Skip to main content
Advertisement
  • Loading metrics

No Reliable Association between Runs of Homozygosity and Schizophrenia in a Well-Powered Replication Study

  • Emma C. Johnson ,

    Emma.C.Johnson@colorado.edu; matthew.c.keller@gmail.com.

    Affiliations Department of Psychology and Neuroscience, University of Colorado at Boulder, United States of America, Institute for Behavioral Genetics, University of Colorado at Boulder, United States of America

  • Douglas W. Bjelland,

    Affiliation Institute for Behavioral Genetics, University of Colorado at Boulder, United States of America

  • Daniel P. Howrigan,

    Affiliations Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America, Broad Institute, Cambridge, Massachusetts, United States of America, Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, United States of America

  • Abdel Abdellaoui,

    Affiliation Department of Biological Psychology, VU University Amsterdam, Amsterdam, Netherlands

  • Gerome Breen,

    Affiliation IDepartment of Social Genetic and Developmental Psychiatry, King’s College London, London, United Kingdom

  • Anders Borglum,

    Affiliations The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Denmark, Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark, Department of Biomedicine, Aarhus University, Aarhus, Denmark, Department P, Aarhus University Hospital, Risskov, Denmark

  • Sven Cichon,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Division of Medical Genetics, Department of Biomedicine, University Basel, Basel, Switzerland, Institute of Neuroscience and Medicine (INM-1), Structural and Functional Organisation of the Brain, Genomic Imaging, Research Centre Juelich, Juelich, Germany, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Franziska Degenhardt,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Andreas J. Forstner,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Josef Frank,

    Affiliation Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim / Heidelberg University, Mannheim, Germany

  • Giulio Genovese,

    Affiliation Broad Institute, Cambridge, Massachusetts, United States of America

  • Stefanie Heilmann-Heimbach,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Stefan Herms,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Division of Medical Genetics, Department of Biomedicine, University Basel, Basel, Switzerland, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Per Hoffman,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Division of Medical Genetics, Department of Biomedicine, University Basel, Basel, Switzerland, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Wolfgang Maier,

    Affiliation Department of Psychiatry, University of Bonn, Bonn, Germany

  • Manuel Mattheisen,

    Affiliation Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Derek Morris,

    Affiliation Department of Psychiatry & Neuropsychiatric Genetics Research Group, School of Medicine, The Trinity Centre for Health Sciences, St. James's Hospital, Ireland

  • Bryan Mowry,

    Affiliations Queensland Centre for Schizophrenia Mental Health Research, The Park, Centre for Mental Health, Wacol, Australia, Department of Psychiatry, University of Queensland, Brisbane, Australia

  • Betram Müller-Mhysok,

    Affiliations Max Planck Institute of Psychiatry, Munich, Germany, Munich Cluster for Systems Neurology (SyNergy), Munich, Germany, University of Liverpool, Institute of Translational Medicine, Liverpool, United Kingdom

  • Benjamin Neale,

    Affiliations Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America, Broad Institute, Cambridge, Massachusetts, United States of America, Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, United States of America

  • Igor Nenadic,

    Affiliation Department of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany

  • Markus M. Nöthen,

    Affiliations Department of Genomics, Life and Brain Center, University of Bonn, Germany, Institute of Human Genetics, University of Bonn, Bonn, Germany

  • Colm O’Dushlaine,

    Affiliation Neuropsychiatric Genetics Research Group, Department of Psychiatry and Institute of Molecular Medicine, Trinity College Dublin, Ireland

  • Marcella Rietschel,

    Affiliation Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim / Heidelberg University, Mannheim, Germany

  • Douglas M. Ruderfer,

    Affiliation Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States of America

  • Dan Rujescu,

    Affiliations Molecular and Clinical Neurobiology, Department of Psychiatry, Ludwig-Maximilians-University, Munich, Germany, Department of Psychiatry, University of Halle-Wittenberg, Halle, Germany

  • Thomas G. Schulze,

    Affiliation Institute for Psychiatric Phenomics and Genomics (IPPG), Ludwig-Maximilians-University, Munich, Germany

  • Matthew A. Simonson,

    Affiliation Mayo Clinic, Department of Health Sciences, Division of Biomedical Statistics and Informatics, Rochester, Minnesota, United States of America

  • Eli Stahl,

    Affiliations Broad Institute, Cambridge, Massachusetts, United States of America, Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, United States of America

  • Jana Strohmaier,

    Affiliation Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim / Heidelberg University, Mannheim, Germany

  • Stephanie H. Witt,

    Affiliation Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim / Heidelberg University, Mannheim, Germany

  • Schizophrenia Working Group of the Psychiatric Genomics Consortium ,

    Membership of the Schizophrenia Working Group of the Psychiatric Genomics Consortium is provided in the acknowledgments.

  • Patrick F. Sullivan,

    Affiliations Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden, Department of Genetics, University of North Carolina, Chapel Hill, NC, United States of America, Department of Psychiatry, University of North Carolina, Chapel Hill, NC, United States of America

  •  [ ... ],
  • Matthew C. Keller

    Emma.C.Johnson@colorado.edu; matthew.c.keller@gmail.com.

    Affiliations Department of Psychology and Neuroscience, University of Colorado at Boulder, United States of America, Institute for Behavioral Genetics, University of Colorado at Boulder, United States of America

  • [ view all ]
  • [ view less ]

Abstract

It is well known that inbreeding increases the risk of recessive monogenic diseases, but it is less certain whether it contributes to the etiology of complex diseases such as schizophrenia. One way to estimate the effects of inbreeding is to examine the association between disease diagnosis and genome-wide autozygosity estimated using runs of homozygosity (ROH) in genome-wide single nucleotide polymorphism arrays. Using data for schizophrenia from the Psychiatric Genomics Consortium (n = 21,868), Keller et al. (2012) estimated that the odds of developing schizophrenia increased by approximately 17% for every additional percent of the genome that is autozygous (β = 16.1, CI(β) = [6.93, 25.7], Z = 3.44, p = 0.0006). Here we describe replication results from 22 independent schizophrenia case-control datasets from the Psychiatric Genomics Consortium (n = 39,830). Using the same ROH calling thresholds and procedures as Keller et al. (2012), we were unable to replicate the significant association between ROH burden and schizophrenia in the independent PGC phase II data, although the effect was in the predicted direction, and the combined (original + replication) dataset yielded an attenuated but significant relationship between Froh and schizophrenia (β = 4.86,CI(β) = [0.90,8.83],Z = 2.40,p = 0.02). Since Keller et al. (2012), several studies reported inconsistent association of ROH burden with complex traits, particularly in case-control data. These conflicting results might suggest that the effects of autozygosity are confounded by various factors, such as socioeconomic status, education, urbanicity, and religiosity, which may be associated with both real inbreeding and the outcome measures of interest.

Author Summary

It is well known that mating between relatives increases the risk that a child will have a rare recessive genetic disease, but there has also been increasing interest and inconsistent findings on whether inbreeding is a risk factor for common, complex psychiatric disorders such as schizophrenia. The best powered study to date investigating this theory predicted that the odds of developing schizophrenia increase by approximately 17% for every additional percent of the genome that shows evidence of inbreeding. In this replication, we used genome-wide single nucleotide polymorphism data from 18,562 schizophrenia cases and 21,268 controls to quantify the degree to which they were inbred and to test the hypothesis that schizophrenia cases show higher mean levels of inbreeding. Contrary to the original study, we did not find evidence for distant inbreeding to play a role in schizophrenia risk. There are various confounding factors that could explain the discrepancy in results from the original study and our replication, and this should serve as a cautionary note–careful attention should be paid to issues like ascertainment when using the data from genome-wide case-control association studies for secondary analyses for which the data may not have originally been intended.

Introduction

Close inbreeding (e.g., cousin-cousin mating) is known to decrease fitness in animals[1] and to increase risk for recessive Mendelian diseases in humans[2], a phenomenon known as inbreeding depression. Inbreeding depression is thought to occur due to evolutionary selection against genetic variants that decrease fitness—e.g., variants that increase risk of disorders[3]. Such fitness-reducing variants should not only be more rare, but also more recessive than expected under a neutral evolution model (i.e., show directional dominance). If so, individuals with a greater proportion of their genome in autozygous stretches (two homologous segments of a chromosome inherited from a common ancestor identical by descent [IBD]) should have higher rates of disorders. This is because autozygous regions reveal the full, harmful effects of any deleterious, recessive alleles that existed on the haplotype of the common ancestor.

Whether inbreeding increases risk for complex disorders like schizophrenia is less clear. Previous studies have found that inbreeding is associated with higher rates of complex disorders[49]. However, sample sizes have typically been small and the possibility that confounding factors might explain the results has left the links inconclusive. Moreover, close inbreeding accounts for fewer than 1% of marriages in industrialized countries[10], and information on pedigrees going back many generations is difficult to collect reliably. For these reasons, investigators have recently begun looking at signatures of very distant inbreeding (e.g., common ancestry up to ~100 generations ago) using genome-wide single nucleotide polymorphism (SNP) data in an attempt to understand whether autozygosity increases the risk to schizophrenia and other complex diseases[11]. Autozygosity in SNP data is typically inferred from runs of homozygosity (ROHs): long, contiguous stretches (e.g., > 40) of homozygous SNPs. The proportion of the genome contained in such ROHs, Froh, can then be used to predict complex traits[1219]. Keller et al.[11] showed that Froh is the optimal method for detecting inbreeding signals that are due to rare, recessive to partially recessive mutations, such as those thought to occur when traits are under directional selection[3]. The low variation in Froh means that large sample sizes (e.g., >12,000) are required to uncover realistic effects of distant inbreeding on complex diseases in samples unselected for inbreeding[11].

In 2012, Keller et al.[20] used the original Psychiatric Genomics Consortium schizophrenia data (17 case-control datasets, total n = 21,831) to investigate whether Froh is associated with increased risk of schizophrenia. The authors estimated that the odds of developing schizophrenia increased by approximately 17% for every additional percent of the genome that is contained in autozygous regions (β = 16.1, CI(β) = [6.93, 25.7], p = 6x10-4.) This was by far the largest study to that date examining the association between Froh and any psychiatric disorder, and the significant relationship between Froh and case-control status remained robust through secondary analyses of various covariate combinations, common vs. rare IBD haplotypes, and SNP thresholds used to define ROHs. These results are consistent with the hypothesis that autozygosity causally increases the risk of schizophrenia. Nevertheless, because various confounding factors may increase likelihood of distant inbreeding as well as the probability of having offspring with schizophrenia, these results do not imply a causal relationship. For example, parents higher on schizophrenia liability may pass their higher liability to offspring and mate with more genetically similar partners (e.g., due to decreased mobility, educational opportunities, etc.).

The current study seeks to provide a well-powered, independent replication of Keller et al.(2012)[20]. In light of the growing concern about publication bias[21,22] and dearth of well-powered replications[23,24], this follow-up analysis is a necessary step in validating the Froh—schizophrenia relationship. The present study used genome-wide SNP data from 22 independent schizophrenia case-control datasets (n = 39,830) from the PGC[25] to further examine the relationship between Froh and schizophrenia. Our replication attempt is an important contribution to the growing body of literature examining autozygosity and psychiatric disorders, and should help verify whether autozygosity estimated from ROHs is robustly related to schizophrenia risk and, by extension, can help elucidate whether schizophrenia risk alleles are biased, on average, toward recessive effects.

Results

SNP data from 28,985 schizophrenia cases and 35,017 controls were collected as detailed in Ripke et al.[25]. Quality control (QC) and analyses were conducted separately for the original and replication datasets. The “original” dataset included subjects from the PGC’s SCZ1[26] samples used by Keller et al[20] (n = 21,868 after QC), and the “replication” dataset contained all subjects (n = 39,830 after QC) in the PGC SCZ2[25] samples not included in the original Keller et al. study, making the replication dataset independent of the original dataset analyzed in Keller et al.

Despite the number of imputed SNPs ranging from ~1.8 million to ~4.2 million in the datasets, there were not enough well imputed SNPs in common across all 22 datasets to conduct a viable ROH analysis in the same way as in the original study (see Methods). Nevertheless, Keller et al. also reported results from ROHs estimated from unimputed SNP data, and these results were highly consistent with imputed SNPs. Therefore, our primary analyses were conducted using post-QC, unimputed genotype data. We also report results on imputed SNPs (see S5S12 Figs and S1 Table) using slightly different QC procedures than used in the original report (see Methods), which do not change the conclusions below. While ROHs from the imputed data were called from a common SNP set, ROHs from the unimputed data were called on unique sets of SNPs for each dataset.

Keller et al.[20] found that all ROH length thresholds were significantly associated with schizophrenia, but because ROH thresholds are ultimately arbitrary, they focused their discussion on the thresholds (e.g., 110 consecutive homozygous SNPs in the unimputed data) that maximized the schizophrenia-ROH relationship. In an attempt to follow as closely as possible the method used by Keller et al., we report two sets of ROH results. The first approach—a direct replication attempt of Keller et al.—defined ROHs as being ≥ 110 consecutive homozygous SNPs in a row (with median Mb ranging from ~1 to ~3.4 Mb, depending on sample) in the unimputed data. Because using unimputed SNP data introduces large differences in mean ROH length across datasets (when defined by number of consecutive homozygous SNPs) due to varying SNP densities, we also employed a secondary replication approach using a 2.3 Mb minimum length threshold that corresponds to 110 SNPs-in-a-row average length in the original report. As in the original report, we also show results across all thresholds to ensure that no results were missed.

Table 1 gives the descriptive statistics for average ROH lengths and Froh across datasets, where ROHs were defined as ≥ 110 consecutive homozygous SNPs. There was wide variation in average Froh and ROH lengths between datasets, a consequence of using unimputed SNP data, which introduces more between-dataset variability in Froh and mean ROH length[20]. Across datasets, mean Froh was also higher (0.30% vs. 0.14%) and average ROH lengths shorter (1.1–3.4 Mb vs. 2.0–4.7 Mb) in the replication versus original datasets. Part of the reason for the mean Froh discrepancy seemed to be due to replication datasets being genotyped on denser SNP chips, because this discrepancy reduced when we defined ROHs as ≥ 2.3 Mb homozygous SNPs (0.22% vs. 0.13%; Table 1). The remaining higher average Froh in the replication datasets appears to be due to more samples being from countries with higher overall Froh (e.g., Sweden, Estonia, Israel) in the replication datasets; the average Froh levels were very similar across replication vs. original datasets within the same countries.

thumbnail
Table 1. Descriptive data for the unimputed (post-QC) PGC replication data—ROHs defined as ≥ 110 consecutive homozygous SNPs or as ≥ 2.3 Mb long.

https://doi.org/10.1371/journal.pgen.1006343.t001

ROH burden results

For each dataset, we regressed case-control status on Froh using mixed effects logistic regression treating dataset as a random factor, and controlled for 20 principal components (PCs) from the genomic relationship matrix[27] and two SNP quality measures (excess heterozygosity and SNP missingness; see Methods). In Keller et al. (2012), the authors used mixed effects models to test the ROH burden association with schizophrenia. However, in the current analysis we used fixed effect logistic regression models, treating dataset as a fixed, because a minority of the mixed effects models failed to converge. When the mixed effects models did converge, the results were highly similar to the respective fixed effect models. Figs 1 and S1 show the predicted change in odds of schizophrenia risk (and 95% confidence intervals) for every 1% increase in average Froh for each logistic regression in the replication data using ROHs defined by either ≥110 consecutive homozygous SNPs (Fig 1) or ROH length ≥ 2.3 Mb (S1 Fig). The overall association between schizophrenia and Froh in the replication data was in the predicted direction but not significant for ROHs defined as at least 110 consecutive homozygous SNPs (β = 0.19, CI(β) = [−4.50,4.88], Z = 0.08, p = 0.94) or for ROHs defined as ≥ 2.3 Mb (β = 0.75, CI(β) = [−4.05,5.56], Z = 0.31, p = 0.76). The results from analyses on ROHs called from imputed rather than raw SNP data were also non-significant (S5 Fig). As in Keller et al., we also explored increasingly long SNP and Mb ROH thresholds to assess the stability of the Froh-schizophrenia relationship (Figs 2 and 3). Across all thresholds, the only thresholds that approached significant associations between Froh and schizophrenia in the replication data were at the upper limits of the Mb-length ROH thresholds; the strongest association was for ROHs defined as ≥ 19 Mb (β = 8.64, CI(β) = [−0.85,18.13], Z = 1.78, p = 0.07).

thumbnail
Fig 1. Estimated changes in odds of schizophrenia for each 1% increase in Froh (odds ratios; asterisks) and their 95% confidence intervals (bars) across the independent replication datasets (colored according to SNP platform) and for the total sample (black) from the unimputed SNP data, for ROHs defined as ≥ 110 consecutive homozygous SNPs.

Boxes are proportional to the square root of sample sizes (also shown at the bottom). Dataset names are on the x-axis. Only one of the individual estimated odds ratios significantly differs from one (“clm2” dataset), and the overall effect (black) is not significant (β = 0.19, Z = 0.08, p = 0.94).

https://doi.org/10.1371/journal.pgen.1006343.g001

thumbnail
Fig 2. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from unimputed SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

No SNP homozygosity threshold was significant.

https://doi.org/10.1371/journal.pgen.1006343.g002

thumbnail
Fig 3. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from unimputed SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

No Mb length thresholds reached significance.

https://doi.org/10.1371/journal.pgen.1006343.g003

We conducted a series of follow-up analyses to ensure that the failure to replicate our original report was not due to analytical error, inclusion of outlier individuals or datasets, or suppressing covariates in the replication data. We reran the same analyses described above on SNP data from the “original” report using the exact same quality control and analytic procedures performed on the replication data. Results were virtually identical to those obtained in Keller et al.’s 2012 study (S2S4 Figs), increasing our confidence that the procedures used in the replication attempt were identical to those used in the original analysis and that the results from the original analysis were not due to analytic or procedural errors. We then reran analyses in the replication data after (a) omitting individuals with very long (>30 Mb) ROHs, (b) omitting only long ROHs, (c) including all combinations of covariates in the model (SNP missingness, average heterozygosity, 10 or 20 principle components), and (d) including only the longest ROH for each individual. The Froh-schizophrenia relationship remained non-significant in these follow-up analyses (results shown in S2 Table).

We noticed that there was greater variability in Froh in the replication datasets and that this greater variability was mostly driven by replication datasets that had n < 300. Under the premise that smaller samples might differ in genotypic or phenotypic quality, we excluded seven samples that contained fewer than 300 cases (“egcu”, “ersw”, “lie2”, “pews”, “top8”, “umes”), reran our baseline analysis (including all covariates mentioned above and using an ROH threshold of ≥ 110 consecutive homozygous SNPs), but still observed a non-significant Froh-schizophrenia relationship (β = 1.04, CI(β) = [−3.88,5.96], Z = 0.42, p = 0.68) in the predicted direction. Therefore, this post-hoc analysis does not lend support to the possibility that small samples in the replication set added noise to our analysis, obscuring an Froh-schizophrenia relationship.

Although results from the replication analysis were not significant, they were in the same direction as the original analysis. It could therefore be argued that the best estimate of the association between ROHs and schizophrenia is obtained by combining the two datasets. When we reran our analyses on the combined original + replication data (n = 61,661), all Froh associations based on ROH thresholds greater than 60 consecutive homozygous SNPs or longer than 1 Mb were significant (Figs 4 and 5). For an ROH threshold of ≥ 110 consecutive homozygous SNPs), we observed a significant Froh-schizophrenia relationship in the combined data (β = 4.86, CI(β) = [0.90,8.83], Z = 2.40, p = 0.02). In this combined dataset, we also used a replication status-by-Froh interaction to conclude that the Froh-schizophrenia association was only marginally higher in the original compared to the replication datasets (interaction β = −3.98, Z = −1.84, p = 0.07) for ROHs defined as at least 110 consecutive homozygous SNPs.

thumbnail
Fig 4. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from the combined unimputed SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

All SNP thresholds greater than 60 SNPs-in-a-row were significant.

https://doi.org/10.1371/journal.pgen.1006343.g004

thumbnail
Fig 5. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from the combined unimputed SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

All length thresholds longer than 1 Mb were significant.

https://doi.org/10.1371/journal.pgen.1006343.g005

The effects of close versus distant inbreeding

To assess the relative importance of distant versus close inbreeding, we compared the effects of short versus long ROHs. As in the original study, we chose our ROH length threshold based on the Mb length cutoff that resulted in equal Froh variances, calculating Froh_short as the proportion of the genome contained in ROHs < 8 Mb long, and Froh_long as the proportion of the genome contained in ROHs > 8 Mb long. Although neither association was significant, the effect of Froh_short (β = −5.06, CI(β) = [−12.08,1.95], Z = −1.42, p = 0.16), caused by autozygosity arising from more ancient common ancestors, was negative (“protective”) and in the opposite direction of effect of Froh_long (β = 1.23, CI(β) = [−4.78,7.25], Z = 0.40, p = 0.69), caused by autozygosity arising from more recent common ancestors, which predicted increased risk for schizophrenia (Fig 6).

thumbnail
Fig 6. Forest plot of the change in odds of schizophrenia risk for each 1% increase in Froh due to short (< 8 Mb, blue) or long (> 8 Mb, red) ROHs for each sample in the replication.

Boxes are proportional to the square root of sample sizes, and 95% confidence intervals are indicated by the horizontal lines. Dataset names are on the y-axis, with the estimates from the combined sample at the bottom.

https://doi.org/10.1371/journal.pgen.1006343.g006

Discussion

Despite exploring various homozygous SNP length thresholds, Mb thresholds, and combinations of covariates, the findings from this study do not lend much support to the original observation of a highly significant Froh-schizophrenia association[20], and provide only equivocal support, based on combining the original and replication data, for the hypothesis that autozygosity is a risk factor for schizophrenia.

Perhaps the simplest explanation for this pattern of results is that the conclusions about distant inbreeding from the original data represent a type-I error or that the lack of replication in the current report was a type-II error. Despite the fact that the effect in the original study was highly significant (p = 6x10-4) and the statistical power in the replication study to detect the observed effect size in the original study was nearly 100%, it is possible that the estimated effects of the original analysis could have been over-estimated and/or those of the replication analysis under-estimated, due to sampling variability. There is some support for this interpretation, as there was not a significant difference in results between replication versus original datasets (interaction p = 0.07).

An alternative explanation for the overall pattern of results has to do with the potential influence of unmeasured confounding factors in both the original and replication analyses. Unlike genotype frequencies, which change very slowly and are unaffected by inbreeding, ROH levels can change substantially after even a single generation of inbreeding, making ROH analyses highly susceptible to confounding factors associated with both disease risk and the degree of inbreeding/outbreeding. For example, contrary to initial predictions, Abdellaoui et al.[28] identified a significant and negative (“protective”) relationship between Froh and risk for major depressive disorder (MDD) in the Dutch population. However, the authors found that religiosity was significantly associated with both higher autozygosity and lower MDD in this population. When religiosity was accounted for in their regression model, the original association between MDD and Froh disappeared. A similar effect was detected for educational attainment: highly educated individuals were more likely to migrate and mate with highly educated and more diverse partners, making highly educated spouse pairs share less ancestry and leading to their offspring having lower Froh[29]. Thus, assortative mating on variables such as education or religion could subtly influence observed Froh associations, potentially affecting results in ways that can be difficult to account for. For example, an observed Froh-schizophrenia relationship could be due to parents with a higher schizophrenia liability mating with less genetically diverse mates due to, e.g., fewer educational opportunities or lower migration rates. Thus, the causation may be reversed: schizophrenia liability in parents could cause not only higher schizophrenia risk, but also higher Froh, in offspring rather than Froh in offspring increasing their schizophrenia liability. Such reverse and third variable causation possibilities can only be tested if relevant socio-demographic variables in subjects and (optimally) their parents are collected.

The possibility of unmeasured variables confounding Froh-disorder relationships seems particularly likely in analyses conducted on ascertained samples. Ascertainment of cases and controls not perfectly matched on socio-demographic factors that might affect degree of outbreeding (e.g., socioeconomic status, education level, age, religion, urbanicity) can mask any true Froh association and bias the observed association in either direction. Such a scenario might explain otherwise contradictory findings in previous ROH case-control analyses[18,28,3036]. For example, following two studies showing that genome-wide autozygosity was significantly associated with schizophrenia risk, including the original Keller et al. study[13,20], two newer studies failed to replicate this association[34,35], although both replication sample sizes (n = 3,400 and 11,244 respectively) were substantially smaller than the current one (n = 39,830). (It should be noted that the sample used in the latter study[36] overlapped with the samples in both the original Keller et al.[20] study and the current replication study). Even within the same study, Froh results in ascertained samples have been inconsistent. Using PGC MDD data, Power et al.[36] found a significant positive Froh-MDD relationship in data from three German sites but a significant negative Froh-MDD relationship in six non-German sites. A possible explanation for this and other such examples of heterogeneity across sites they observed is that cases and controls differed on socio-demographic factors that were associated with Froh, and the direction of this ascertainment bias was inconsistent across data collection sites.

We believe that similar ascertainment biases could have affected results in the present study as well as in the original Keller et al.[20] report. Many of the PGC schizophrenia datasets used cases ascertained from hospitals, clinics, health surveys, and advertisements but controls from previous biomedical research volunteers, university students, blood donors, and population registries. While such differences in ascertainment between cases and controls are highly unlikely to lead to allele frequency differences, and thus are of little concern to genome-wide association studies, they could very easily lead to Froh differences due to differences in degree of inbreeding/outbreeding in the populations from which cases and controls were drawn. Controlling for ancestry principal components in this case would only help to the degree that degree of inbreeding/outbreeding is associated with ancestry. Unfortunately, none of the other variables that might statistically control for such biases due to differences in case/control ascertainment are currently available in the PGC data collection. The PGC collection of studies was designed for association analyses; it was not optimally designed for ancillary purposes, such as ROH analyses.

It is important to recognize that even ascertainment biases that differ at random across sites would substantially inflate type-I error rates because the proper degrees of freedom for the test should be closer to the number of independent sites rather than the number of independent cases and controls. To demonstrate this, we permuted data under the null hypothesis of no relationship between Froh and schizophrenia in the 17 datasets from the original 2012 study by randomly flipping case or control status within each dataset for each permutation (e.g., cases and control statuses in a dataset either remained the same or were flipped to the opposite status). We then calculated the overall Froh ~ schizophrenia relationship with the same logistic regression model and using the same covariates as in the original analysis. Across 1,000 permutations, 183 p-values were significant (p < 0.05), implying a type-I error rate of 0.18 and demonstrating how false conclusions about Froh relationships can be reached even when ascertainment biases are random across multiple sites.

Conclusion

Given concerns about the false discovery rate in science[22], there has been increasing emphasis on the need for well-powered, direct replications of novel findings in genetics[23,37,38] and other fields[3941]. The current study was a well-powered, direct replication attempt that failed to replicate an earlier finding that autozygosity arising from distant common ancestors was significantly associated with schizophrenia. As is typical with null findings, it is difficult to identify the reason for this failure to replicate. However, we have argued that a likely cause is that ROH associations are highly susceptible to confounding, especially in case-control (ascertained) samples. Thus, we believe that the conclusions of the original study were premature and the true causal relationship between schizophrenia and autozygosity could be either stronger/more positive (if the populations from which controls were ascertained were, on average, slightly less outbred than populations from which cases were ascertained) or weaker/more negative (the reverse) than reported here. Unfortunately, we do not have the ability to test these hypotheses directly in the current datasets, and doing so awaits either new samples in which cases and controls are carefully matched or the collection of information that allows potential confounders to be statistically controlled. This creates a dilemma for ROH analyses using existing case-control genome-wide data: GWAS datasets usually do not match cases and controls to the degree necessary to rule out confounding effects on ROH analyses and typically do not collect the relevant socio-demographic information necessary to control for potential confounders. The current study therefore serves as a cautionary tale for analyzing ROHs in existing ascertained GWAS datasets. Such datasets may be perfectly adequate for their designed purpose–GWAS–but may be problematic and even misleading for ROH analyses.

Methods

Psychiatric Genomics Consortium GWAS Data

Our study used 37 datasets from the Psychiatric Genomics Consortium’s SCZ2 data–these data included 28,985 schizophrenia cases and 35,017 controls, collected from 37 sites in 13 countries. Data collection and ascertainment details are described elsewhere.[25]

Keller et al.[20] used 17 datasets from the PGC SCZ1[26] data. Several of these original 17 studies recruited additional subjects by the time of our study, necessitating two well-defined, independent datasets: one including all of the individuals analyzed in the original 2012 study (“original” dataset), and one containing only subjects not included in Keller et al.’s 2012 report (the “replication” dataset, comprised of 22 studies and a total sample size of 18,562 cases and 21,268 controls after QC; see Table 1). Three of the original case-control datasets from the PGC’s SCZ1 added more subjects and/or controls in SCZ2, but only two of these datasets had enough subjects to pass QC and merit inclusion in the current study—thus there is a “top8” dataset (N = 180) in this replication study, comprised of the samples that were added to the “top3” dataset (N = 598) from the original 2012 study, and a “boco” dataset (N = 1,870), which includes the new cases and controls that were added to the original “bon” dataset (N = 1,778). For consistency with the original Keller et al. (2012) study[20], we excluded the three family-based datasets of parent-proband trios and three East Asian datasets.

Quality Control (QC) Procedures–Raw SNP Data

We followed the same QC procedures as Keller et al.[20]. We removed a) one individual from any pair of individuals who were related with >0.2, b) individuals with non-European ancestry as determined by principal components analysis; c) samples with SNP missingness >0.02; or d) samples with genome-wide heterozygosities >6 standard deviations above the mean. SNPs were excluded if they a) deviated from Hardy-Weinberg equilibrium at p<1×10−6; b) had missingness >0.02; or c) had a missingness difference between cases and controls >0.02.

QC Procedures–Imputed SNP Data

Early in the analysis process, we found that only including SNPs with imputation dosage r2 > .90 across all datasets, as was done in the original study[20], left us with too few SNPs with which to conduct viable ROH analyses in the replication data. Because having ROHs of similar length and SNP density is important for comparing present results to those from the 2012 study, we decided that having a similar number of SNPs to Keller et al.[20] was more important than following the exact same QC procedures. Thus, to arrive at a similar number of genome-wide SNPs in the new and old datasets, some of the QC measures described below were different than in the 2012 investigation.

SNPs were imputed using the 1000 Genomes reference panel[42]; imputation procedures are described elsewhere[25]. Imputation dosages were converted to best-guess (highest posterior probability) SNP calls because ROH detection algorithms require discrete SNP calls, and extremely stringent QC thresholds were employed to achieve accuracy rates similar to those in genotyped SNPs[43]. We excluded any imputed SNPs that were not included in the HapMap3[44] reference panel, as done in the 2012 study. Unlike the original QC procedures, we did not require that the dosage r2 had to be > .90 in each individual datasets. We excluded any imputed SNPs that had a dosage r2<0.98 or >1.02 in the overall sample (calculated using average dosage r2 weighted by sample size) or that had MAF<0.15 within each sample (vs. .05 in original), leaving 340,084 high-quality imputed SNPs (vs. 398,325 in original).

ROH Calling Procedures

Again, we followed the same ROH calling procedures as in Keller et al[20]. As recommended in a separate investigation[45] by three of the authors of the present study, we chose PLINK software[46] for its computational efficiency and superior detection of autozygous stretches. As in the 2012 study, we pruned for LD using PLINK’s—indep flag, which ensures more uniform SNP coverage across the genome and reduces false autozygosity calls by removing redundant markers. We pruned SNPs for LD using a VIF threshold of 10, which is equivalent to multiple R2 > 0.90 between the focal SNP and the 50 surrounding SNPs.

We called ROHs using PLINK’s—homozyg flags, defining initial ROHs as being ≥40 homozygous SNPs in a row with no heterozygote calls allowed. We required that ROHs have a density greater than 1 SNP per 200 kb, and split an ROH into two if a gap >500 kb existed between consecutive homozygous SNPs. We then post-processed the initial ROH calls by altering the SNPs-in-a-row threshold and the Mb length threshold; specifically, we looked at ROH calls with a minimum of 40 to 200 consecutive homozygous SNPs in increments of 10, and ROH calls with minimum lengths ranging from 1 to 20 Mb by increments of 1 Mb. We varied ROH thresholds this widely to ensure that no potential effects of autozygosity were missed, but the primary results presented here are based on two replication attempts in the unimputed data: (a) using the same SNP thresholds that gave the most straightforward comparison with the original report (this was 110 SNPs-in-a-row for the unimputed data, spanning ~1 to ~2.1 Mb in the replication datasets, and 65 SNPs-in-a-row for the imputed data), and (b) using the physical length threshold (2.3 Mb) that corresponded to the average Mb length for 110 SNPs-in-a row in the original report.

ROH Burden Analysis

After calling ROHs, we summed the total length of all autosomal ROHs for each individual and divided that by the total SNP-mappable distance (2.77x109 bases) to calculate Froh. Froh, the proportion of the genome contained in long homozygous regions, was used as the predictor of schizophrenia case-control status in analyses described below. As confounding factors such as population stratification, SNP missingness, call quality, and plate effects can influence Froh, we included the first 20 principle components (based on a genome relationship matrix calculated from ~30K LD-pruned SNPs), percentage of missing SNP calls in the raw data, and excess heterozygosity in all regression models[20]. We then regressed case-control status on Froh using a mixed linear effects logistic regression model (available in the lme4 package in R version 3.1.0), treating dataset as a random factor, to assess the overall effect of Froh on schizophrenia across all sites. Some of the models with random effects did not converge; thus, for consistency, we modeled dataset as a fixed factor for all analyses. The results from mixed linear effects models that converged were very similar to fixed effects models, giving us confidence that the fixed effects results of this analysis and the random effect results from the original Keller et al. (2012) study are commensurate. We also ran logistic regressions in each of the 22 datasets separately.

Ethics Statement

This research was approved by CU Boulder's Institutional Review Board with regard to protocol number 13–0266 on 3/29/2016 in accordance with Federal Regulations at 45 CFR 46. Written patient consent was obtained for each individual study by the study PI, with the exception of the "clm3" and "clo3" datasets, which obtained anonymous samples via a drug monitoring service under ethical approval and in accordance with the UK Human Tissue Act.

Supporting Information

S1 Table. Descriptives for the imputed independent PGC replication data, for ROHs defined as 65 SNPs or greater.

https://doi.org/10.1371/journal.pgen.1006343.s001

(DOCX)

S2 Table. Results from follow-up analyses to ensure that failure to replicate was not due to inclusion of outlier individuals or datasets, or suppressing covariates in the replication data.

https://doi.org/10.1371/journal.pgen.1006343.s002

(DOCX)

S1 Fig. Estimated changes in odds of schizophrenia for each 1% increase in Froh (odds ratios; asterisks) and their 95% confidence intervals (bars) across the independent replication data (colored) and for the total sample (black) from the unimputed SNP data, for ROHs defined as ≥ 2.3 Mb.

Boxes are proportional to the square root of sample sizes (also shown at the bottom). Dataset names are on the x-axis. Only one of the individual estimated odds ratios significantly differs from one (“clm2” dataset), and the overall effect (black) is not significant (β = 0.75, Z = 0.31, p = 0.76).

https://doi.org/10.1371/journal.pgen.1006343.s003

(TIFF)

S2 Fig. Estimated changes in odds of schizophrenia for each 1% increase in Froh (odds ratios; asterisks) and their 95% confidence intervals (bars) across the original PGC SCZ1 data (colored) and for the total sample (black) from the unimputed SNP data.

Boxes are proportional to the square root of sample sizes (also shown at the bottom). Dataset names are on the x-axis. (While the y-axis is cut off at 3 for clarity, it should be noted that the upper limit of the 95% confidence interval is 4.1 for the “muc” dataset and 5.4 for the “top3” dataset.) Only one of the individual estimated odds ratios significantly differ from one (the “muc” dataset), but the overall effect (black) is significant (Beta = 16.83, p = 0.000357.)

https://doi.org/10.1371/journal.pgen.1006343.s004

(TIFF)

S3 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from unimputed PGC SCZ1 original SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

All SNP homozygosity thresholds above 40 SNPs-in-a-row were significant.

https://doi.org/10.1371/journal.pgen.1006343.s005

(TIFF)

S4 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from unimputed PGC SCZ1 original SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

All ROH Mb thresholds equal to and longer than 1 Mb were significant.

https://doi.org/10.1371/journal.pgen.1006343.s006

(TIFF)

S5 Fig. Estimated changes in odds of schizophrenia for each 1% increase in Froh (odds ratios; asterisks) and their 95% confidence intervals (bars) across the independent replication datasets (colored) and for the total sample (black) from the imputed SNP data, for ROHs defined as ≥ 65 homozygous SNPs in a row.

Boxes are proportional to the square root of sample sizes (also shown at the bottom). Dataset names are on the x-axis. Only one of the individual estimated odds ratios significantly differs from one (“clm2” dataset), and the overall effect (black) is not significant (β = 0.11, Z = 0.05, p = 0.96).

https://doi.org/10.1371/journal.pgen.1006343.s007

(TIFF)

S6 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from imputed PGC SCZ2 replication SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

No SNP homozygosity thresholds were significant.

https://doi.org/10.1371/journal.pgen.1006343.s008

(TIFF)

S7 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from imputed PGC SCZ2 replication SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

No ROH Mb thresholds were significant.

https://doi.org/10.1371/journal.pgen.1006343.s009

(TIFF)

S8 Fig. Estimated changes in odds of schizophrenia for each 1% increase in Froh (odds ratios; asterisks) and their 95% confidence intervals (bars) across the original PGC SCZ1 datasets (colored) and for the total sample (black) from the imputed SNP data for SNPs defined as ≥ 65 homozygous SNPs in a row.

Boxes are proportional to the square root of sample sizes (also shown at the bottom). Dataset names are on the x-axis—note that this imputed analysis was performed on the original SCZ1 individuals but within the PGC’s SCZ2 data, where some of the original individuals were divided among several new datasets. This is why some of the dataset names are slightly different from those in the original unimputed PGC SCZ1 data in S2 Fig. Only one of the individual estimated odds ratios significantly differs from one, the “munc” dataset, but the overall effect (black) is significant (Beta = 14.88, Z = 2.43, p = 0.02.)

https://doi.org/10.1371/journal.pgen.1006343.s010

(TIFF)

S9 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from imputed PGC SCZ1 original SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

All SNP length thresholds were significant.

https://doi.org/10.1371/journal.pgen.1006343.s011

(TIFF)

S10 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from imputed PGC SCZ1 original SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

All ROH Mb thresholds were significant.

https://doi.org/10.1371/journal.pgen.1006343.s012

(TIFF)

S11 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from the combined imputed SNP data predicting schizophrenia for different SNP thresholds of calling ROHs.

SNP thresholds of 120 homozygous SNPs-in-a-row and above were significant.

https://doi.org/10.1371/journal.pgen.1006343.s013

(TIFF)

S12 Fig. Slope estimates (the change in log odds for a 1% increase in Froh; points) and their 95% confidence intervals (bars) of Froh from the combined imputed SNP data predicting schizophrenia for different Mb thresholds of calling ROHs.

All Mb thresholds ≥ 3 Mb were significant

https://doi.org/10.1371/journal.pgen.1006343.s014

(TIFF)

Acknowledgments

The following collaborators are in the Schizophrenia Working Group of the Psychiatric Genomics Consortium: Stephan Ripke, Benjamin M. Neale, Aiden Corvin, James T. R. Walters, Kai-How Farh, Peter A. Holmans, Phil Lee, Brendan Bulik-Sullivan, David A. Collier, Hailiang Huang, Tune H. Pers, Ingrid Agartz, Esben Agerbo, Margot Albus, Madeline Alexander, Farooq Amin, Silviu A. Bacanu, Martin Begemann, Richard A. Belliveau Jr, Judit Bene, Sarah E. Bergen, Elizabeth Bevilacqua, Tim B. Bigdeli, Donald W. Black, Richard Bruggeman, Nancy G. Buccola, Randy L. Buckner, William Byerley, Wiepke Cahn, Guiqing Cai, Dominique Campion, Rita M. Cantor, Vaughan J. Carr, Noa Carrera, Stanley V. Catts, Kimberly D. Chambert, Raymond C. K. Chan, Ronald Y. L. Chen, Eric Y. H. Chen, Wei Cheng, Eric F. C. Cheung, Siow Ann Chong, C. Robert Cloninger, David Cohen, Nadine Cohen, Paul Cormican, Nick Craddock, James J. Crowley, David Curtis, Michael Davidson, Kenneth L. Davis, Franziska Degenhardt, Jurgen Del Favero, Ditte Demontis, Dimitris Dikeos, Timothy Dinan, Srdjan Djurovic, Gary Donohoe, Elodie Drapeau, Jubao Duan, Frank Dudbridge, Naser Durmishi, Peter Eichhammer, Johan Eriksson, Valentina Escott-Price, Laurent Essioux, Ayman H. Fanous, Martilias S. Farrell, Josef Frank, Lude Franke, Robert Freedman, Nelson B. Freimer, Marion Friedl, Joseph I. Friedman, Menachem Fromer, Giulio Genovese, Lyudmila Georgieva, Ina Giegling, Paola Giusti-Rodríguez, Stephanie Godard, Jacqueline I. Goldstein, Vera Golimbet, Srihari Gopal, Jacob Gratten, Lieuwe de Haan, Christian Hammer, Marian L. Hamshere, Mark Hansen, Thomas Hansen, Vahram Haroutunian, Annette M. Hartmann, Frans A. Henskens, Stefan Herms, Joel N. Hirschhorn, Per Hoffmann, Andrea Hofman, Mads V. Hollegaard, David M. Hougaard, Masashi Ikeda, Inge Joa, Antonio Julià, René S. Kahn, Luba Kalaydjieva, Sena Karachanak-Yankova, Juha Karjalainen, David Kavanagh, Matthew C. Keller, James L. Kennedy, Andrey Khrunin, Yunjung Kim, Janis Klovins, James A. Knowles, Bettina Konte, Vaidutis Kucinskas, Zita Ausrele Kucinskiene, Hana Kuzelova-Ptackova, Anna K. Kähler, Claudine Laurent, Jimmy Lee Chee Keong, S. Hong Lee, Sophie E. Legge, Bernard Lerer, Miaoxin Li, Tao Li, Kung-Yee Liang, Jeffrey Lieberman, Svetlana Limborska, Carmel M. Loughland, Jan Lubinski, Jouko Lönnqvist, Milan Macek Jr, Patrik K. E. Magnusson, Brion S. Maher, Wolfgang Maier, Jacques Mallet, Sara Marsal, Manuel Mattheisen, Morten Mattingsdal, Robert W. McCarley, Colm McDonald, Andrew M. McIntosh, Sandra Meier, Carin J. Meijer, Bela Melegh, Ingrid Melle, Raquelle I. Mesholam-Gately, Andres Metspalu, Patricia T. Michie, Lili Milani, Vihra Milanova, Younes Mokrab, Derek W. Morris, Ole Mors, Kieran C. Murphy, Robin M. Murray, Inez Myin-Germeys, Bertram Müller-Myhsok, Mari Nelis, Igor Nenadic, Deborah A. Nertney, Gerald Nestadt, Kristin K. Nicodemus, Liene Nikitina-Zake, Laura Nisenbaum, Annelie Nordin, Eadbhard O’Callaghan, Colm O’Dushlaine, F. Anthony O’Neill, Sang-Yun Oh, Ann Olincy, Line Olsen, Jim Van Os, Psychosis Endophenotypes International Consortium, Christos Pantelis, George N. Papadimitriou, Sergi Papiol, Elena Parkhomenko, Michele T. Pato, Tiina Paunio, Milica Pejovic-Milovancevic, Diana O. Perkins, Olli Pietiläinen, Jonathan Pimm, Andrew J. Pocklington, John Powell, Alkes Price, Ann E. Pulver, Shaun M. Purcell, Digby Quested, Henrik B. Rasmussen, Abraham Reichenberg, Mark A. Reimers, Alexander L. Richards, Joshua L. Roffman, Panos Roussos, Douglas M. Ruderfer, Veikko Salomaa, Alan R. Sanders, Ulrich Schall, Christian R. Schubert, Thomas G. Schulze, Sibylle G. Schwab, Edward M. Scolnick, Rodney J. Scott, Larry J. Seidman, Jianxin Shi, Engilbert Sigurdsson, Teimuraz Silagadze, Jeremy M. Silverman, Kang Sim, Petr Slominsky, Jordan W. Smoller, Hon-Cheong So, ChrisC. A. Spencer, Eli A. Stahl, Hreinn Stefansson, Stacy Steinberg, Elisabeth Stogmann, Richard E. Straub, Eric Strengman, Jana Strohmaier, T. Scott Stroup, Mythily Subramaniam, Jaana Suvisaari, Dragan M. Svrakic, Jin P. Szatkiewicz, Erik Söderman, Srinivas Thirumalai, Draga Toncheva, Sarah Tosato, Juha Veijola, John Waddington, Dermot Walsh, Dai Wang, Qiang Wang, Bradley T. Webb, Mark Weiser, Dieter B. Wildenauer, Nigel M. Williams, Stephanie Williams, Stephanie H. Witt, Aaron R. Wolen, Emily H. M. Wong, Brandon K. Wormley, Hualin Simon Xi, Clement C. Zai, Xuebin Zheng, Fritz Zimprich, Naomi R. Wray, Kari Stefansson, Peter M. Visscher, Wellcome Trust Case-Control Consortium, Rolf Adolfsson, Ole A. Andreassen, Douglas H. R. Blackwood, Elvira Bramon, Joseph D. Buxbaum, Anders D. Børglum, Sven Cichon, Ariel Darvasi, Enrico Domenici, Hannelore Ehrenreich, Tõnu Esko, Pablo V. Gejman, Michael Gill, Hugh Gurling, Christina M. Hultman, Nakao Iwata, Assen V. Jablensky, Erik G. Jönsson, Kenneth S. Kendler, George Kirov, Jo Knight, Todd Lencz, Douglas F. Levinson, Qingqin S. Li, Jianjun Liu, Anil K. Malhotra, Steven A. McCarroll, Andrew McQuillin, Jennifer L. Moran, Preben B. Mortensen, Bryan J. Mowry, Markus M. Nöthen, Roel A. Ophoff, Michael J. Owen, Aarno Palotie, Carlos N. Pato, Tracey L. Petryshen, Danielle Posthuma, Marcella Rietschel, Brien P. Riley, Dan Rujescu, Pak C. Sham, Pamela Sklar, David St Clair, Daniel R. Weinberger, Jens R. Wendland, Thomas Werge, Mark J. Daly, Patrick F. Sullivan & Michael C. O’Donovan

Author Contributions

  1. Conceptualization: MCK.
  2. Data curation: GB AB SC FD AJF JF GG SHH SH PH WM MM DM BM BMM BN IN MMN CO MR DMR DR TGS ES JS SHW.
  3. Formal analysis: ECJ MCK DWB.
  4. Funding acquisition: MCK.
  5. Methodology: MCK ECJ DWB.
  6. Supervision: MCK.
  7. Writing – original draft: ECJ MCK.
  8. Writing – review & editing: ECJ MCK PFS AA DPH MAS.

References

  1. 1. Darwin C. The effects of cross and self fertilisation in the vegetable kingdom. J. Murray; 1876.
  2. 2. Walsh B. Evolutionary Quantitative Genetics. Handbook of Statistical Genetics: Third Edition. 2008. p. 533–86.
  3. 3. Charlesworth B, Charlesworth D. The genetic basis of inbreeding depression. Genet Res. 1999;74(3):329–40. pmid:10689809
  4. 4. Abaskuliev AA, Skoblo G V. Inbreeding, endogamy and exogamy among relatives of schizophrenia patients. Genetika. 1975;11(3):145–8.
  5. 5. Bulaeva OA, Pavlova TA, Bulaeva KB. The effect of inbreeding on accumulation of complex diseases in genetic isolates. Genetika. 2009;45(8):1096–104. pmid:19769299
  6. 6. Mansour H, Fathi W, Klei L, Wood J, Chowdari K, Watson A, et al. Consanguinity and increased risk for schizophrenia in Egypt. Schizophr Res. 2010;120(1–3):108–12. pmid:20435442
  7. 7. Chaleby K, Tuma TA. Cousin marriages and schizophrenia in Saudi Arabia. Br J Psychiatry. 1987;150(APR.):547–9. pmid:3664138
  8. 8. Gindilis VM, Gaĭnullin RG, Shmaonova LM. Genetico-demographic patterns of the prevalence of various forms of endogenous psychoses. Genetika. 1989;25(4):734–43. pmid:2759447
  9. 9. Rudan I, Rudan D, Campbell H, Carothers A, Wright A, Smolej-Narancic N, et al. Inbreeding and risk of late onset complex disease. J Med Genet. 2003 Dec 1;40 (12): 925–32. pmid:14684692
  10. 10. Bittles AH, Neel J V. The costs of human inbreeding and their implications for variations at the DNA level. Nat Genet. 1994;8(2):117–21. pmid:7842008
  11. 11. Keller MC, Visscher PM, Goddard ME. Quantification of inbreeding due to distant ancestors and its detection using dense single nucleotide polymorphism data. Genetics. 2011;189(1):237–49. pmid:21705750
  12. 12. Vine AE, McQuillin A, Bass NJ, Pereira A, Kandaswamy R, Robinson M, et al. No evidence for excess runs of homozygosity in bipolar disorder. Psychiatr Genet. 2009;19(4):165–70. pmid:19451863
  13. 13. Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan TV, Kane JM, et al. Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proc Natl Acad Sci U S A. 2007;104(50):19942–7. pmid:18077426
  14. 14. Ku CS, Naidoo N, Teo SM, Pawitan Y. Regions of homozygosity and their impact on complex diseases and traits. Human Genetics. 2011. p. 1–15.
  15. 15. McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, et al. Runs of Homozygosity in European Populations. Am J Hum Genet. 2008;83(3):359–72. pmid:18760389
  16. 16. Kirin M, McQuillan R, Franklin CS, Campbell H, Mckeigue PM, Wilson JF. Genomic runs of homozygosity record population history and consanguinity. PLoS One. 2010;5(11).
  17. 17. Enciso-Mora V, Hosking FJ, Houlston RS. Risk of breast and prostate cancer is not associated with increased homozygosity in outbred populations. Eur J Hum Genet. 2010;18(8):909–14. pmid:20407466
  18. 18. Spain SL, Cazier J- B, Houlston R, Carvajal-Carmona L, Tomlinson I. Colorectal cancer risk is not associated with increased levels of homozygosity in a population from the United Kingdom. Cancer Res. 2009;69(18):7422–9. pmid:19723657
  19. 19. Hosking FJ, Papaemmanuil E, Sheridan E, Kinsey SE, Lightfoot T, Roman E, et al. Genome-wide homozygosity signatures and childhood acute lymphoblastic leukemia risk. Blood. 2010;115(22):4472–7. pmid:20231427
  20. 20. Keller MC, Simonson MA, Ripke S, Neale BM, Gejman P V., Howrigan DP, et al. Runs of homozygosity implicate autozygosity as a schizophrenia risk factor. PLoS Genet. 2012;8(4).
  21. 21. Thornton A, Lee P. Publication bias in meta-analysis: Its causes and consequences. J Clin Epidemiol. 2000;53(2):207–16. pmid:10729693
  22. 22. Ioannidis JPA., Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):e124. pmid:16060722
  23. 23. Duncan LE, Keller MC. A critical review of the first 10 years of candidate gene-by-environment interaction research in psychiatry. American Journal of Psychiatry. 2011. p. 1041–9. pmid:21890791
  24. 24. Collaboration OS. Estimating the reproducibility of psychological science. Sci. 2015 Aug 28;349 (6251).
  25. 25. Ripke S, Neale BM, Corvin A, Walters JTR, Farh K-H, Holmans P a., et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421–7. pmid:25056061
  26. 26. Ripke S, Sanders AR, Kendler KS, Levinson DF, Sklar P, Holmans PA, et al. Genome-wide association study identifies five new schizophrenia loci. Nat Genet. 2011;43(10):969–76. pmid:21926974
  27. 27. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9. pmid:16862161
  28. 28. Abdellaoui A, Hottenga JJ, Xiao X, Scheet P, Ehli EA, Davies GE, et al. Association between autozygosity and major depression: Stratification due to religious assortment. Behav Genet. 2013;43(6):455–67. pmid:23978897
  29. 29. Abdellaoui A, Hottenga JJ, Willemsen G, Bartels M, Van Beijsterveldt T, Ehli EA, et al. Educational attainment influences levels of homozygosity through migration and assortative mating. PLoS One. 2015;10(3).
  30. 30. Nalls MA, Guerreiro RJ, Simon-Sanchez J, Bras JT, Traynor BJ, Gibbs JR, et al. Extended tracts of homozygosity identify novel candidate genes associated with late-onset Alzheimer’s disease. Neurogenetics. 2009;10(3):183–90. pmid:19271249
  31. 31. Sims R, Dwyer S, Harold D, Gerrish A, Hollingworth P, Chapman J, et al. No evidence that extended tracts of homozygosity are associated with Alzheimer’s disease. Am J Med Genet Part B Neuropsychiatr Genet. 2011;156(7):764–71.
  32. 32. Ghani M, Sato C, Lee JH, Reitz C, Moreno D, Mayeux R, et al. Evidence of recessive Alzheimer disease loci in a Caribbean Hispanic data set: genome-wide survey of runs of homozygosity. JAMA Neurol. 2013;70(10):1261–7. pmid:23978990
  33. 33. Assié G, LaFramboise T, Platzer P, Eng C. Frequency of germline genomic homozygosity associated with cancer cases. Jama. 2008;299(12):1437–45. pmid:18364486
  34. 34. Ruderfer DM, Lim ET, Genovese G, Moran JL, Hultman CM, Sullivan PF, et al. No evidence for rare recessive and compound heterozygous disruptive variants in schizophrenia. Eur J Hum Genet. 2014;23(July):1–3.
  35. 35. Heron EA, Cormican P, Donohoe G, O’Neill FA, Kendler KS, Riley BP, et al. No evidence that runs of homozygosity are associated with schizophrenia in an Irish genome-wide association dataset. Schizophr Res. 2014;154(1–3):79–82. pmid:24560374
  36. 36. Power RA, Keller MC, Ripke S, Abdellaoui A, Wray NR, Sullivan PF, et al. A recessive genetic model and runs of homozygosity in major depressive disorder. Am J Med Genet Part B Neuropsychiatr Genet. 2014;165(2):157–66.
  37. 37. Sullivan PF. Spurious Genetic Associations. Biol Psychiatry. 2007;61(10):1121–6. pmid:17346679
  38. 38. Collins AL, Kim Y, Sklar P, O’Donovan MC, Sullivan PF. Hypothesis-driven candidate genes for schizophrenia compared to genome-wide association results. Psychol Med. Cambridge Univ Press; 2012;42(03):607–16.
  39. 39. Button KS, Ioannidis JP a, Mokrysz C, Nosek B a, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14(5):365–76. pmid:23571845
  40. 40. Peng RD. Reproducible research and Biostatistics. Biostatistics. Biometrika Trust; 2009;10(3):405–8. pmid:19535325
  41. 41. Makel MC, Plucker JA, Hegarty B. Replications in psychology research how often do they really occur? Perspect Psychol Sci. Sage Publications; 2012;7(6):537–42. pmid:26168110
  42. 42. The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov 1;491(7422):56–65. pmid:23128226
  43. 43. Hao K, Chudin E, McElwee J, Schadt EE. Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies. BMC Genet. 2009;10:27. pmid:19531258
  44. 44. Consortium IH. A haplotype map of the human genome. Nature. Nature Publishing Group; 2005;437(7063):1299–320. pmid:16255080
  45. 45. Howrigan DP, Simonson MA, Keller MC. Detecting autozygosity through runs of homozygosity: A comparison of three autozygosity detection algorithms. BMC Genomics. 2011. p. 460. pmid:21943305
  46. 46. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. pmid:17701901