Abstract
Although genome-wide association studies have uncovered single-nucleotide polymorphisms (SNPs) associated with complex disease, these variants account for a small portion of heritability. Some contribution to this 'missing heritability' may come from copy-number variants (CNVs), in particular rare CNVs; but assessment of this contribution remains challenging because of the difficulty in accurately genotyping CNVs, particularly small variants. We report a population-based approach for the identification of CNVs that integrates data from multiple samples and platforms. Our algorithm, cnvHap, jointly learns a chromosome-wide haplotype model of CNVs and cluster-based models of allele intensity at each probe. Using data for 50 French individuals assayed on four separate platforms, we found that cnvHap correctly detected at least 14% more deleted and 50% more amplified genotypes than PennCNV or QuantiSNP, with an 82% and 115% improvement for aberrations containing <10 probes. Combining data from multiple platforms additionally improved sensitivity.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Meyre, D. et al. Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations. Nat. Genet. 41, 157–159 (2009).
Sladek, R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881–885 (2007).
Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat. Genet. 40, 638–645 (2008).
Cook, E.H. & Scherer, S.W. Copy-number variations associated with neuropsychiatric conditions. Nature 455, 919–923 (2008).
Walters, R.G. et al. A new highly penetrant form of obesity due to deletions on chromosome 16p11.2. Nature 463, 671–675 (2010).
Aitman, T.J. et al. Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855 (2006).
Diskin, S.J. et al. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459, 987–991 (2009).
McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat. Genet. 40, 1107–1112 (2008).
Willer, C.J. et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009).
Kleinjan, D.A. & van Heyningen, V. Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005).
Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).
Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).
Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713–720 (2010).
Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).
Lipson, D., Aumann, Y., Ben-Dor, A., Linial, N. & Yakhini, Z. Efficient calculation of interval scores for DNA copy number data analysis. J. Comput. Biol. 13, 215–228 (2006).
Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).
Colella, S. et al. QuantiSNP: an objective Bayes hidden-Markov model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35, 2013–2025 (2007).
Franke, L. et al. Detection, imputation, and association analysis of small deletions and null alleles on oligonucleotide arrays. Am. J. Hum. Genet. 82, 1316–1333 (2008).
Mefford, H.C. et al. A method for rapid, targeted CNV genotyping identifies rare variants associated with neurocognitive disease. Genome Res. 19, 1579–1585 (2009).
Cooper, G.M., Zerr, T., Kidd, J.M., Eichler, E.E. & Nickerson, D.A. Systematic assessment of copy-number-variant detection via genome-wide SNP genotyping. Nat. Genet. 40, 1199–1203 (2008).
Korn, J.M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).
Coin, L. & Durbin, R. Improved techniques for the identification of pseudogenes. Bioinformatics 20 (Suppl. 1), i94–i100 (2004).
Hoerl, A.E. Application of ridge analysis to regression problems. Chem. Eng. Prog. 58, 54–59 (1962).
de Smith, A.J. et al. Small deletion variants have stable breakpoints commonly associated with alu elements. PLoS One 3, e3104 (2008).
Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).
Su, S.-Y., Balding, D.J. & Coin, L.J.M. Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions. BMC Bioinformatics 9, 513 (2008).
de Smith, A.J. et al. Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases. Hum. Mol. Genet. 16, 2783–2794 (2007).
Peiffer, D.A. et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res. 16, 1136–1148 (2006).
Kidd, J.M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).
Su, S.-Y., Balding, D.J. & Coin, L.J.M. Disease association tests by inferring ancestral haplotypes using a hidden Markov model. Bioinformatics 24, 972–978 (2008).
Marioni, J.C. et al. Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization. Genome Biol. 8, R228 (2007).
Acknowledgements
We thank D. Serre, A. Montpetit and D. Vincent for advice concerning Illumina arrays and D. Peiffer (Illumina) for providing genotype data on HapMap samples. Genome Canada and Genome Quebec funded genotyping on the Illumina Human1M platform. L.J.M.C. is funded by a Research Council UK fellowship. J.E.A. is supported by the Medical Research Council. R.G.W. is supported by Johnson & Johnson and the South East England Development Agency. J.S.E.-S.M. is supported by an Imperial College Division of Medicine PhD studentship.
Author information
Authors and Affiliations
Contributions
L.J.M.C. designed the project with A.I.F.B., developed the cnvHap algorithm and software, analyzed data and wrote the paper. J.E.A. ran cnvPartition, PennCNV and QuantiSNP on the data and helped write the paper. R.G.W. and J.S.E.-S.M. provided critical comments and helped to write the paper. D.J.B. provided statistical advice. R.S. provided SNP genotype data, advised on its interpretation and edited the paper. A.J.d.S. provided aCGH data and advised on its interpretation. P.F. provided the DNA samples and coordinated the SNP genotyping. A.I.F.B. designed the project with L.J.M.C., coordinated the aCGH analysis, contributed to writing the paper and oversaw the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–9, Supplementary Tables 1–3 and Supplementary Note 1 (PDF 1513 kb)
Supplementary Software
Software, documentation and an example. (ZIP 9805 kb)
Rights and permissions
About this article
Cite this article
Coin, L., Asher, J., Walters, R. et al. cnvHap: an integrative population and haplotype–based multiplatform model of SNPs and CNVs. Nat Methods 7, 541–546 (2010). https://doi.org/10.1038/nmeth.1466
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1466