Abstract
Introduction
Variant detection protocols for clinical next-generation sequencing (NGS) need application-specific optimization. Our aim was to analyze the performance of single nucleotide variant (SNV) and copy number (CNV) detection programs on an NGS panel for a rare disease.
Methods
Thirty genes were sequenced in 83 patients with hereditary spastic paraplegia. The variant calls obtained with LifeScope, GATK UnifiedGenotyper and GATK HaplotypeCaller were compared with Sanger sequencing. The calling efficiency was evaluated for 187 (56 unique) SNVs and indels. Five multiexon deletions detected by multiple ligation probe assay were assessed from the NGS panel data with ExomeDepth, panelcn.MOPS and CNVPanelizer software.
Results
There were 48/51 (94%) SNVs and 1/5 (20%) indels consistently detected by all the calling algorithms. Two SNVs were not detected by any of the callers because of a rare reference allele, and one SNV in a low coverage region was only detected by two algorithms. Regarding CNVs, ExomeDepth detected 5/5 multi-exon deletions, panelcn.MOPs 4/5 and only 3/5 deletions were accurately detected by CNVPanelizer.
Conclusions
The calling efficiency of NGS algorithms for SNVs is influenced by variant type and coverage. NGS protocols need to account for the presence of rare variants in the reference sequence as well as for ambiguities in indel calling. CNV detection algorithms can be used to identify large deletions from NGS panel data for diagnostic applications; however, sensitivity depends on coverage, selection of the reference set and deletion size. We recommend the incorporation of several variant callers in the NGS pipeline to maximize variant detection efficiency.
References
Kumar KR, Blair NF, Vandebona H, et al. Targeted next generation sequencing in SPAST-negative hereditary spastic paraplegia. J Neurol. 2013;260:2516–22.
Schlipf NA, Schüle R, Klimpe S, et al. Amplicon-based high-throughput pooled sequencing identifies mutations in CYP7B1 and SPG7 in sporadic spastic paraplegia patients. Clin Genet. 2011;80:148–60.
Crona J, Ljungström V, Welin S, Walz MK, Hellman P, Björklund P. Bioinformatic challenges in clinical diagnostic application of targeted next generation sequencing: experience from pheochromocytoma. PLoS One. 2015;10:e0133210.
O’Rawe J, Jiang T, Sun G, et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 2013;5:28.
Pirooznia M, Kramer M, Parla J, et al. Validation and assessment of variant calling pipelines for next-generation sequencing. Hum Genomics. 2014;8:14.
Liu X, Han S, Wang Z, Gelernter J, Yang BZ. Variant callers for next-generation sequencing data: a comparison study. PLoS One. 2013;8:e75619.
Hwang S, Kim E, Lee I, Marcotte EM. Systematic comparison of variant calling pipelines using gold standard personal exome variants. Sci Rep. 2015;5:17875.
Millat G, Chanavat V, Rousson R. Evaluation of a new high-throughput next-generation sequencing method based on a custom AmpliSeq library and ion torrent PGM sequencing for the rapid detection of genetic variations in long QT syndrome. Mol Diagn Ther. 2014;18:533–9.
Dacheva D, Dodova R, Popov I, et al. Validation of an NGS approach for diagnostic BRCA1/BRCA2 mutation testing. Mol Diagn Ther. 2015;19:119–30.
Sikkema-Raddatz B, Johansson LF, de Boer EN, et al. Targeted next-generation sequencing can replace Sanger sequencing in clinical diagnostics. Hum Mutat. 2013;34:1035–42.
Baudhuin LM, Lagerstedt SA, Klee EW, Fadra N, Oglesbee D, Ferber MJ. Confirming variants in next-generation sequencing panel testing by Sanger sequencing. J Mol Diagn. 2015;17:456–61.
Beck TF, Mullikin JC, NISC Comparative Sequencing Program, Biesecker LG. Systematic evaluation of sanger validation of next-generation sequencing variants. Clin Chem. 2016;62:647–54.
Stuppia L, Antonucci I, Palka G, Gatta V. Use of the MLPA assay in the molecular diagnosis of gene copy number alterations in human genetic diseases. Int J Mol Sci. 2012;13:3245–76.
Wei X, Dai Y, Yu P, et al. Targeted next-generation sequencing as a comprehensive test for patients with and female carriers of DMD/BMD: a multi-population diagnostic study. Eur J Hum Genet. 2014;22:110–8.
Plagnol V, Curtis J, Epstein M, et al. A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics. 2012;28:2747–54.
Povysil G, Tzika A, Vogt J, Haunschmid V, Messiaen L, Wimmer K, Klambauer G, Hochreiter S, panelcn.MOPS: CNV detection in targeted panel sequencing data for diagnostic use; (Abstract 1016T). In: Presented at the 66th annual meeting of the American society of human genetics, October 20, 2016, Vancouver.
Klambauer G, Schwarzbauer K, Mayr A, Mitterecker A, Clevert D, Bodenhofer U, Hochreiter S. cn.MOPS: mixture of poissons for discovering copy number variations in next generation sequencing data with a low false discovery rate. Nucleic Acids Res. 2012;40:e69.
Oliveira C, Wolf T. CNVPanelizer: reliable CNV detection in targeted sequencing applications. R package version 1.4.0. 2016.
DePristo MA, Banks E, Poplin R, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–8.
McKenna A, Hanna M, Banks E, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data. Nucleic Acids Res. 2010;38:e164.
R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2016. https://www.R-project.org/.
Lex A, Gehlenborg N, Strobelt H, Vuillemot R, Pfister H. UpSet: visualization of intersecting sets. IEEE Trans Vis Comput Graph. 2014;20:1983–92.
Robinson JT, Thorvaldsdóttir H, Winckler W, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
Richards S, Aziz N, Bale S, et al. ACMG Laboratory Quality Assurance Committee. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.
Quintáns B, Ordóñez-Ugalde A, Cacheiro P, Carracedo A, Sobrido MJ. Medical genomics: the intricate path from genetic variant identification to clinical interpretation. Appl Transl Genom. 2014;3:60–7.
Li H, Handsaker B, Wysoker A, Fennell T, et al. 1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68–74. ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/.
Pranckevičiene E, Rančelis T, Pranculis A, Kučinskas V. Challenges in exome analysis by LifeScope and its alternative computational pipelines. BMC Res Notes. 2015;8:421.
Hamilton A, Tétreault M, Dyment DA, Zou R, Kernohan K, Geraghty MT, FORGE Canada Consortium; Care4Rare Canada Consortium, Hartley T, Boycott KM. Concordance between whole-exome sequencing and clinical Sanger sequencing: implications for patient care. Mol Genet Genomic Med. 2016;4:504–12.
Field MA, Cho V, Andrews TD, Goodnow CC. Reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies. PLoS One. 2015;10:e0143199.
Chen R, Butte AJ. The reference human genome demonstrates high risk of type 1 diabetes and other disorders. Pac Symp Biocomput. 2011:231–42.
Moore B, Hu H, Singleton M, De La Vega FM, Reese MG, Yandell M. Global analysis of disease-related DNA sequence variation in 10 healthy individuals: implications for whole genome-based clinical diagnostics. Genet Med. 2011;13:210–7.
Magi A, D’Aurizio R, Palombo F, et al. Characterization and identification of hidden rare variants in the human genome. BMC Genom. 2015;16:340.
Ferrarini A, Xumerle L, Griggio F, et al. the use of non-variant sites to improve the clinical assessment of whole-genome sequence data. PLoS One. 2015;10:e0132180.
Balasubramanian S, Habegger L, Frankish A, et al. Gene inactivation and its implications for annotation in the era of personal genomics. Genes Dev. 2011;25:1–10.
Dewey FE, Chen R, Cordero SP, et al. Phased whole-genome genetic risk in a family quartet using a major allele reference sequence. PLoS Genet. 2011;7:e1002280.
Bodian DL, McCutcheon JN, Kothiyal P, et al. Germline variation in cancer-susceptibility genes in a healthy, ancestrally diverse cohort: I for individual genome sequencing. PLoS One. 2014;9:e94554.
Ghoneim DH, Myers JR, Tuttle E, Paciorkowski AR. Comparison of insertion/deletion calling algorithms on human next-generation sequencing data. BMC Res Notes. 2014;7:864.
de Ligt J, Boone PM, Pfundt R, et al. Detection of clinically relevant copy number variants with whole-exome sequencing. Hum Mutat. 2013;34:1439–48.
Plagnol V. ExomeDepth: R package version 1.1.10. https://cran.r-project.org/web/packages/ExomeDepth/vignettes/ExomeDepth-vignette.pdf. Accessed 12 Dec 2016.
Futema M, Plagnol V, Whittall RA, Neil HA, Simon Broome Register Group, Humphries SE, UK10K. Use of targeted exome sequencing as a diagnostic tool for familial hypercholesterolaemia. J Med Genet. 2012;49:644–9.
Vandrovcova J, Thomas ER, Atanur SS, et al. The use of next-generation sequencing in clinical diagnosis of familial hypercholesterolemia. Genet Med. 2013;15:948–57.
Lopes LR, Murphy C, Syrris P, Dalageorgou C, McKenna WJ, Elliott PM, Plagnol V. Use of high-throughput targeted exome-sequencing to screen for copy number variation in hypertrophic cardiomyopathy. Eur J Med Genet. 2015;58:611–6.
Kadalayil L, Rafiq S, Rose-Zerilli MJ, et al. Exome sequence read depth methods for identifying copy number changes. Brief Bioinform. 2015;16:380–92.
Fang H, Wu Y, Narzisi G, et al. Reducing INDEL calling errors in whole genome and exome sequencing data. Genome Med. 2014;6:89.
Gézsi A, Bolgár B, Marx P, et al. VariantMetaCaller: automated fusion of variant calling pipelines for quantitative, precision-based filtering. BMC Genom. 2015;16:875.
Acknowledgements
The authors are indebted with the participating patients and with the Asociación Española de Paraparesia Espástica Familiar (AEPEF). We thank the following developers of the CNV detection algorithms for their valuable help and input: Vincent Plagnol—ExomeDepth, Gundula Povysil—panelcn.MOPS, and Thomas Wolf—CNVPanelizer.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest and disclosures
PC, AOU, BQ, SPH, JA, MGM, SIPP, FG, JA, AC, and MJS all declare that they have no conflict of interest relevant to the content presented in this manuscript.
Funding
This project was funded by the Institute of Health Carlos III (FIS PS09/01830; PS09/01685; PS09/00839).
Ethical approval and informed consent
This study was conducted in accordance with the ethical principles of the Declaration of Helsinki and approved by the regional ethics committee Comité Autonómico de Ética de la Investigación de Galicia (CAEIG). All participants provided written informed consent.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Cacheiro, P., Ordóñez-Ugalde, A., Quintáns, B. et al. Evaluating the Calling Performance of a Rare Disease NGS Panel for Single Nucleotide and Copy Number Variants. Mol Diagn Ther 21, 303–313 (2017). https://doi.org/10.1007/s40291-017-0268-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40291-017-0268-x