Skip to main content

Studying Cancer Genomics Through Next-Generation DNA Sequencing and Bioinformatics

  • Protocol
  • First Online:
Clinical Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1168))

Abstract

Cancer is a complex disease driven by multiple mutations acquired over the lifetime of the cancer cells. These alterations, termed somatic mutations to distinguish them from inherited germline mutations, can include single-nucleotide substitutions, insertions, deletions, copy number alterations, and structural rearrangements. A patient’s cancer can contain a combination of these aberrations, and the ability to generate a comprehensive genetic profile should greatly improve patient diagnosis and treatment. Next-generation sequencing has become the tool of choice to uncover multiple cancer mutations from a single tumor source, and the falling costs of this rapid high-throughput technology are encouraging its transition from basic research into a clinical setting. However, the detection of mutations in sequencing data is still an evolving area and cancer genomic data requires some special considerations. This chapter discusses these aspects and gives an overview of current bioinformatics methods for the detection of somatic mutations in cancer sequencing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

CNA:

Copy number alterations

CNV:

Copy number variants

SNV:

Single-nucleotide variants

SR:

Structural rearrangements

References

  1. Ley TJ, Mardis ER, Ding L et al (2008) DNA sequencing of a cytogenetically normal acute myeloid leukemia genome. Nature 456:66–72

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  2. Cancer Genome Atlas Research Network, Kandoth C, Schultz N et al (2013) Integrated genomic characterization of endometrial carcinoma. Nature 497:67–73

    Article  PubMed  Google Scholar 

  3. International Cancer Genome Consortium, Hudson TJ, Anderson W et al (2010) International network of cancer genome projects. Nature 464:993–998

    Article  CAS  PubMed  Google Scholar 

  4. Parsons DW, Jones S, Zhang X et al (2008) An integrated genomic analysis of human glioblastoma multiforme. Science 321:1807–1812

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  5. Tiacci E, Trifonov V, Schiavoni G et al (2011) BRAF mutations in hairy-cell leukemia. N Engl J Med 364:2305–2315

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  6. Vogelstein B, Papadopoulos N, Velculescu VE et al (2013) Cancer genome landscapes. Science 339:1546–1558

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  7. Ding L, Ley TJ, Larson DE et al (2012) Clonal evolution in relapsed acute myeloid leukemia revealed by whole-genome sequencing. Nature 481:506–510

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Landau DA, Carter SL, Stojanov P et al (2013) Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152:714–726

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  9. Gerlinger M, Rowan AJ, Horswell S et al (2012) Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366:883–892

    Article  CAS  PubMed  Google Scholar 

  10. Oesper L, Mahmoody A, Raphael BJ (2013) THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol 14:R80

    Article  PubMed  Google Scholar 

  11. Mroz EA, Tward AD, Pickering CR et al (2013) High intratumor genetic heterogeneity is related to worse outcome in patients with head and neck squamous cell carcinoma. Cancer 119:3034–3042

    Article  PubMed  Google Scholar 

  12. Mardis ER (2012) Genome sequencing and cancer. Curr Opin Genet Dev 22:245–250

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  13. Su X, Zhang L, Zhang J et al (2012) PurityEst: estimating purity of human tumor samples using next-generation sequencing data. Bioinformatics 28:2265–2266

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  14. Larson NB, Fridley BL (2013) PurBayes: estimating tumor cellularity and subclonality in next-generation sequencing data. Bioinformatics 29:1888–1889

    Article  CAS  PubMed  Google Scholar 

  15. Beroukhim R, Mermel CH, Porter D et al (2010) The landscape of somatic copy-number alteration across human cancers. Nature 463:899–905

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  16. Carter SL, Cibulskis K, Helman E et al (2012) Absolute quantification of somatic DNA alterations in human cancer. Nat Biotechnol 30:413–421

    Article  CAS  PubMed  Google Scholar 

  17. Do H, Wong SQ, Li J et al (2013) Reducing sequence artifacts in amplicon-based massively parallel sequencing of formalin-fixed paraffin-embedded DNA by enzymatic depletion of uracil-containing templates. Clin Chem 59:1376–1383

    Article  CAS  PubMed  Google Scholar 

  18. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. McKenna A, Hanna M, Banks E et al (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Li H, Handsaker B, Wysoker A et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079

    Article  PubMed Central  PubMed  Google Scholar 

  21. Pleasance ED, Cheetham RK, Stephens PJ et al (2010) A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191–196

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  22. Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219

    Article  CAS  PubMed  Google Scholar 

  24. Larson DE, Harris CC, Chen K et al (2012) SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28:311–317

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  25. Roth A, Ding J, Morin R et al (2012) JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumor paired next-generation sequencing data. Bioinformatics 28:907–913

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  26. Saunders CT, Wong WS, Swamy S et al (2012) Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28:1811–1817

    Article  CAS  PubMed  Google Scholar 

  27. Kim SY, Speed TP (2013) Comparing somatic mutation-callers: beyond Venn diagrams. BMC Bioinformatics 14:189

    Article  PubMed Central  PubMed  Google Scholar 

  28. Minoche AE, Dohm JC, Himmelbauer H (2011) Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol 12:R112

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  29. Roberts ND, Kortschak RD, Parker WT et al (2013) A comparative analysis of algorithms for somatic SNV detection in cancer. Bioinformatics 29:2223–2230

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  30. Rashid M, Robles-Espinoza CD, Rust AG et al (2013) Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes. Bioinformatics 29:2208–2210

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  31. O’Rawe J, Jiang T, Sun G et al (2013) Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med 5:28

    Article  PubMed Central  PubMed  Google Scholar 

  32. Lam HY, Clark MJ, Chen R et al (2011) Performance comparison of whole-genome sequencing platforms. Nat Biotechnol 30:78–82

    Article  PubMed  Google Scholar 

  33. Redon R, Ishikawa S, Fitch KR et al (2006) Global variation in copy number in the human genome. Nature 444:444–454

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  34. Boeva V, Popova T, Bleakley K et al (2012) Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 28:423–425

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  35. Sathirapongsasuti JF, Lee H, Horst BA et al (2011) Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27:2648–2654

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  36. Amarasinghe KC, Li J, Halgamuge SK (2013) CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics 14(Suppl 2):S26

    Article  Google Scholar 

  37. Teo SM, Pawitan Y, Ku CS et al (2012) Statistical challenges associated with detecting copy number variations with next-generation sequencing. Bioinformatics 28:2711–2718

    Article  CAS  PubMed  Google Scholar 

  38. Campbell PJ, Stephens PJ, Pleasance ED et al (2008) Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet 40:722–729

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  39. Wang J, Mullighan CG, Easton J et al (2011) CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Methods 8:652–654

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  40. Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  41. Jiang Y, Wang Y, Brudno M (2012) PRISM: pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants. Bioinformatics 28:2576–2583

    Article  CAS  PubMed  Google Scholar 

  42. Raphael BJ (2012) Chapter 6: structural variation and medical genomics. PLoS Comput Biol 8:e1002821

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  43. Forshew T, Murtaza M, Parkinson C et al (2012) Noninvasive identification and monitoring of cancer mutations by targeted deep sequencing of plasma DNA. Sci Transl Med 4:136ra168

    Article  Google Scholar 

  44. Meacham F, Boffelli D, Dhahbi J et al (2011) Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics 12:451

    Article  PubMed Central  PubMed  Google Scholar 

  45. Nakamura K, Oshima T, Morimoto T et al (2011) Sequence-specific error profile of Illumina sequencers. Nucleic Acids Res 39:e90

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  46. Kanagawa T (2003) Bias and artifacts in multitemplate polymerase chain reactions (PCR). J Biosci Bioeng 96:317–323

    CAS  PubMed  Google Scholar 

  47. Costello M, Pugh TJ, Fennell TJ et al (2013) Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res 41:e67

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  48. Schmitt MW, Kennedy SR, Salk JJ et al (2012) Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A 109:14508–14513

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  49. Adzhubei I, Jordan DM, Sunyaev SR (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet Chapter 7, Unit 7.20

    Google Scholar 

  50. Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1081

    Article  CAS  PubMed  Google Scholar 

  51. Swanton C (2012) My Cancer Genome: a unified genomics and clinical trial portal. Lancet Oncol 13:668–669

    Article  Google Scholar 

  52. Forbes SA, Bhamra G, Bamford S et al (2008) The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Human Genet Chapter 10, Unit 10.11

    Google Scholar 

  53. Yang W, Soares J, Greninger P et al (2013) Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res 41:D955–D961

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  54. Robinson JT, Thorvaldsdottir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria A. Doyle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Doyle, M.A., Li, J., Doig, K., Fellowes, A., Wong, S.Q. (2014). Studying Cancer Genomics Through Next-Generation DNA Sequencing and Bioinformatics. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0847-9_6

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0846-2

  • Online ISBN: 978-1-4939-0847-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics