Skip to main content

Bioinformatics Analysis of Sequence Data

  • Chapter
  • First Online:
Molecular Pathology in Cancer Research
  • 1109 Accesses

Abstract

Bioinformatics is the application of mathematics, statistics and computer science to biological data. In this chapter, we introduce this discipline and describe approaches to basic analyses of genomic DNA and RNA data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Van Loo P, Nordgard SH, Lingjaerde OC, Russnes HG, Rye IH, Sun W, Weigman VJ, Marynen P, Zetterberg A, Naume B et al (2010) Allele-specific copy number analysis of tumors. Proc Natl Acad Sci U S A 107(39):16910–16915

    Article  PubMed  PubMed Central  Google Scholar 

  2. Ortiz-Estevez M, Aramburu A, Bengtsson H, Neuvial P, Rubio A (2012) CalMaTe: a method and software to improve allele-specific copy number of SNP arrays for downstream segmentation. Bioinformatics 28(13):1793–1794

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Song S, Nones K, Miller D, Harliwong I, Kassahn KS, Pinese M, Pajic M, Gill AJ, Johns AL, Anderson M et al (2012) qpure: A tool to estimate tumor cellularity from genome-wide single-nucleotide polymorphism profiles. PLoS One 7(9), e45835

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Olshen AB, Venkatraman ES, Lucito R, Wigler M (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4):557–572

    Article  PubMed  Google Scholar 

  5. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218):53–59

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Rieber N, Zapatka M, Lasitschka B, Jones D, Northcott P, Hutter B, Jager N, Kool M, Taylor M, Lichter P et al (2013) Coverage bias and sensitivity of variant calling for four whole-genome sequencing technologies. PLoS One 8(6), e66621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Campbell CD, Chong JX, Malig M, Ko A, Dumont BL, Han L, Vives L, O'Roak BJ, Sudmant PH, Shendure J et al (2012) Estimating the human mutation rate using autozygosity in a founder population. Nat Genet 44(11):1277–1281

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Borresen-Dale AL et al (2013) Signatures of mutational processes in human cancer. Nature 500(7463):415–421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31(3):213–219

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, Ding L (2009) VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25(17):2283–2285

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Koboldt DC, Larson DE, Wilson RK (2013) Using VarScan 2 for germline variant calling and somatic mutation detection. Curr Protoc Bioinformatics 44:15.14.11–15.14.17

    Google Scholar 

  13. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Goode DL, Hunter SM, Doyle MA, Ma T, Rowley SM, Choong D, Ryland GL, Campbell IG (2013) A simple consensus approach improves somatic mutation prediction accuracy. Genome Med 5(9):90

    Article  PubMed  PubMed Central  Google Scholar 

  15. Fowler DM, Fields S (2014) Deep mutational scanning: a new style of protein science. Nat Methods 11(8):801–807

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Araya CL, Fowler DM (2011) Deep mutational scanning: assessing protein function on a massive scale. Trends Biotechnol 29(9):435–442

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ivakhno S, Royce T, Cox AJ, Evers DJ, Cheetham RK, Tavare S (2010) CNAseg—a novel framework for identification of copy number changes in cancer from second-generation sequencing data. Bioinformatics 26(24):3051–3058

    Article  CAS  PubMed  Google Scholar 

  18. Alkodsi A, Louhimo R, Hautaniemi S (2015) Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data. Brief Bioinform 16:242–254

    Article  PubMed  Google Scholar 

  19. Kadalayil L, Rafiq S, Rose-Zerilli MJ, Pengelly RJ, Parker H, Oscier D, Strefford JC, Tapper WJ, Gibson J, Ennis S et al (2015) Exome sequence read depth methods for identifying copy number changes. Brief Bioinform 16:380–392

    Article  PubMed  Google Scholar 

  20. Liu B, Morrison CD, Johnson CS, Trump DL, Qin M, Conroy JC, Wang J, Liu S (2013) Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget 4(11):1868–1881

    Article  PubMed  PubMed Central  Google Scholar 

  21. Tan R, Wang Y, Kleinstein SE, Liu Y, Zhu X, Guo H, Jiang Q, Allen AS, Zhu M (2014) An evaluation of copy number variation detection tools from whole-exome sequencing data. Hum Mutat 35(7):899–907

    Article  CAS  PubMed  Google Scholar 

  22. Zhao M, Wang Q, Wang Q, Jia P, Zhao Z (2013) Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics 14 Suppl 11:S1

    Google Scholar 

  23. Amarasinghe KC, Li J, Hunter SM, Ryland GL, Cowin PA, Campbell IG, Halgamuge SK (2014) Inferring copy number and genotype in tumour exome data. BMC Genomics 15:732

    Article  PubMed  PubMed Central  Google Scholar 

  24. Li J, Lupat R, Amarasinghe KC, Thompson ER, Doyle MA, Ryland GL, Tothill RW, Halgamuge SK, Campbell IG, Gorringe KL (2012) CONTRA: copy number analysis for targeted resequencing. Bioinformatics 28(10):1307–1313

    Article  PubMed  PubMed Central  Google Scholar 

  25. Boeva V, Popova T, Lienard M, Toffoli S, Kamal M, Le Tourneau C, Gentien D, Servant N, Gestraud P, Rio Frio T et al (2014) Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. Bioinformatics 30(24):3443–3450

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Bellos E, Kumar V, Lin C, Maggi J, Phua ZY, Cheng CY, Cheung CM, Hibberd ML, Wong TY, Coin LJ et al (2014) cnvCapSeq: detecting copy number variation in long-range targeted resequencing data. Nucleic Acids Res 42(20), e158

    Article  PubMed  PubMed Central  Google Scholar 

  27. Xi R, Hadjipanayis AG, Luquette LJ, Kim TM, Lee E, Zhang J, Johnson MD, Muzny DM, Wheeler DA, Gibbs RA et al (2011) Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci U S A 108(46):E1128–E1136

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Guo Y, Sheng Q, Samuels DC, Lehmann B, Bauer JA, Pietenpol J, Shyr Y (2013) Comparative study of exome copy number variation estimation tools using array comparative genomic hybridization as control. Biomed Res Int 2013:915636

    PubMed  PubMed Central  Google Scholar 

  29. Favero F, Joshi T, Marquard AM, Birkbak NJ, Krzystanek M, Li Q, Szallasi Z, Eklund AC (2015) Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann Oncol 26(1):64–70

    Article  CAS  PubMed  Google Scholar 

  30. Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6(9):677–681

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Schweiger MR, Kerick M, Timmermann B, Albrecht MW, Borodina T, Parkhomchuk D, Zatloukal K, Lehrach H (2009) Genome-wide massively parallel sequencing of formaldehyde fixed-paraffin embedded (FFPE) tumor tissues for copy-number- and mutation-analysis. PLoS One 4(5), e5548

    Article  PubMed  PubMed Central  Google Scholar 

  32. Van Allen EM, Wagle N, Stojanov P, Perrin DL, Cibulskis K, Marlow S, Jane-Valbuena J, Friedrich DC, Kryukov G, Carter SL et al (2014) Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat Med 20(6):682–688

    Article  PubMed  PubMed Central  Google Scholar 

  33. Scheinin I, Sie D, Bengtsson H, van de Wiel MA, Olshen AB, van Thuijl HF, van Essen HF, Eijk PP, Rustenburg F, Meijer GA et al (2014) DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly. Genome Res 24(12):2022–2032

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Schroder J, Hsu A, Boyle SE, Macintyre G, Cmero M, Tothill RW, Johnstone RW, Shackleton M, Papenfuss AT (2014) Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics

    Google Scholar 

  35. Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18):i333–i339

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wong K, Keane TM, Stalker J, Adams DJ (2010) Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly. Genome Biol 11(12):R128

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony T. Papenfuss .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media LLC

About this chapter

Cite this chapter

Papenfuss, A.T., Cameron, D., Schroeder, J., Vergara, I. (2016). Bioinformatics Analysis of Sequence Data. In: Lakhani, S., Fox, S. (eds) Molecular Pathology in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-6643-1_14

Download citation

Publish with us

Policies and ethics