Skip to main content
Log in

Finding cancer driver mutations in the era of big data research

  • Review
  • Published:
Biophysical Reviews Aims and scope Submit manuscript

Abstract

In the last decade, the costs of genome sequencing have decreased considerably. The commencement of large-scale cancer sequencing projects has enabled cancer genomics to join the big data revolution. One of the challenges still facing cancer genomics research is determining which are the driver mutations in an individual cancer, as these contribute only a small subset of the overall mutation profile of a tumour. Focusing primarily on somatic single nucleotide mutations in this review, we consider both coding and non-coding driver mutations, and discuss how such mutations might be identified from cancer sequencing datasets. We describe some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes. We also address the use of genome-wide variation in mutation load to establish background mutation rates from which to identify driver mutations under positive selection. Finally, we describe the ways in which mutational signatures can act as clues for the identification of cancer drivers, as these mutations may cause, or arise from, certain mutational processes. By defining the molecular changes responsible for driving cancer development, new cancer treatment strategies may be developed or novel preventative measures proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abraham BJ, Hnisz D, Weintraub AS, Kwiatkowski N, Li CH, Li Z, Weichert-Leahey N, Rahman S, Liu Y, Etchin J et al (2017) Small genomic insertions form enhancers that misregulate oncogenes. Nat Commun 8:14385

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Adzhubei I, Jordan DM, Sunyaev SR (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet 7:Unit 7.20

    Google Scholar 

  • Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Borresen-Dale A-L et al (2013a) Signatures of mutational processes in human cancer. Nature 500:415–421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR (2013b) Deciphering signatures of mutational processes operative in human cancer. Cell Rep 3:246–259

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bell RJA, Rube HT, Xavier-Magalhães A, Costa BM, Mancini A, Song JS, Costello JF (2016) Understanding TERT promoter mutations: a common path to immortality. Mol Cancer Res 14:315–323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S et al (2012) Annotation of functional variation in personal genomes using RegulomeDB. Genome Res 22:1790–1797

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Carter SL, Cibulskis K, Helman E, McKenna A, Shen H, Zack T, Laird PW, Onofrio RC, Winckler W, Weir BA et al (2012) Absolute quantification of somatic DNA alterations in human cancer. Nat Biotech 30:413–421

    Article  CAS  Google Scholar 

  • Cuykendall TN, Rubin MA, Khurana E (2017) Non-coding genetic variation in cancer. Curr Opin Syst Biol 1:9–15

    Article  PubMed  PubMed Central  Google Scholar 

  • Dees ND, Zhang Q, Kandoth C, Wendl MC, Schierding W, Koboldt DC (2012) MuSiC: identifying mutational significance in cancer genomes. Genome Res 22:1589–1598

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Flensburg C, Sargeant T, Bosma A, Kluin RJC, Kibbelaar RE, Hoogendoorn M, Alexander WS, Roberts AW, Bernards R, de Jong D et al (2017) Dynamic changes in clonal architecture during disease progression in follicular lymphoma. bioRxiv. https://doi.org/10.1101/181792

  • Forbes SA, Beare D, Gunasekaran P, Leung K, Bindal N, Boutselakis H, Ding M, Bamford S, Cole C, Ward S et al (2015) COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res 43:D805–D811

    Article  CAS  PubMed  Google Scholar 

  • Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, Jia M, Shepherd R, Leung K, Menzies A et al (2011) COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res 39:D945–D950

    Article  CAS  PubMed  Google Scholar 

  • Frigola J, Sabarinathan R, Mularoni L, Muinos F, Gonzalez-Perez A, Lopez-Bigas N (2017) Reduced mutation rate in exons due to differential mismatch repair. Nat Genet 49:1684–1692

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fu Y, Liu Z, Lou S, Bedford J, Mu XJ, Yip KY, Khurana E, Gerstein M (2014) FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biol 15:480

    Article  PubMed  PubMed Central  Google Scholar 

  • Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR (2004) A census of human cancer genes. Nat Rev Cancer 4:177–183

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E et al (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6:l1

    Article  CAS  Google Scholar 

  • Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Tamborero D, Schroeder MP, Jene-Sanz A, Santos A, Lopez-Bigas N (2013) IntOGen-mutations identifies cancer drivers across tumor types. Nat Meth 10:1081–1082

    Article  CAS  Google Scholar 

  • Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17:333–351

    Article  CAS  PubMed  Google Scholar 

  • Groschel S, Sanders MA, Hoogenboezem R, de Wit E, Bouwman BA, Erpelinck C, van der Velden VH, Havermans M, Avellino R, van Lom K et al (2014) A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157:369–381

    Article  CAS  PubMed  Google Scholar 

  • Grossman RL, Heath AP, Ferretti V, Varmus HE, Lowy DR, Kibbe WA, Staudt LM (2016) Toward a shared vision for cancer genomic data. N Engl J Med 375:1109–1112

    Article  PubMed  PubMed Central  Google Scholar 

  • Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144:646–674

    Article  CAS  PubMed  Google Scholar 

  • Hinkson IV, Davidsen TM, Klemm JD, Kerlavage AR, Kibbe WA (2017) A comprehensive infrastructure for big data in cancer research: accelerating cancer research and precision medicine. Front Cell Dev Biol 5:83

    Article  PubMed  PubMed Central  Google Scholar 

  • Horn S, Figl A, Rachakonda PS, Fischer C, Sucker A, Gast A, Kadel S, Moll I, Nagore E, Hemminki K et al (2013) TERT promoter mutations in familial and sporadic melanoma. Science 339:959–961

    Article  CAS  PubMed  Google Scholar 

  • Huang FW, Hodis E, Xu MJ, Kryukov GV, Chin L, Garraway LA (2013) Highly recurrent TERT promoter mutations in human melanoma. Science 339:957–959

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46:310–315

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protocols 4:1073–1081

    Article  CAS  PubMed  Google Scholar 

  • Lanzós A, Carlevaro-Fita J, Mularoni L, Reverter F, Palumbo E, Guigó R, Johnson R (2017) Discovery of cancer driver long noncoding RNAs across 1112 tumour genomes: new candidates and distinguishing features. Sci Rep 7:41544

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR (2014) Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505:495–501

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA et al (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499:214–218

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lochovsky L, Zhang J, Fu Y, Khurana E, Gerstein M (2015) LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res 43:8123–8134

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mansour MR, Abraham BJ, Anders L, Berezovskaya A, Gutierrez A, Durbin AD, Etchin J, Lawton L, Sallan SE, Silverman LB et al (2014) Oncogene regulation. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 346:1373–1377

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Martincorena I, Raine KM, Gerstung M, Dawson KJ, Haase K, Van Loo P, Davies H, Stratton MR, Campbell PJ (2017) Universal patterns of selection in cancer and somatic tissues. Cell 171:1029–1041

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J et al (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science 337:1190–1195

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P, Cunningham F (2016) The Ensembl variant effect predictor. Genome Biol 17:122

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mertens F, Johansson B, Fioretos T, Mitelman F (2015) The emerging complexity of gene fusions in cancer. Nat Rev Cancer 15:371

    Article  CAS  PubMed  Google Scholar 

  • Miller CA, White BS, Dees ND, Griffith M, Welch JS, Griffith OL, Vij R, Tomasson MH, Graubert TA, Walter MJ et al (2014) SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput Biol 10:e1003665

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mularoni L, Sabarinathan R, Deu-Pons J, Gonzalez-Perez A, López-Bigas N (2016) OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations. Genome Biol 17:128

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nowell PC (1976) The clonal evolution of tumor cell populations. Science 194:23–28

    Article  CAS  PubMed  Google Scholar 

  • Oesper L, Mahmoody A, Raphael BJ (2013) THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol 14:R80–R80

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Perera D, Chacon D, Thoms JA, Poulos RC, Shlien A, Beck D, Campbell PJ, Pimanda JE, Wong JW (2014) OncoCis: annotation of cis-regulatory mutations in cancer. Genome Biol 15:485

    PubMed  PubMed Central  Google Scholar 

  • Perera D, Poulos RC, Shah A, Beck D, Pimanda JE, Wong JWH (2016) Differential DNA repair underlies mutation hotspots at active promoters in cancer genomes. Nature 532:259–263

    Article  CAS  PubMed  Google Scholar 

  • Porta-Pardo E, Godzik A (2014) e-Driver: a novel method to identify protein regions driving cancer. Bioinformatics 30:3109–3114

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Poulos RC, Olivier J, Wong JWH (2017) The interaction between cytosine methylation and processes of DNA replication and repair shape the mutational landscape of cancer genomes. Nucleic Acids Res 45:7786–7795

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Poulos RC, Thoms JAI, Guan YF, Unnikrishnan A, Pimanda JE, Wong JWH (2016) Functional mutations form at CTCF-cohesin binding sites in melanoma due to uneven nucleotide excision repair across the motif. Cell Rep 17:2865–2872

    Article  CAS  PubMed  Google Scholar 

  • Poulos, R.C., Wong, J.W.H. (2017) cis-regulatory driver mutations in cancer genomes. In eLS (John Wiley & Sons, Ltd), pp. 1–10

  • Qiao Y, Quinlan AR, Jazaeri AA, Verhaak RGW, Wheeler DA, Marth GT (2014) SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization. Genome Biol 15:443

    Article  PubMed  PubMed Central  Google Scholar 

  • Rahman S, Magnussen M, León TE, Farah N, Li Z, Abraham BJ, Alapi KZ, Mitchell RJ, Naughton T, Fielding AK et al (2017) Activation of the LMO2 oncogene through a somatically acquired neomorphic promoter in T-cell acute lymphoblastic leukemia. Blood 129:3221–3226

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Reimand J, Bader GD (2013) Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Mol Syst Biol 9:637

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Reimand J, Wagih O, Bader GD (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep 3:2651

    Article  PubMed  PubMed Central  Google Scholar 

  • Rheinbay E, Parasuraman P, Grimsby J, Tiao G, Engreitz JM, Kim J, Lawrence MS, Taylor-Weiner A, Rodriguez-Cuevas S, Rosenberg M et al (2017) Recurrent and functional regulatory mutations in breast cancer. Nature 547:55–60

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ritchie GR, Dunham I, Zeggini E, Flicek P (2014) Functional annotation of noncoding sequence variants. Nat Methods 11:294–296

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Roth A, Khattra J, Yap D, Wan A, Laks E, Biele J, Ha G, Aparicio S, Bouchard-Côté A, Shah SP (2014) PyClone: statistical inference of clonal population structure in cancer. Nat Methods 11:396–398

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sabarinathan R, Mularoni L, Deu-Pons J, Gonzalez-Perez A, López-Bigas N (2016) Nucleotide excision repair is impaired by binding of transcription factors to DNA. Nature 532:264–267

    Article  CAS  PubMed  Google Scholar 

  • Sanders MA, Chew E, Flensburg C, Zeilemaker A, Miller SE, al Hinai A, Bajel A, Luiken B, Rijken M, Mclennan T et al (2017) Germline loss of MBD4 predisposes to leukaemia due to a mutagenic cascade driven by 5mC. bioRxiv. https://doi.org/10.1101/180588

  • Schmitt MW, Loeb LA, Salk JJ (2016) The influence of subclonal resistance mutations on targeted cancer therapy. Nat Rev Clin Oncol 13:335–347

    Article  CAS  PubMed  Google Scholar 

  • Schuster-Bockler B, Lehner B (2012) Chromatin organization is a major influence on regional mutation rates in human cancer cells. Nature 488:504–507

    Article  CAS  PubMed  Google Scholar 

  • Stamatoyannopoulos JA, Adzhubei I, Thurman RE, Kryukov GV, Mirkin SM, Sunyaev SR (2009) Human mutation rate associated with DNA replication timing. Nat Genet 41:393–395

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Stratton MR, Campbell PJ, Futreal PA (2009) The cancer genome. Nature 458:719–724

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Supek F, Lehner B (2015) Differential DNA mismatch repair underlies mutation rate variation across the human genome. Nature 521:81–84

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Supek F, Miñana B, Valcárcel J, Gabaldón T, Lehner B (2014) Synonymous mutations frequently act as driver mutations in human cancers. Cell 156:1324–1335

    Article  CAS  PubMed  Google Scholar 

  • Tamborero D, Gonzalez-Perez A, Lopez-Bigas N (2013a) OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29:2238–2244

    Article  CAS  PubMed  Google Scholar 

  • Tamborero D, Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Kandoth C, Reimand J, Lawrence MS, Getz G, Bader GD, Ding L et al (2013b) Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci Rep 3:2650

    Article  PubMed  PubMed Central  Google Scholar 

  • Tomasetti C, Marchionni L, Nowak MA, Parmigiani G, Vogelstein B (2015) Only three driver gene mutations are required for the development of lung and colorectal cancers. Proc Natl Acad Sci U S A 112:118–123

    Article  CAS  PubMed  Google Scholar 

  • Tomczak K, Czerwińska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19:A68–A77

    Google Scholar 

  • Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW (2013) Cancer genome landscapes. Science 339:1546–1558

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164–e164

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Waszak SM, Tiao G, Zhu B, Rausch T, Muyas F, Rodriguez-Martin B, Rabionet R, Yakneen S, Escaramis G, Li Y et al (2017) Germline determinants of the somatic mutation landscape in 2,642 cancer genomes. bioRxiv. https://doi.org/10.1101/208330

  • Yates LR, Campbell PJ (2012) Evolution of the cancer genome. Nat Rev Genet 13:795–806

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, Aas T, Alexandrov LB, Larsimont D, Davies H et al (2015) Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat Med 21:751

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang, J., Baran, J., Cros, A., Guberman, J.M., Haider, S., Hsu, J., Liang, Y., Rivkin, E., Wang, J., Whitty, B., et al. (2011) International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data. Database (Oxford) 2011: bar026

  • Zhang X, Choi PS, Francis JM, Imielinski M, Watanabe H, Cherniack AD, Meyerson M (2016) Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers. Nat Genet 48:176–182

    Article  CAS  PubMed  Google Scholar 

  • Zheng CL, Wang NJ, Chung J, Moslehi H, Sanborn JZ, Hur JS, Collisson EA, Vemula SS, Naujokas A, Chiotti KE et al (2014) Transcription restores DNA repair to heterochromatin, determining regional mutation rates in cancer genomes. Cell Rep 9:1228–1234

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rebecca C. Poulos.

Ethics declarations

Funding information

R.C.P is supported by an Australian Government Research Training Program Scholarship. J.W.H.W. is supported by an Australian Research Council Future Fellowship (FT130100096) and a National Health and Medical Research Council Project Grant (APP1119932).

Conflicts of interest

Rebecca C. Poulos declares that she has no conflict of interest. Jason W.H. Wong declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

This article is part of a Special Issue on ‘Big Data’ edited by Joshua WK Ho and Eleni Giannoulatou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poulos, R.C., Wong, J.W.H. Finding cancer driver mutations in the era of big data research. Biophys Rev 11, 21–29 (2019). https://doi.org/10.1007/s12551-018-0415-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12551-018-0415-6

Keywords

Navigation