Skip to main content

Accurate Ensemble Prediction of Somatic Mutations with SMuRF2

  • Protocol
  • First Online:
Variant Calling

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2493))

Abstract

Accurate identification of somatic mutations is crucial for discovery and identification of driver mutations in cancer tumors. Here, we describe the updated Somatic Mutation calling method using a Random Forest (SMuRF2), an ensemble method that combines the predictions and auxiliary features from individual mutation callers using supervised machine learning. SMuRF2 provides an efficient workflow to predict both somatic point mutations (SNVs) and small insertions/deletions (indels) in cancer genomes and exomes. We describe the latest method and provide a detailed tutorial for running SMuRF2.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hanahan D, Weinberg Robert A (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674. https://doi.org/10.1016/j.cell.2011.02.013

    Article  CAS  PubMed  Google Scholar 

  2. Huang W, Guo YA, Muthukumar K, Baruah P, Chang MM, Skanderup AJ (2019) SMuRF: portable and accurate ensemble prediction of somatic mutations. Bioinformatics (Oxford, England) 35(17):3157–3159. https://doi.org/10.1093/bioinformatics/btz018

    Article  CAS  Google Scholar 

  3. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213. https://doi.org/10.1038/nbt.2514

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lai Z, Markovets A, Ahdesmaki M, Chapman B, Hofmann O, McEwen R, Johnson J, Dougherty B, Barrett JC, Dry JR (2016) VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res 44(11):e108. https://doi.org/10.1093/nar/gkw227

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576. https://doi.org/10.1101/gr.129684.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Huang W, Guo YA, Chang MM, Skanderup AJ (2020) Ensemble-based somatic mutation calling in cancer genomes. In: Boegel S (ed) Bioinformatics for cancer immunotherapy: methods and protocols. Springer, US, New York, NY, pp 37–46. https://doi.org/10.1007/978-1-0716-0327-7_3

    Chapter  Google Scholar 

  7. Kim S, Scheffler K, Halpern AL, Bekritsky MA, Noh E, Källberg M, Chen X, Kim Y, Beyter D, Krusche P, Saunders CT (2018) Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods 15(8):591–594. https://doi.org/10.1038/s41592-018-0051-x

    Article  CAS  PubMed  Google Scholar 

  8. Ewing AD, Houlahan KE, Hu Y, Ellrott K, Caloian C, Yamaguchi TN, Bare JC, P’ng C, Waggott D, Sabelnykova VY, participants I-TDSMCC, Xi L, Dewal N, Fan Y, Wang W, Wheeler D, Wilm A, Ting GH, Li C, Bertrand D, Nagarajan N, Chen Q-R, Hsu C-H, Hu Y, Yan C, Kibbe W, Meerzaman D, Cibulskis K, Rosenberg M, Bergelson L, Kiezun A, Radenbaugh A, Sertier A-S, Ferrari A, Tonton L, Bhutani K, Hansen NF, Wang D, Song L, Lai Z, Liao Y, Shi W, Carbonell-Caballero J, Dopazo J, CCK L, Guinney J, Kellen MR, Norman TC, Haussler D, Friend SH, Stolovitzky G, Margolin AA, Stuart JM, Boutros PC (2015) Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. Nat Methods 12:623. https://doi.org/10.1038/nmeth.3407

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6(2):80–92. https://doi.org/10.4161/fly.19695

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weitai Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Huang, W., Sim, N.L., Skanderup, A.J. (2022). Accurate Ensemble Prediction of Somatic Mutations with SMuRF2. In: Ng, C., Piscuoglio, S. (eds) Variant Calling. Methods in Molecular Biology, vol 2493. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2293-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2293-3_4

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2292-6

  • Online ISBN: 978-1-0716-2293-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics