Abstract
Whole-genome sequencing (WGS) is used to determine the genetic composition of an organism. This fast-moving field is continually evolving through technical advancements and the development of new bioinformatic tools for analyzing genomic data; however, the basic principles and processes for defining and processing high-quality genome sequence information remain unchanged. Here, we introduce some considerations and describe some commonly used bioinformatic steps for processing raw genome sequence data to generate genome assemblies through to understanding basic population genomics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gardy JL, Loman NJ (2018) Towards a genomics-informed, real-time, global pathogen surveillance system. Nat Rev Genet 19(1):9–20. https://doi.org/10.1038/nrg.2017.88
Klemm E, Dougan G (2016) Advances in understanding bacterial pathogenesis gained from whole-genome sequencing and phylogenetics. Cell Host Microbe 19(5):599–610. https://doi.org/10.1016/j.chom.2016.04.015
Bessen DE, Smeesters PR, Beall BW (2018) Molecular epidemiology, ecology, and evolution of group a streptococci. Microbiol Spectr 6(5). https://doi.org/10.1128/microbiolspec.CPP3-0009-2018
Davies MR, McIntyre L, Mutreja A et al (2019) Atlas of group a streptococcal vaccine candidates compiled using large-scale comparative genomics. Nat Genet 51(6):1035–1043. https://doi.org/10.1038/s41588-019-0417-8
Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068–2069. https://doi.org/10.1093/bioinformatics/btu153
Croucher NJ, Page AJ, Connor TR et al (2015) Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 43(3):e15. https://doi.org/10.1093/nar/gku1196
Mostowy R, Croucher NJ, Andam CP et al (2017) Efficient inference of recent and ancestral recombination within bacterial populations. Mol Biol Evol 34(5):1167–1182. https://doi.org/10.1093/molbev/msx066
Didelot X, Wilson DJ (2015) ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 11(2):e1004041. https://doi.org/10.1371/journal.pcbi.1004041
Lees JA, Harris SR, Tonkin-Hill G et al (2019) Fast and flexible bacterial genomic epidemiology with PopPUNK. Genome Res 29(2):304–316. https://doi.org/10.1101/gr.241455.118
Wood DE, Salzberg SL (2014) Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15(3):R46. https://doi.org/10.1186/gb-2014-15-3-r46
Bankevich A, Nurk S, Antipov D et al (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19(5):455–477. https://doi.org/10.1089/cmb.2012.0021
Wick RR, Schultz MB, Zobel J et al (2015) Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31(20):3350–3352. https://doi.org/10.1093/bioinformatics/btv383
Wick RR, Judd LM, Gorrie CL et al (2017) Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13(6):e1005595. https://doi.org/10.1371/journal.pcbi.1005595
Kapatai G, Coelho J, Platt S et al (2017) Whole genome sequencing of group a streptococcus: development and evaluation of an automated pipeline for emmgene typing. PeerJ 5:e3226. https://doi.org/10.7717/peerj.3226
Arndt D, Grant JR, Marcu A et al (2016) PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res 44(W1):W16–W21. https://doi.org/10.1093/nar/gkw387
Liu M, Li X, Xie Y et al (2019) ICEberg 2.0: an updated database of bacterial integrative and conjugative elements. Nucleic Acids Res 47(D1):D660–D665. https://doi.org/10.1093/nar/gky1123
Hunt M, Mather AE, Sanchez-Buso L et al (2017) ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. Microb Genom 3(10):e000131. https://doi.org/10.1099/mgen.0.000131
Brynildsrud O, Bohlin J, Scheffer L et al (2016) Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. Genome Biol 17(1):238. https://doi.org/10.1186/s13059-016-1108-8
Thorpe HA, Bayliss SC, Sheppard SK et al (2018) Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria. Gigascience 7(4):1–11. https://doi.org/10.1093/gigascience/giy015
Page AJ, Taylor B, Delaney AJ et al (2016) SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb Genom 2(4):e000056. https://doi.org/10.1099/mgen.0.000056
Nguyen LT, Schmidt HA, von Haeseler A et al (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32(1):268–274. https://doi.org/10.1093/molbev/msu300
Ondov BD, Treangen TJ, Melsted P et al (2016) Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17(1):132. https://doi.org/10.1186/s13059-016-0997-x
Acknowledgments
This work was supported by NHMRC project grants (#1130455, #1165876 and #1098319). S.Y.C.T. is an Australian National Health and Medical Research Council (NHMRC) Career Development Fellow (#1145033). M.R.D is an University of Melbourne C.R. Roper Fellow.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Lacey, J.A., James, T.B., Tong, S.Y.C., Davies, M.R. (2020). Whole Genome Sequence Analysis and Population Genomics of Group A Streptococci. In: Proft, T., Loh, J. (eds) Group A Streptococcus. Methods in Molecular Biology, vol 2136. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0467-0_7
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0467-0_7
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0466-3
Online ISBN: 978-1-0716-0467-0
eBook Packages: Springer Protocols