The microbial pan-genome
Section snippets
Introduction to the pan-genome
Ten years after the first sequence of a free-living organism was revealed, public databases contain 239 complete bacterial genomes. However, as shown in Table 1, in 83% and 8% of the cases, only one or two genomes per bacterial species have been sequenced, respectively. In a recent work [1••], eight genomes representative of the serogroup (see Glossary) diversity among group B Streptococcus (GBS) strains were analyzed to answer the question of how many genomes are needed to fully describe a
A large microbial gene pool driving evolution
Much indirect evidence had already hinted at the concept of the pan-genome, even before it was properly defined by mathematical quantification [1••]. Several studies of subtractive hybridization and comparative genome hybridization (CGH) using multiple isolates of the same species had shown that bacterial species such as Helicobacter pylori, Staphylococcus aureus and Escherichia coli display an extensive genetic diversity, with an average of 20–35% of genes being specific for a single strain [2
Core and dispensable genes
In general, the core genome includes all genes responsible for the basic aspects of the biology of a species and its major phenotypic traits. By contrast, dispensable genes contribute to the species diversity and might encode supplementary biochemical pathways and functions that are not essential for bacterial growth but which confer selective advantages, such as adaptation to different niches, antibiotic resistance, or colonization of a new host. Such genes are generally clustered on large
Serotypes and sequence types do not correlate with genomic diversity
Classical methods to catalogue bacterial species are based on knowledge convenient phenotypic traits. The most popular is the agglutination of bacterial cells by specific antisera against the capsular polysaccharide surrounding many pathogens. For a variety of encapsulated bacteria, this method has been widely used for epidemiology studies and vaccine design, assuming that all strains belonging to the same serogroup are similar. More recently, techniques such as multilocus enzyme
Challenging the concept of species
Species can have an open or a closed pan-genome. An open pan-genome is typical of those species that colonize multiple environments and have multiple ways of exchanging genetic material. Streptococci, Meningococci, H. pylori, Salmonellae and E. coli have these properties and are likely to have an open pan-genome. By contrast, other species such as B. anthracis, Mycobacterium tuberculosis and Chlamydia trachomatis, which are known to be more conserved, live in isolated niches with limited access
Conclusions and practical implications
The need to sequence multiple genomes from each species to better understand the diversity of bacterial species is not just a theoretical exercise. Recently, it has been shown that the design of a universal vaccine against GBS was only possible using dispensable genes [26•]. In addition, sequencing of multiple genomes was instrumental in discovering the presence of the pilus in GBS, group A Streptococcus, and Pneumococcus, an essential virulence factor that had been missed by all conventional
References and recommended reading
Papers of particular interest, published within the annual period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
The authors would like to thank Michael Cieslewicz, Antonello Covacci and John Telford for their contribution to the pan-genome concept. They are also grateful to the IRIS Bioinformatic group, to the TIGR information technology and database server groups led by Vadim Sapiro and Michael Heaney, respectively, and to Giorgio Corsi for artwork.
Glossary
- Core genome
- The pool of genes shared by all the strains of the same bacterial species.
- Dispensable genome
- The pool of genes present in some — but not all — strains of the same bacterial species.
- Lateral gene transfer
- Mechanism by which an individual of one species transfers genetic material (i.e. DNA) to an individual of a different species.
- Pan-genome
- The global gene repertoire of a bacterial species: core genome + dispensable genome.
- Serogroup
- Group of related bacterial strains characterized by the
References (26)
- et al.
Lateral gene transfer: when will adolescence end?
Mol Microbiol
(2003) - Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, Angiuoli SV, Crabtree J, Jones A, Durkin AS et...
- et al.
Evolutionary genomics of Staphylococcus aureus: insights into the origin of methicillin-resistant strains and the toxic shock syndrome epidemic
Proc Natl Acad Sci USA
(2001) - et al.
Whole genome comparison of Campylobacter jejuni human isolates using a low-cost microarray reveals extensive genetic diversity
Genome Res
(2001) - et al.
Extensive genomic diversity in pathogenic Escherichia coli and Shigella strains revealed by comparative genomic hybridization microarray
J Bacteriol
(2004) - et al.
Environmental genome shotgun sequencing of the Sargasso sea
Science
(2004) - et al.
Diversity of the human intestinal microbial flora
Science
(2005) - et al.
Computational improvements reveal great bacterial diversity and high metal toxicity in soil
Science
(2005) - et al.
Analysis of bacterial communities in heavy metal-contaminated soils at different levels of resolution
FEMS Microbiol Ecol
(1999) - et al.
Exploring microbial diversity — a vast below
Science
(2005)
Genotypic diversity within a natural coastal bacterioplankton population
Science
Horizontal gene transfer: a critical view
Proc Natl Acad Sci USA
Horizontal gene transfer: the path to maturity
Mol Microbiol
Cited by (927)
Integrated genomics provides insights into the evolution of the polyphosphate accumulation trait of Ca. Accumulibacter
2024, Environmental Science and Ecotechnology