Skip to main content

Advertisement

Log in

On the Identification of Clinically Relevant Bacterial Amino Acid Changes at the Whole Genome Level Using Auto-PSS-Genome

  • Original research article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

The identification of clinically relevant bacterial amino acid changes can be performed using different methods aimed at the identification of genes showing positively selected amino acid sites (PSS). Nevertheless, such analyses are time consuming, and the frequency of genes showing evidence for PSS can be low. Therefore, the development of a pipeline that allows the quick and efficient identification of the set of genes that show PSS is of interest. Here, we present Auto-PSS-Genome, a Compi-based pipeline distributed as a Docker image, that automates the process of identifying genes that show PSS using three different methods, namely codeML, FUBAR, and omegaMap. Auto-PSS-Genome accepts as input a set of FASTA files, one per genome, containing all coding sequences, thus minimizing the work needed to conduct positively selected sites analyses. The Auto-PSS-Genome pipeline identifies orthologous gene sets and corrects for multiple possible problems in input FASTA files that may prevent the automated identification of genes showing PSS. A FASTA file containing all coding sequences can also be given as an external global reference, thus easing the comparison of results across species, when gene names are different. In this work, we use Auto-PSS-Genome to analyse Mycobacterium leprae (that causes leprosy), and the closely related species M. haemophilum, that mainly causes ulcerating skin infections and arthritis in persons who are severely immunocompromised, and in children causes cervical and perihilar lymphadenitis. The genes identified in these two species as showing PSS may be those that are partially responsible for virulence and resistance to drugs.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.sing-group.org/compihub/explore/5faa52ccf05e940c9c2762e4.

  2. https://github.com/pegi3s/auto-pss-genome.

  3. https://hub.docker.com/r/pegi3s/auto-pss-genome.

  4. http://sing-group.org/compihub/explore/5f588ccb407682001ad3a1d5.

  5. https://github.com/pegi3s/check-cds.

  6. https://hub.docker.com/r/pegi3s/check-cds.

  7. https://sing-group.org/compihub/explore/5fa91806407682001ad3a1e9.

  8. https://www.maths.otago.ac.nz/~dbryant/software/PhiPack.tar.

  9. https://github.com/pegi3s/ipssa.

  10. https://hub.docker.com/r/pegi3s/ipssa.

  11. https://pegi3s.github.io/dockerfiles/.

References

  1. Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics 13:555–556. https://doi.org/10.1093/bioinformatics/13.5.555

    Article  CAS  Google Scholar 

  2. Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky Pond SL, Scheffler K (2013) FUBAR: a fast, Unconstrained Bayesian AppRoximation for inferring selection. Mol Biol Evol 30:1196–1205. https://doi.org/10.1093/molbev/mst030

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Wilson DJ, McVean G (2006) Estimating diversifying selection and functional constraint in the presence of recombination. Genetics 172:1411–1425. https://doi.org/10.1534/genetics.105.044917

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. López-Fernández H, Duque P, Vázquez N, Fdez-Riverola F, Reboiro-Jato M, Vieira CP, Vieira J (2020) Inferring positive selection in large viral datasets. In: Fdez-Riverola F, Rocha M, Mohamad MS, Zaki N, Castellanos-Garzón JA (eds) Practical applications of computational biology and bioinformatics, 13th international conference. Springer, Cham, pp 61–69. https://doi.org/10.1007/978-3-030-23873-5_8

    Chapter  Google Scholar 

  5. López-Fernández H, Vieira CP, Fdez-Riverola F, Reboiro-Jato M, Vieira J (2021) Inferences on Mycobacterium Leprae host immune response escape and antibiotic resistance using genomic data and GenomeFastScreen. In: Panuccio G, Rocha M, Fdez-Riverola F, Mohamad MS, Casado-Vara R (eds) Practical applications of computational biology and bioinformatics, 14th international conference (PACBB 2020). Springer, Cham, pp 42–50. https://doi.org/10.1007/978-3-030-54568-0_5

    Chapter  Google Scholar 

  6. Osório NS, Rodrigues F, Gagneux S, Pedrosa J, Pinto-Carbó M, Castro AG, Young D, Comas I, Saraiva M (2013) Evidence for diversifying selection in a set of mycobacterium tuberculosis genes in response to antibiotic- and nonantibiotic-related pressure. Mol Biol Evol 30:1326–1336. https://doi.org/10.1093/molbev/mst038

    Article  CAS  PubMed  Google Scholar 

  7. Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Vieira CP, Fonseca NA, Vieira J (2012) ADOPS—automatic detection of positively selected sites. J Integr Bioinform 9:200. https://doi.org/10.2390/biecoll-jib-2012-200

    Article  PubMed  Google Scholar 

  8. Lindeboom JA, van Coppenraet LESB, van Soolingen D, Prins JM, Kuijper EJ (2011) Clinical manifestations, diagnosis, and treatment of Mycobacterium haemophilum infections. Clin Microbiol Revi 24:701–717. https://doi.org/10.1128/CMR.00020-11

    Article  Google Scholar 

  9. Pin D, Guérin-Faublée V, Garreau V, Breysse F, Dumitrescu O, Flandrois J-P, Lina G (2012) Mycobacterium species related to M. leprae and M. lepromatosis from cows with bovine nodular thelitis. Emerg Infect Dis 20:2111–2114. https://doi.org/10.3201/eid2012.140184

    Article  CAS  Google Scholar 

  10. Sievers F, Higgins DG (2018) Clustal omega for making accurate alignments of many protein sequences: clustal omega for many protein sequences. Protein Sci 27:135–145. https://doi.org/10.1002/pro.3290

    Article  CAS  PubMed  Google Scholar 

  11. Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G, Aury J-M, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes M-C, Crouzillat D, Silva CD, Daddiego L, Bellis FD, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joët T, Labadie K, Lan T, Leclercq J, Lepelley M, Leroy T, Li L-T, Librado P, Lopez L, Muñoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono A, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, de Kochko A, Graziosi G, Henry RJ, Jayarama S, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P (2014) The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345:1181–1184. https://doi.org/10.1126/science.1255274

    Article  CAS  PubMed  Google Scholar 

  12. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542. https://doi.org/10.1093/sysbio/sys029

    Article  PubMed  PubMed Central  Google Scholar 

  13. Glez-Peña D, Gómez-Blanco D, Reboiro-Jato M, Fdez-Riverola F, Posada D (2010) ALTER: program-oriented conversion of DNA and protein alignments. Nucleic Acids Res 38:W14-18. https://doi.org/10.1093/nar/gkq321

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Shen W, Le S, Li Y, Hu F (2016) SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11:e0163962. https://doi.org/10.1371/journal.pone.0163962

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. López-Fernández H, Duque P, Henriques S, Vázquez N, Fdez-Riverola F, Vieira CP, Reboiro-Jato M, Vieira J (2019) Bioinformatics protocols for quickly obtaining large-scale data sets for phylogenetic inferences. Interdiscip Sci Comput Life Sci 11:1–9. https://doi.org/10.1007/s12539-018-0312-5

    Article  CAS  Google Scholar 

  16. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. https://doi.org/10.1093/nar/gkh340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Shimono N, Morici L, Casali N, Cantrell S, Sidders B, Ehrt S, Riley LW (2003) Hypervirulent mutant of Mycobacterium tuberculosis resulting from disruption of the mce1 operon. Proc Natl Acad Sci 100:15918–15923. https://doi.org/10.1073/pnas.2433882100

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Demangel C, Brodin P, Cockle PJ, Brosch R, Majlessi L, Leclerc C, Cole ST (2004) Cell envelope protein PPE68 contributes to Mycobacterium tuberculosis RD1 Immunogenicity Independently of a 10-kilodalton culture filtrate protein and ESAT-6. Infect Immun 72:2170–2176. https://doi.org/10.1128/IAI.72.4.2170-2176.2004

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Squeglia F, Romano M, Ruggiero A, Vitagliano L, De Simone A, Berisio R (2013) Carbohydrate recognition by RpfB from Mycobacterium tuberculosis unveiled by crystallographic and molecular dynamics analyses. Biophys J 104:2530–2539. https://doi.org/10.1016/j.bpj.2013.04.040

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Thakur RS, Basavaraju S, Somyajit K, Jain A, Subramanya S, Muniyappa K, Nagaraju G (2013) Evidence for the role of Mycobacterium tuberculosis RecG helicase in DNA repair and recombination. FEBS J 280:1841–1860. https://doi.org/10.1111/febs.12208

    Article  CAS  PubMed  Google Scholar 

  21. Li C, Li Q, Zhang Y, Gong Z, Ren S, Li P, Xie J (2017) Characterization and function of Mycobacterium tuberculosis H37Rv Lipase Rv1076 (LipU). Microbiol Res 196:7–16. https://doi.org/10.1016/j.micres.2016.12.005

    Article  CAS  PubMed  Google Scholar 

  22. Ren H, Liu J (2006) AsnB is involved in natural resistance of Mycobacterium smegmatis to multiple drugs. AAC 50:250–255. https://doi.org/10.1128/AAC.50.1.250-255.2006

    Article  CAS  Google Scholar 

  23. Brown AC, Parish T (2008) Dxr is essential in Mycobacterium tuberculosis and fosmidomycin resistance is due to a lack of uptake. BMC Microbiol 8:78. https://doi.org/10.1186/1471-2180-8-78

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Virulence attenuation of two Mas-like polyketide synthase mutants of Mycobacterium tuberculosis | Microbiology Society. https://www.microbiologyresearch.org/content/journal/micro/10.1099/mic.0.26278-0. Accessed 13 Nov 2020

  25. Koster K, Largen A, Foster JT, Drees KP, Qian L, Desmond EP, Wan X, Hou S, Douglas JT (2018) Whole genome SNP analysis suggests unique virulence factor differences of the Beijing and Manila families of Mycobacterium tuberculosis found in Hawaii. PLoS ONE 13:e0201146. https://doi.org/10.1371/journal.pone.0201146

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Starks AM, Gumusboga A, Plikaytis BB, Shinnick TM, Posey JE (2009) Mutations at embB Codon 306 are an important molecular indicator of ethambutol resistance in Mycobacterium tuberculosis. AAC 53:1061–1066. https://doi.org/10.1128/AAC.01357-08

    Article  CAS  Google Scholar 

  27. Chen JM, Zhang M, Rybniker J, Boy-Röttger S, Dhar N, Pojer F, Cole ST (2013) Mycobacterium tuberculosis EspB binds phospholipids and mediates EsxA-independent virulence. Mol Microbiol 89:1154–1166. https://doi.org/10.1111/mmi.12336

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was funded by National Funds through FCT—Fundação para a Ciência e a Tecnologia, I.P., under the project UIDB/04293/2020. The SING group thanks the CITI (Centro de Investigación, Transferencia e Innovación) from the University of Vigo for hosting its IT infrastructure. This work was partially supported by the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) under the scope of the strategic funding ED431C2018/55-GRC Competitive Reference Group.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Vieira.

Ethics declarations

Conflict of interests

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 617 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

López-Fernández, H., Vieira, C.P., Ferreira, P. et al. On the Identification of Clinically Relevant Bacterial Amino Acid Changes at the Whole Genome Level Using Auto-PSS-Genome. Interdiscip Sci Comput Life Sci 13, 334–343 (2021). https://doi.org/10.1007/s12539-021-00439-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-021-00439-2

Keywords

Navigation