Skip to main content

Computational Methods for the Discovery and Annotation of Viral Integrations

  • Protocol
  • First Online:
Book cover piRNA

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2509))

Abstract

The transfer of genetic material between viruses and eukaryotic cells is pervasive. Somatic integrations of DNA viruses and retroviruses have been linked to persistent viral infection and genotoxic effects. Integrations into germline cells, referred to as Endogenous Viral Elements (EVEs), can be co-opted for host functions. Besides DNA viruses and retroviruses, EVEs can also derive from nonretroviral RNA viruses, which have often been observed in piRNA clusters. Here, we describe a bioinformatic framework to annotate EVEs in a genome assembly, study their widespread occurrence and polymorphism and identify sample-specific viral integrations using whole genome sequencing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Soucy SM, Huang J, Gogarten JP (2015) Horizontal gene transfer: building the web of life. Nat Rev Genet 16:472–482. https://doi.org/10.1038/nrg3962

    Article  CAS  PubMed  Google Scholar 

  2. Keeling PJ, Palmer JD (2008) Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet 9:605–618. https://doi.org/10.1038/nrg2386

    Article  CAS  PubMed  Google Scholar 

  3. Chen Y, Williams V, Filippova M et al (2014) Viral carcinogenesis: factors inducing DNA damage and virus integration. Cancers 6:2155–2186. https://doi.org/10.3390/cancers6042155

    Article  PubMed  PubMed Central  Google Scholar 

  4. Frank JA, Feschotte C (2017) Co-option of endogenous viral sequences for host cell function. Curr Opin Virol 25:81–89. https://doi.org/10.1016/j.coviro.2017.07.021

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Dheilly NM, Adema C, Raftos DA et al (2014) No more non-model species: the promise of next generation sequencing for comparative immunology. Dev Comp Immunol 45:56–66. https://doi.org/10.1016/j.dci.2014.01.022

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Blair CD, Olson KE, Bonizzoni M (2020) The widespread occurrence and potential biological roles of endogenous viral elements in insect genomes. Curr Issues Mol Biol 34:13–30. https://doi.org/10.21775/cimb.034.013

    Article  PubMed  Google Scholar 

  7. ter Horst AM, Nigg JC, Dekker FM, Falk BW (2019) Endogenous viral elements are widespread in arthropod genomes and commonly give rise to PIWI-interacting RNAs. J Virol 93:e02124-18. https://doi.org/10.1128/JVI.02124-18

    Article  PubMed  PubMed Central  Google Scholar 

  8. Kryukov K, Ueda MT, Imanishi T, Nakagawa S (2019) Systematic survey of non-retroviral virus-like elements in eukaryotic genomes. Virus Res 262:30–36. https://doi.org/10.1016/j.virusres.2018.02.002

    Article  CAS  PubMed  Google Scholar 

  9. Horie M, Honda T, Suzuki Y et al (2010) Endogenous non-retroviral RNA virus elements in mammalian genomes. Nature 463:84–87. https://doi.org/10.1038/nature08695

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Palatini U, Miesen P, Carballar-Lejarazu R et al (2017) Comparative genomics shows that viral integrations are abundant and express piRNAs in the arboviral vectors Aedes aegypti and Aedes albopictus. BMC Genomics 18:1–15. https://doi.org/10.1186/s12864-017-3903-3

    Article  CAS  Google Scholar 

  11. Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2

    Article  CAS  PubMed  Google Scholar 

  12. Forster M, Szymczak S, Ellinghaus D et al (2015) Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data. Sci Rep 5:11534. https://doi.org/10.1038/srep11534

    Article  PubMed  PubMed Central  Google Scholar 

  13. Pischedda E, Crava C, Carlassara M et al (2021) ViR: a tool to solve intrasample variability in the prediction of viral integration sites using whole genome sequencing data. BMC Bioinformatics 22:1–15. https://doi.org/10.1186/s12859-021-03980-5

    Article  CAS  Google Scholar 

  14. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Mass Genomics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  CAS  Google Scholar 

  15. Robinson JT, Thorvaldsdóttir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60. https://doi.org/10.1038/nmeth.3176

    Article  CAS  PubMed  Google Scholar 

  18. Whitfield ZJ, Dolan PT, Kunitomi M et al (2017) The diversity, structure, and function of heritable adaptive immunity sequences in the Aedes aegypti genome. Curr Biol 27:3511–3519.e7. https://doi.org/10.1016/j.cub.2017.09.067

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. https://doi.org/10.1093/bioinformatics/btq033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Kitson E, Suttle CA (2019) VHost-classifier: virus-host classification using natural language processing. Bioinformatics 35:3867–3869. https://doi.org/10.1093/bioinformatics/btz151

    Article  CAS  PubMed  Google Scholar 

  21. Shen W, Xiong J (2021) TaxonKit: A practical and efficient NCBI taxonomy toolkit. J Genet Genomics 48(9):844–850. https://doi.org/10.1016/j.jgg.2021.03.006

  22. McKenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Rimmer A, Phan H, Mathieson I et al (2014) Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet 46:912–918. https://doi.org/10.1038/ng.3036

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907 [q-bio.GN]

    Google Scholar 

  25. Lai Z, Markovets A, Ahdesmaki M et al (2016) VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res 44:1–11. https://doi.org/10.1093/nar/gkw227

    Article  CAS  Google Scholar 

  26. Danecek P, McCarthy SA (2017) BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33:2037–2039. https://doi.org/10.1093/bioinformatics/btx100

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Pischedda E, Scolari F, Valerio F et al (2019) Insights into an unexplored component of the mosquito repeatome: distribution and variability of viral sequences integrated into the genome of the arboviral vector Aedes albopictus. Front Genet 10:93. https://doi.org/10.3389/fgene.2019.00093

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Kent JK (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. https://doi.org/10.1101/gr.229202

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. https://doi.org/10.1038/nbt.1883

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Chen S, Senar MA (2019) Exploring efficient data parallelism for genome read mapping on multicore and manycore architectures. Parallel Comput 87:11–24. https://doi.org/10.1016/j.parco.2019.04.014

    Article  Google Scholar 

  32. Kondo H, Hirano S, Chiba S et al (2013) Characterization of burdock mottle virus, a novel member of the genus Benyvirus, and the identification of benyvirus-related sequences in the plant and insect genomes. Virus Res 177:75–86. https://doi.org/10.1016/j.virusres.2013.07.015

    Article  CAS  PubMed  Google Scholar 

  33. Aguiar ERGR, de Almeida JPP, Queiroz LR et al (2020) A single unidirectional piRNA cluster similar to the flamenco locus is the major source of EVE-derived transcription and small RNAs in Aedes aegypti mosquitoes. RNA 26:581–594. https://doi.org/10.1261/rna.073965.119

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Fort P, Albertini A, Van-Hua A et al (2012) Fossil rhabdoviral sequences integrated into arthropod genomes: ontogeny, evolution, and potential functionality. Mol Biol Evol 29:381–390. https://doi.org/10.1093/molbev/msr226

    Article  CAS  PubMed  Google Scholar 

  35. Katzourakis A, Gifford RJ (2010) Endogenous viral elements in animal genomes. PLoS Genet 6:e1001191. https://doi.org/10.1371/journal.pgen.1001191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chen X, Kost J, Li D (2019) Comprehensive comparative analysis of methods and software for identifying viral integrations. Brief Bioinform 20:2088–2097. https://doi.org/10.1093/bib/bby070

    Article  CAS  PubMed  Google Scholar 

  37. Gafni E, Luquette LJ, Lancaster AK et al (2014) COSMOS: Python library for massively parallel workflows. Bioinformatics 30:2956–2958. https://doi.org/10.1093/bioinformatics/btu385

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kämpf C, Specht M, Scholz A et al (2019) uap: reproducible and robust HTS data analysis. BMC Bioinformatics 20:664. https://doi.org/10.1186/s12859-019-3219-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Morandi E, Cereda M, Incarnato D et al (2019) HaTSPiL: a modular pipeline for high-throughput sequencing data analysis. PLoS One 14:e0222512. https://doi.org/10.1371/journal.pone.0222512

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Li ITS, Shum W, Truong K (2007) 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA). BMC Bioinformatics 8:185. https://doi.org/10.1186/1471-2105-8-185

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariangela Bonizzoni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Palatini, U., Pischedda, E., Bonizzoni, M. (2022). Computational Methods for the Discovery and Annotation of Viral Integrations. In: Parrish, N.F., Iwasaki, Y.W. (eds) piRNA. Methods in Molecular Biology, vol 2509. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2380-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2380-0_18

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2379-4

  • Online ISBN: 978-1-0716-2380-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics