Abstract
The transfer of genetic material between viruses and eukaryotic cells is pervasive. Somatic integrations of DNA viruses and retroviruses have been linked to persistent viral infection and genotoxic effects. Integrations into germline cells, referred to as Endogenous Viral Elements (EVEs), can be co-opted for host functions. Besides DNA viruses and retroviruses, EVEs can also derive from nonretroviral RNA viruses, which have often been observed in piRNA clusters. Here, we describe a bioinformatic framework to annotate EVEs in a genome assembly, study their widespread occurrence and polymorphism and identify sample-specific viral integrations using whole genome sequencing data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Soucy SM, Huang J, Gogarten JP (2015) Horizontal gene transfer: building the web of life. Nat Rev Genet 16:472–482. https://doi.org/10.1038/nrg3962
Keeling PJ, Palmer JD (2008) Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet 9:605–618. https://doi.org/10.1038/nrg2386
Chen Y, Williams V, Filippova M et al (2014) Viral carcinogenesis: factors inducing DNA damage and virus integration. Cancers 6:2155–2186. https://doi.org/10.3390/cancers6042155
Frank JA, Feschotte C (2017) Co-option of endogenous viral sequences for host cell function. Curr Opin Virol 25:81–89. https://doi.org/10.1016/j.coviro.2017.07.021
Dheilly NM, Adema C, Raftos DA et al (2014) No more non-model species: the promise of next generation sequencing for comparative immunology. Dev Comp Immunol 45:56–66. https://doi.org/10.1016/j.dci.2014.01.022
Blair CD, Olson KE, Bonizzoni M (2020) The widespread occurrence and potential biological roles of endogenous viral elements in insect genomes. Curr Issues Mol Biol 34:13–30. https://doi.org/10.21775/cimb.034.013
ter Horst AM, Nigg JC, Dekker FM, Falk BW (2019) Endogenous viral elements are widespread in arthropod genomes and commonly give rise to PIWI-interacting RNAs. J Virol 93:e02124-18. https://doi.org/10.1128/JVI.02124-18
Kryukov K, Ueda MT, Imanishi T, Nakagawa S (2019) Systematic survey of non-retroviral virus-like elements in eukaryotic genomes. Virus Res 262:30–36. https://doi.org/10.1016/j.virusres.2018.02.002
Horie M, Honda T, Suzuki Y et al (2010) Endogenous non-retroviral RNA virus elements in mammalian genomes. Nature 463:84–87. https://doi.org/10.1038/nature08695
Palatini U, Miesen P, Carballar-Lejarazu R et al (2017) Comparative genomics shows that viral integrations are abundant and express piRNAs in the arboviral vectors Aedes aegypti and Aedes albopictus. BMC Genomics 18:1–15. https://doi.org/10.1186/s12864-017-3903-3
Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Forster M, Szymczak S, Ellinghaus D et al (2015) Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data. Sci Rep 5:11534. https://doi.org/10.1038/srep11534
Pischedda E, Crava C, Carlassara M et al (2021) ViR: a tool to solve intrasample variability in the prediction of viral integration sites using whole genome sequencing data. BMC Bioinformatics 22:1–15. https://doi.org/10.1186/s12859-021-03980-5
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Mass Genomics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Robinson JT, Thorvaldsdóttir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26
Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421
Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60. https://doi.org/10.1038/nmeth.3176
Whitfield ZJ, Dolan PT, Kunitomi M et al (2017) The diversity, structure, and function of heritable adaptive immunity sequences in the Aedes aegypti genome. Curr Biol 27:3511–3519.e7. https://doi.org/10.1016/j.cub.2017.09.067
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. https://doi.org/10.1093/bioinformatics/btq033
Kitson E, Suttle CA (2019) VHost-classifier: virus-host classification using natural language processing. Bioinformatics 35:3867–3869. https://doi.org/10.1093/bioinformatics/btz151
Shen W, Xiong J (2021) TaxonKit: A practical and efficient NCBI taxonomy toolkit. J Genet Genomics 48(9):844–850. https://doi.org/10.1016/j.jgg.2021.03.006
McKenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110
Rimmer A, Phan H, Mathieson I et al (2014) Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet 46:912–918. https://doi.org/10.1038/ng.3036
Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907 [q-bio.GN]
Lai Z, Markovets A, Ahdesmaki M et al (2016) VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res 44:1–11. https://doi.org/10.1093/nar/gkw227
Danecek P, McCarthy SA (2017) BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33:2037–2039. https://doi.org/10.1093/bioinformatics/btx100
Pischedda E, Scolari F, Valerio F et al (2019) Insights into an unexplored component of the mosquito repeatome: distribution and variability of viral sequences integrated into the genome of the arboviral vector Aedes albopictus. Front Genet 10:93. https://doi.org/10.3389/fgene.2019.00093
Kent JK (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. https://doi.org/10.1101/gr.229202
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. https://doi.org/10.1038/nbt.1883
Chen S, Senar MA (2019) Exploring efficient data parallelism for genome read mapping on multicore and manycore architectures. Parallel Comput 87:11–24. https://doi.org/10.1016/j.parco.2019.04.014
Kondo H, Hirano S, Chiba S et al (2013) Characterization of burdock mottle virus, a novel member of the genus Benyvirus, and the identification of benyvirus-related sequences in the plant and insect genomes. Virus Res 177:75–86. https://doi.org/10.1016/j.virusres.2013.07.015
Aguiar ERGR, de Almeida JPP, Queiroz LR et al (2020) A single unidirectional piRNA cluster similar to the flamenco locus is the major source of EVE-derived transcription and small RNAs in Aedes aegypti mosquitoes. RNA 26:581–594. https://doi.org/10.1261/rna.073965.119
Fort P, Albertini A, Van-Hua A et al (2012) Fossil rhabdoviral sequences integrated into arthropod genomes: ontogeny, evolution, and potential functionality. Mol Biol Evol 29:381–390. https://doi.org/10.1093/molbev/msr226
Katzourakis A, Gifford RJ (2010) Endogenous viral elements in animal genomes. PLoS Genet 6:e1001191. https://doi.org/10.1371/journal.pgen.1001191
Chen X, Kost J, Li D (2019) Comprehensive comparative analysis of methods and software for identifying viral integrations. Brief Bioinform 20:2088–2097. https://doi.org/10.1093/bib/bby070
Gafni E, Luquette LJ, Lancaster AK et al (2014) COSMOS: Python library for massively parallel workflows. Bioinformatics 30:2956–2958. https://doi.org/10.1093/bioinformatics/btu385
Kämpf C, Specht M, Scholz A et al (2019) uap: reproducible and robust HTS data analysis. BMC Bioinformatics 20:664. https://doi.org/10.1186/s12859-019-3219-1
Morandi E, Cereda M, Incarnato D et al (2019) HaTSPiL: a modular pipeline for high-throughput sequencing data analysis. PLoS One 14:e0222512. https://doi.org/10.1371/journal.pone.0222512
Li ITS, Shum W, Truong K (2007) 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA). BMC Bioinformatics 8:185. https://doi.org/10.1186/1471-2105-8-185
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Palatini, U., Pischedda, E., Bonizzoni, M. (2022). Computational Methods for the Discovery and Annotation of Viral Integrations. In: Parrish, N.F., Iwasaki, Y.W. (eds) piRNA. Methods in Molecular Biology, vol 2509. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2380-0_18
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2380-0_18
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2379-4
Online ISBN: 978-1-0716-2380-0
eBook Packages: Springer Protocols