The phylum Cressdnaviricota, created in 2019, includes viruses with single-stranded DNA (ssDNA) genomes and icosahedral capsids that infect diverse eukaryotes, including algae, fungi, plants, insects, and vertebrates [1]. A characteristic feature of viruses in this phylum is the presence of homologous replication-associated proteins (Reps) with an N-terminal rolling-circle replication initiation endonuclease domain of the HUH superfamily [2] and a C-terminal superfamily 3 helicase domain [3]. By contrast, the capsid proteins encoded by viruses from different families can be non-orthologous, although all cressdnaviruses for which structural information is available appear to use single jelly-roll capsid proteins for virion formation [4, 5]. The phylum consists of two classes, the class Repensiviricetes, which currently includes all fungal and plant viruses of the families Genomoviridae and Geminiviridae, and the class Arfiviricetes, which includes six virus families (Bacilladnaviridae, Circoviridae, Smacoviridae, Nanoviridae, Metaxyviridae, and Redondoviridae). However, many groups of related ssDNA viruses, informally referred to as CRESSV1 to CRESSV6, remain unclassified [6, 7].

Recently, Kinsella et al. [8] identified three groups of ssDNA viruses associated with protozoan parasites of the genera Entamoeba and Giardia. The authors suggested that viruses associated with the Entamoeba hosts could constitute two families, “Naryaviridae” and “Nenyaviridae”, whereas those associated with Giardia hosts could form the family “Vilyaviridae” [8]. The families Naryaviridae, Nenyaviridae, and Vilyaviridae are named after three rings from the Middle-earth canon [9, 10] (also known as Tolkien’s canon) [8]. Here, we report on the formal establishment of the three families and a sequence-based taxonomic framework and demarcation criteria for classification of viruses within these families.

We assembled a dataset of unclassified ssDNA virus genome sequences from the GenBank database displaying similarity to those of representative members of the originally proposed Naryaviridae, Nenyaviridae, and Vilyaviridae [8]. The Rep sequences of these 60 viruses were analyzed together with those of Bacilladnaviridae, Circoviridae, Geminiviridae, Genomoviridae, Nanoviridae, Metaxyviridae, Redondoviridae, and Smacoviridae as well as those of CRESSV1 to CRESSV6 [6, 7]. The sequences were aligned using MAFFT v7.490 [11], and the resulting alignment was trimmed with TrimAL v1.2 [12] with a gap threshold of 0.2. A maximum-likelihood phylogenetic tree was constructed using IQtree2 [13] with automatic selection of the best-fit substitution model for a given alignment, which was rtREV+F+R6. The maximum-likelihood phylogenetic analysis of the corresponding Rep proteins in the framework of other members of the phylum Cressdnaviricota confirmed that the three groups form monophyletic clades distinct from the previously classified viruses (Fig. 1). Whereas the Vilyaviridae fell within the established order Cirlivirales, the Naryaviridae and Nenyaviridae formed distinct branches within the class Arfiviricetes (Fig. 1). Thus, to bridge the gap between the family and class taxa, we created the orders Rivendellvirales and Rohanvirales, which accommodate the families Naryaviridae and Nenyaviridae, respectively. The name of the order Rivendellvirales is derived from Rivendell, an Elven refuge in a steep and hidden valley to the west of the Misty Mountains, founded in the Second Age by Elrond; Rivendell was home to a number of great Elven lords [14]. The name of the order Rohanvirales, is derived from Rohan, a kingdom of the Rohirrim, bounded by the Anduin, the Misty Mountains, and Fangorn Forest, among others; once a province of Gondor, the land was given to the Men of Eotheod in return for their aid to Gondor in a battle [14].

Fig. 1
figure 1

Maximum-likelihood phylogenetic tree of Rep proteins from members of the phylum Cressdnaviricota. Closely related sequence groups are collapsed into triangles, the side lengths of which are proportional to the distances between the closest and farthest leaf nodes. The Rep alignment used for tree reconstruction was taken from Krupovic et al. [1] and supplemented with sequences of members of the newly established families Naryaviridae, Nenyaviridae, and Vilyaviridae. The latter dataset included previously reported sequences [8] as well as related sequences downloaded from the GenBank database. Numbers at the nodes represent aLRT branch support. The scale bar represents the number of substitutions per site.

Viruses of the family Vilyaviridae encode capsid proteins distantly related to those characteristic of circovirids, consistent with their placement within the order Cirlivirales. Notably, due to high sequence divergence, the similarity was only detectable when the corresponding profile hidden Markov models (HMM) were compared using HHpred [15]. In the sequence-similarity-based network constructed using CLANS [16], members of the families Naryaviridae and Nenyaviridae formed several mixed clusters of non-orthologous proteins (Fig. 2), consistent with the phylogenetic analysis reported previously [8]. In particular, members of the family Naryaviridae formed four separate clusters, two of which also included nenyavirids. Profile-profile comparisons revealed a further distant relationship between these CP clusters and CPs of ssDNA viruses of the families Geminiviridae, Nanoviridae, and Redondoviridae (Fig. 2). The replacement of the capsid gene on several occasions emphasizes the importance of recombination in the evolution of the Naryaviridae and Nenyaviridae. Even viruses of the same species, as in the case of Nimphelosvirus isildur, can encode highly distinct CPs. Notably, members of both the Naryaviridae and the Nenyaviridae are associated with Entamoeba sp., and thus the interfamily gene exchange is likely to be enabled by the shared host range.

Fig. 2
figure 2

Diversity of capsid proteins encoded by members of the families Naryaviridae, Nenyaviridae, and Vilyaviridae. Protein sequences were clustered by their pairwise sequence similarity. Lines connect sequences with P-value ≤ 1e−05. Species of the Naryaviridae (red), which form several distinct clusters, are labeled. Affiliation of the clusters to CPs of other ssDNA viruses is indicated with bold text.

To determine meaningful demarcation criteria within the families Naryaviridae, Nenyaviridae, and Vilyaviridae, we analyzed the relationships between the corresponding viruses within each of the three families by performing all-against-all genome and Rep sequence comparisons as well as phylogenetic analysis (Figs. 3, 4, 5). For species demarcation, we used 78% pairwise nucleotide sequence identity, similar to what was used for other cressdnaviricots, including genomovirids [17, 18] and smacovirids [19, 20]. Thus, all viral genomes showing sequence identity higher than 78% should be considered variant members of the existing species. Nonetheless, there may be situations where it is difficult to assign species because a particular new sequence is

  • (1) >78% identical to sequences from a particular species but <78% identical to other variants belonging to that same species;

  • (2) >78% identical to sequences from two or more different species.

Fig. 3
figure 3

Genera and species of the family Vilyaviridae. (A) Pairwise all-against-all comparison of the vilyavirid genomes. Genome maps are shown on the left with amino acid identity values shown for the open reading frames generated using Clinker [44]. A heat map showing the corresponding identity values is shown on the right. (B) Maximum-likelihood tree of the family Vilyaviridae (left) and a heat map showing pairwise comparison of the Rep amino acid sequences (right). Species belonging to the same genus are indicated with the same color; the genera are listed on the right. The phylogeny was inferred using IQtree2 [13], with automatic selection of the best-fit substitution model for a given alignment, which was Q.pfam+F+I+G4. Numbers at the nodes represent aLRT branch support. The cyan line shows the established demarcation of genera. The pairwise identities were calculated using SDT [45].

Fig. 4
figure 4

Genera and species of the family Naryaviridae. (A) Pairwise all-against-all comparison of the naryavirid genomes. Genome maps are shown on the left with amino acid identity values shown for the open reading frames generated using Clinker [44]. A heat map showing the corresponding identity values is shown on the right. (B) Maximum-likelihood tree of the family Naryaviridae (left) and a heat map showing pairwise comparison of the Rep amino acid sequences (right). Species belonging to the same genus are indicated with the same color; the genera are listed on the right. The phylogeny was inferred using IQtree2 [13], with automatic selection of the best-fit substitution model for a given alignment, which was LG+G4. Numbers at the nodes represent aLRT branch support. The cyan line shows a proposed demarcation of genera. The pairwise identities were calculated using SDT [45].

Fig. 5
figure 5

Genera and species of the family Nenyaviridae. (A) Pairwise all-against-all comparison of the nenyavirid genomes. Genome maps are shown on the left with amino acid identity values shown for the open reading frames generated using Clinker [44]. A heat map showing the corresponding identity values is shown on the right. (B) Maximum-likelihood tree of the family Nenyaviridae (left) and a heat map showing pairwise comparison of the Rep amino acid sequences (right). Species belonging to the same genus are indicated with the same color; the genera are listed on the right. The phylogeny was inferred using IQtree2 [13], with automatic selection of the best-fit substitution model for a given alignment, which was LG+G4. Numbers at the nodes represent aLRT branch support. The cyan line shows a proposed demarcation of genera. The pairwise identities were calculated using SDT [45].

To resolve the above conflicts, we suggest adopting an approach similar to that proposed for alphasatellites [21], circoviruses [22], geminiviruses [23, 24], and genomoviruses [18]. To resolve conflict 1, we suggest that the new virus be classified within any species in which it shares >78% sequence identity with any one variant already classified as belonging to that species, even if it is <78% identical to other viruses within that species. To resolve conflict 2, we suggest that the new virus be considered a member of the species with whose members it shares the highest degree of sequence similarity.

Given the interfamilial recombination observed within the Naryaviridae and Nenyaviridae, which produces genomes with diverse combinations of CPs and Reps (Fig. 2), we chose to define genera based on the cohesive phylogenetic lineages of the Rep, because this protein is generally more conserved within other ssDNA virus families than the CP and is the only protein shared by all members of the phylum Cressdnaviricota [1]. Notably, pairwise comparison of the Rep amino acid sequences fully recapitulated the Rep-phylogeny-based classification (Figs. 3, 4, 5). Using the criteria outlined above, the family Vilyaviridae was divided into 12 genera with 18 species (Fig. 3; Table 1, 2); the family Naryaviridae was divided into four genera with five species (Fig. 4; Tables 1, 2), and the family Nenyaviridae was divided into five 5 genera with six species (Fig. 5; Tables 1, 2).

Table 1 Etymology of the genus names in the families Naryaviridae, Nenyaviridae, and Vilyaviridae
Table 2 Summary of the virus classification in the families Naryaviridae, Nenyaviridae, and Vilyaviridae with binomial species names

Since the families Naryaviridae, Nenyaviridae, and Vilyaviridae are named after three rings from the Middle-earth canon, we followed the Tolkien Middle-earth [25] theme for genus and species names (Table 1). Furthermore, for species names, we used a binomial format with the “genus name + free-form epithet” [26, 27], where epithets are derived from various characters from the Tolkien canon. All species, genera, and families and their members are listed in Table 2.

The taxonomic changes described above were ratified by the International Committee on Taxonomy of Viruses (ICTV) in 2022 [46]. With the creation of the orders Rivendellvirales and Rohanvirales and the families Naryaviridae, Nenyaviridae, and Vilyaviridae, the phylum Cressdnaviricota now includes eight orders and 11 families. Yet, many more groups of ssDNA viruses remain to be discovered and classified, including the previously recognized CRESSV1-6. Notably, CRESSV2 [1, 6, 7] forms a sister group to the Nenyaviridae in the Rep phylogeny (Fig. 1) and, once formally classified, is likely to represent another family within the order Rohanvirales, whereas CRESSV1 and CRESSV3 [1, 6, 7] fall within the order Cirlivirales, together with the Vilyaviridae and Circoviridae. The families Naryaviridae and Nenyaviridae highlight the chimerism of certain ssDNA virus genomes, which has been discovered previously among the ssDNA viruses known as ‘cruciviruses’ [28,29,30,31,32,33,34]. We chose here to classify naryavirids and nenyavirids based on their Rep phylogeny, whereas the cruciviruses, as a group, are united by homologous tombusvirus-like CPs and encode non-orthologous Reps, which fall into different clades of ssDNA viruses. It remains to be decided what is the best approach to classify cruciviruses and other viruses with highly chimeric genomes.