The family Genomoviridae includes viruses with small circular single-stranded (ss) DNA genomes (~1.8–2.4 kb) encoding a rolling-circle replication initiation protein (Rep) and a capsid protein (CP) in an ambisense orientation [35]. Whereas the genomoviral CP is not recognizably similar at the sequence level to the CPs of other known viruses, the Rep is homologous to those of other eukaryotic ssDNA viruses and is most similar to those of plant viruses of the family Geminiviridae, sharing several unique sequence motifs and forming a sister group in phylogenetic analyses [21,22,23]. Accordingly, the families Genomoviridae and Geminiviridae were included in the order Geplafuvirales [36]. All eukaryotic ssDNA viruses encoding these related Reps, informally referred to as the CRESS DNA viruses [57, 82], were recently officially unified in the phylum Cressdnaviricota [36, 72].

The founding member of the Genomoviridae [35], Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV-1), infects the phytopathogenic fungus Sclerotinia sclerotiorum [79] but can also replicate in its transmission vector, the mycophagous insect Lycoriella ingenua [43]. However, the vast majority of genomoviruses have been discovered by metagenomics in diverse samples (see Supplementary Table S1), and the real extent of their host range remains unknown. In 2017, a sequence-based taxonomic framework was established for genomovirus classification [70]. In particular, 78% genome-wide pairwise identity was chosen as a species demarcation threshold, whereas Rep sequence phylogeny was used to define genera. At the time, the family consisted of 121 members, which were classified based on genome sequences into 73 species divided into nine genera: Gemycircularvirus (43 species and 73 members), Gemyduguivirus (1 species and 1 member), Gemygorvirus (5 species and 9 members), Gemykibivirus (16 species and 29 members), Gemykolovirus (2 species and 3 members), Gemykrogvirus (3 species and 3 members), Gemykroznavirus (1 species and 1 member), Gemytondvirus (1 species and 1 member), and Gemyvongvirus (1 species and 1 member).

In the period since the establishment of the family Genomoviridae [35] and the first report on the classification of then known genomoviruses (n = 122) [70], ~420 new complete genome sequences of genomoviruses had been deposited in the GenBank database as of May 2020, including both virus isolates and viruses discovered by metagenomics [2, 3, 6,7,8,9, 11,12,13,14,15,16, 18, 24, 26,27,28,29,30,31,32,33,34, 37,38,39,40,41,42, 44,45,46, 48, 50,51,52,53,54,55,56, 58, 59, 61, 62, 64,65,66,67,68,69, 71, 73,74,75,76,77,78,79,80,81, 83]. Furthermore, the International Committee on Taxonomy of Viruses (ICTV) has recently adopted a freeform binomial species nomenclature, whereby virus species names have to consist of two words, with the first one being the genus name and the second being a free-form species epithet, which can consist of Latin letters and/or Arabic numerals [60]. All existing species that currently do not conform to this binomial format have to be renamed before 2023. Here, we report on the classification of the new genomoviruses as well as on other taxonomic changes in the family Genomoviridae, which were approved by the ICTV following the annual ratification vote in March of 2021.

The first notable change implemented in the Genomoviridae taxonomy was adoption of the binomial species nomenclature for all 73 existing species. The correspondence between the old and new binomial species names is shown in Table 1. We note that the names of viruses that are included in the corresponding species are not affected. For instance, the virus name Sclerotinia sclerotiorum hypovirulence associated DNA virus 1 remains unchanged although the species name Sclerotinia gemycircularvirus 1 has been changed to Gemycircularvirus sclero1.

Table 1 Nomenclatural and taxonomic changes in the family Genomoviridae

The second taxonomic development involves the classification of ~420 new genomoviruses. We used the criteria of Rep amino-acid-based phylogeny (Fig. 1) to assign genomoviruses to genera, as outlined by Varsani and Krupovic [70]. By contrast, the recently isolated Fusarium graminearum gemytripvirus 1 (FgGMTV1), infecting Fusarium graminearum, a fungal plant pathogen with worldwide distribution that causes Fusarium head blight (FHB) disease in wheat and barley [39], formed a separate branch in the Rep phylogeny (Fig. 1). Notably, unlike other members of the Genomoviridae, which are monopartite, FgGMTV1 contains three genomic segments, each encoding a single protein [39]: DNA-A encodes a Rep protein; DNA-B encodes a genomovirus-like CP; and DNA-C encodes a protein of unknown function. DNA-A and DNA-B are mutually interdependent for their replication, whereas DNA-C relies on DNA-A and DNA-B for replication and appears to enhance virus pathogenesis and transmission via conidia as well as accumulation of viral DNA in infected fungi [39]. Phylogenetic analysis suggests that the multipartite genome of FgGMTV1 has evolved from a monopartite genome of an ancestral genomovirus. Thus, based on the Rep phylogeny and its multipartite genome organization, FgGMTV1 has been classified as a member of a new species, Gemytripvirus fugra1, within a new genus, Gemytripvirus (gemini-like myco-infecting tripartite virus) [39].

Fig. 1
figure 1

Maximum-likelihood phylogenetic tree of Rep amino acid sequences of 545 genomoviruses together with a subset of geminiviruses and an unclassified virus (MK032746) that are distantly related. The tree is rooted with geminivirus sequences (green) and that of MK032746, which is most closely related to classified members of the family Genomoviridae. Clades corresponding to different genomovirus genera are colored blue. The Rep sequence alignment was constructed with MAFFT [20] and trimmed with TrimAL [1] using the gappyout option. The final alignment contained 435 amino acid sites and was used to construct a maximum-likelihood phylogenetic tree using IQ-Tree [47]. The best-fitting model was determined by ModelFinder [19] and was LG+F+R9. Numbers at the nodes represent bootstrap support values (%).

Using the previously established species demarcation criteria [70], namely, genome-wide pairwise identity of 78%, 35 viruses can be assigned to eight known species, while the remaining 389 viruses are classified to 164 new species (Table 1). The new species were named using the free-form binomial system. The greatly expanded dataset of genomoviruses has reinforced the validity of the previously established species demarcation criteria. Indeed, pairwise comparison of the representative sequences from each of the species (except for Gemytripvirus fugra1, whose members have a tripartite genome) showed that they share less than 78% genome-wide pairwise identity (Figs. 2 and 3). A summary of the genome sequences of genomoviruses belonging to specific genera and their source are summarized in Figure 4, and additional details are provided in Supplementary Table S1. Accordingly, this threshold will continue to be used for further taxonomic classification of new genomoviruses.

Fig. 2
figure 2

Pairwise identity matrix of the genome sequence of a representative member of each species of genomovirus (n = 236) except Gemytripvirus fugra1, whose representative, Fusarium graminearum gemytripvirus 1, has a multicomponent genome. The analysis was performed using SDT v1.2 [49].

Fig. 3
figure 3

Pairwise distribution plot of the 236 representative sequences of genomoviruses (except Fusarium graminearum gemytripvirus 1 [species Gemytripvirus fugra1], which has a multicomponent genome), showing that no sequences from different species share >78% identity.

Fig. 4
figure 4

Summary of the number of genomovirids assigned to each genus and source from which the genomes was obtained

Finally, we note that following the taxonomic assessment described herein, a number of new genomoviruses (n = 201; GenBank, download 15 May 2021) have been discovered [4, 5, 10, 25, 63]. Of these, 42 can be assigned to currently established species, and 161, once classified, are likely to represent new species and genera within the family Genomoviridae. However, we would like to discourage naming newly discovered viruses using official taxon names. For instance, an ssDNA virus infecting the phytopathogenic fungus Botrytis cinerea has been isolated recently and named Botrytis cinerea genomovirus 1 (BcGV1) [17]. Although BcGV1 displays a genomic organization similar to that of genomoviruses and encodes a related Rep, the CP encoded by this virus is unrelated to that of genomoviruses. Thus, placement of BcGV1 within the family Genomoviridae is questionable.