| Literature DB >> 30324030 |
Karyna Rosario1, Kaitlin A Mettel1, Bayleigh E Benner1, Ryan Johnson1, Catherine Scott2, Sohath Z Yusseff-Vanegas3, Christopher C M Baker4,5, Deby L Cassill6, Caroline Storer7, Arvind Varsani8,9, Mya Breitbart1.
Abstract
Viruses encoding a replication-associated protein (Rep) within a covalently closed, single-stranded (ss)DNA genome are among the smallest viruses known to infect eukaryotic organisms, including economically valuable agricultural crops and livestock. Although circular Rep-encoding ssDNA (CRESS DNA) viruses are a widespread group for which our knowledge is rapidly expanding, biased sampling toward vertebrates and land plants has limited our understanding of their diversity and evolution. Here, we screened terrestrial arthropods for CRESS DNA viruses and report the identification of 44 viral genomes and replicons associated with specimens representing all three major terrestrial arthropod lineages, namely Euchelicerata (spiders), Hexapoda (insects), and Myriapoda (millipedes). We identified virus genomes belonging to three established CRESS DNA viral families (Circoviridae, Genomoviridae, and Smacoviridae); however, over half of the arthropod-associated viral genomes are only distantly related to currently classified CRESS DNA viral sequences. Although members of viral and satellite families known to infect plants (Geminiviridae, Nanoviridae, Alphasatellitidae) were not identified in this study, these plant-infecting CRESS DNA viruses and replicons are transmitted by hemipterans. Therefore, members from six out of the seven established CRESS DNA viral families circulate among arthropods. Furthermore, a phylogenetic analysis of Reps, including endogenous viral sequences, reported to date from a wide array of organisms revealed that most of the known CRESS DNA viral diversity circulates among invertebrates. Our results highlight the vast and unexplored diversity of CRESS DNA viruses among invertebrates and parallel findings from RNA viral discovery efforts in undersampled taxa.Entities:
Keywords: Arthropod; CRESS DNA; Discovery; Endogenous; Insect; Invertebrate; Replication-associated protein (Rep); Spider; Virus; ssDNA
Year: 2018 PMID: 30324030 PMCID: PMC6186406 DOI: 10.7717/peerj.5761
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Sample information and identified CRESS DNA genomes.
| Year | Location | Species name (common name) | Samples | Identified genomes |
|---|---|---|---|---|
| 2011 | Kenya | Pool (2) | Arboreal ant associated circular virus 1 | |
| 2011 | Kenya | Pool (2) | Arboreal ant associated circular virus 1 | |
| 2011 | Kenya | Pool | Arboreal ant associated circular virus 1 | |
| 2013 | FL USA | Pool | Fire ant associated circular virus 1 | |
| 2013 | FL USA | Pool | Bark beetle associated circular virus 1 | |
| 2014 | Puerto Rico | Single (4) | Water beetle associated circular virus 1 | |
| 2013 | Store | Pool | Cricket associated circular virus 1 | |
| 2011 | FL USA | Single | Grasshopper associated circular virus 1 | |
| 2013 | Nevis | Pool | Fly associated circular virus 1 | |
| Fly associated circular virus 3 | ||||
| Fly associated circular virus 5 | ||||
| 2013 | St. Barts | Pool | Fly associated circular virus 2 | |
| 2013 | Dom. Republic | Pool | Fly associated circular virus 4 | |
| 2013 | Guadeloupe | Pool | Fly associated circular virus 6 | |
| 2013 | St. Barts | Pool | Fly associated circular virus 7 | |
| 2015 | NH USA | Single | Millipede associated circular virus 1 | |
| 2017 | Victoria BC | Single | Common house spider circular molecule 1 | |
| 2017 | Victoria BC | Single | Cybaeus spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Cybaeus spider associated circular virus 2 | |
| 2017 | Victoria BC | Single | Cybaeus spider associated circular molecule 1 | |
| 2017 | Victoria BC | Single | False black widow spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Giant house spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Giant house spider associated circular virus 2 | |
| 2017 | Victoria BC | Single | Giant house spider associated circular virus 3 | |
| 2017 | Victoria BC | Single (2) | Giant house spider associated circular virus 4 | |
| 2014 | Puerto Rico | Single | Golden silk orbweaver associated circular virus 1 | |
| 2017 | FL USA | Single | Longjawed orbweaver circular virus 1 | |
| 2014 | Puerto Rico | Single | Longjawed orbweaver circular virus 2 | |
| 2017 | Victoria BC | Single | Pimoid spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Pimoid spider associated circular virus 2 | |
| Pimoid spider associated circular molecule 1 | ||||
| 2017 | Victoria BC | Single | Sierra dome spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Sierra dome spider associated circular virus 2 | |
| 2017 | Victoria BC | Single | Soft spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Spider associated circular virus 2 | |
| 2017 | Victoria BC | Single | Spider associated circular virus 3 | |
| 2014 | FL USA | Single | Spinybacked orbweaver circular virus 1 | |
| 2017 | FL USA | Single | Spinybacked orbweaver circular virus 1 | |
| 2017 | FL USA | Single | Spinybacked orbweaver circular virus 2 | |
| 2017 | FL USA | Single | Tentweb spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Tubeweb spider associated circular virus 1 | |
| 2017 | Victoria BC | Single | Woodlouse hunter spider associated circular virus 1 | |
| 2015 | Kenya | Pool | Termite associated circular virus 1 | |
| Termite associated circular virus 3 | ||||
| Termite associated circular virus 4 | ||||
| 2015 | Kenya | Pool (2) | Termite associated circular virus 2 |
Notes:
Location abbreviations: FL, Florida; NH, New Hampshire; St. Barts, Saint Barthelemy; Dom. Republic, Dominican Republic; BC, British Columbia; Store, Carolina Biological Supply.
Many specimens were taxonomically identified by sample providers. However, some specimens were identified through DNA barcoding and are indicated with an asterisk (*).
Samples processed as individuals (Single) or pools (Pool) composed of up to 10 individuals from the same species are distinguished. Some CRESS DNA genomes were recovered from multiple individuals or pools (the number of samples that independently resulted in the identification of a given genome is specified within parenthesis). Although some genomes represent the same viral species, genomes sharing less than 100% genome-wide pairwise identity that were recovered from independent samples were submitted to GenBank and assigned individual accession numbers (see Table 3).
CRESS DNA genome information, accession numbers, and taxonomic groups identified in this study.
| Accession number(s) | Genome | Taxonomic affiliation | Genome size (nt) | Nonanucleotide motif (type) | BLAST match source (accession number) | Identity (%) | |
|---|---|---|---|---|---|---|---|
| Rep | Genome | ||||||
| Arboreal ant associated circular virus 1 | 1769 | TAGTATTAC (II) | Bat feces ( | 73* | 67 | ||
| Fly associated circular virus 1 | 1722 | TAGTATTAC (II) | Cockroach ( | 89 | 85 | ||
| Soft spider associated circular virus 1 | 1937 | TAGTATTAC (II) | Shrew feces ( | 63* | 59 | ||
| Spinybacked orbweaver circular virus 2 | 1707 | TAGTATTAC (II) | Dragonfly ( | 99* | 99 | ||
| Cybaeus spider associated circular virus 1 | Circularisvirus | 1991 | TAATACTAC (V) | Dragonfly ( | 61 | 60 | |
| Golden silk orbweaver associated circular virus 1 | Circularisvirus | 2054 | CAGTATTAC (V) | Dragonfly ( | 63 | 60 | |
| Longjawed orbweaver circular virus 1 | Circularisvirus | 1905 | CATTATTAC (V) | Dragonfly ( | 60 | 62 | |
| Spinybacked orbweaver circular virus 1 | Circularisvirus | 1995 | CAGTATTAC (V) | Dragonfly ( | 63 | 64 | |
| Fire ant associated circular virus 1 | Crucivirus | 3226 | TATGTGTAA (IV) | Wastewater ( | 61 | 55 | |
| Bark beetle associated circular virus 1 | 2237 | TAATATTAT (II) | Dragonfly ( | 96* | 92 | ||
| Cybaeus spider associated circular virus 2 | 2344 | TAATATTAT (II) | Whitefly ( | 67* | 61 | ||
| Fly associated circular virus 2 | 2207 | TAACATTGT (II) | Pig feces ( | 99* | 99 | ||
| Giant house spider associated circular virus 1 | 2093 | TAATATTAT (II) | Llama feces ( | 73* | 67 | ||
| Grasshopper associated circular virus 1 | 2309 | TAACACTGT (II) | Bat feces ( | 62* | 64 | ||
| Pimoid spider associated circular molecule 1 | 1662 | TAATGTTAT (II) | Llama feces ( | 69* | 68 | ||
| Pimoid spider associated circular virus 1 | 2240 | TAATATTAT (II) | Sewage ( | 100* | 99 | ||
| Sierra dome spider associated circular virus 1 | 2232 | TAATATTAT (II) | Bird feces ( | 67* | 64 | ||
| Spider associated circular virus 1 | 2214-2216 | TAATACTAT (II) | Cow feces ( | 84* | 74 | ||
| Spider associated circular virus 2 | 2204 | TAATACTAT (II) | Cow feces ( | 85* | 71 | ||
| Termite associated circular virus 2 | 2222-2226 | TAATATTAT (II) | Thrips ( | 74* | 68 | ||
| Tubeweb spider associated circular virus 1 | 2174 | TAACACTGT (II) | Thrips ( | 63* | 61 | ||
| Fly associated circular virus 3 | 2537 | TAGTGTTAC (IV) | Macaques feces ( | 83 | 89 | ||
| Fly associated circular virus 4 | 2546 | TAGTGTTAC (IV) | Chimpanzee feces ( | 57 | 61 | ||
| Cricket associated circular virus 1 | Volvovirus | 2516 | TAGTATTAC (II) | Cricket ( | 100 | 99 | |
| Common house spider circular molecule 1 | Unclassified | 1833 | TATTATTAC^ (V) | Giant panda feces ( | 62 | 63 | |
| Cybaeus spider associated circular molecule 1 | Unclassified | 1989 | TAGCACTAA (VIII) | Peatland ( | 58* | n/a | |
| False black widow spider associated circular virus 1 | Unclassified | 2199 | TAGTATTAC (I) | Reclaimed water ( | 61 | 62 | |
| Fly associated circular virus 5 | Unclassified | 1997 | TAGTATTAC (II) | Bat feces ( | 97 | 93 | |
| Fly associated circular virus 6 | Unclassified | 2103 | TAGTATTAC (IV) | Wastewater ( | 61 | 59 | |
| Fly associated circular virus 7 | Unclassified | 2010 | TAGTATTAC (IV) | Wastewater ( | 65 | 66 | |
| Giant house spider associated circular virus 2 | Unclassified | 2040 | TAGTATTAC (V) | Sphaeriid clam ( | 68 | 64 | |
| Giant house spider associated circular virus 3 | Unclassified | 2290 | TATTATTAC (I) | Amphipod ( | 61 | 59 | |
| Giant house spider associated circular virus 4 | Unclassified | 2494 | TAATATTAC (IV) | Wastewater ( | 62 | 60 | |
| Longjawed orbweaver circular virus 2 | Unclassified | 2321 | CAGTATTAC (VI) | Damselfly ( | 58 | 57 | |
| Millipede associated circular virus 1 | Unclassified | 1987 | TAGTATTAC (II) | Estuarine snail ( | 59 | 58 | |
| Pimoid spider associated circular virus 2 | Unclassified | 2125 | TAGTATTAC (I) | Bat ( | 62 | 60 | |
| Sierra dome spider associated circular virus 2 | Unclassified | 1860 | TAGTATTAC (V) | Giant panda feces ( | 57 | 57 | |
| Spider associated circular virus 3 | Unclassified | 1889 | CAACCACTC (I) | Ice shelf pond ( | 57 | 57 | |
| Tentweb spider associated circular virus 1 | Unclassified | 2127 | TAGTATTAC (II) | Dragonfly larvae ( | 62 | 60 | |
| Termite associated circular virus 1 | Unclassified | 2155 | TAATATTAC (II) | Chicken feces ( | 61* | 55 | |
| Termite associated circular virus 3 | Unclassified | 2220 | TAATGTTAC (II) | Shrub ( | 57* | 56 | |
| Termite associated circular virus 4 | Unclassified | 2152 | TAATGTTAC (II) | Tomato ( | 58* | 57 | |
| Water beetle associated circular virus 1 | Unclassified | 2244 | CAGTATTAC (II) | Ice shelf pond ( | 56 | 57 | |
| Woodlouse hunter spider associated circular virus 1 | Unclassified | 2176 | TAATAGTAG (II) | Amphipod ( | 57* | 58 | |
Notes:
A few genomes were not considered viral and were labelled as “molecules” for the following reasons: (A) capsid-encoding open reading frame (ORF) seemed truncated or (B) genome only contained a single major ORF.
Groups that do not represent established taxonomic groups by ICTV are non-italicized, including Circularisvirus, Crucivirus, and Volvovirus.
Most nonamers were located at the apex of a predicted hairpin structure, with the exception of a circular molecule identified with the symbol (^). Genome organizations, using the specified nonamer as a reference, are indicated within parenthesis according to genotypes discussed by Rosario, Duffy & Breitbart, 2012.
Best BLAST matches for identified CRESS DNA genomes. Some of the most closely related viruses to CRESS DNA viruses and replicons identified here, based on BLAST searches, contain a different genome organization and are indicated with the symbol (#).
Pairwise identities (PIs) between identified CRESS DNA genomes and their best BLAST match. Nucleotide PIs between replication-associated proteins (Rep) were calculated based on predicted spliced coding regions. Genomes containing Rep-encoding ORFs interrupted by an intron are marked with the symbol (*).
Taxonomic classification framework for established CRESS DNA viral groups.
| Family | Genome-wide pairwise identity | Species demarcation criteria | Reference |
|---|---|---|---|
| 54% | |||
| Not reported | 75% | ||
| 55% | 80% | ||
| 54% | |||
| 53% | 78% | ||
| Not reported | 75% | ||
| 55% | 77% |
Notes:
Refers to the lower limit of genome-wide pairwise identities (PIs) among members of a given viral family.
With the exception of the family Bacilladnaviridae, the species demarcation criteria (SDC) is based on genome-wide PIs. The SDC may vary by subfamily (Alphasatellitidae) or genus (Geminiviridae) within a given family.
The SDC for the Bacilladnaviridae is based on amino acid sequence PI of the replication-associated protein.
Figure 1Approximately maximum likelihood phylogenetic tree of replication-associated protein (Rep) amino acid sequences representing CRESS DNA viruses, replicons, and CRESS DNA-like endogenous viral (CEV) elements recovered from various organisms.
Branch colors distinguish sequences associated with various types of organisms. Clades containing Rep sequences falling within established CRESS DNA viral groups including the Genomoviridae (Genomo), Geminiviridae (Gemini), Bacilladnaviridae (Bac), Circoviridae (Circo), Smacoviridae (Smaco), Nanoviridae (N) and Alphasatellitidae (Alpha) were merged and are highlighted in gray. The new circularisvirus clade is highlighted with a light blue rectangle. The asterisk symbol indicates branches representing CEVs. Branches representing CEVs identified in Ephydra spp. (dipteran), including E. gracilis (EpG) and E. hians (EpH), and nematodes (Ne) are specified. CEVs identified in protists are also specified, including Entamoeba (Ent), Giardia intestinalis (G) and Blastocystis hominis (BH). Entamoeba species are further distinguished, namely E. invadens (EntI), E. histolytica (EntH), and E. dispar (EntD). Reps identified in this study are highlighted with schematics of terrestrial arthropods showing their source and broad phylogenetic distribution. Branches with <80% Shimodaira–Hasegawa (SH)-like support were collapsed. Arthropod silhouettes credit: Shutterstock vector library at https://www.shutterstock.com.
Figure 2Maximum likelihood phylogenetic tree of replication-associated protein (Rep) amino acid sequences representing members of the family Genomoviridae and related CRESS DNA viruses.
Branch colors distinguish sequences associated with various types of organisms and environmental sources. Bars on the right indicate clades representing genomovirus genera and unclassified sequences. Clades containing Rep sequences representing Gemygorvirus, Gemyduguivirus, and Gemykrogvirus species and members from the family Geminiviridae, which were used as an outgroup, were merged. Genomovirus Reps identified in this study are named and highlighted with schematics of terrestrial arthropods from which they were identified, including viruses associated with sierra dome spiders (SdSACV), pimoid spiders (PiSACV), tubeweb spiders (TuwSACV), grasshoppers (GhACV), and termites (TACV). Viruses identified in multiple species of spiders are identified as spider associated circular viruses (SACV). Branches with <70% Shimodaira–Hasegawa (SH)-like support were collapsed. A version of the tree containing source information and accession numbers for all the sequences included in the phylogenetic analysis is available in Fig. S1. Arthropod silhouettes credit: Shutterstock vector library at https://www.shutterstock.com.
Figure 3Midpoint rooted maximum likelihood phylogenetic tree of selected replication-associated protein (Rep) amino acid sequences representing members of the family Circoviridae and related CRESS DNA viruses.
Branch colors distinguish sequences associated with various invertebrate and vertebrate organisms. Bars on the right indicate clades representing the Cyclovirus and Circovirus genera. Rep sequences representing CRESS DNA-like endogenous viral (CEV) elements are highlighted with an asterisk symbol. Cyclovirus Reps identified in this study are highlighted with schematics of terrestrial arthropods and include viruses identified from flies (FlyACV), ants (AaACV), soft spiders (SoSACV) and spinyback orbweavers (SpOrbCV). Reps representing unclassified genome sequences forming non-Circoviridae clades used as outgroups were merged and are highlighted in gray (accessions: KX246259, KR528563, KM598407, KR528546, KM874290, KM874319, KM874343, KT945164). Branches with <70% Shimodaira–Hasegawa (SH)-like support were collapsed. Arthropod silhouettes credit: Shutterstock vector library at https://www.shutterstock.com.