| Literature DB >> 28058112 |
Ryan Ruboyianes1, Michael Worobey1.
Abstract
Recent discoveries indicate that the foamy virus (FV) (Spumavirus) ancestor may have been among the first retroviruses to appear during the evolution of vertebrates, demonstrated by foamy endogenous retroviruses present within deeply divergent hosts including mammals, coelacanth, and ray-finned fish. If they indeed existed in ancient marine environments hundreds of millions of years ago, significant undiscovered diversity of foamy-like endogenous retroviruses might be present in fish genomes. By screening published genomes and by applying PCR-based assays of preserved tissues, we discovered 23 novel foamy-like elements in teleost hosts. These viruses form a robust, reciprocally monophyletic sister clade with sarcopterygian host FV, with class III mammal endogenous retroviruses being the sister group to both clades. Some of these foamy-like retroviruses have larger genomes than any known retrovirus, exogenous or endogenous, due to unusually long gag-like genes and numerous accessory genes. The presence of genetic features conserved between mammalian FV and these novel retroviruses attests to a foamy-like replication biology conserved for hundreds of millions of years. We estimate that some of these viruses integrated recently into host genomes; exogenous forms of these viruses may still circulate.Entities:
Keywords: paleovirology, endogenous retrovirus, foamy virus, fish, phylogeny
Year: 2016 PMID: 28058112 PMCID: PMC5210030 DOI: 10.1093/ve/vew032
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
EFV-harboring fish genome contigs
| Order | Common name | Contig (acc) | BLASTx match (acc) | % ID | ERV components | |
|---|---|---|---|---|---|---|
| Cypriniformes | ||||||
| | Zebrafish | CAAK05053864.1 | SFVspm (ABV59399) | 29 | 6E−83 | LTR- |
| | Fathead minnow | JNCD01029789.1 | SFVcpz (AKM21185) | 30 | 5E−87 | tLTR- |
| Salmoniformes | ||||||
| | Rainbow trout | CCAF010042932.1 | EFV (NP_054716) | 30 | 5E−36 | |
| Gadiformes | ||||||
| | Cod | CAEA01131311.1 | PFV (AAA66556) | 34 | 9E−71 | |
| Pleuronectiformes | ||||||
| | Tongue sole | AGRG01061695.1 | SFVspm (ABV59399) | 33 | 2E−64 | |
| Cichliformes | ||||||
| | Midas cichlid | CCOE01001074.1 | EFV (NP_054716) | 28 | 7E−100 | LTR- |
| Cyprinodontiformes | ||||||
| | Turquoise killifish | JNBZ01063262.1 | EFV (NP_054716) | 31 | 5E−69 | |
| | Mummichog | JXMV01054445.1 | SFVmac (AFA44809) | 33 | 1E−33 | |
| | Platyfish | AGAJ01041163.1 | SFVorg (CAD67562) | 29 | 6E−92 | tLTR- |
| | Guppy | AYCK01027102.1 | SFVmac (AFA44809) | 29 | 2E−92 | LTR- |
| | Amazon molly | AZHG01028727.1 | SFVspm (ABV59399) | 29 | 4E−86 | LTR- |
| Perciformes | ||||||
| | Sablefish | AWGY01041462 | SFVcpz (AFX98084) | 37 | 4E−58 | |
| | Yellow croaker | JRPU01012463.1 | SFVspm (ABV59399) | 26 | 3E−81 | LTR- |
| | Korean mudskipper | JACL01052273.1 | SFVspm (ABV59399) | 30 | 4E−74 | |
| | Flag rockfish | AUPQ01030678.1 | SFVspm (ABV59399) | 32 | 3E−62 | |
| | Tiger rockfish | AUPR01019601.1 | SFVspm (ABV59399) | 31 | 1E−60 | |
| Formerly perciformes | ||||||
| | Bicolor damselfish | JMKM01039980.1 | SFVspm (ABV59399) | 33 | 2E−72 | tLTR- |
Host species higher taxonomic levels per Betancur-R et al. (2013) , except (*) incertae sedis member of Ovalentaria assigned by aforementioned. Common names per FishBase classification (Harel et al. 2015). (a) Related ERV described in Llorens et al. (2009). (b) Previously described in Schartl et al. (2013). ERV components marked (t) are truncated.
EFV, Equine foamy virus; PFV, Prototype foamy virus; SFVcpz, Simian foamy virus – chimpanzee; SFVorg, Simian foamy virus – orangutan, SFVmac, Simian foamy virus – macaque; SFVspm, Simian foamy virus – spider monkey.
Figure 1.Retrovirus Pol phylogeny. MrBayes consensus tree estimated from 692 aligned Pol protein positions and rooted at its midpoint for illustrative purposes. Branch lengths estimated as expected substitutions per site. Nodes with posterior probability <1.0 are labeled. Seven genera of exogenous and endogenous retroviruses in colored blocks; Snakehead retrovirus, genus yet unclassified, is uncolored; foamy-like teleost endogenous retroviruses represented by salmon-colored branches. Asterisks (*) mark foamy-like fish ERVs discovered in previous studies. Class I ERVs are grouped with gammaretroviruses and class II with betaretroviruses for simplicity. RV taxa and source citations are available in Supplementary Table 4.
Figure 2.Fish ERV genetic features. SFVcpz/hu (acc: Y07725.1) is the prototype virus for the FV clade. Open boxes represent ORFs. Premature stop codons marked with an asterisk (*). Reading frames represented by vertical orientation. Downward arrow represents furin cleavage site; arrows surmounted by ‘S’ are signal peptidase cleavage sites; p3-cleavage site marked by ‘p3’ arrow; dotted line represents ORF boundary in LTR. PPT:LTR boundary ‘gggtg’ motif marked as such. Hypothetical gag genes are green, pol are blue, and hypothetical env are red. Black boxes are unknown accessory genes unless labeled. Boxes and genome length are to scale (kb). GQR, glycine-glutamine-arginine rich box; GR, glycine-arginine rich box; IP, internal promoter; Int, integrase; LTR, long terminal repeat; PBS, primer binding site; Pro, protease; RH, ribonuclease H; RT, reverse transcriptase; TM, trans-membrane motif.
Estimates of ERV integration age
| ERV | Methods | Divergence: | Mutation rate: | Estimated age: |
|---|---|---|---|---|
|
DrFV-3 |
Flank TMRCA (4661 pos.) |
0.017 |
4.3e−8 |
284,000 y 95% HPD: 2.47e5–3.24e5 |
|
AcERV-1 AcERV-2 AcERV-3 AcERV-4 |
LTR divergence (1502 pos.) LTR divergence (1472 pos.) LTR divergence (988 pos.) LTR divergence (1493 pos.) |
0.002 0.003 0.011 0.011 |
6.6e−8 |
15,151 y 22,727 y 83,333 y 83,333 y |
|
PfERV-1 PfERV-2 |
LTR divergence (665 pos.) LTR divergence (1707 pos.) |
0.003 0.015 |
4.89e−8 |
30,674 y 153,374 y |
|
PrERV-1 PrERV |
LTR divergence (1690 pos.) Flank divergence (2995 pos.) |
0.005 0.016 |
4.89e−8 |
51,124 y 163,599 y |
|
SpERV |
LTR divergence (1368 pos.) |
0.007 |
4.89e−8 |
71,574 y |
|
LcERV |
Flank divergence (2313 pos.) LTR divergence (1561 pos.) |
0.003 0.010 |
4.89e−8 |
30,674 y 102,249 y |
Divergence calculated with K80 nucleotide substitution model (Krogh et al. 2001). Mutation rates drawn from previous studies (Luscombe et al. 2001; Tamura et al. 2011; Kratochwil et al. 2015). DrFV-3 divergence was calculated from multiple sequence alignment using BEAST (Guindon and Gascuel 2003), and the 95 percent highest posterior density interval is included. Confidence intervals could not be calculated for other estimates due to our estimation methods, and these should not be interpreted as precise estimates lacking uncertainty. Where more than one estimation was possible, these are presented as upper and lower estimates of minimum age
HPD, highest posterior density; pos, nucleotide positions; TMRCA, time to most recent common ancestor; y, years. Accession numbers listed in Supplementary Table 1.
Hosts with FV-like fragments; de novo sequencing
| Order species | Common name | Source catalog no. | BLASTx hit (acc) | % ID | |
|---|---|---|---|---|---|
| Anguilliformes | |||||
| | Goldentail moray | KU 145 | SFVspm (ABV59399) | 27 | 9E−15 |
| Cypriniformes | |||||
| Plains minnow | MSB 57953 | SFVmac (AFA44809) | 47 | 1E−19 | |
| Gadiformes | |||||
| | Giant grenadier | KU 2298 | EFV (NP_054716) | 35 | 2E−21 |
| Sygnathiformes | |||||
| | Flying gurnard | KU 237 | SFVspm (ABV59399) | 35 | 3E−25 |
| Cichliformes | |||||
| | Freshwater angelfish | KU 2846 | SFVspm (ADE05995) | 39 | 7E−27 |
| Blenniformes | |||||
| | Doubleline clingfish | KU 7020 | SFVspm (ABV59399) | 38 | 3E−29 |
| Beloniformes | |||||
| Bennet’s flying fish | KU 2780 | SFVspm (ABV59399) | 36 | 4E−21 | |
| Houndfish | KU 5842 | SFVspm (ADE05995) | 40 | 1E−28 | |
| Perciformes | |||||
| Deepwater serrano | SIO 08-90 | SFVspm (ABV59399) | 38 | 2E−27 | |
Listed are the best hits for a representative cloned sequence successfully amplified from genomic DNA. Orders after Betancur-R et al. (2013). Common names from FishBase (Harel et al. 2015)
KU, University of Kansas; MSB, Museum of Southwest Biology; SIO, Scripps Institute of Oceanography. Blastx hit abbreviations: EFV, equine foamy virus; SFVmac, simian foamy virus – macaque; SFVspm, simian foamy virus – spider monkey.
Figure 3.Phylogeny of RVs and twenty-seven foamy-like fish ERVs. MrBayes consensus tree estimated from 702 Pol aa positions, including gap positions introduced with short fragments sequenced in this study. Tree is midpoint rooted for clarity. Nodes with posterior probability <1.0 are marked. Branch lengths estimated as expected substitutions per site. RV clades are marked with colored bars to right. Included are ERVs discovered and sequenced by PCR-based assay in this study (•), and ERV sequences from WGS data alone. Also labeled are taxa from WGS databases that we PCR amplified and sequenced as positive controls (+).