| Literature DB >> 23537098 |
Joshua A Hayward1, Mary Tachedjian, Jie Cui, Hume Field, Edward C Holmes, Lin-Fa Wang, Gilda Tachedjian.
Abstract
BACKGROUND: Betaretroviruses infect a wide range of species including primates, rodents, ruminants, and marsupials. They exist in both endogenous and exogenous forms and are implicated in animal diseases such as lung cancer in sheep, and in human disease, with members of the human endogenous retrovirus-K (HERV-K) group of endogenous betaretroviruses (βERVs) associated with human cancers and autoimmune diseases. To improve our understanding of betaretroviruses in an evolutionarily distinct host species, we characterized βERVs present in the genomes and transcriptomes of mega- and microbats, which are an important reservoir of emerging viruses.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23537098 PMCID: PMC3621094 DOI: 10.1186/1742-4690-10-35
Source DB: PubMed Journal: Retrovirology ISSN: 1742-4690 Impact factor: 4.602
Betaretroviral elements in mega- and microbat transcriptomes
| | | | ||||||
|---|---|---|---|---|---|---|---|---|
| | | | ||||||
| Betaretroviruses | Gag | 1.65×10-121 | 150 | 3.86×10-28 | 9 | ND | 0 | |
| | Pol | <1×10-250 | 246 | 7.80×10-36 | 16 | 1.0×10-58 | 1 | |
| | Env | 1.29×10-51 | 48 | 4.00×10-15 | 3 | ND | 0 | |
| Gag | 1.75×10-59 | 185 | 1.31×10-15 | 3 | ND | 0 | ||
| | Pol | <1×10-250 | 241 | 2.13×10-40 | 5 | 7.0×10-56 | 1 | |
| | Env | 2.58×10-31 | 137 | 2.65×10-31 | 2 | 1.0×10-87 | 1 | |
| Gag | 2.16×10-104 | 190 | 1.49×10-20 | 5 | ND | 0 | ||
| | Pol | <1×10-250 | 287 | 7.71×10-34 | 21 | 1.0×10-58 | 1 | |
| | Env | 1.83×10-33 | 140 | 1.48×10-31 | 6 | 2.0×10-98 | 1 | |
| Gag | 6.85×10-53 | 90 | 4.82×10-13 | 2 | ND | 0 | ||
| | Pol | <1×10-250 | 269 | 1.36×10-31 | 15 | 2.0×10-45 | 1 | |
| | Env | 8.39×10-54 | 19 | ND | 0 | ND | 0 | |
| Gag | 1.70×10-108 | 185 | 2.96×10-20 | 5 | ND | 0 | ||
| | Pol | <1×10-250 | 290 | 9.34×10-39 | 21 | 9.0×10-62 | 1 | |
| Env | 1.31×10-35 | 136 | 2.75×10-26 | 2 | 1.0×10-109 | 1 | ||
aGag, Pol, and Env proteins were translated from the genomes of extant betaretroviruses and used as search queries in a tBLASTn analysis of the Illumina sequenced transcriptome of P. alecto, and the 454 sequenced transcriptomes of R. megaphyllus, and R. ferrumequinum. The e-value of the transcriptome hit with the greatest sequence similarity (lowest e-value) to each query sequence is displayed.
bThe number of transcripts identified in the transcriptomes with an e-value < 1 × 10-10. ND: No data.
Figure 1A schematic representation of PaERV-βA. Two transcripts were identified in the P. alecto Illumina sequenced transcriptome that overlapped by 3,152 nt with 100% sequence identity which were used to assemble the PaERV-βA genomic sequence. Indicated are the retroviral genes gag, pro, pol, and env, which have been rendered defective by random mutation since integration. Also shown are the key enzymatic active sites of the viral protease (D×G), reverse transcriptase (DDD), and integrase (DDE); the betaretroviral dUTPase domain in pro; two unique open reading frames (ORFs); the polypurine tract (PPT); and the (Unique 3') (U3) region. ORF* does not appear to be genuine, but rather has arisen as a result of an insertion mutation that has disrupted a stop codon.
Full-length endogenous betaretroviruses identified in the Illumina sequenced transcriptome of and the Sanger sequenced genomes of and
| | | | | | | | | | ||
|---|---|---|---|---|---|---|---|---|---|---|
| | | |||||||||
| | | | | | | | | | | |
| | 7,705 | Defective | Defective | Defective | Defective | 0 | 407* | Unknown | 100 nt NSR overlapping 5' LTR and beginning of | |
| | 9,257 | Defective | Defective | 1 | 1265 | Lys 3 | 102 nt NSR within | |||
| | 7,126 | Defective | Defective | Defective | 0 | 366* | Lys 3 | Short | ||
| | 7,928 | Defective | Defective | Defective | Defective | 1 | 398 | Lys 1,2 | NSRs overlapping 5' LTR and | |
| | 7,879 | Defective | 1 | 371* | Lys 3 | A single stop mutation in | ||||
| | 7,804 | Defective | Defective | Defective | 1 | 370 | Lys 3 | 41 nt NSR at extreme 5' end of the 5′ LTR | ||
| | 7,631 | Defective | Defective | Defective | 0 | 387* | Lys 3 | Appears to contain a deletion that overlaps PPT and 3'LTR | ||
| | 7,843 | Defective | Defective | Defective | 1 | 361 | Lys 3 | | ||
| | 7,809 | Defective | Defective | Defective | Defective | 0 | 371* | Lys 3 | | |
| | 8,773 | Defective | 2 | 427* | Lys 1,2 | | ||||
| | 8,611 | Defective | Defective | 1 | 425* | Lys 1,2 | 3' LTR appears truncated | |||
| | | | | | | | | | | |
| | >8,103§ | Defective | Defective | Defective | Defective | 2 | Unknown | Unknown | Contains artifact ORF (denoted as ORF* in Figure | |
| | | | | | | | | | | |
| | 9,866 | Defective | Defective | Defective | 0 | 422* | Lys 1,2 | Large foreign insertion in 5' LTR | ||
| | 8,121 | Unknown | Defective | Intact | Defective | 0 | 480 | Lys 3 | 669 nt NSR within | |
| | 8,102 | 0 | 479* | Lys 3 | Completely intact | |||||
| | 9,007 | Defective | Defective | Defective | Intact | 0 | 479* | Lys 3 | Contains short foreign insertions in | |
| | 7,890 | Defective | Defective | Defective | Defective | 1 | 440 | Lys† | | |
| 8,235 | Defective | Defective | 1 | 470 | Lys 3 | Small ~45nt deletion overlapping |
a The genome size is given for the proviral version of the βERVs. § The genome size of PaERV-βA is uncertain as the known sequence begins 25nt upstream of the gag gene and does not include the (unique 5') region.
b The core retroviral genes gag, pro, pol, and env that contain frameshift or premature stop mutations are described as ‘defective’, those that contain neither of these are described as ‘intact’ in bold font.
c The pro open reading frame (ORF) of each βERV was found to encode a betaretroviral dUTPase protein domain.
d The number of ORFs that do not code for the core genes and are 300 nucleotides or greater in length.
e The length of the long terminal repeats (LTRs). * For those βERVs whose 5′ and 3′ LTR lengths differ, the value of the 5′ LTR is given.
f The specific lysine (Lys) tRNA complementary to the primer binding site (PBS) for each βERV is given. † The specific identity of the PBS of MlERV-βE is uncertain. NSR: non-sequenced region.
Figure 2Phylogenetic relationships of bat and non-bat betaretroviruses. Maximum likelihood phylogenetic trees of (A) Gag, (B) Pol, and (C) Env amino acid sequences. Bootstrap values <70% are not shown, and branch lengths are drawn to a scale of amino acid substitutions per site. Bootstrap values are denoted as ** >90%; * >70% and < 90%. The trees are midpoint rooted for purposes of clarity only. βERV proteins of P. vampyrus and P. alecto are highlighted in red text. βERVs of M. lucifugus are highlighted in blue text. The clades within the Gag and Pol trees highlighted with a grey background (γ-Env) contain betaretroviruses whose Env sequence is not sufficiently closely related to the Env of other betaretroviruses to be included in the Env tree.
Figure 3Phylogenetic comparison of the envelope (Env) protein sequence of betaretroviruses and gammaretroviruses. Bootstrap values <70% are not shown, and branch lengths are drawn to a scale of amino acid substitutions per site. Bootstrap values are denoted as ** >90%; * >70% and <90%. βERV proteins of P. vampyrus and P. alecto are highlighted in red text. βERVs of M. lucifugus and R. ferrumequinum are highlighted in blue text. Gammaretroviruses are highlighted in teal text.
Figure 4Eight sub-groups of the genus. A schematic diagram for a single representative of each group is depicted. Core retroviral genes gag, pro, pol, and env are bordered by the proviral long terminal repeats (LTRs). Also shown are other major genetic features such as open reading frames (ORFs) greater than 300nt in length and the rec gene of HERV-K113, enzymatic active site motifs of protease (D×G), reverse transcriptase (DDD), and integrase (DDE); the primer binding site (PBS) and polypurine tracts (PPT); and the characteristic betaretroviral dUTPase domain. NSR: non-sequenced region. ORF* is part of foreign nucleotide insertion within MlERV-βA and does not appear to be a retroviral element.
Estimation of time since integration
| | | |
| 0.024 | 30 | |
| 0.011 | 13.8 | |
| 0.027 | 33.8 | |
| 0.056 | ND | |
| 0.011 | 13.8 | |
| 0.043 | ND | |
| 0.039 | ND | |
| 0.006 | 7.5 | |
| 0.029 | 36.3 | |
| 0.005 | 6.3 | |
| 0.025 | ND | |
| | | |
| 0.055 | 29 | |
| 0.043 | 22.6 | |
| 0.008 | 4.2 | |
| 0.012 | 6.3 | |
| 0.007 | 3.7 | |
| 0.006 | 3.2 |
a 5′ and 3′ LTR divergence: number of differences, per nucleotide, per site.
b Molecular clock dating was used to estimate the time in millions of years (mya) since the integration of each betaretrovirus into the host genome, based on the number of nucleotide differences between the 5′ and 3′ LTRs of each betaretrovirus [25].
ND: Not dated; these βERVs could not be dated using this method. PvERV-βD and PvERV-βF contained non-sequenced regions within their 5′ LTR, while PvERV-βG and PvERV-βK contained bulk deletions within their 3′ LTRs.
Figure 5A proposed series of events leading to the current diversity in the genus The proposed series of evolutionary events leading to eight distinct sub-groups of betaretroviruses based on a combination of the phylogenetic analyses of Gag, Pol, and Env protein sequences, and the genomic features and organizations of individual betaretroviruses.