| Literature DB >> 27596539 |
Max L Nibert1, Jesse D Pyle2, Andrew E Firth3.
Abstract
Sequence accessions attributable to novel plant amalgaviruses have been found in the Transcriptome Shotgun Assembly database. Sixteen accessions, derived from 12 different plant species, appear to encompass the complete protein-coding regions of the proposed amalgaviruses, which would substantially expand the size of genus Amalgavirus from 4 current species. Other findings include evidence for UUU_CGN as a +1 ribosomal frameshifting motif prevalent among plant amalgaviruses; for a variant version of this motif found thus far in only two amalgaviruses from solanaceous plants; for a region of α-helical coiled coil propensity conserved in a central region of the ORF1 translation product of plant amalgaviruses; and for conserved sequences in a C-terminal region of the ORF2 translation product (RNA-dependent RNA polymerase) of plant amalgaviruses, seemingly beyond the region of conserved polymerase motifs. These results additionally illustrate the value of mining the TSA database and others for novel viral sequences for comparative analyses.Entities:
Keywords: Amalgaviridae; Coiled coil; Database mining; Fungal virus; Plant virus; Ribosomal frameshifting; dsRNA virus
Mesh:
Substances:
Year: 2016 PMID: 27596539 PMCID: PMC5052127 DOI: 10.1016/j.virol.2016.07.002
Source DB: PubMed Journal: Virology ISSN: 0042-6822 Impact factor: 3.616
Newly proposed (top) and original (bottom) plant amalgaviruses.
| Putative host species (cultivar) | GenBank accession no. | Amalgavirus (abbrev.) | Length (bp) | ORF1p (aa) | ORF2p (aa) | ORF1+2p (aa) |
|---|---|---|---|---|---|---|
| GAAO01011981 | AcAV1 | 3453 | 391 | 779 | 1057 | |
| GAAN01008476 | AcAV2 | 3453 | 390 | 787 | 1065 | |
| GBIE01024896 | AoAV1 | 3356 | 382 | 783 | 1056 | |
| GBIE01028534 | AoAV2 | (2971) | (388) | (716) | (989) | |
| GEFY01004381 | CoAV1 | 3333 | 398 | 774 | 1066 | |
| JW101175 | CaAV1 | 3478 | 375 | 774 | 1062 | |
| GDRJ01026949 | CdAV1 | 3443 | 402 | 774 | 1070 | |
| GDQF01098448 | EbAV1 | 3433 | 384 | 784 | 1049 | |
| GDQF01120453 | EbAV2 | 3408 | 386 | 785 | 1054 | |
| GBXZ01049574 | FpAV1 | 3412 | 382 | 784 | 1057 | |
| GBXZ01002308 | FpAV2 | 3411 | 385 | 774 | 1053 | |
| GBXZ01009138 | FpAV3 | (3288) | 385 | (768) | (1047) | |
| 3381 | 385 | 769 | 1048 | |||
| GEAC01063629 | GaAV1 | (2793) | (228) | 774 | (896) | |
| 3401 | 403 | 774 | 1071 | |||
| GAYX01076418 | LpAV1 | (3296) | 385 | (770) | (1049) | |
| 3373 | 385 | 769 | 1048 | |||
| GAFF01077243 | MsAV1 | 3423 | 394 | 772 | 1058 | |
| GDHJ01028335 | PeAV1 | 3394 | 384 | 781 | 1059 | |
| GECO01025317 | PpAV1 | (3015) | (322) | 777 | (1003) | |
| (3186) | (365) | 777 | (1046) | |||
| GAMH01005363 | SeAV1 | (2798) | 382 | (613) | (880) | |
| GCJW01039808 | ScAV1 | (2851) | 382 | (633) | (916) | |
| 3412 | 398 | 781 | 1064 | |||
| HM029246 | BLV | 3431 | 375 | 789 | 1054 | |
| HQ128706 | RHV-A | 3427 | 404 | 777 | 1077 | |
| EF442780 | STV | 3437 | 377 | 774 | 1062 | |
| EU371896 | VCV-M | 3434 | 394 | 771 | 1057 | |
Nucleotide sequences that appear to be truncated at one or both ends have their lengths listed in parentheses.
For apparently full-length ORF1 translation products, the lengths are calculated from the first in-frame Met residue to the first in-frame stop codon. For ORF1 translation products that appear to be truncated at one or both ends, the lengths are calculated to the termini and are listed in parentheses.
For apparently full-length ORF2 translation products, the lengths are calculated from the first residue following the proposed +1 PRF site to the first in-frame stop codon. For ORF2 translation products that appear to be truncated at the C-terminal end, the lengths are calculated from the first residue following the proposed +1 PRF site to the C-terminus and are listed in parentheses.
For apparently full-length ORF1+2 translation products, the lengths are calculated from the first in-frame Met residue in ORF1p to the first in-frame stop codon in ORF2p, taking into account the proposed +1 PRF site. For ORF1+2 translation products that appear to be truncated at one or both ends, the lengths are calculated to the respective termini, taking into account the proposed +1 PRF site.
Sequences for which peer-reviewed papers are also available, as indicated in the text.
Sequences that were extended by reassembling contigs from SRA entries (see text and Table S1).
Fig. 1Motifs for +1 PRF. Anticodon:codon base pairs are indicated by filled circles. The positions of these +1 PRF motifs in a broader, aligned RNA sequence context are shown in Fig. S3. (A) Previously identified motif from influenza (Flu)A virus segment (S)3 and previously proposed motifs from plant amalgaviruses BLV, RHV-A, and VCV-M (Firth et al., 2012) are shown. Proposed motifs from newly proposed plant amalgaviruses are also shown, along with the consensus at bottom. Both UUU and UUC are decoded by a single tRNAPhe iso-acceptor that has anticodon 3′AAG (Grosjean et al., 2010). First positioned on codon UUU in the +1 PRF motif, this tRNA is then thought to slip forward by one nucleotide (arrow) in the P site (onto codon UUC), positioning the next codon (GNN) in the A site for continued translation. (B) Previously proposed motif from plant amalgavirus STV (Depierreux et al., 2016) is shown. Anticodon 3′UCC (first positioned on codon AGG in the motif), was suggested to slip forward by one nucleotide in the P site (onto codon GGC), positioning the next codon (GUC) in the A site for continued translation. (C) Newly proposed motifs from plant amalgaviruses CaAV1 and STV are shown. Anticodon 3´GAI (first positioned on codon CUU in the motif) is thought to slip forward by one nucleotide in the P site (onto codon UUA), positioning the next codon (GNC) in the A site for continued translation.
Fig. 2Pairwise sequence identity scores. Sequences of the ORF1 (lower left) and ORF2 (upper right) translation products of the indicated viruses (original and proposed) were compared in pairs using EMBOSS: needle or needleall. Sequence identity scores are shown in %. Shading off the diagonal highlights more closely related pairs for which the ORF1p score is >40% and the ORF2p score is >65%. For these analyses, the ORF1p sequences of AoAV1 and PpAV1 began with the first residue instead of the first Met residue since their encoding sequences appear to be 5′-truncated, and the ORF2p sequences of AoAV1 and SeAV1 ended with the last residue instead of the last residue before the downstream stop codon since their encoding sequences appear to be 3′-truncated; as a result, their scores here may be artificially low in some instances.
Fig. 3Phylogenetic tree, ORF2p (RdRp). Sequences of the ORF2 translation products were aligned using MAFFT and then subjected to phylogenetic analysis using PhyML as described in Materials and Methods. Values estimated from the data were Proportion of invariable sites, 0.010, and Gamma shape parameter, 1.473. Alternative use of the rtREV amino acid substitution model for PhyML (in place of LG) yielded results largely identical to those shown here. Proposed plant amalgaviruses new to this report are labeled in gray. The tree is displayed as a rectangular phylogram rooted on the branch to family Partitiviridae members. Branch support values are shown in %, and those with support values <50% are collapsed to the preceding node. The few branches with support values between 50% and 80% are drawn with thinner lines. Scale bar, average number of substitutions per alignment position. See Table S2 for a summary of abbreviations and GenBank numbers. Vertical lines: approved or proposed spans of genera and families (family Amalgaviridae has been proposed to encompass proposed genus Zybavirus by Depierreux et al. (2016)). For each genus-level taxon, the number of characterized genome segments for each virus (1 or 2) and known hosts (P, plants; F, fungi; A, alveolate protist) are indicated.
Fig. 4Graphical analyses, ORF2p (RdRp) and ORF1p. (A) The ORF2p (RdRp) alignment for plant amalgaviruses shown in Fig. S1 was analyzed using EMBOSS: plotcon, with a window size of 10 for averaging the similarity scores. Labels A, B, and C indicate peaks corresponding to those respective RdRp motifs. The horizontal line at top indicates the span of homologies to picornavirus RdRps identified by hhpred, as implemented with defaults at http://toolkit.tuebingen.mpg.de/hhpred. Asterisks identify peaks corresponding to highly conserved sequences in a C-terminal region seemingly outside the conserved core RdRp region. (B) The ORF1p alignment for plant amalgaviruses shown in Fig. S2 was analyzed using PCOILS. Results are shown for averaging windows of 14 (dotted line), 21 (dashed line), and 28 (solid line). Fig. S2 also highlights the regions of coiled coil propensity predicted for each individual virus. Graphical results for a representative individual plant amalgavirus sequence (STV) and others are shown in Fig. S4.