| Literature DB >> 24816789 |
Jiří Černý1, Barbora Černá Bolfíková2, James J Valdés3, Libor Grubhoffer1, Daniel Růžek4.
Abstract
Viral RNA dependent polymerases (vRdPs) are present in all RNA viruses; unfortunately, their sequence similarity is too low for phylogenetic studies. Nevertheless, vRdP protein structures are remarkably conserved. In this study, we used the structural similarity of vRdPs to reconstruct their evolutionary history. The major strength of this work is in unifying sequence and structural data into a single quantitative phylogenetic analysis, using powerful a Bayesian approach. The resulting phylogram of vRdPs demonstrates that RNA-dependent DNA polymerases (RdDPs) of viruses within Retroviridae family cluster in a clearly separated group of vRdPs, while RNA-dependent RNA polymerases (RdRPs) of dsRNA and +ssRNA viruses are mixed together. This evidence supports the hypothesis that RdRPs replicating +ssRNA viruses evolved multiple times from RdRPs replicating +dsRNA viruses, and vice versa. Moreover, our phylogram may be presented as a scheme for RNA virus evolution. The results are in concordance with the actual concept of RNA virus evolution. Finally, the methods used in our work provide a new direction for studying ancient virus evolution.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24816789 PMCID: PMC4015915 DOI: 10.1371/journal.pone.0096070
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The list of selected vRdPs.
| Baltimore class | family | genus | virus | abbre-viation | viral RNA dependent polymerase | ||||
| PDB | str. | res. [Å] | cocrystallized molecules | citation | |||||
| +ssRNA viruses |
|
| Rabbit hemorrhagic disease virus | RHEV | 1KHV | B | 2,5 | Lu2+ |
|
|
| Murine norovirus | MuNORV1 | 3UQS | A | 2 | SO4 2− |
| ||
| Norovirus | NORV | 3BSO | A | 1,74 | Mg2+, CTP, RNA |
| |||
|
| Sapporo virus | SappV | 2CKW | A | 2,3 |
| |||
|
|
| Dengue virus 3 | DENV3 | 2J7W | A | 2,6 | Zn2+, GTP |
| |
| Japanese encephalitis virus | JEV | 4K6M | A | 2,6 | SAH, SO4 2−, Zn2+ |
| |||
|
| Hepatitis C virus 1 | HCV1 | 1NB6 | A | 2,6 | Mn2+, UTP |
| ||
|
| Bovine viral diarrhea virus | BVDV1 | 1S49 | A | 3 | GTP |
| ||
|
|
| Enterobacterio phage Qβ | Qβ | 3AVX | A | 2,41 | Ca2+, 3′dGTP, RNA |
| |
|
|
| Foot and mouth disease virus | FMDV | 2E9Z | A | 3 | Mg2+, UTP, PPi, RNA |
| |
|
| Humane rhinovirus 16 A | HuRV16A | 1XR7 | A | 2,3 |
| |||
| Coxsackie virus B3 | CoxVB3 | 3CDW | A | 2,5 | PPi |
| |||
| Humane rhinovirus 1B | HuRV1B | 1XR6 | A | 2,5 | K+ |
| |||
| Poliovirus 1 | PolV | 3OLB | A | 2,41 | Zn2+, ddCTP, RNA |
| |||
| ds RNA viruses |
|
| Infectious pancreatic necrosis virus | IPNV | 2YI9 | A | 2,2 | Mg2+ |
|
|
| Infectious bursal disease virus | IBDV | 2PUS | A | 2,4 |
| |||
|
|
| Pseudomonas phage phi6 | Φ6 | 1HI0 | P | 3 | Mn2+, Mg2+, GTP, DNA |
| |
|
|
| Mammalian orthoreovirus 3 | MORV3 | 1N35 | A | 2,5 | Mn2+, 3′dCTP, RNA |
| |
|
| Simian rotavirus Sa11 | SRV | 2R7W | A | 2,6 | GTP, RNA |
| ||
| Reverse tran- scribing viruses |
|
| Moloney murine leukemia virus | MoMLV | 1RW3 | A | 3 |
| |
|
| Human immunodeficiency virus 2 | HIV2 | 1MU2 | A | 2,35 | SO4 2− |
| ||
| Human immunodeficiency virus 1 | HIV1 | 3V81 | C | 2,85 | nepavirine, DNA |
| |||
The vRdPs selected as described in Material and methods were assigned to individual viral species, genera, families and Baltimore groups. For each individual vRdP its PDB code (PDB), used protein strand (column str.), resolution (column res.) and cofactor, substrate, template, product molecules (column co-crystallized molecules) are listed.
Figure 1Protein structures of selected vRdPs representatives.
Nine representatives of the selected vRdPs were chosen. Their structures are shown as a ribbon diagram. All molecules are oriented in the same orientation with finger subdomain on the left, the palm on the bottom and the thumb on the right. The catalytic site is positioned in the centre of each molecule and in some protein structures it is enclosed by the finger tips located at the top of each protein structure. Conserved protein structures typical of vRdPs (homomorphs) are highlighted by colours: violet (hmG), dark blue (hmF), dark green (hmA), light green (hmB), yellow (hmC), orange (hmD) red (hmE), and pink (hmH). Molecular rendering in this figure were created with Swiss PDB Viewer.
Comparison of structure similarity Z-score of all vRdPs.
| DENV | JEV | BVDV1 | HCV1 | PolV1 | HuRV16 | HuRV1B | CoxVB3 | FMDV | NORV | MuNORV1 | RHEV | SappV | Φ6 | Qβ | IBDV | IPNV | SRV | MORV3 | HIV1 | HIV2 | ||
| 2J7W-A | 4K6M-A | 1S49-A | 1NB6-A | 3OLB-A | 1XR7-A | 1XR6-A | 3CDW-A | 2E9Z-A | 3BSO-A | 3UQS-A | 1KHV-B | 2CKW-A | 1HI0-P | 3AVX-A | 2PUS-A | 2YI9-A | 2R7W-A | 1N35-A | 3V81-C | 1MU2-A | ||
| JEV | 4K6M-A | 42,9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| BVDV1 | 1S49-A | 22,8 | 21,7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| HCV1 | 1NB6-A | 20,5 | 17,4 | 27,4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| PolV1 | 3OLB-A | 18,1 | 16,8 | 25,3 | 21,5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| HuRV16 | 1XR7-A | 18,2 | 16,6 | 25,1 | 20,9 | 52,4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| HuRV1B | 1XR6-A | 18,0 | 16,5 | 24,8 | 20,7 | 52,2 | 56,7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| CoxVB3 | 3CDW-A | 18,0 | 16,3 | 25,2 | 21,0 | 53,1 | 52,4 | 53,1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| FMDV | 2E9Z-A | 19,2 | 17,2 | 26,5 | 21,6 | 41,5 | 41,3 | 41,0 | 41,6 | - | - | - | - | - | - | - | - | - | - | - | - | - |
| NORV | 3BSO-A | 20,5 | 17,5 | 27,1 | 23,8 | 32,0 | 32,3 | 38,1 | 31,8 | 32,4 | - | - | - | - | - | - | - | - | - | - | - | - |
| MuNORV1 | 3UQS-A | 20,9 | 17,7 | 28,0 | 25,2 | 31,1 | 31,5 | 31,2 | 31,4 | 32,2 | 51,0 | - | - | - | - | - | - | - | - | - | - | - |
| RHEV | 1KHV-B | 18,7 | 17,9 | 27,4 | 24,3 | 32,4 | 33,0 | 32,9 | 33,0 | 32,4 | 39,3 | 42,7 | - | - | - | - | - | - | - | - | - | - |
| SappV | 2CKW-A | 17,5 | 15,0 | 24,7 | 20,6 | 30,4 | 30,8 | 30,8 | 30,9 | 30,8 | 39,1 | 39,4 | 43,9 | - | - | - | - | - | - | - | - | - |
| Φ6 | 1HI0-P | 14,8 | 10,6 | 4,1 | 16,4 | 17,2 | 17,0 | 16,9 | 17,7 | 15,7 | 18,5 | 19,1 | 17,7 | 14,1 | - | - | - | - | - | - | - | - |
| Qβ | 3AVX-A | 11,1 | 7,7 | 14,8 | 14,1 | 14,0 | 13,5 | 13,6 | 14,5 | 13,8 | 13,2 | 14,4 | 14,9 | 12,6 | 12,3 | - | - | - | - | - | - | - |
| IBDV | 2PUS-A | 8,4 | 6,6 | 10,7 | 9,5 | 12,1 | 12,1 | 11,9 | 12,6 | 12,9 | 13,4 | 13,3 | 12,6 | 12,9 | 9,5 | 6,0 | - | - | - | - | - | - |
| IPNV | 2YI9-A | 9,8 | 6,7 | 13,9 | 12,9 | 12,4 | 12,3 | 12,1 | 13,0 | 13,5 | 15,5 | 14,2 | 14,0 | 13,2 | 10,7 | 7,7 | 42,5 | - | - | - | - | - |
| SRV | 2R7W-A | 8,9 | 9,0 | 10,2 | 10,5 | 9,7 | 9,4 | 8,3 | 8,4 | 9,3 | 9,4 | 9,1 | 10,4 | 8,5 | 9,9 | 7,8 | 4,6 | 4,6 | - | - | - | - |
| MORV3 | 1N35-A | 6,5 | 4,0 | 10,3 | 7,6 | 7,8 | 7,3 | 7,1 | 7,8 | 8,1 | 7,9 | 7,9 | 8,1 | 8,0 | 8,4 | 8,0 | 6,5 | 6,6 | 15,4 | - | - | - |
| HIV1 | 3V81-C | 4,7 | 1,6 | 6,3 | 6,5 | 5,4 | 5,5 | 4,9 | 4,8 | 5,3 | 5,5 | 5,7 | 5,7 | 4,9 | 3,8 | 5,8 | 2,8 | 2,3 | 4,0 | 5,9 | - | - |
| HIV2 | 1MU2-A | 5,4 | 4,0 | 7,9 | 7,4 | 6,2 | 6,6 | 6,8 | 6,9 | 6,1 | 7,6 | 7,9 | 6,5 | 7,4 | 5,5 | 7,7 | 3,6 | 4,3 | 4,6 | 5,1 | 28,5 | - |
| MoMLV | 1RW3-A | 4,7 | 3,4 | 7,9 | 6,2 | 7,2 | 7,4 | 7,0 | 6,8 | 6,0 | 7,6 | 6,8 | 7,5 | 7,4 | 4,9 | 6,2 | 2,6 | 3,0 | 4,0 | 3,9 | 18,2 | 20,7 |
Individual vRdP structures are introduced by a PBD code-strain and they are assigned to a virus species. Note that structure similarity Z-score is high among vRdPs originating from viruses classified in the same genus (see genus Enterovirus (written in bold) as the best example). Structural similarity is somewhat lower but still high among vRdPs from viruses classified in the same family (see family Picornaviridae (written in italic) as the best example). Structural similarity of vRdPs from viruses classified in different families is significantly lower and is decreasing with excepted phylogenetic relationship. Compare all other families to family Picornaviridae.
Matrix describing individual features used in phylogenetic analysis of vRdPs.
| Virus | Family | Genus | PDB ID | Chain | Features | ||||||||||||||||||||
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | |||||
| DENV3 |
|
| 2J7W | A | 0 | 0 | 0 | 0 | 0 | 0 | N | 1 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| JEV |
|
| 4K6M | A | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| BVDV1 |
|
| 1S49 | A | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| HCV1 |
|
| 1NB6 | A | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| PolV1 |
|
| 3OLB | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| HuRV16 |
|
| 1XR7 | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| HuRV1B |
|
| 1XR6 | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| CoxVB3 |
|
| 3CDW | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| FMDV |
|
| 2E9Z | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| NORV |
|
| 3BSO | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| MuNORV1 |
|
| 3UQS | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
| RHEV |
|
| 1KHV | B | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 |
| SappV |
|
| 2CKW | A | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
| Φ6 |
|
| 1HI0 | P | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 1 | 1 | 0 | 0 | 0 | 2 | 1 | 0 | 2 | 1 | 0 | 1 | 1 | 2 |
| Qβ |
|
| 3AVX | A | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
| IBDV |
|
| 2PUS | A | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | 1 | 0 |
| IPNV |
|
| 2YI9 | A | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | 1 | 0 |
| SRV |
|
| 2R7W | A | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 0 | 1 | 1 | 3 |
| MORV3 |
|
| 1N35 | A | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 2 | 0 | 0 | 1 | 1 | 3 |
| HIV1 |
|
| 3V81 | C | 1 | 1 | 2 | 1 | 0 | 1 | 2 | 0 | 2 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
| HIV2 |
|
| 1MU2 | A | 1 | 1 | 2 | 1 | 0 | 1 | 2 | 0 | 2 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
| MoMLV |
|
| 1RW3 | A | 1 | 1 | 2 | 1 | 0 | 1 | 2 | 0 | 2 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
Individual vRdP structures are introduced by PBD code-strain and they are assigned to a virus species. Rows in the matrix represent vRdPs, while the compared features are listed as 21 columns. Compared features are: (A) polymerase product - 0 RNA, 1 DNA; (B) polymerase template - 0 RNA, 1 both DNA and RNA; (C) NA synthesis initiation - 0 de novo, 1 protein primer, 2 RNA primer; (D) overall polymerase domain architecture as described in [23] - 0 active site is encircled by finger tips, 1 active site is open (fingers subdomain do not touch thumb subdomain); (E) polymerase core organization - 0 ABC, 1 CAB; (F) motif F length - 0 normal (motif is F2 is present), 1 short (motif F2 is absent), 2 long (insertion is present in motif F); (G) motif F structure - 0 ββα(310)β, 1 βββ, 2 ββ; (H) F - A (C) motif connection - 0 short (≤35 amino acid residues), 1 long structured (>35 amino acid residues); (I) motif A structure - 0 -310, 1 βα, 2 β310; (J) A–B motif connection - 0 ααββ, 1 αββαββ, 2 ββ; (K) length of helix in motif B - 0 normal (≤21 amino acid residues), 1 long (>22 amino acid residues); (L) kink in motif B - 0 absent, 1 present; (M) B - C (D) motifs connection - 0 very short (≤5 amino acid residues), 1 loop (6–14 amino acid residues), 2 long helical (≥15 amino acid residues, at least 8 amino acid residues long helix); (N) motif C length - 0 short (10 amino acid residues), 1 long (>10 amino acid residues); (O) C (B)–D motifs connection - 0 short loop (≤5 amino acid residues), 1 long loop (>5 amino acid residues); (P) motif D structure - 310α-, 1 α-, 2αβ; (Q) position of helix in motif D - 0 normal position, 1 shifted position; (R) D–E motif connection - 0 short (<20 amino acid residues), 1 long structured (<20 amino acid residues); (S) motif E structure - 0 wide, 1 narrow; (T) thumb domain size - 0 large (>180 amino acid residues), 1 small (<180 amino acid residues); (U) priming motif - 0 none, 1 priming loop in thumb subdomain, 2 priming loop in palm subdomain, 3 polymerase C terminal part. Symbols α, β, 310, and L mean α helix, β strand, 310 helix, and loop, respectively.
Figure 2Structure based sequence alignment of vRdPs finger subdomain.
vRdPs are listed at the beginning of each row by the name of the virus encoding the appropriate vRdP followed by vRdP PBD code. The number at the beginning and at the end of each row indicates the position of the first and last amino acid residue on the appropriate row in the full-length protein bearing polymerase activity (including all additional protein domains). The numbering above the alignment describes position of individual amino acid residues in the alignment. Amino acid residues forming α helices, 310 helices, and β strands are written by red, green, and blue, respectively. Solvent accessible amino acid residues are written in lower case letters; solvent inaccessible by upper case letters. Amino acid residues with positive phi torsion angle, amino acid residues hydrogen bound to main-chain amide, or amino acid residues hydrogen bound to main-chain carbonyl are underlined, written in bold, or in italic, respectively. Most frequent amino acid residues at each alignment position are listed in a row called consensus. Highly conserved positions (more than 80%) are indicated by uppercase violet letters. The 100% conserved amino acid residues are shown by uppercase red letters. Most upper row shows Clustal calculated consensus. Amino acid residues in conserved sequence motifs G and F typical for all vRdPs are highlighted by violet and dark blue colour frames. Amino acid residues it the conserved structural homomorhps hmG and hmF are highlighted the same but lighter colours.
Figure 3Structure based sequence alignment of vRdPs palm subdomain.
Alignment of vRdPs is as in Figure 2. Amino acid residues in conserved sequence motifs F, A, B, and C are highlighted by dark blue, dark green, light green, and yellow frames. Amino acid residues it the conserved structural homomorhps are highlighted the same but lighter colours. The only three 100% conserved amino acid residues in the entire alignment (an arginine residue at position 327 in motif F, an aspartate residue at position 411 in motif, and a glycine residue at position 517 in motif B). The fourth 100% conserved amino acid residue is an aspartate residue in motif C. Despite this aspartate residue is superpostionable in protein structures, it is placed on different position in structure based sequence alignment of protein primary structures thanks to cyclic permutation in IBDV and IPNV RdRPs (see position 397 for birnaviral RdRPs and position 580 for remaining vRdPs).
Figure 4Structure based sequence alignment of vRdPs thumb subdomain.
Alignment of vRdPs is as in Figure 2 and 3. Amino acid residues in conserved sequence motifs D and E are highlighted by orange and red frames. Amino acid residues in the conserved structural homomorhps are highlighted the same but lighter colours. hmH homomorph is highlighted in pink.
Figure 5Phylogenetic tree of vRdPs evolution.
Phylogenetic tree was calculated by an analysis unifying sequence and structure information. Only names of virus species coding vRdPs are listed in the tree. Individual virus species are grouped in genera (blue) and families (red) according actual ICTV virus taxonomy.