| Literature DB >> 29018627 |
Maria A Prostova1, Andrei A Deviatkin1, Irina O Tcelykh1,2, Alexander N Lukashev1,3, Anatoly P Gmyl1,2,3.
Abstract
BACKGROUND: Enteroviruses are small non-enveloped viruses with a (+) ssRNA genome with one open reading frame. Enterovirus protein 3C (or 3CD for some species) binds the replicative element oriL to initiate replication. The replication of enteroviruses features a low-fidelity process, which allows the virus to adapt to the changing environment on the one hand, and requires additional mechanisms to maintain the genome stability on the other. Structural disturbances in the apical region of oriL domain d can be compensated by amino acid substitutions in positions 154 or 156 of 3C (amino acid numeration corresponds to poliovirus 3C), thus suggesting the co-evolution of these interacting sequences in nature. The aim of this work was to understand co-evolution patterns of two interacting replication machinery elements in enteroviruses, the apical region of oriL domain d and its putative binding partners in the 3C protein.Entities:
Keywords: Enterovirus; RNA-protein interaction; Tetraloop; Virus evolution
Year: 2017 PMID: 29018627 PMCID: PMC5633025 DOI: 10.7717/peerj.3896
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Schematic representation of interaction of poliovirus protein 3CD (colored with blue) with poliovirus genome replicative element oriL.
Domains a, b, c and d of oriL are labeled. Apical region of domain d, corresponding to the tetraloop and its flanking base pair, is colored with red.
Number of full genome sequences that contained oriL region and number of unique domain d sequences before and after filtration.
For Enterovirus E and F number of unique tetraloops is shown separately for first and the second oriL.
| Species | Number of full genome sequences | Number of full genome sequences after 1% nucleic identity filtration | Number of unique tetraloops before filtration | Number of unique tetraloops after filtration | ||
|---|---|---|---|---|---|---|
| 1052 | 564 | 17 | 16 | |||
| 339 | 244 | 18 | 18 | |||
| 747 | 274 | 15 | 12 | |||
| 419 | 57 | 7 | 6 | |||
| 12 | 10 | 6 | 5 | 6 | 5 | |
| 13 | 10 | 4 | 3 | 4 | 3 | |
| 10 | 8 | 6 | 6 | |||
| 3 | 2 | 2 | 2 | |||
| 8 | 5 | 3 | 3 | |||
| 159 | 118 | 8 | 8 | |||
| 50 | 37 | 7 | 7 | |||
| 38 | 37 | 6 | 6 | |||
Figure 2Secondary structure of oriL domain d of distinct enterovirus species.
(A–I) secondary structure of oriL domain d in Enterovirus A-J genome. For Enterovirus E and F domain d of the first oriL is shown. Secondary structure of domain d of Porcine enterovirus 9 strain UKG/410/73 was folded with use as reference (Krumbholz et al., 2002); (J–L) secondary structure of oriL domain d in Rhinovirus A-C genome.
Occurrence of domain d apical sequences in filtered sets of full genomes of different enterovirus species and serotypes.
Tetraloops CCCG, UGUG, CAUG and UUGG that were unique for species Enterovirus A, B, C and D and were lost upon filtration, were added to maintain the diversity of loop sequence and are shown in blue. The gradient coloring from red to green represents abundance heat map for the genomes with different domain d sequence.
|
|
Figure 3Distribituion of domain d loop sequence and amino acid motifs in the 3C protein.
(A) Distribution of domain d loop sequences. The regions corresponding to tetraloop consensuses, triloops and pentaloops are shown. Number of genomes cut off at 15 for clear view of sequence distribution. (B) The frequency plot of amino acid sequence of 3C in species of genus Enterovirus. The amino acid sequence logo was done with the WebLogo server (Crooks et al., 2004). Arrows indicate amino acids of the proteolytic triade (Glu71 and Cys 147), the first and the last amino acids of motif 82KFRDI86, the putative RNA-binding tripeptide 154–156 of 3C and Lys153 in the protein 3C of Rhinovirus A.