| Literature DB >> 31204430 |
Abstract
RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and pseudoknots, which often behave in a modular way. Their ubiquitous distribution is associated with a variety of functions in biological processes. The location of these structures in the genomes of RNA viruses is often coordinated with specific processes in the viral life cycle, where the presence of the structure acts as a checkpoint for deciding the eventual fate of the process. These structures have been found to adopt complex conformations and exert their effects by interacting with ribosomes, multiple host translation factors and small RNA molecules like miRNA. A number of such RNA structures have also been shown to regulate translation in viruses at the level of initiation, elongation or termination. The role of various computational studies in the preliminary identification of such sequences and/or structures and subsequent functional analysis has not been fully appreciated. This review aims to summarize the processes in which viral RNA structures have been found to play an active role in translational regulation, their global conformational features and the bioinformatics/computational tools available for the identification and prediction of these structures.Keywords: RNA secondary structure; RNA structure dynamics; RNA viruses; bioinformatics algorithms; non-canonical translation; viral bioinformatics
Year: 2020 PMID: 31204430 PMCID: PMC7109810 DOI: 10.1093/bib/bbz054
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1Schematic representation of non-canonical translation strategies employed by viruses and discussed in this review. (A) IRES: an extensive RNA structure in 5′-UTR composed of many hairpin stems recruits 40S ribosome and other translation factors. These factors assemble at the structure and scan the mRNA for the nearest AUG codon to start translation. (B) −1 ribosomal frameshifting: an RNA pseudoknot or a stem–loop present in the overlapping genomic region stalls the actively translating ribosome and induces a shift in the reading register, so that downstream translation can be resumed from a new reading frame. (C) Ribosomal shunting: an extensive RNA structure located between a short and a long ORF shunts the 40S subunit of the ribosome between the two ORFs. (D) Stop codon readthrough: RNA structure that is located between two ORFs prevents the reading of the cognate stop codon and releases factor binding so that translation can be continued. (E,F) Reinitiation and non-AUG initiation: in reinitiation, upstream sequence motifs [termination upstream ribosome binding site (TURBS)] interact with 40S subunit to reinitiate translation. In non-AUG initiation, the RNA sequence and/or structure stimulates reading of a near-cognate start codon by the ribosome.
Figure 2Experimentally verified atomic resolution RNA structures. (A) HCV IRES secondary structure showing the modular architecture and corresponding tertiary structures of different units (secondary structure schematic adapted from [40]). (B) Cryo-EM-derived structure of ribosome-bound cricket paralysis virus IRES. (C) Secondary and tertiary structure (1Z2J) of RNA stem–loop from HIV involved in −1 PRF. (D) Secondary structure of pseudoknot involved in programmed −1 ribosomal frameshift (−1 PRF) from mouse mammary tumor virus (MMTV) and its NMR structure (1RNK).
List of existing tools and databases dedicated to storing information and prediction of sequence signals involved in non-canonical translation processes in viruses and other organisms
| Program | Description |
|---|---|
| IRESite | Database housing experimentally annotated cellular and viral IRES elements, as well as |
| HCVIVdb | Database of HCV IRES sequences with published natural or engineered mutations [ |
| PRFdb | Database of ribosomal frameshifting signals from eukaryotic database [ |
| KnotInFrame | Prediction of −1 PRF signals [ |
| FSFinder2 | Prediction of −1 and +1 ribosomal frameshifting sites in genomic and mRNA sequences [ |
| FSscan | Prediction of +1 ribosomal frameshift signals [ |
| FSDB | Database of experimentally verified and predicted ribosomal frameshifting [ |
| Recode | Database of experimentally known translation recoding events and signals [ |
Comparison of RNA sequence/structure motif prediction programs and algorithms utilized in the background. The average sensitivity and specificity values for the predictions have been provided wherever applicable. The list also includes programs that were developed but are no longer maintained (indicated by an)
| Program | Background algorithm | Description | Performance parameters |
|---|---|---|---|
| VIPS | RNALfold [ | For calculating local, thermodynamically stable RNA secondary structure, minimum free energy parameters [ | Accuracy—51.87% |
| RNA Align [ | Comparative secondary structure analysis | ||
| pknotsRG [ | Prediction of pseudoknotted regions in the predicted IRES secondary structures | ||
| Advantages |
Predicts both viral and cellular IRES structures. Can predict IRES structures with pseudoknots. | ||
| Limitation | Due to dependency of algorithm on sequence and structural features conservation, prediction of cellular IRES is poor, since cellular IRESes mostly lack any consensus sequence/structural features. | ||
| IRESPred | RNAFold [ | Support vector machine-based classifiers, RNAFold and RPISeq were used as back-end programs to compute classifying features | Accuracy—70.89% |
| RPISeq [ | |||
| Advantages | Prediction scores were consistently better than VIPS, since algorithm is independent of intrinsic sequence conservation bias. | ||
| Limitations | Principal parameter used for prediction algorithm is the interaction between IRES sequence and 40S ribosome. This interaction is not conserved in cellular IRES and viral IRES from HCV and cricket paralysis virus. Hence, the algorithm is unable to predict IRES when they lack any of the features defined in the feature set of machine-learning algorithm. | ||
| KnotInFrame (−1 PRF) | PknotsRG-fs | Constraint-based folding of input sequence to enforce pseudoknot formation, modified from original pknotsRG program | Ranking the predictions based on differences in constrained and relaxed |
| Advantages | Computationally efficient, scans complete genomes within few hours. | ||
| Limitations | Prediction accuracy is as good as the accuracy of thermodynamic parameters used for RNA secondary structure prediction. | ||
| FSFinder | The algorithm works by scanning for slippery sequence motif and estimating base-pairing possibility in the contextual region for presence of stimulatory signals. | Sensitivity (−1 FS)—88% | |
| Advantages | Predicts both −1 and +1 frameshift sequences. | ||
| Limitations | Prediction of +1 frameshifting has been tested on only two genes: protein chain release factor ( | ||
*Indicates databases which are no longer being maintained.