| Literature DB >> 29354101 |
Chun Shen Lim1, Chris M Brown1.
Abstract
Structured RNA elements may control virus replication, transcription and translation, and their distinct features are being exploited by novel antiviral strategies. Viral RNA elements continue to be discovered using combinations of experimental and computational analyses. However, the wealth of sequence data, notably from deep viral RNA sequencing, viromes, and metagenomes, necessitates computational approaches being used as an essential discovery tool. In this review, we describe practical approaches being used to discover functional RNA elements in viral genomes. In addition to success stories in new and emerging viruses, these approaches have revealed some surprising new features of well-studied viruses e.g., human immunodeficiency virus, hepatitis C virus, influenza, and dengue viruses. Some notable discoveries were facilitated by new comparative analyses of diverse viral genome alignments. Importantly, comparative approaches for finding RNA elements embedded in coding and non-coding regions differ. With the exponential growth of computer power we have progressed from stem-loop prediction on single sequences to cutting edge 3D prediction, and from command line to user friendly web interfaces. Despite these advances, many powerful, user friendly prediction tools and resources are underutilized by the virology community.Entities:
Keywords: RNA structure prediction; RNA viruses; bioinformatics; cis-regulatory elements; comparative genomics; non-coding RNAs; pseudoknots; structural motifs
Year: 2018 PMID: 29354101 PMCID: PMC5758548 DOI: 10.3389/fmicb.2017.02582
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Structural RNA elements, the most used prediction tools, and challenges for their prediction.
| mfold/UNAFold (Zuker, | Predictions normally only consider standard or canonical base-pairs C-G, U-A, and U-G. Single base-pairs (“lone pairs”) are often excluded by default. Functionally important alternative structures depending on ligand binding need special consideration (e.g., riboswitches). | |
| mfold/UNAFold, RNAfold, RNAStructure/Fold. | Predicted bulges may use non-canonical base-pairs e.g., U-U, A-G, kink-turn. | |
| mfold/UNAFold, RNAfold, RNAStructure/Fold. | Predicted internal loops may use non-canonical base-pairs e.g., U-U, A-G. | |
| mfold/UNAfold, RNAfold (free energy bonus). RNAComposer (Popenda et al., | Most 2D predictions do not predict the intraloop pair (e.g., the G-A pair of the GNRA loop). 3D predictions may predict them. Should be considered if a terminal four-base loop is predicted. Other types of loops e.g., tri-loop and anticodon like loops, can also be stabilized. | |
| PknotsRG (Janssen and Giegerich, | Not predicted by most 2D software. Alternative forms of pseudoknot are found. | |
| RNAComposer, 3dRNA. | Widespread but most software will not predict these due to non-canonical base-pairs. Requires 3D or homology based software which are yet to be integrated into the most used RNA structure prediction tools. | |
| mfold/UNAFold, RNAfold, RNAStructure/Fold. | Junctions may be important for ligand binding but 3D structures are difficult to predict. | |
| RNAComposer, 3dRNA. | Common in 3D structures. | |
| pAliKiss (Janssen and Giegerich, | Difficult to predict without prior knowledge. | |
| Combination of stem-loop and pseudoknot prediction tools. | No specialized tools available to date. | |
| mfold/UNAFold, LRIscan (Fricke and Marz, | Difficult to predict without prior knowledge. Only two specialized tools available to date—CovaRNA and LRIscan. Only LRIscan is optimized for viral genomes and yet to be proven useful. | |
| RNAhybrid (Rehmsmeier et al., | Difficult to predict without prior knowledge. |
Figure 1Known viral RNA structures, from stem-loops to complex tRNA-like structures. (A) The simplest form of RNA structure is a stem-loop. A stem-loop is shown with a bulge, internal loop or (B) tetraloop. (C) The loop can also base-pair with upstream or downstream sequences to form a pseudoknot. (D) Interaction between the loops of two stem-loops forms kissing hairpins. (E) A relatively complex structure is a cloverleaf or tRNA-like structure that often consists of multiple stem-loops and pseudoknots.
Figure 2Dumbbell RNA structures of flaviviruses. Representations of 5′ dumbbell of dengue virus 2 in (A) dot-bracket notation, (B) arc, (C) stem-loops, and (D) circular diagrams. The diagrams are illustrated by VARNA (with pseudoknotted interactions). (E) The excerpts of the Stockholm file of the dumbbell elements (both 5′ and 3′ dumbbells) from Rfam. A Stockholm file consists of descriptions of the RNA structure of interest, multiple sequence alignment and consensus secondary structure in dot-and-bracket format. (F) Rfam model of the dumbbell structure assessed and illustrated by R-scape and R2R, respectively. (G) Representations of 5′ dumbbell of dengue virus 2 in 3D structure (modeled by SimRNAWeb; Magnus et al., 2016).
Figure 3Approaches in prediction of structured RNA elements in RNA viruses. A virus sequence of interest can be matched to the NCBI/RefSeq database (see section “KNOW YOUR ENEMY”). A range of related sequences can be aligned using RNA structure informed and/or CDS informed approaches. Structured RNA elements of a virus are likely conserved in structure rather than primary sequence (red, blue, and green dots indicate mismatches). Secondary structures can be predicted from the aligned sequence. Covariation of a secondary structure can be tested statistically. Secondary structures can also be predicted directly using minimum free energy MFE) approach. RNA 3D prediction can also be done.
Figure 4Proportion of known viruses and viroids based on the Baltimore classification, used in the ICTV database. (A) The genetic material of about 44% of known viruses and viroids is RNA. (B) About 58% RNA viruses and viroids are positive strand RNA viruses [ssRNA (+)] of which (C) Potyviridae are the largest family. RNA viruses are usually enriched with RNA structures. This is partly due to both the replication and transcription of eukaryotic RNA viruses occur in the cytoplasm, which are distinct from the host system and are driven by viral RNA elements. RNA virus transcripts therefore lack 5′-m7G-cap and are translated via unusual mechanisms such as internal ribosome entry site (IRES)-mediated translation and cap-independent translation. Only two RNA virus families are bacteriophages, namely Leviviridae and Cystoviridae, which are positive-sense single-stranded RNA and double-stranded RNA viruses, respectively.
Figure 5Viral structured RNA elements from Rfam 12.2. (A) The number of the structured RNA families published in journal articles over the years and (B) viral RNA families available in Rfam. However, the viral entries are likely overrepresented by RNA structures at the untranslated regions as those located in the coding sequence are often overlooked. sisRNA, stable intronic sequence RNA.
Figure 6Structured RNA elements of BYDV. CP readthrough elements are shown in green. BYDV, barley yellow dwarf virus; CP, coat protein; BTE, BYDV-like translation element; gRNA, genomic RNA; MP, movement protein; ORF, open reading frame; RdRp, RNA-dependent RNA polymerase; sgRNA, subgenomic RNA; SL, stem-loop.
Figure 7Functions of viral RNA structures. Viral structured RNA elements are important in viral replication, transcription and translation. Many RNA viruses hijack the host translation machinery and utilize unusual translation mechanisms for protein synthesis. (A) Internal ribosomal entry site (IRES) of HCV recruits eIF3 and 43S preinitiation complex to promote a cap-independent translation mechanism called IRES-mediated translation. The domains II to IV of HCV (light green) are the key RNA motifs of IRES. This unusual translation mechanism can be inhibited by benzimidazole by targeting the domain IIa. The domain IIIf is a pseudoknot. (B) Cap-independent translation enhancer (CITE) of BYDV (BYDV-like translation enhancer, BTE) recruits eukaryotic initiation factors and 40S ribosomal subunit, forming long-range interactions with stem-loop-D (SL-D) to promote translation. (C) Unusual translation mechanisms can also occur in some polycistronic viral RNAs. The 5′UTR of cauliflower mosaic virus is long and highly structured. The highly structured region contains multiple upstream AUGs. A highly structured 5′UTR with multiple upstream AUGs could inhibit translation of the main open reading frame (mORF) of a eukaryotic mRNA. Cauliflower mosaic virus overcome this problem with ribosome shunt cis-element. A ribosome first translate the small ORF (sORF) at the viral 5′UTR. During translation termination, the ribosome dissociates but the take-off site (the sequence surrounding the termination codon) induce ribosome shunting. This allows the ribosome to bypass the highly structured region of the 5′UTR, land on the landing site, followed by translation of the mORF. (D) Feline calicivirus contains two ORFs with a slightly overlapping sequence AUGA. A structured motif called stop/restart cis-element located upstream of AUGA permits effective reinitation and translation of the second ORF. A termination upstream ribosome-binding site (TURBS) located in the RNA structure allows tethering of 40S ribosomal subunit and eIF3. This promotes reinitiation of the second ORF.