Literature DB >> 31496634

Selection Pressures on RNA Sequences and Structures.

Katja Nowick1, Maria Beatriz Walter Costa2, Christian Höner Zu Siederdissen3, Peter F Stadler3,4,5,6,7.   

Abstract

With the discovery of increasingly more functional noncoding RNAs (ncRNAs), it becomes eminent to more strongly consider them as important players during species evolution. Although tests for negative selection of ncRNAs already exist since the beginning of this century, the SSS-test is the first one for also investigating positive selection. When analyzing selection in ncRNAs, it should be taken into account that selection pressures can independently act on sequence and structure. We applied the SSS-test to explore the evolution of ncRNAs in primates and identified more than 100 long noncoding RNAs (lncRNAs) that might evolve under positive selection in humans. With this test, it is now possible to more thoroughly include ncRNAs into evolutionary studies.

Entities:  

Keywords:  RNA; consensus structure; positive selection; structural conservation

Year:  2019        PMID: 31496634      PMCID: PMC6716170          DOI: 10.1177/1176934319871919

Source DB:  PubMed          Journal:  Evol Bioinform Online        ISSN: 1176-9343            Impact factor:   1.625


Comment on: Walter Costa MB, Höner Zu Siederdissen C, Dunjic M, Stadler PF, Nowick K. SSS-test: a novel test for detecting positive selection on RNA secondary structure. BMC Bioinformatics. 2019;20(1):151. doi:10.1186/s12859-019-2711-y. PubMed PMID: 30898084. PubMed Central PMCID: PMC6429701. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6429701/. To understand how species evolve and adapt to their environment, tests for natural selection have been developed. The common assumption is that parts of the genome that are responsible for adaptive phenotypic changes evolve faster than other parts. Most proteins and nucleic acids exert their biological function by means of well-defined interactions. The specificity of functional interactions as well as the need to avoid undesired binding activities translates into selection pressures on both the sequence and the 3-dimensional structure of proteins and nucleic acids. A relatively simple test for estimating selection pressures on protein-coding genes has been developed in the 1980s[1,2] and relates the rate of nucleotide changes that cause an amino acid change (non-synonymous changes) to the rate of silent nucleotide changes (synonymous changes), referred to as Ka/Ks or dN/dS ratio. Ratios much smaller than 1 indicate negative selection, i.e., conservation of the protein sequence. Higher ratios are usually interpreted as relaxed constraint. If that ratio is positive, the excess of amino acid changing Mutations is compatible with accelerated evolution or a sign of positive selection. Despite the increasing acknowledgment that ncRNAs are functional, a comparable test for noncoding RNA (ncRNA) genes did not exist until recently.[3] Importantly, in the case of RNAs, structure-formation is dominated both thermodynamically and kinetically by the secondary structure, i.e., the pattern of base pairs and unpaired bases. The simplicity of RNA secondary structures, and their discrete combinatorial nature, makes it possible to describe selection pressures acting on the structure in terms of comparably simple rules that pertain to the preservation and turnover of base pairs. Sequence variations that locally maintain base pairing patterns are indicative of negative selection, in particular compensatory substitutions, such as the replacement of a GC pair by a CG, AU, or UA pair. On the other hand, substitutions that disrupt base pairs hint at relaxed constraints or positive selection. Conceptually, this is not different from synonymous and non-synonymous substitutions in the open reading frames (ORFs) of protein-coding genes. There is, however, an important practical difference between ORFs and RNA secondary structures: although codons are local in sequence, secondary structures are inherently nonlocal, usually involving pairs that are long-range with respect to the sequence. As a consequence, this assessment of selection pressures on secondary structure requires completely different computational tools. It is important to realize that molecules are typically subject to multiple, superimposed selection pressures. For protein-coding genes, e.g., functional elements such as SElenoCystein Insertion Sequences (SECIS) or Internal Ribosomal Entry Sites (IRES) require tightly constrained RNA secondary structures within protein-coding sequences. This specific type of superimposed selective pressures yields substitution patterns that are recognizable by specialized computational tools.[4] Similar situations are observed in ncRNAs. For tRNAs, e.g., the clover-leaf secondary structure and the 3-dimensional L-shape are required for loading into the ribosome and recognition for charging essentially independent of the sequence. On the other hand, tRNAs have an internal pol-III promoter, whose sequence must be maintained to ensure expression. Selection may also act on the expression level. For instance, the choice of rare codons as well as highly stable mRNA secondary structure may hamper translation. Carlini et al.[5] proposed that the balance between codon bias and mRNA secondary structure is mediated through the third codon position: here, natural selection might favor high GC or AT content to increase base pairing for weakly expressed genes and the opposite for highly expressed genes. It is a nontrivial, and largely unsolved task to disentangle such superimposed selective force. Presently, available tools only model a single effect or at most a pair of specific selection pressures. Selection pressures that independently act to maintain superimposed sequence and secondary structure features can lead to incongruent conservation of sequence and structure: in this case, sequence patterns and structural elements are shifted relative to each other. As a consequence, analogous base pairs no longer correspond to homologous sequence positions. This type of incongruent evolution violates the basic assumptions of all tools that measure secondary structure conservation: the secondary structure will not appear conserved in a sequence-based alignment, whereas in structure-based alignments nonhomologous nucleotides are aligned thus leading to an exaggerated estimate of compensatory base pairs. Tools to identify such cases are only in an exploratory stage of development at best.[6] Over the last two decades, several methods have become available to evaluate negative/stabilizing selection of secondary structures, mostly aimed at classical structured RNAs such as tRNAs, rRNA, or snRNAs. A common assumption of all these methods is that selection acts to preserve individual base pairs. The difference between the strict consensus model of R-scape[7] and the reduced rate of evolution model implicit in RNAz[8] and cmfinder[9] is the strength of the selective pressure. In the strict consensus model, a conserved core structure is assumed whose base pairs are present in all homologs under consideration. Thus, one expects to observe compensatory substitutions giving direct statistical evidence for the preserved base pairs. The idea is implemented in R-scape.[7] The more relaxed model only evaluates whether the secondary structures are less diverged than expected for the observed divergence of the underlying sequences: RNAz therefore measures indirect evidence in the form of a “structure conservation index” that presents the ratio of folding energies between individual folds and consensus structure, and a z-score quantifying the folding energy relative to randomized sequences. It has been shown in Ancel and Fontana[10] that stabilizing selection on the secondary structure results in more negative folding energies over evolutionary time scales. Probabilistic models can be used to determine the type of selection acting at a given locus. For negative selection, the expectation is that the rate of change is (very) low. Higher change rates can indicate either accelerated or positive evolution. Accelerated evolution is characterized by higher accumulation of changes in a short amount of time.[11] To identify accelerated regions, one should first identify negative selection for the orthologous locus in other species and then test for accumulation of species-specific changes.[12] Analyzing human accelerated regions (HARs), it seems likely that more than one evolutionary force shapes them,[11] including positive selection.[11,13] In contrast to accelerated evolution, positive evolution occurs when the changed locus yields an advantage to the organism, being actively selected for throughout evolution in a longer time frame. Although accelerated evolution is detectable at the primary sequence level alone, it is necessary to consider a phenotypic level for the detection of positive selection, to identify an advantage over the ancestral state. For ncRNAs and proteins, one should account for changes in the structure. This poses challenges for ncRNAs. Although it suffices for proteins to distinguish synonymous from non-synonymous substitutions, such a binary classification does not appear to work well for RNA secondary structures.[14] As a remedy, the SSS-test associates the probability of each structural change with a background model to calculate the likelihood of a change being merely random or being selected for. An excess of structural change indicates positive selection, whereas an excess of changes that are structure-conserving supports negative selection.[3] The SSS-test combines scoring models for structural change for both substitutions and indels. In this model, scores close to zero indicate negative selection and higher scores are indicative of positive selection. Empirical calibration suggests that scores higher than 10 are a strong indication of positive selection within the primate group. Researchers who wish to investigate selective pressures on ncRNAs should be mindful of the biological question and choose the most suitable approach and software (Table 1), keeping in mind the different selection pressures (Figure 1).
Table 1.

Types of selective pressures on noncoding RNAs and how to detect them.

Selective pressureMethodLevel of analysis
Positive selection SSS-test [3] Secondary structure
Accelerated evolutionPollard et al.[15]Primary sequence
Negative selectionR-scape,[7] RNAz,[8] cmfinder,[9] qrna[16]AlifoldZ,[17] EvoFold,[18] SISSIz,[19] SSS-test[3]Secondary structure
Figure 1.

Types of selection pressures in ncRNAs: (1) positive selection, acting on the structure, in which one species acquires a structural change in the orthologous ncRNA with an advantage over the ancestral structure; (2) accelerated evolution, acting on the primary sequence, in which the sequence of a ncRNA accumulates a relatively high number of changes compared with its orthologs over a short time span; and (3) negative selection, acting on the structure, in which the ncRNA structure is maintained across orthologs over relatively long evolutionary time.

Types of selective pressures on noncoding RNAs and how to detect them. Types of selection pressures in ncRNAs: (1) positive selection, acting on the structure, in which one species acquires a structural change in the orthologous ncRNA with an advantage over the ancestral structure; (2) accelerated evolution, acting on the primary sequence, in which the sequence of a ncRNA accumulates a relatively high number of changes compared with its orthologs over a short time span; and (3) negative selection, acting on the structure, in which the ncRNA structure is maintained across orthologs over relatively long evolutionary time. There are several advantages to the approach taken by the SSS-test: First, it can be used for detecting signs of positive as well as negative selection. Second, it allows identifying changes in structures as well as in stability. Third, small RNAs as well as lncRNAs can be investigated; in the latter case, local structures will be tested for selection. We applied the SSS-test to more than 15 000 human lncRNAs with orthologs in various primates and identified 110 lncRNAs that are candidates for being under positive selection in humans.[3] We observed two types of patterns among these candidates: Some candidates, such as LINC02217, contain local structures with completely different shapes, whereas other candidates, such as SIX3-AS1, maintain their structure but with a clearly increased stability in human compared with their orthologs. We further performed the SSS-test to investigate which lncRNAs that have been associated with psychiatric disorders might evolve under positive selection. We discovered 8 lncRNAs that possess local structures with signs of positive selection in humans. The candidates we identified can now be further tested functionally, to decipher if and how they might be involved in human evolution, for instance, in the evolution of cognitive abilities. The SSS-test and related software are available at https://github.com/waltercostamb/SSS-test and can now be applied for further evolutionary questions. We propose that any new genome project could annotate ncRNA genes in addition to protein-coding genes and scan for RNA structures under selection. Existing genome data and ncRNA databases could be mined and analyzed for selected ncRNAs. Biomedical studies have repeatedly found disease-associated variants within ncRNA genes. To gain further insights into the functions of such genes, their evolutionary history could be investigated with the SSS-test. Although the SSS-test is certainly a powerful test for investigating the evolution of ncRNAs, there is still ample room for improvement: presently, the cutoffs for deeming a structure to evolve under selection are empirically determined and thus need to be calibrated by the user for each dataset. In our study, we required that the candidate structures are among the most conserved structures across the phylogeny, but demonstrate a relatively strong change in a single lineage, e.g., humans. Although the workflow can be extended to detect distinct selective pressures in different lineages, it still depends on the existence of a well-conserved ancestral structure. In some cases, it is possible not only to identify a locus under positive selection but also to reconstruct the evolutionary history itself with some accuracy. This amounts to determining the order of substitution events and can be achieved under the assumption that the structural differences between extant and ancestral structure represent the direction of the selective force.[13] Taken together, the time has come to learn more about the evolutionary history of various ncRNA genes and their role in species evolution. The SSS-test can serve to identify candidates to prioritize for further functional investigations.
  16 in total

1.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models.

Authors:  Z Yang; R Nielsen
Journal:  Mol Biol Evol       Date:  2000-01       Impact factor: 16.240

2.  Plasticity, evolvability, and modularity in RNA.

Authors:  L W Ancel; W Fontana
Journal:  J Exp Zool       Date:  2000-10-15

3.  The relationship between third-codon position nucleotide content, codon bias, mRNA secondary structure and gene expression in the drosophilid alcohol dehydrogenase genes Adh and Adhr.

Authors:  D B Carlini; Y Chen; W Stephan
Journal:  Genetics       Date:  2001-10       Impact factor: 4.562

4.  Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics.

Authors:  Stefan Washietl; Ivo L Hofacker
Journal:  J Mol Biol       Date:  2004-09-03       Impact factor: 5.469

5.  Fast and reliable prediction of noncoding RNAs.

Authors:  Stefan Washietl; Ivo L Hofacker; Peter F Stadler
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-21       Impact factor: 11.205

6.  CMfinder--a covariance model based RNA motif finding algorithm.

Authors:  Zizhen Yao; Zasha Weinberg; Walter L Ruzzo
Journal:  Bioinformatics       Date:  2005-12-15       Impact factor: 6.937

7.  An RNA gene expressed during cortical development evolved rapidly in humans.

Authors:  Katherine S Pollard; Sofie R Salama; Nelle Lambert; Marie-Alexandra Lambot; Sandra Coppens; Jakob S Pedersen; Sol Katzman; Bryan King; Courtney Onodera; Adam Siepel; Andrew D Kern; Colette Dehay; Haller Igel; Manuel Ares; Pierre Vanderhaeghen; David Haussler
Journal:  Nature       Date:  2006-08-16       Impact factor: 49.962

8.  Statistical evidence for conserved, local secondary structure in the coding regions of eukaryotic mRNAs and pre-mRNAs.

Authors:  Irmtraud M Meyer; István Miklós
Journal:  Nucleic Acids Res       Date:  2005-11-07       Impact factor: 16.971

9.  Identification and classification of conserved RNA secondary structures in the human genome.

Authors:  Jakob Skou Pedersen; Gill Bejerano; Adam Siepel; Kate Rosenbloom; Kerstin Lindblad-Toh; Eric S Lander; Jim Kent; Webb Miller; David Haussler
Journal:  PLoS Comput Biol       Date:  2006-04-21       Impact factor: 4.475

10.  Noncoding RNA gene detection using comparative sequence analysis.

Authors:  E Rivas; S R Eddy
Journal:  BMC Bioinformatics       Date:  2001-10-10       Impact factor: 3.169

View more
  3 in total

1.  Variation Profile of the Orthotospovirus Genome.

Authors:  Deepti Nigam; Hernan Garcia-Ruiz
Journal:  Pathogens       Date:  2020-06-29

2.  Analysis of Long Non-Coding RNA in Cryptosporidium parvum Reveals Significant Stage-Specific Antisense Transcription.

Authors:  Yiran Li; Rodrigo P Baptista; Adam Sateriale; Boris Striepen; Jessica C Kissinger
Journal:  Front Cell Infect Microbiol       Date:  2021-01-14       Impact factor: 5.293

3.  RNA structure-altering mutations underlying positive selection on Spike protein reveal novel putative signatures to trace crossing host-species barriers in Betacoronavirus.

Authors:  Alexis Felipe Rojas-Cruz; Juan Carlos Gallego-Gómez; Clara Isabel Bermúdez-Santana
Journal:  RNA Biol       Date:  2022-01       Impact factor: 4.766

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.