| Literature DB >> 30898084 |
Maria Beatriz Walter Costa1,2, Christian Höner Zu Siederdissen3, Marko Dunjić4,5, Peter F Stadler3,6,7,8,9,10,11, Katja Nowick12,13,14,15.
Abstract
BACKGROUND: Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Comprehensive assessments of selection acting on RNAs therefore must also encompass structure. Selection pressures acting on the structure of non-coding genes can be detected within multiple sequence alignments. Approaches of this type, however, have so far focused on negative selection. Thus, a computational method for identifying ncRNAs under positive selection is needed.Entities:
Keywords: Long non-coding RNA; Positive selection; Primate genomes; Psychiatric disorders; RNA secondary structure
Mesh:
Substances:
Year: 2019 PMID: 30898084 PMCID: PMC6429701 DOI: 10.1186/s12859-019-2711-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Structural divergence d of synthetic data sets for families evolved under simulated negative selection pressure (neg) compared to unconstrained (ran) evolution. Each data set is composed of 100 families, evolved from one ancestral sequence to five extant sequences, differing by 5 (left) or 10 (right) accepted substitutions from the ancestor
Fig. 2SSS-score s of synthetic data sets with simulated negative evolutionary constraints compared to simulated positive selection. Each data set is composed of 100 sequences, evolved from one ancestral sequence to one extant sequence, differing by 5 (left) or 10 (right) accepted substitutions from the ancestor
Characterization of local structural selection of lncRNAs
| Species | Local structures | Conserved ( | Positive ( | |
|---|---|---|---|---|
| Human | 8934 | 8179 (91.6%) | 111 | (1.2%) |
| Pan | 8736 | 7997 (91.5%) | 90 | (1.0%) |
| Gorilla | 8080 | 7199 (89.1%) | 136 | (1.7%) |
| Orangutan | 6435 | 4802 (74.6%) | 315 | (4.9%) |
| Macaque | 5113 | 2659 (52.0%) | 738 | (14.4%) |
Only the low diverged set was considered in this analysis. Percentages of conserved and positive structures are relative to each species’ number of representatives
Fig. 3Local lncRNA structure LINC02217sub5: a human, b pan and c gorilla; Only the human structure obtained an SSS-score indicating positive selection with s=16.2, while the data indicates strong negative selection for the other species (s=0.0). Structures are represented by their minimum free energy. Base colors are assigned according to their pairing frequency in the structure’s ensemble. Shades of red occur in ≥90% of the ensemble, shades of green/yellow denote increasing probabilities from ≥50%. For unpaired bases, shades of red denote increasing unpairedness
Fig. 4Local lncRNA structure SIX3-AS1sub11: a human, b pan, c orangutan, d gorilla and e macaque. Only the human structure obtained an SSS-score indicating positive selection with s=12.2 while the other species have strong negative selection scores (s=0.0). Structures are represented by their minimum free energy. Base colors are assigned according to their pairing frequency in the structure’s ensemble. Shades of red occur in ≥90% of the ensemble, shades of green/yellow denote increasing probabilities from ≥50%. For unpaired bases, shades of red denote increasing unpairedness
Human lncRNA candidates with signs of positive selection in local structures
| Gene name | Transcription age | Sequence age | Nb species transcribed | Nb species with sequence | ENSEMBL gene ID |
|---|---|---|---|---|---|
| RRS1-AS1 | African apes | Great apes | 3 | 4 | ENSG00000246145 |
| LINC01939 | African apes | Great apes | 3 | 4 | ENSG00000228799 |
| LINC01839 | Primates | Primates | 4 | 4 | ENSG00000227509 |
| LINC01802 | Primates | Primates | 5 | 5 | ENSG00000225064 |
| LINC01724 | Primates | Primates | 5 | 5 | ENSG00000227421 |
| LINC01693 | Primates | Primates | 5 | 5 | ENSG00000227764 |
| MACC1-AS1 | Primates | Primates | 5 | 5 | ENSG00000228598 |
| TRPM2-AS | Primates | Primates | 5 | 5 | ENSG00000230061 |
| LINC01258 | Primates | Primates | 5 | 5 | ENSG00000249534 |
| PLUT1 | Primates | Primates | 5 | 5 | ENSG00000247381 |
| LINC01345 | Primates | Primates | 5 | 5 | ENSG00000226374 |
| MDC1-AS1 | Primates | Eutherians | 2 | 6 | ENSG00000224328 |
| LINC01790 | Therians | Therians | 3 | 7 | ENSG00000230173 |
| LINC02042 | Eutherians | Eutherians | 5 | 5 | ENSG00000240893 |
| LINC02092 | Eutherians | Eutherians | 5 | 6 | ENSG00000234721 |
| LINC01738 | Eutherians | Eutherians | 6 | 6 | ENSG00000227947 |
| LINC02288 | Eutherians | Eutherians | 6 | 6 | ENSG00000246548 |
| LINC02217 | Eutherians | Eutherians | 6 | 6 | ENSG00000248455 |
| DNMBP-AS1 | Mammals | Amniotes | 6 | 9 | ENSG00000227695 |
| SIX3-AS1 | Tetrapods | Tetrapods | 9 | 9 | ENSG00000236502 |
The evolutionary age and expression information was taken from [20]. Gene names were retrieved from the ENSEMBL database. Only transcripts that have been assigned an HGNC ID are shown
ENSEMBL IDs and SSS-scores of local structures with signs of a positive selection/weak positive selection in humans
| lncRNA family ( | Selection score |
|---|---|
|
|
|
|
|
|
|
|
|
| LINC00689sub40 | 8.9 |
| LINC00689sub38 | 7.4 |
| MEG3sub15 | 7.0 |
| H19sub7 | 6.8 |
| SOX2-OTsub27 | 5.9 |
| MEG3sub1 | 5.4 |
| BDNF-ASsub18 | 5.0 |
| LINC02151sub5 | 4.9 |
| MIATsub86 | 4.7 |
| MIATsub31 | 4.6 |
| NEAT1sub120 | 4.6 |
Marked in bold are local structures with SSS-score above 10.0
Fig. 5Minimum Free Energy (MFE) structures of MIATsub92 local structures of Human and Pan. The duplication of a TTTGAACTTGGCTAACACAGG sequence in the human lineage might have driven the evolution of the structure towards a more stable structure. Prevailing red regions exhibit well-defined structures with probabilities close to 1 for paired and unpaired bases. Duplicated regions are labeled with horizontal and vertical lines, and G/A nucleotide substitution is marked with an arrow. Bonobo has the same sequence as the chimpanzee