| Literature DB >> 35885935 |
Senol Dogan1, Emrulla Spahiu2, Anis Cilic3.
Abstract
MicroRNAs (miRNAs) are short non-coding RNAs that function in post-transcriptional gene silencing and mRNA regulation. Although the number of nucleotides of miRNAs ranges from 17 to 27, they are mostly made up of 22 nucleotides. The expression of miRNAs changes significantly in cancer, causing protein alterations in cancer cells by preventing some genes from being translated into proteins. In this research, a structural analysis of 587 miRNAs that are differentially expressed in myeloid cancer was carried out. Length distribution studies revealed a mean and median of 22 nucleotides, with an average of 21.69 and a variance of 1.65. We performed nucleotide analysis for each position where Uracil was the most observed nucleotide and Adenine the least observed one with 27.8% and 22.6%, respectively. There was a higher frequency of Adenine at the beginning of the sequences when compared to Uracil, which was more frequent at the end of miRNA sequences. The purine content of each implicated miRNA was also assessed. A novel motif analysis script was written to detect the most frequent 3-7 nucleotide (3-7n) long motifs in the miRNA dataset. We detected CUG (42%) as the most frequent 3n motif, CUGC (15%) as a 4n motif, AGUGC (6%) as a 5n motif, AAGUGC (4%) as a 6n motif, and UUUAGAG (4%) as a 7n motif. Thus, in the second part of our study, we further characterized the motifs by analyzing whether these motifs align at certain consensus sequences in our miRNA dataset, whether certain motifs target the same genes, and whether these motifs are conserved within other species. This thorough structural study of miRNA sequences provides a novel strategy to study the implications of miRNAs in health and disease. A better understanding of miRNA structure is crucial to developing therapeutic settings.Entities:
Keywords: cancer; consensus motifs; gene regulation; miRNA motifs; microRNA; structure
Mesh:
Substances:
Year: 2022 PMID: 35885935 PMCID: PMC9316571 DOI: 10.3390/genes13071152
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Flowchart of data collection and processing. In brackets is the database/tool used to perform the analysis.
Nucleotide sequences of miRNAs implicated in myeloid cancer 1.
| 5′ | Direction | 3′ | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | Length | |
| hsa-mir-1248 | A | C | C | U | U | C | U | U | G | U | A | U | A | A | G | C | A | C | U | G | U | G | C | U | A | A | A | 27 |
| hsa-mir-1183 | C | A | C | U | G | U | A | G | G | U | G | A | U | G | G | U | G | A | G | A | G | U | G | G | G | C | A | 27 |
| hsa-mir-1272 | G | A | U | G | A | U | G | A | U | G | G | C | A | G | C | A | A | A | U | U | C | U | G | A | A | A | 26 | |
| hsa-mir-1244 | A | A | G | U | A | G | U | U | G | G | U | U | U | G | U | A | U | G | A | G | A | U | G | G | U | U | 26 | |
| hsa-mir-921 | C | U | A | G | U | G | A | G | G | G | A | C | A | G | A | A | C | C | A | G | G | A | U | U | C | 25 | ||
| hsa-mir-638 | A | G | G | G | A | U | C | G | C | G | G | G | C | G | G | G | U | G | G | C | G | G | C | C | U | 25 | ||
| … | ||||||||||||||||||||||||||||
| hsa-mir-1279 | U | C | A | U | A | U | U | G | C | U | U | C | U | U | U | C | U | 17 | ||||||||||
1 Position 1–27 corresponds to direction 5′-3′. The complete data can be found in Supplementary Table S1.
Figure 2The distribution of miRNA length in myeloid cancer.
Comparison of short and long miRNA publication numbers. Retrieved in February 2022 from reference counts of each miRNA page in mirbase.org.
| Short miRNAs | Nucleotide Length | Publications |
|---|---|---|
| hsa-mir-1275 | 17 | 33 |
| hsa-mir-302e | 17 | 160 |
| hsa-mir-1207 | 18 | 33 |
|
|
|
|
| hsa-mir-1248 | 27 | 16 |
| hsa-mir-1183 | 27 | 5 |
| hsa-mir-1272 | 26 | 5 |
| hsa-mir-1244 | 26 | 7 |
Figure 3Percentage of nucleotides in each position of miRNAs studied. Position 1–27 corresponds to direction 5′-3′.
Purine rich and poor miRNAs.
| miRNA | Highest Purine Content | A | G | C | U | %A + G |
|---|---|---|---|---|---|---|
| hsa-mir-765 | UGGAGGAGAAGGAAGGUGAUG | 7 | 11 | 0 | 3 | 85.71 |
| hsa-mir-1468 | AGCAAAAUAAGCAAAUGGAAAA | 14 | 4 | 2 | 2 | 81.82 |
| hsa-mir-1910 | GAGGCAGAAGCAGGAUGACA | 8 | 8 | 3 | 1 | 80.00 |
| hsa-mir-202 | AGAGGUAUAGGGCAUGGGAA | 7 | 9 | 1 | 3 | 80.00 |
| hsa-mir-1255a | AGGAUGAGCAAAGAAAGUAGAUU | 11 | 7 | 1 | 4 | 78.26 |
| hsa-mir-320a | AAAAGCUGGGUUGAGAGGGCGA | 7 | 10 | 2 | 3 | 77.27 |
| hsa-mir-936 | ACAGUAGAGGGAGGAAUCGCAG | 8 | 9 | 3 | 2 | 77.27 |
| hsa-mir-149 | AGGGAGGGACGGGGGCUGUGC | 3 | 13 | 3 | 2 | 76.19 |
| Lowest purine content | ||||||
| hsa-mir-1281 | UCGCCUCCUCCUCUCCC | 0 | 1 | 11 | 5 | 5.88 |
| hsa-mir-483 | UCACUCCUCUCCUCCCGUCUU | 1 | 1 | 11 | 8 | 9.52 |
| hsa-mir-877 | UCCUCUUCUCCCUCCUCCCAG | 1 | 1 | 12 | 7 | 9.52 |
| hsa-mir-1236 | CCUCUUCCCCUUGUCUCUCCAG | 1 | 2 | 11 | 8 | 13.64 |
| hsa-mir-1249 | ACGCCCUUCCCCCCCUUCUUCA | 2 | 1 | 13 | 6 | 13.64 |
| hsa-mir-1224 | CCCCACCUCCUCUCUCCUCAG | 2 | 1 | 13 | 5 | 14.29 |
| hsa-mir-1238 | CUUCCUCGUCUGUCUGCCCC | 0 | 3 | 10 | 7 | 15.00 |
The list of identified 3n motifs in studied myeloid cancer miRNA dataset.
| 3n Motif | Frequency | Percentage |
|---|---|---|
| CUG | 249 | 42.42% |
| UGC | 234 | 39.86% |
| UGG | 233 | 39.69% |
| UGU | 231 | 39.35% |
| CAG | 213 | 36.29% |
| UUG | 213 | 36.29% |
| CCU | 205 | 34.92% |
| CUU | 205 | 34.92% |
| GUG | 202 | 34.41% |
| AGG | 196 | 33.39% |
| UCU | 195 | 33.22% |
| GCU | 191 | 32.54% |
| CGU | 74 | 12.61% |
| CGC | 72 | 12.27% |
| GCG | 66 | 11.24% |
| UCG | 62 | 10.56% |
| ACG | 56 | 9.54% |
| CGA | 42 | 7.16% |
The number of most and least detected 4n motifs.
| Most Detected 4n Motifs | Least Detected 4n Motifs | ||||
|---|---|---|---|---|---|
| Present in >70 miRNAs | Present in <10 miRNAs | ||||
| Motif | Frequency | Percentage | Motif | Frequency | Percentage |
| CUGC | 87 | 14.82% | CGAA | 10 | 1.70% |
| ACUG | 85 | 14.48% | CGAG | 10 | 1.70% |
| UGCA | 85 | 14.48% | CGUA | 10 | 1.70% |
| CUUU | 83 | 14.14% | UCGA | 10 | 1.70% |
| AGUG | 82 | 13.97% | ACGA | 9 | 1.53% |
| CUGG | 80 | 13.63% | ACGC | 9 | 1.53% |
| CUGU | 80 | 13.63% | UACG | 8 | 1.36% |
| UUUG | 79 | 13.46% | CGAU | 7 | 1.19% |
| CAGU | 78 | 13.29% | UUCG | 6 | 1.02% |
| UUCU | 78 | 13.29% | |||
| UGCU | 77 | 13.12% | |||
| UGUG | 77 | 13.12% | |||
| GUGC | 76 | 12.95% | |||
| UGGG | 76 | 12.95% | |||
| UCUG | 75 | 12.78% | |||
The highly observed long motifs.
| 7n Motif | Frequency | 6n Motif | Frequency | 5n Motif | Frequency | 4n Motif | Frequency |
|---|---|---|---|---|---|---|---|
| UUUAGAG | 19 | AAGUGC | 22 | AGUGC | 36 | CUGC | 87 |
| AAGUGCU | 18 | GCUUCC | 22 | CUUCC | 34 | ACUG | 85 |
| AGUGCUU | 16 | UUUAGA | 21 | GCUUC | 33 | UGCA | 85 |
| GUGCUUC | 15 | UUAGAG | 20 | AAGUG | 32 | CUUU | 83 |
| UGCUUCC | 15 | UGCUUC | 19 | CCUUU | 32 | AGUG | 82 |
| AGUGCU | 18 | CUGCC | 31 | CUGG | 80 |
Motifs and consensus sequences in miRNAs.
| miRNA | 5′-3′ | 3n | 4n | 5n | 6n | 7n |
|---|---|---|---|---|---|---|
| 519b |
| GUG | GAGG | AGUGC | AAAGUG | UUUAGAG |
| UGC | CUUU | CUUUU | AAGUGC | AAAGUGC | ||
| CCU | GUGC | AAGUG | UUUAGA | UCCUUUU | ||
| CUU | UAGA | AGAGC | UUAGAG | UUAGAGG |
MiRNA with 7n motifs and their gene targets.
| 7n Motif | Frequency | Targeted Gene | Frequency | Percentage |
|---|---|---|---|---|
| GUGCUUC | 15 |
| 15 | 100 |
| GUGCUUC | 15 |
| 15 | 100 |
| GUGCUUC | 15 |
| 15 | 100 |
| GUGCUUC | 15 |
| 15 | 100 |
| GUGCUUC | 15 |
| 15 | 100 |
| GUGCUUC | 15 |
| 15 | 100 |
| UGCUUCC | 15 |
| 14 | 93.33 |
| UGCUUCC | 15 |
| 13 | 86.67 |
| UGCUUCC | 15 |
| 13 | 86.67 |
| UGCUUCC | 15 |
| 13 | 86.67 |
| UGCUUCC | 15 |
| 13 | 86.67 |
| UGCUUCC | 15 |
| 13 | 86.67 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AGUGCUU | 16 |
| 15 | 93.75 |
| AAGUGCU | 18 |
| 17 | 94.44 |
| AAGUGCU | 18 |
| 16 | 88.89 |
| AAGUGCU | 18 |
| 16 | 88.89 |
| AAGUGCU | 18 |
| 16 | 88.89 |
| AAGUGCU | 18 |
| 16 | 88.89 |
| AAGUGCU | 18 |
| 16 | 88.89 |
| UUUAGAG | 19 |
| 8 | 42.11 |
| UUUAGAG | 20 |
| 8 | 40.00 |
| UUUAGAG | 21 |
| 8 | 38.10 |
| UUUAGAG | 22 |
| 8 | 36.36 |
| UUUAGAG | 23 |
| 8 | 34.78 |
| UUUAGAG | 24 |
| 8 | 33.33 |
5n and 6n motifs conserved between humans and other species.
| 5n Motifs | 6n Motifs | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| CAGUG | CUGGG | UGCAG | UUUUC | UUGCA | GAGAG | AGUGCU | AGUGCA | CUGCAG | |
|
| 5.4% | 4.8% | 4.8% | 3.1% | 2.9% | ||||
|
| 5.2% | 5.7% | 5.0% | 2.1% | |||||
|
| 6.2% | 6.4% | 5.8% | 2.0% | 2.3% | 2.3% | |||
|
| 6.6% | 6.9% | 2.9% | 3.2% | |||||
|
| 4.5% | 2.0% | 1.8% | ||||||
|
| 5.8% | 7.2% | 2.4% | 3.0% | |||||
| Human | 6.0% | 5.2% | 2.2% | ||||||
|
| 7.3% | ||||||||
|
| 8.3% | ||||||||
|
| 10.7% | ||||||||