| Literature DB >> 19415115 |
Raheleh Salari1, Cagri Aksay, Emre Karakoc, Peter J Unrau, Iman Hajirasouliha, S Cenk Sahinalp.
Abstract
BACKGROUND: Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a hard task; in particular past attempts for ab initio ncRNA search mostly failed with the exception of tools that can identify micro RNAs. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2009 PMID: 19415115 PMCID: PMC2673033 DOI: 10.1371/journal.pone.0005433
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Specificity and sensitivity values for different thresholds of smyRNA trained on E.coli and tested on S.flexneri based on the highest 1,000 ranking predictions.
Figure 2Log-likelihood score comparison of pentamer log likelihood scores from S. flexneri and E. coli.
Each data point corresponds to a pentamer p positioned at (x, y) where x is the log likelihood score of p in E. coli and y in S. flexneri.
Predictive power of smyRNA on different genomes.
| Genome | Type | Length(nt) | # of known ncRNAs | # of known ncRNAs returned by smyRNA | # of all subsequences returned by smyRNA |
| Cyanophora paradoxa cyanelle | Eukaryota | 135,599 | 40 | 36(90%) | 61 |
| Kluyveromyces lactis strain NRRL Y-1140 chromosome B of strain NRRL Y-1140 of Kluyveromyces lactis | Eukaryota | 1,320,834 | 32 | 25(78%) | 104 |
| Yarrowia lipolytica chromosome A of strain CLIB122 of Yarrowia lipolytica | Eukaryota | 2,303,261 | 86 | 71(83%) | 102 |
| Yersinia pestis strain CO92 | Bacteria | 4,653,728 | 118 | 83(70%) | 140 |
| Salmonella enterica subsp. enterica serovar Choleraesuis str. SC-B67 | Bacteria | 4,755,700 | 159 | 109(69%) | 141 |
| Vibrio cholerae O1 biovar eltor str. N16961 chromosome I | Bacteria | 2,961,149 | 126 | 101(80%) | 217 |
| Shigella flexneri 2a str. 301 | Bacteria | 4,607,203 | 183 | 120(66%) | 254 |
E. coli has been used for training and threshold score is set to t = 11. Number and percentage of discovered ncRNAs (presented in fifth column) shows the accuracy of smyRNA in predicting ncRNA genes on different genomes.