| Literature DB >> 19653906 |
Yuan Mao1, Hamed Shateri Najafabadi, Reza Salavati.
Abstract
BACKGROUND: Post-transcriptional regulation of gene expression is the dominant regulatory mechanism in trypanosomatids as their mRNAs are transcribed from polycistronic units. A few cis-acting RNA elements in 3'-untranslated regions of mRNAs have been identified in trypanosomatids, which affect the mRNA stability or translation rate in different life stages of these parasites. Other functional RNAs (fRNAs) also play essential roles in these organisms. However, there has been no genome-wide analysis for identification of fRNAs in trypanosomatids.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19653906 PMCID: PMC2907701 DOI: 10.1186/1471-2164-10-355
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Homology table for the predicted ncRNAs. Many candidate ncRNAs can be grouped into several homology clusters, here shown by color labels (clusters 1–15). In this figure, only ncRNAs are shown for which there is at least one other predicted ncRNA with homology E-value < 0.0025 and alignment coverage > 50%. The color of each square reflects the BLAST E-value with the sequence in the corresponding row as the query.
Classification of predicted ncRNAs in T. brucei genome
| Classification | Within 100 bp of a non-overlapping coding sequence** | Within/flanking ncRNA cluster (no. of ncRNAs >2)*** | Within a strand switch region*** | Elsewhere | Total |
|---|---|---|---|---|---|
| Overlap CDS | 0 | 0 | 0 | 36 | |
| Overlap pseudogene | 0 | 0 | 0 | 1 | |
| Overlap unlikely proteins | 0 | 0 | 0 | 0 | |
| Homologous to rRNA*,** | 0 | 4 | 2 | 1 | |
| Homologous to tRNA* | 1 | 0 | 0 | 0 | |
| Overlap known ncRNA** | 0 | 25 | 7 | 1 | |
| Overlap Ingi/RIME repeat | 0 | 0 | 0 | 0 | |
| Unclassified | 8 | 1 | 1 | 43 | |
Candidate ncRNAs are classified based on either homology with known ncRNAs or overlap with known genomic features. Candidate ncRNAs within each class are further divided into subgroups based on their location relative to known genomic features.
* Each candidate may contain several closely located single ncRNAs, some of which may have already been annotated on the current release of the T. brucei genome. However, at least one ncRNA within each sequence is unannotated, for which a known homolog is found. These unannotated ncRNAs represent novel instances of their classes.
** These categories may overlap.
*** These categories may overlap.
Figure 2Function-specific motifs in 5' UTRs of . The functions in which each motif is significantly overrepresented or underrepresented are indicated in the second column using black and blue text colors, respectively. Column headings: (a) Mutual information value; (b) Z-score associated with the MI value; (c) Robustness, obtained from ten jack-knife trials of randomly removing one-third of the genes and reassessing the statistical significance of the resulting MI values; (d) Position bias indicator – "Y" if a position bias is observed; (e) Orientation bias, indicating the orientation of the motif with respect to its associated coding sequence.
Figure 3Function-specific motifs in 3' UTRs of . The functions in which each motif is significantly overrepresented or underrepresented are indicated in the second column using black and blue text colors, respectively. Column headings are the same as in Figure 2.
Figure 4Function prediction using regulatory motifs in . This figure shows inositol phosphate metabolism (KEGG:tbr00562) as an example. (A) The performance of our naïve Bayesian network using different numbers of motifs for prediction of inositol phosphate metabolism genes. We used a two-fold cross-validation for assessing the prediction power, where half of the dataset was used for training and the other half for validation. Cross-validation was repeated 100 times for each number of motifs, and each time the AUC (area under the curve) of the ROC curve was measured as the prediction power. Standard deviation of AUC is shown by the error bars. (B) The ROC curve for prediction of inositol phosphate metabolism genes using all 36 predicted motifs. Standard deviation of sensitivity is shown by the grey shaded region. The diagonal line shows the performance that would be expected if our naïve Bayesian network was not able to predict inositol phosphate metabolism genes. This classifier has a very high specificity (~99%) at sensitivities of up to 20% for this pathway.