| Literature DB >> 19814811 |
Payal Singh1, Pradipta Bandyopadhyay, Sudha Bhattacharya, A Krishnamachari, Supratim Sengupta.
Abstract
BACKGROUND: Riboswitches are a type of noncoding RNA that regulate gene expression by switching from one structural conformation to another on ligand binding. The various classes of riboswitches discovered so far are differentiated by the ligand, which on binding induces a conformational switch. Every class of riboswitch is characterized by an aptamer domain, which provides the site for ligand binding, and an expression platform that undergoes conformational change on ligand binding. The sequence and structure of the aptamer domain is highly conserved in riboswitches belonging to the same class. We propose a method for fast and accurate identification of riboswitches using profile Hidden Markov Models (pHMM). Our method exploits the high degree of sequence conservation that characterizes the aptamer domain.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19814811 PMCID: PMC2770071 DOI: 10.1186/1471-2105-10-325
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Receiver-Operator Characteristic curves (ROC). (a) Lysine. (b) Sam alpha.
Sensitivity and Specificity for different riboswitch families.
| FMN | 0.99 | 1 |
| Cobalamin | 0.99 | 1 |
| TPP | 0.99 | 1 |
| Purine | 1 | 1 |
| Sam | 1 | 1 |
| Sam alpha | 0.67 | 1 |
| Glms | 1 | 1 |
| Glycine | 1 | 1 |
| Lysine | 0.97 | 1 |
| PreQ1 | 0.95 | 0.95 |
Figure 2Comparison of computational time of CM, R. Red - CM, green - RAVENNA, pink - pHMM. Time is represented on log scale. It shows that pHMMs are several 100 times faster than CM. 73 complete genomes from Refseq database with size ranging from 20 KB to 13 MB were used to calculate computing time for different approaches.
Figure 3Performance of pHMM for different riboswitch classes. (a). Exclusive CM hits and common hits that are picked up by pHMM as well as CM are shown on normalized scale. Orange indicates common hits while green indicates hits picked exclusively by CM. (b). Exclusive pHMM hits and common hits that are picked up by pHMM as well as CM are shown on normalized scale. Orange indicates common hits while green indicates hits picked exclusively by pHMM.
Percentage of CM hits covered by pHMM.
| FMN | 844 | 831 | 830 | 99.40% | 100% |
| Cobalamin | 1713 | 1807 | 1703 | 99.41% | 99.59% |
| TPP | 2245 | 2250 | 2242 | 99.95% | 99.95% |
| Lysine | 651 | 621 | 618 | 97.45% | 98.25% |
| Glycine | 1185 | 1260 | 1174 | 99.66% | 100% |
| Purine | 595 | 645 | 594 | 99.83% | 99.83% |
| Sam | 1262 | 1278 | 1255 | 99.44% | 99.68% |
| Sam alpha | 194 | 233 | 127 | 69.02% | 75.14% |
| Glms | 172 | 154 | 153 | 96.83% | 97.45% |
| PreQ1 | 332 | 2387 | 267 | 90.94% | 98.84% |
I# : percentage coverage was calculated after removing hits lying in gene, in AT repetitive region and those which were located far upstream of the genes. II# : percentage coverage was calculated after removing hits lying in gene, in AT repetitive region, those which were far upstream of the genes, as well as those hits that were upstream of putative and hypothetical genes.
Comparison of the performance of the RibEx package with pHMMs.
| FMN | 183 | 183 | 183 |
| Cobalamin | 306 | 302 | 305 |
| TPP | 496 | 465 | 495 |
| Glycine | 217 | 184 | 217 |
| Lysine | 112 | 82 | 111 |
| Purine | 122 | 107 | 122 |
| Sam | 298 | 298 | 298 |
| Glms | 44 | 44 | 44 |
Figure 4Flowchart of the approach. This figure illustrates the workflow of our approach