| Literature DB >> 28830349 |
Carl Tony Fakhry1, Prajna Kulkarni2, Ping Chen3, Rahul Kulkarni2, Kourosh Zarringhalam4.
Abstract
BACKGROUND: Small RNAs (sRNAs) constitute an important class of post-transcriptional regulators that control critical cellular processes in bacteria. Recent research using high-throughput transcriptomic approaches has led to a dramatic increase in the discovery of bacterial sRNAs. However, it is generally believed that the currently identified sRNAs constitute a limited subset of the bacterial sRNA repertoire. In several cases, sRNAs belonging to a specific class are already known and the challenge is to identify additional sRNAs belonging to the same class. In such cases, machine-learning approaches can be used to predict novel sRNAs in a given class.Entities:
Keywords: Bacterial small RNA; Boltzmann; CsrA; Machine learning; RNA structure; RsmA; ToxT
Mesh:
Substances:
Year: 2017 PMID: 28830349 PMCID: PMC5568370 DOI: 10.1186/s12864-017-4057-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Schematic representation of BTF feature generation. For the given RNA sequence, a stochastic sample of low energy secondary structures is generated. Paired nucleotides are indicated by 1 and unpaired nucleotides are indicated by 0. For a given SSC triplet such as ω=(CCU, 011), the sequence and the sampled structures are scanned by sliding a window of length 3 over the sequence as well as samples and the frequency of ω is recorded
Fig. 2Features. Counts for Robust Features in classifier training for sRNAs in a RsmA pathway and b ToxT pathway
Cross validation results for sRNA classifier
| Class | Sensitivity | Specificity | Accuracy | Precision | AUC |
|---|---|---|---|---|---|
| RsmA | 0.99 | 1 | 1 | 1 | 1 |
| ToxT | 0.91 | 0.93 | 0.92 | 0.93 | 0.99 |
First row: RsmA regulating sRNAs; Second row: sRNA targets of ToxT
Predictions of RsmA regulating sRNAs in selected bacterial species
| Organism | Flanking genes | Orientation | Predicted 5′ end | Predicted 3′ end | Probability |
|---|---|---|---|---|---|
| Acinetobacter ADP1 | ACIAD0018/ACIAD0019 | →←← | 25035 | 24917 | 0.99 |
| ACIAD2750/ACIAD2751 | ←→← | 2690560 | 2690698 | 0.99 | |
| Geobacter sulfurreducens | KN400_0047/KN400_0048 | ←→← | 55576 | 55646 | 0.93 |
| KN400_1076/phoR | ←→← | 1156573 | 1156660 | 0.99 | |
| KN400_2615 | Antisense | 2843134 | 2843215 | 0.91 | |
| Oceanobacillus iheyensis | OB3267/OB3268 | ←→← | 3404835 | 3404912 | 0.95 |
| Pseudomonas putida KT2400 | PP_1864/PP_1865 | →→→ | 2085406 | 2085577 | 0.96 |
| PP_1865/PP_1866 | →←← | 2087405 | 2087227 | 0.96 | |
| PP_1865/PP_1866 | →→← | 2087652 | 2087827 | 0.85 | |
| asd/PP_1990 | →→→ | 2256149 | 2256329 | 0.97 | |
| PP_2113/PP2114 | →→→ | 2412827 | 2413009 | 0.84 | |
| PP_2114/PP_2115 | →←→ | 2414845 | 2414666 | 0.91 | |
| PP_2218/PP_2219 | →←→ | 2530804 | 2530622 | 0.93 | |
| PP_3547/PP_3548 | ←→← | 4022257 | 4022439 | 0.95 | |
| PP_3547/PP_3548 | ←←← | 4022850 | 4022673 | 0.95 | |
| Pseudomonas syringae pv. tomato DC3000 | PSPTO_1719/PSPTO_1720 | →→← | 1889433 | 1889570 | 0.83 |
| uvrB/PSPTO_2165 | →→← | 2380918 | 2381056 | 0.85 | |
| PSPTO_2585/amt-2 | →→→ | 2856216 | 2856355 | 0.97 | |
| PSPTO_3273/PSPTO_3274 | →←← | 3699381 | 3699244 | 0.95 | |
| PSPTO_3490/PSPTO_3491 | ←←→ | 3941102 | 3940967 | 0.77 | |
| PSPTO_3490/PSPTO_3491 | ←←→ | 3941740 | 3941605 | 0.93 | |
| PSPTO_3491 | Antisense | 3942111 | 3941974 | 0.93 | |
| fadB/PSPTO_3518 | ←←→ | 3970534 | 3970396 | 0.97 | |
| gcd/PSPTO_4197 | →←← | 4728863 | 4728726 | 0.94 | |
| PSPTO_5182/PSPTO_5183 | →←→ | 5898180 | 5898085 | 0.96 | |
| Shigella flexneri | S2642/S2643 | →→→ | 2532358 | 2532472 | 0.93 |
| Vibrio fischeri ES114 | hemB/gpp | →←← | 61386 | 61312 | 0.83 |
| pgi/cheX | →←← | 315197 | 315113 | 0.91 | |
| rpsO/pnp | →←→ | 525909 | 525838 | 0.96 | |
| ydaL/copG | ←←← | 852728 | 852653 | 0.99 | |
| VF_1096/VF_1097 | →←← | 1212030 | 1211908 | 0.99 |
Arrows indicate the orientations of the predicted sRNA (center) and the two flanking genes
Fig. 3Structural conservation of RsmA-regulating sRNAs in Pseudomonas syringae as predicted by RNAz program. As can be seen, the predicted sRNA is highly conserved at the structural level, indicating that the predicted sRNA is functional. Note the presence of the GGA motif in the unpaired region of the predicted secondary structure
Predictions of ToxT regulated sRNAs in Vibrio cholerae
| Organism | Flanking genes | Orientation | Predicted 5′ end | Predicted 3′ end | Probability |
|---|---|---|---|---|---|
| Vibrio cholerae | VC_0312/VC_0313 | →←→ | 323707 | 323584 | 0.97 |
| Vibrio cholerae | VC0967 | antisense | 1031946 | 1032143 | 0.97 |
| Vibrio cholerae | VC_1192/VC_1193 | →→← | 1266285 | 1266383 | 0.94 |
| Vibrio cholerae | VC_0249 | antisense | 255195 | 255110 | 0.94 |
| Vibrio cholerae | VC_0994/VC_0995 | ←←→ | 1061082 | 1060968 | 0.98 |
| Vibrio cholerae | VC_1072/VC_1073 | →→→ | 1139343 | 1139442 | 0.94 |
Arrows indicate the orientations of the predicted sRNA (center) and the two flanking genes
Fig. 4a Predicted sRNA and upstream sequence from VC0970 antisense region (flanking gene (in Table 3) is VC0967). The boxes indicate putative ToxT binding sites. The arrow indicates the start position of sRNA. b Predicted secondary structure of the sRNA using MFOLD