| Literature DB >> 30845912 |
Marco Di Salvo1, Simone Puccio2, Clelia Peano2,3, Stephan Lacour4, Pietro Alifano5.
Abstract
BACKGROUND: In bacterial genomes, there are two mechanisms to terminate the DNA transcription: the "intrinsic" or Rho-independent termination and the Rho-dependent termination. Intrinsic terminators are characterized by a RNA hairpin followed by a run of 6-8 U residues relatively easy to identify using one of the numerous available prediction programs. In contrast, Rho-dependent termination is mediated by the Rho protein factor that, firstly, binds to ribosome-free mRNA in a site characterized by a C > G content and then reaches the RNA polymerase to induce its release. Conversely on intrinsic terminators, the computational prediction of Rho-dependent terminators in prokaryotes is a very difficult problem because the sequence features required for the function of Rho are complex and poorly defined. This is the reason why it still does not exist an exhaustive Rho-dependent terminators prediction program.Entities:
Keywords: Motif; RUT site; Rho; Rho-dependent terminators; RhoTermPredict; Transcription termination
Mesh:
Substances:
Year: 2019 PMID: 30845912 PMCID: PMC6407284 DOI: 10.1186/s12859-019-2704-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Method used for the prediction of putative Rho-dependent terminators. The adopted procedure is constituted by 2 steps: i) the identification of the RUT site and ii) the identification of the RNAP pausing site in a 150 nt long region immediately downstream from the predicted RUT site 3′
Statistics of predicted Rho-dependent terminators in the E.coli K-12 genome sequences of the positive set by RhoTermPredict algorithm
| Positive dataset size | Regions with at least one prediction (%) | Regions with more predictions (%) | Total number of predictions | Mean C/G content of RUT site |
|---|---|---|---|---|
| 1264 | 64.5 | 17.2 | 1064 | 1.6 |
Fig. 2Distribution of C/G content of predicted terminators RUT sites
Fig. 3Distribution of predicted Rho-dependent terminators RUT sites in E. coli K-12 as a function of their distance from the BST regions. Predicted Rho-dependent terminators are grouped based on distances between the RUT site 3′-end points and the annotated BST regions 5′-end points
Testing results of RhoTermPredict and performances of the Rho-independent terminators tool ARNolda in the positive and negative set of sequences
| Tool | TP | FN | FP | TN | Precision (%) | Recall (%) | Specificity (%) | Accuracy (%) | F1-score |
|---|---|---|---|---|---|---|---|---|---|
| RhoTermPredict | 128 | 67 | 46 | 149 | 73.6 | 65.6 | 76.4 | 71.0 | 0.7 |
| ARNold | 11 | 184 | 19 | 176 | 36.7 | 5.6 | 90.3 | 48.0 | 0.1 |
aTest experiments were repeated 10 times for 195 randomly selected sequences of positive sets of E. coli K-12, and the means were taken
Statistics of predicted Rho-dependent terminators in the E. coli K-12 whole-genome and IRs by RhoTermPredict algorithm and evaluation with RNA-Seq data
| Dataset | Total number of predictions | Predictions next to expressed DNA regions | Validated predictions (%) |
|---|---|---|---|
| Whole genome | 23,930 | 7200 | 62.4 |
| IRs | 839 | 319 | 70.5 |
Fig. 4Distribution of RNA-Seq read values ratios between the read value of putative RUT site 5′-end point and the read value 150 nt downstream from putative RUT site 3′-end point of validated genome-wide predictions
Testing results of RhoTermPredict and performances of the Rho-independent terminators tool ARNold in the positive and negative set of sequences of B. subtilis 168a
| Tool | TP | FN | FP | TN | Precision (%) | Recall (%) | Specificity (%) | Accuracy (%) | F1-score |
|---|---|---|---|---|---|---|---|---|---|
| RhoTermPredict | 17 | 17 | 5 | 29 | 77.3 | 50.0 | 85.3 | 67.5 | 0.6 |
| ARNold | 4 | 30 | 1 | 33 | 80.0 | 11.8 | 97.0 | 54.4 | 0.2 |
aTest experiments were repeated 10 times for 34 randomly selected sequences of negative sets of B. subtilis 168 (in order to have a positive and a negative set of the same size), and the means were taken
Fig. 5Boxplot of the predictions read value ratios obtained for various window of the C/G ratio values
Statistical informations about validated predictions from RNA-Seq data
| C/G < 1.5 | 1.5 < = C/G < 2 | 2 < = C/G < 2.5 | 2.5 < = C/G < 3 | C/G > = 3 | |
|---|---|---|---|---|---|
| Number of predictions | 2930 | 1206 | 254 | 67 | 37 |
| Read value ratios mean | 13.3 | 9.3 | 7 | 10.2 | 10.5 |
| Read value ratios std | 168.1 | 25.7 | 9.4 | 13.1 | 10.5 |
| Read value ratios median | 3.2 | 3.3 | 3.4 | 4.4 | 5.2 |
| Read value ratios > 100 | 33 | 13 | 0 | 0 | 0 |
The number of predictions, the mean, the median and the standard deviation of read value ratios were reported for various window of the predicted RUT site C/G ratio values