| Literature DB >> 32002367 |
Faegheh Golabi1,2, Mousa Shamsi1, Mohammad Hosein Sedaaghi3, Abolfazl Barzegar2,4, Mohammad Saeid Hejazi5,6.
Abstract
Purpose: Riboswitches are special non-coding sequences usually located in mRNAs' un-translated regions and regulate gene expression and consequently cellular function. Furthermore, their interaction with antibiotics has been recently implicated. This raises more interest in development of bioinformatics tools for riboswitch studies. Herein, we describe the development and employment of novel block location-based feature extraction (BLBFE) method for classification of riboswitches.Entities:
Keywords: BLBFE; Block location-based feature extraction; Classification; Non-coding RNA; Performance measures; Riboswitch; Sequential blocks
Year: 2019 PMID: 32002367 PMCID: PMC6983983 DOI: 10.15171/apb.2020.012
Source DB: PubMed Journal: Adv Pharm Bull ISSN: 2228-5881
Seven riboswitch families obtained from Rfam 13.0 database
|
|
|
|
|
|
| Lysine | RF00168 | 47 | 183 | 11.06 |
| Cobalamin | RF00174 | 430 | 203 | 15.54 |
| Glycine | RF00504 | 44 | 101 | 15.99 |
| SAM-alpha | RF00521 | 40 | 79 | 1.18 |
| SAM-IV | RF00634 | 40 | 116 | 4.13 |
| Cyclic-di-GMP-I | RF01051 | 155 | 87 | 6 |
| SAH | RF01057 | 52 | 85 | 15.4 |
Results of the application of the sequential block finding (SBF) algorithm for 7 families of riboswitches
|
|
|
|
| Lysine | AGAGGUGC | 10 |
| AGUAA | 28 | |
| Cobalamin | CGGUG | 18 |
| GCA | 77 | |
| AGC | 92 | |
| AGA | 175 | |
| GACC | 180 | |
| Glycine | GGAGA | 13 |
| CCGA | 35 | |
| SAM-alpha | GUGGU | 11 |
| AUUUG | 17 | |
| GCCACGU | 37 | |
| SAM-IV | UCA | 3 |
| GAG | 7 | |
| CAG | 13 | |
| GCUGG | 32 | |
| CGGCAACC | 38 | |
| Cyclic-di-GMP-I | GAAA | 23 |
| CGCAAAGC | 35 | |
| SAH | GAGGAGCG | 7 |
| UGC | 16 | |
| AGGCUCGG | 36 |
Figure 1
Figure 2Multiclass confusion matrix for the LDA classifier, based on the features extracted by the block location-based feature extraction (BLBFE) method
|
|
|
|
|
|
|
|
|
| Lysine | 33 | 1 | 10 | 3 | 0 | 0 | 0 |
| Cobalamin | 13 | 363 | 46 | 3 | 1 | 1 | 3 |
| Glycine | 0 | 0 | 44 | 0 | 0 | 0 | 0 |
| SAM-alpha | 0 | 0 | 4 | 36 | 0 | 0 | 0 |
| SAM-IV | 0 | 0 | 4 | 0 | 36 | 0 | 0 |
| Cyclic-di-GMP-I | 0 | 0 | 52 | 0 | 0 | 103 | 0 |
| SAH | 0 | 0 | 12 | 0 | 1 | 0 | 39 |
| TP | 33 | 363 | 44 | 36 | 36 | 103 | 39 |
| FP | 13 | 1 | 128 | 6 | 2 | 1 | 3 |
| TN | 621 | 291 | 610 | 618 | 618 | 551 | 615 |
| FN | 14 | 67 | 0 | 4 | 4 | 52 | 13 |
Multiclass confusion matrix for the PNN classifier, based on the features extracted by the block location-based feature extraction (BLBFE) method
|
|
|
|
|
|
|
|
|
| Lysine | 39 | 1 | 4 | 0 | 0 | 1 | 2 |
| Cobalamin | 6 | 407 | 7 | 3 | 1 | 5 | 1 |
| Glycine | 1 | 1 | 39 | 0 | 0 | 3 | 0 |
| SAM-alpha | 0 | 1 | 1 | 35 | 1 | 2 | 0 |
| SAM-IV | 0 | 1 | 0 | 1 | 37 | 1 | 0 |
| Cyclic-di-GMP-I | 1 | 2 | 7 | 2 | 0 | 141 | 2 |
| SAH | 0 | 0 | 2 | 0 | 0 | 2 | 48 |
| TP | 39 | 407 | 39 | 35 | 37 | 141 | 48 |
| FP | 8 | 6 | 21 | 6 | 2 | 14 | 53 |
| TN | 707 | 339 | 707 | 711 | 709 | 605 | 698 |
| FN | 8 | 23 | 5 | 5 | 3 | 14 | 52 |
Multiclass confusion matrix for the KNN classifier, based on the features extracted by the block location-based feature extraction (BLBFE) method
|
|
|
|
|
|
|
|
|
| Lysine | 39 | 3 | 2 | 0 | 0 | 2 | 1 |
| Cobalamin | 4 | 406 | 9 | 1 | 0 | 7 | 3 |
| Glycine | 0 | 14 | 28 | 0 | 0 | 2 | 0 |
| SAM-alpha | 0 | 5 | 1 | 32 | 0 | 1 | 1 |
| SAM-IV | 1 | 5 | 0 | 0 | 33 | 0 | 1 |
| Cyclic-di-GMP-I | 2 | 9 | 2 | 1 | 0 | 139 | 2 |
| SAH | 1 | 2 | 5 | 0 | 0 | 3 | 41 |
| TP | 39 | 406 | 28 | 32 | 33 | 139 | 41 |
| FP | 8 | 38 | 19 | 2 | 0 | 15 | 49 |
| TN | 679 | 312 | 690 | 686 | 685 | 579 | 677 |
| FN | 8 | 24 | 16 | 8 | 7 | 16 | 52 |
Figure 3
Figure 4Multiclass confusion matrix for the decision tree classifier, based on the features extracted by the block location-based feature extraction (BLBFE) method
|
|
|
|
|
|
|
|
|
| Lysine | 40 | 4 | 1 | 0 | 0 | 1 | 1 |
| Cobalamin | 11 | 402 | 8 | 1 | 7 | 1 | 0 |
| Glycine | 1 | 6 | 31 | 1 | 0 | 5 | 0 |
| SAM-alpha | 0 | 4 | 0 | 30 | 4 | 2 | 0 |
| SAM-IV | 1 | 0 | 1 | 1 | 34 | 3 | 0 |
| Cyclic-di-GMP-I | 1 | 7 | 2 | 1 | 11 | 131 | 2 |
| SAH | 1 | 1 | 0 | 1 | 5 | 3 | 41 |
| TP | 40 | 402 | 31 | 30 | 34 | 131 | 41 |
| FP | 15 | 22 | 12 | 5 | 27 | 15 | 44 |
| TN | 669 | 307 | 678 | 679 | 675 | 578 | 668 |
| FN | 7 | 28 | 13 | 10 | 6 | 24 | 52 |