| Literature DB >> 21619696 |
Nazar Zaki1, Stefan Wolfsheimer, Gregory Nuel, Sawsan Khuri.
Abstract
BACKGROUND: Conotoxin has been proven to be effective in drug design and could be used to treat various disorders such as schizophrenia, neuromuscular disorders and chronic pain. With the rapidly growing interest in conotoxin, accurate conotoxin superfamily classification tools are desirable to systematize the increasing number of newly discovered sequences and structures. However, despite the significance and extensive experimental investigations on conotoxin, those tools have not been intensively explored.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21619696 PMCID: PMC3133552 DOI: 10.1186/1471-2105-12-217
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Optimal alignment vs. finite-temperature alignment. (a) One highly similar region in the search space (b) Many competitively similar regions in the search space in each window.
Number of the conotoxin protein examples in each of the four subsets.
| Subset | Superfamily | No. of Sequences |
|---|---|---|
| A-conotoxin | 25 | |
| M-conotoxin | 13 | |
| O-conotoxin | 61 | |
| T-conotoxin | 17 | |
| 116 |
Number of the conotoxin protein examples in each of the nine subsets.
| Subset | Superfamily | No. of Sequences |
|---|---|---|
| A-conotoxin | 201 | |
| I1-conotoxin | 32 | |
| I2-conotoxin | 34 | |
| M-conotoxin | 86 | |
| O1-conotoxin | 318 | |
| O2-conotoxin | 41 | |
| O3-conotoxin | 19 | |
| D-conotoxin | 18 | |
| T-conotoxin | 109 | |
| 858 |
The contingency table.
| Related sequences | Unrelated sequences | |
|---|---|---|
| Sequence classified related | True positives ( | False negatives ( |
| Sequence classified unrelated | False positives ( | True negatives ( |
Effectiveness of varying temperature parameter T.
| A | M | O | T | Average | |
|---|---|---|---|---|---|
| 1 | 85.8 | 92.61 | 47.73 | 90.34 | 79.12 |
| 2 | 92.05 | 92.61 | 90.91 | 95.45 | 92.755 |
| 3 | 93.75 | 93.18 | 88.07 | 96.59 | 92.898 |
| 4 | 93.18 | 91.48 | 86.36 | 96.59 | 91.903 |
| 5 | 91.48 | 92.05 | 86.93 | 94.32 | 91.195 |
| 6 | 90.91 | 92.05 | 87.5 | 93.75 | 91.053 |
| 7 | 91.48 | 92.61 | 86.93 | 93.75 | 91.193 |
| 8 | 90.34 | 92.61 | 87.5 | 93.75 | 91.05 |
| 9 | 90.34 | 92.61 | 87.5 | 93.75 | 91.05 |
| 10 | 90.34 | 92.61 | 87.5 | 93.75 | 91.05 |
Effectiveness of varying word parameter kmax.
| A | M | O | T | Average | |
|---|---|---|---|---|---|
| 1 | 85.8 | 92.61 | 81.82 | 91.48 | 87.928 |
| 2 | 92.05 | 92.61 | 90.91 | 95.45 | 92.755 |
| 3 | 92.61 | 92.05 | 90.91 | 93.18 | 92.188 |
| 4 | 96.02 | 92.61 | 94.89 | 89.77 | 93.323 |
| 5 | 90.34 | 89.2 | 97.73 | 89.77 | 91.76 |
Effectiveness of varying window size ℓ.
| ℓ | A | M | O | T | Average |
|---|---|---|---|---|---|
| 10 | 86.36 | 93.75 | 73.3 | 65.91 | 79.83 |
| 20 | 93.75 | 94.89 | 94.89 | 89.77 | 93.325 |
| 30 | 96.59 | 97.16 | 93.75 | 94.89 | 95.5975 |
| 40 | 96.59 | 98.3 | 93.75 | 96.02 | 96.165 |
| 50 | 97.16 | 97.73 | 93.18 | 96.59 | 96.165 |
| 60 | 96.59 | 98.86 | 94.32 | 95.45 | 96.305 |
| 70 | 97.73 | 99.43 | 94.32 | 96.59 | 97.0175 |
| 80 | 97.16 | 98.86 | 93.75 | 93.75 | 95.88 |
| 90 | 97.16 | 98.3 | 95.45 | 94.89 | 96.45 |
| 100 | 96.59 | 99.43 | 94.32 | 95.45 | 96.4475 |
| 200 | 96.02 | 98.3 | 98.3 | 94.32 | 96.735 |
| 300 | 99.43 | 98.86 | 99.43 | 99.43 | 99.29 |
| 400 | 97.73 | 98.3 | 95.45 | 96.02 | 96.875 |
| 500 | 97.16 | 96.59 | 96.02 | 93.75 | 95.88 |
| 600 | 96.59 | 94.32 | 96.02 | 94.32 | 95.3125 |
| 700 | 95.45 | 93.18 | 93.75 | 96.02 | 94.6 |
| 800 | 94.89 | 91.48 | 95.45 | 93.75 | 93.8925 |
| 900 | 95.45 | 94.32 | 95.45 | 96.02 | 95.31 |
| 1000 | 95.45 | 90.91 | 93.18 | 94.89 | 93.6075 |
Overall results based on DATASET-1.
| Conotoxin Superfamily | AC | SN | SP | ROC | 10-fold Cross- Validation |
|---|---|---|---|---|---|
| A | 0.9943 | 0.96 | 1 | 0.9925 | 0.983 |
| M | 0.9886 | 0.9836 | 1 | 0.9976 | 0.9773 |
| O | 0.9943 | 0.9836 | 1 | 0.9998 | 0.9772 |
| T | 0.9943 | 1 | 0.987 | 1 | 0.9943 |
Overall results based on DATASET-2.
| Conotoxin Superfamily | AC | SN | SP | ROC | 10-fold Cross- Validation |
|---|---|---|---|---|---|
| A | 0.9811 | 0.985 | 0.9787 | 0.9981 | 1 |
| I1 | 0.9943 | 0.9375 | 0.998 | 0.9937 | 0.9943 |
| I2 | 0.9925 | 0.9412 | 0.996 | 0.9995 | 1 |
| M | 0.9830 | 0.9535 | 0.9887 | 0.9976 | 0.9659 |
| O1 | 0.9906 | 0.9937 | 0.9858 | 0.9998 | 1 |
| O2 | 0.9943 | 0.9756 | 0.9959 | 0.9996 | 0.9943 |
| O3 | 1 | 1 | 1 | 1 | 1 |
| D | 1 | 1 | 1 | 1 | 1 |
| T | 0.9811 | 0.9541 | 0.9881 | 0.9932 | .9773 |
A performance comparison of the SVM-Freescore and other existing methods.
| Method | A | M | O | T |
|---|---|---|---|---|
| SN (SP) | SN (SP) | SN (SP) | SN (SP) | |
| SVM-Freescore | 0.960 (1.000) | 0.984 (1.000) | 0.984 (1.000) | 1.000 (0.987) |
| IDQD | 0.960 (0.923) | 0.923 (1.000) | 0.820 (0.893) | 0.940 (0.940) |
| Multi-class SVMs | 0.840 (0.955) | 0.920 (0.800) | 0.870 (0.869) | 0.940 (0.940) |
| One-versus-rest SVMs | 0.840 (0.955) | 0.846 (1.000) | 0.820 (0.962) | 0.765 (0.929) |
| Least Hamming distance | 0.800 (0.667) | 0.539 (0.539) | 0.771 (0.723) | 0.824 (0.824) |
| ISort | 0.760 (0.792) | 0.692 (0.600) | 0.705 (0.683) | 0.882 (0.790) |
Figure 2A performance comparison using the traditional Smith-Waterman alignment in conjunction with SVM (SVM-SW) and the SVM-Freescore.