| Literature DB >> 21689481 |
Yong-Cui Wang1, Yong Wang, Zhi-Xia Yang, Nai-Yang Deng.
Abstract
BACKGROUND: Enzymes are known as the largest class of proteins and their functions are usually annotated by the Enzyme Commission (EC), which uses a hierarchy structure, i.e., four numbers separated by periods, to classify the function of enzymes. Automatically categorizing enzyme into the EC hierarchy is crucial to understand its specific molecular mechanism.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21689481 PMCID: PMC3121122 DOI: 10.1186/1752-0509-5-S1-S6
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 3The scheme of SVMHL and comparison with standard two-class and multi-class SVMs. The illustration of modeling procedure of the standard two-class SVM (a), the standard multi-class SVM (b), and the SVMHL (c), respectively. ϕ:R is a mapping from the input space R to a Hilbert space H. Red dots have labels y = +1 while blue dots have labels y = -1 in Figure 3(a) . And red dots have labels y = 1, blue dots have labels y = 2, while yellow dots have labels y = 3 in Figure 3(b) and Figure 3(c) .
The statistics of training and testing dataset for the toy example. Class 1 is non-window glass, Class 2 is float processed building window glass, and Class 3 is non-float processed building window glass.
| 20 | 60 | 66 | |
| 9 | 10 | 10 |
The predictive accuracy on glass testing set.
| Class type | Standard SVM | PMSVMHL | SVMHL |
|---|---|---|---|
| 100%(9/9) | 100%(9/9) | 100%(9/9) | |
| 50%(5/10) | 80%(8/10) | 80%(8/10) | |
| 100%(10/10) | 100%(10/10) | 100%(10/10) | |
| 82.8% | 93.1% | 93.1% |
Figure 1The comparison of feature encoding methods ACC and CTF. The predictive accuracy (the left two subfigures) and MCC (the right two subfigures) on the second and third level of Ec1 subset for SVMHL with the CTF (SVMHL) and SVMHL with the AAC (SVMHL).
The predictive accuracy and MCC on the second level of EC hierarchy for AM-SVM and SVMHL.
| Family name | AM-SVM | SVMHL | ||
|---|---|---|---|---|
| Accuracy(%) | MCC | Accuracy(%) | MCC | |
| 95.3 ± 3.8 | 0.95 ± 2.9% | 98.1±4.9 | 0.98±2.6% | |
| 94.1 ± 2.9 | 0.90 ± 8.4% | 97.6±2.6 | 0.93±8.2% | |
| 92.9 ± 3.9 | 0.91 ± 6.6% | 95.4±3.7 | 0.94±6.3% | |
| 93.6 ± 9.1 | 0.93 ± 5.4% | 95.8±8.3 | 0.96±4.7% | |
| 94.7 ± 6.4 | 0.89 ± 4.9% | 96.8±6.2 | 0.92±4.1% | |
| 89.2 ± 6.9 | 0.93 ± 5.1% | 90.1±6.1 | 0.96±6.2% | |
The predictive accuracy and MCC on the third level of EC hierarchy for AM-SVM and SVMHL.
| Family name | AM-SVM | SVMHL | ||
|---|---|---|---|---|
| Accuracy(%) | MCC | Accuracy(%) | MCC | |
| 96.2 ± 4.4 | 0.96 ± 3.2% | 98.3 ± 4.6 | 0.98±2.4% | |
| 89.2 ± 9.8 | 0.91 ± 10.1% | 92.1±9.7 | 0.92±9.5% | |
| 78.9 ± 5.2 | 0.81 ± 9.7% | 81.7±4.9 | 0.81±8.9% | |
| 95.6 ± 7.1 | 0.94 ± 3.4% | 96.7±6.7 | 0.97±2.9% | |
| 78.8 ± 4.1 | 0.89 ± 3.1% | 81.3±3.4 | 0.91±2.7% | |
| 84.5 ± 7.1 | 0.87 ± 8.4% | 86.4±6.4 | 0.91±8.7% | |
Figure 2The protein structure of the enzyme 1MEK and its catalytic residues. The protein structure of the enzyme 1MEK shows the importance to consider the neighbor amino acids in predicting enzyme function. The colored spheres in the structure and the colored residues in the sequence represent the catalytic residues. 1MEK contains the adjacent catalytic residues in its A chain: Cys (C: 36), Gly (G: 37), His (H: 38), Cys (C: 39). The number in the brackets represents the abbreviation of the residues and its positions in the protein sequence.