| Literature DB >> 18541060 |
Guo-Zheng Li1, Hao-Hua Meng, Wen-Cong Lu, Jack Y Yang, Mary Qu Yang.
Abstract
BACKGROUND: Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer than that of negatives, it is important to predict molecular activities considering such an unbalanced situation.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18541060 PMCID: PMC2423448 DOI: 10.1186/1471-2105-9-S6-S7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Performance of different learning algorithms. Both graphs show BACC scores. Top: Results grouped by descriptors. Bottom: Results grouped by different learning algorithm.
Statistics values of AUC (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 59.4(1.3) | 61.2(1.3) | 55.0(1.0) | 55.2(1.0) | 75.3(0.8) | 75.8(0.6) |
| CONST | 50.8(0.8) | 59.3(1.9) | 50.3(1.1) | 50.4(1.1) | 75.0(0.3) | 75.3(0.5) |
| PROP | 62.3(1.4) | 63.0(1.2) | 55.4(1.5) | 55.5(1.3) | 78.0(0.9) | 78.3(0.9) |
| TOPO | 57.7(1.0) | 50.8(0.8) | 54.0(1.1) | 54.1(2.0) | 73.4(0.5) | 73.6(0.7) |
| Average | 57.6(1.1) | 58.6(1.3) | 53.7(1.2) | 53.8(1.4) | 75.4(0.6) | 75.8(0.7) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of BACC (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 60.0(0.3) | 60.8(0.9) | 54.5(1.0) | 54.2(1.0) | 68.7(0.7) | 67.8(0.5) |
| CONST | 50.0(0.1) | 60.1(0.6) | 50.0(0.1) | 50.1(0.3) | 69.1(0.4) | 69.3(0.5) |
| PROP | 62.0(1.0) | 62.8(0.4) | 55.6(1.1) | 54.7(1.3) | 70.3(0.1) | 71.0(0.9) |
| TOPO | 57.4(0.5) | 50.0(0.0) | 53.2(0.1) | 52.8(1.0) | 66.8(0.4) | 67.3(0.8) |
| Average | 57.4(0.5) | 58.4(0.5) | 53.3(0.6) | 53.0(0.9) | 68.7(0.4) | 68.9(0.7) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of sensitivity (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 20.4(0.7) | 22.3(1.7) | 9.8(1.8) | 8.8(2.0) | 69.1(1.6) | 68.4(1.2) |
| CONST | 0.1(0.1) | 20.4(1.2) | 0.1(0.1) | 0.2(0.5) | 73.0(1.0) | 72.5(1.0) |
| PROP | 24.5(1.9) | 26.5(1.0) | 12.0(2.3) | 9.9(2.5) | 70.4(1.8) | 71.2(1.9) |
| TOPO | 15.4(1.0) | 0.0(0.0) | 7.8(2.3) | 6.3(1.8) | 66.0(0.9) | 66.8(1.6) |
| Average | 15.1(0.9) | 17.3(1.0) | 7.4(1.6) | 6.3(1.7) | 69.6(1.3) | 69.7(1.4) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of specificity (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 99.6(0.1) | 99.3(0.1) | 99.2(0.1) | 99.6(0.1) | 68.3(0.4) | 67.3(0.2) |
| CONST | 99.9(0.1) | 99.7(0.1) | 99.9(0.1) | 99.9(0.1) | 65.2(0.5) | 66.1(0.3) |
| PROP | 99.5(0.1) | 99.1(0.1) | 99.3(0.1) | 99.4(0.2) | 70.3(0.4) | 70.8(0.2) |
| TOPO | 99.3(0.1) | 100.0(0.0) | 98.6(0.3) | 99.2(0.1) | 67.7(0.3) | 67.7(0.3) |
| Average | 99.6(0.1) | 99.5(0.1) | 99.3(0.1) | 99.5(0.1) | 67.9(0.4) | 68.0(0.3) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of positive predictive value (PPV) (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 50.4(2.6) | 38.6(2.8) | 19.1(2.7) | 27.0(4.1) | 3.9(0.1) | 3.8(0.1) |
| CONST | NaN(NaN) | 60.7(3.6) | NaN(NaN) | NaN(NaN) | 3.8(0.1) | 3.9(0.1) |
| PROP | 4.6(0.2) | 3.6(1.3) | 23.2(2.2) | 25.7(5.4) | 4.3(0.1) | 4.4(0.1) |
| TOPO | 3.0(1.8) | NaN(NaN) | 9.7(1.5) | 12.8(4.2) | 3.7(0.1) | 3.7(0.1) |
| Average | 19.3(1.5) | 34.3(2.6) | 17.3(2.1) | 21.8(4.6) | 3.9(0.1) | 4.0(0.1) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of negative predictive value (NPV) (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 98.5(0.1) | 98.6(0.1) | 98.3(0.1) | 98.3(0.1) | 99.2(0.1) | 99.1(0.1) |
| CONST | 98.2(0.1) | 98.5(0.1) | 98.2(0.1) | 98.2(0.1) | 99.2(0.1) | 99.2(0.1) |
| PROP | 98.5(0.3) | 98.6(0.1) | 98.4(0.1) | 98.3(0.1) | 99.2(0.1) | 99.2(0.1) |
| TOPO | 98.4(0.1) | 98.2(0.0) | 98.3(0.1) | 98.3(0.1) | 99.1(0.1) | 99.1(0.1) |
| Average | 98.4(0.1) | 98.5(0.1) | 98.3(0.1) | 98.3(0.1) | 99.2(0.1) | 99.2(0.1) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics values of correction (%).
| Descriptor | SVM | unSVM | Bagging | unBagging | asBagging | PRIFEAB |
| BCUT | 98.2(0.1) | 97.9(0.1) | 97.6(0.1) | 97.8(0.1) | 68.3(0.4) | 67.3(0.2) |
| CONST | 98.2(0.1) | 98.3(0.1) | 98.2(0.1) | 98.2(0.1) | 65.3(0.5) | 66.2(0.3) |
| PROP | 98.1(0.1) | 97.8(0.1) | 97.6(0.1) | 97.8(0.1) | 70.3(0.3) | 70.8(0.2) |
| TOPO | 97.8(0.1) | 98.2(0.0) | 97.0(0.2) | 97.5(0.1) | 67.6(0.3) | 67.7(0.3) |
| Average | 98.1(0.1) | 98.1(0.1) | 97.6(0.1) | 97.8(0.1) | 67.9(0.4) | 68.0(0.3) |
Outside of parentheses represent the mean of the respective performance measure, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Statistics ratio values of the number of features used in PRIFEAB to the total number (%).
| BCUT | CONST | PROP | TOPO | Average |
| 93.3(2.0) | 95.9(2.2) | 98.2(0.5) | 99.0(0.1) | 96.6(1.2) |
Outside of parentheses represent the mean value, while Inside of parentheses correspond to the standard deviation across the 10 times of 3-fold cross validations.
Figure 2The Bagging approach.
Figure 3The unBagging approach.
Figure 4The asBagging approach.
Figure 5The PRIFEAB approach.