| Literature DB >> 23198001 |
Marta Pinto1, Michael Trauner, Gerhard F Ecker.
Abstract
Entities:
Year: 2012 PMID: 23198001 PMCID: PMC3505902 DOI: 10.1002/minf.201200049
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 3.353
Composition of the training and test sets considered in the present work. S: substrates; NS Non-substrates.
| PCC | Total | TR | TS | ||||
|---|---|---|---|---|---|---|---|
| S | NS | S | NS | S | NS | ||
| MACCS | 154 | 1050 | 126 | 838 | 28 | 212 | |
| 0.25 | Descriptors | 154 | 1050 | 123 | 841 | 31 | 209 |
| Random | 154 | 1050 | 123 | 840 | 31 | 210 | |
| MACCS | 98 | 1106 | 85 | 879 | 13 | 227 | |
| 0.30 | Descriptors | 98 | 1106 | 80 | 884 | 18 | 222 |
| Random | 98 | 1106 | 78 | 962 | 20 | 222 |
Figure 1Schematic representation of the procedure used to build models for ABCC2 substrate prediction.
Performance of the best models obtained for the noncharged and charged initial databases at the PCC values considered in the present work. Bottom grey and white have been used to indicate the models obtained at PCC values of 0.25 and 0.30, respectively.
| TR/TS selection | Misclassification cost (FN : FP) | Machine learning algorithm | Specificity | Sensitivity | Precision | G-mean | MCC | Accuracy | |
|---|---|---|---|---|---|---|---|---|---|
| Noncharged | Random | 65 : 2.5 | Bag[a]+J48 | 72.38 | 70.97 | 0.28 | 0.72 | 0.31 | 71.67 |
| 5-fold (TR) | 72.62 | 67.48 | 0.27 | 0.70 | 0.29 | 70.05 | |||
| 10-fold (TR) | 72.86 | 65.85 | 0.26 | 0.69 | 0.28 | 69.36 | |||
| Charged | Random | 150 : 3.5 | RF[b] | 71.90 | 77.42 | 0.29 | 0.75 | 0.35 | 74.66 |
| 5-fold (TR) | 67.26 | 73.17 | 0.36 | 0.70 | 0.28 | 70.22 | |||
| 10-fold (TR) | 67.26 | 76.42 | 0.37 | 0.72 | 0.30 | 71.84 | |||
| Noncharged | Descriptors | 81 : 1.20 | Bag+J48 | 75.23 | 77.78 | 0.20 | 0.76 | 0.31 | 76.50 |
| 5-fold (TR) | 67.76 | 68.76 | 0.16 | 0.68 | 0.21 | 68.26 | |||
| 10-fold (TR) | 72.62 | 62.50 | 0.17 | 0.67 | 0.21 | 67.56 | |||
| Charged | Descriptors | 80 : 1.10 | RF | 73.87 | 77.78 | 0.19 | 0.76 | 0.30 | 75.83 |
| 5-fold (TR) | 66.52 | 68.75 | 0.16 | 0.68 | 0.20 | 67.63 | |||
| 10-fold (TR) | 71.27 | 67.50 | 0.18 | 0.69 | 0.23 | 69.38 |
[a] Bag has been used to design the bagging algorithm. [b] RF refers to the Random Forest machine learning method.
Set of 2D MOE2010 descriptors used to build each one of the models shown in Table 2.
| Descriptor | Description | PCC=0.25 | PCC=0.30 | ||
|---|---|---|---|---|---|
| Noncharged | Charged | Noncharged | Charged | ||
| a_don | Number of hydrogen bond donor atoms. | ||||
| a_nBr | Number of bromine atoms. | ||||
| a_nCl | Number of chlorine atoms. | ||||
| a_nN | Number of nitrogen atoms. | ||||
| a_nO | Number of oxygen atoms. | ||||
| a_nS | Number of sulfur atoms. | ||||
| b_count | Number of bonds. | ||||
| PEOE_VSA+1 | Sum of the van der Waals surface area of atoms having a charge in the range [0.05, 0.10). | ||||
| PEOE_VSA+2 | Sum of the van der Waals surface area of atoms having a charge in the range [0.10, 0.15). | ||||
| PEOE_VSA+3 | Sum of the van der Waals surface area of atoms having a charge in the range [0.15, 0.20). | ||||
| PEOE_VSA+4 | Sum of the van der Waals surface area of atoms having a charge in the range [0.20, 0.25). | ||||
| PEOE_VSA_FNEG | Fractional negative van der Waals area. | ||||
| PEOE_VSA_FPOS | Fractional positive van der Waals area. | ||||
| PEOE_VSA_POS | Total positive van der Waals area | ||||
| PEOE_VSA_PPOS | Total polar positive van der Waals area | ||||
| rings | Number of rings. | ||||
| SlogP_VSA0 | Sum of the van der Waals surface area of atoms which contribution to logP is ≤−0.40. | ||||
| SlogP_VSA1 | Sum of the van der Waals surface area of atoms which contribution to logP is in (−0.40, −0.20]. | ||||
| SMR_VSA1 | Sum of the van der Waals surface area of atoms which contribution to MR is in (−0.11, −0.26]. | ||||
| SMR_VSA2 | Sum of the van der Waals surface area of atoms which contribution to MR is in (−0.26, −0.35]. | ||||
| SMR_VSA4 | Sum of the van der Waals surface area of atoms which contribution to MR is in (−0.39, −0.44]. | ||||
| TPSA | Total polar surface area (connection table approximation) | ||||
| vsa_base | Number of basic atoms. | ||||
| vsa_don | Sum of van der Waals surface areas of pure hydrogen bond donor atoms. | ||||
| vsa_other | Sum of van der Waals surface areas of atoms typed as “others”. |
Figure 2Principal component analysis of the 16 descriptors used to build the model.