| Literature DB >> 21918604 |
Katsumi Omagari1, Daisuke Mitomo, Satoru Kubota, Haruki Nakamura, Yoshifumi Fukunishi.
Abstract
We examined the procedures to combine two different in silico drug-screening results to achieve a high hit ratio. When the 3D structure of the target protein and some active compounds are known, both structure-based and ligand-based in silico screening methods can be applied. In the present study, the machine-learning score modification multiple target screening (MSM-MTS) method was adopted as a structure-based screening method, and the machine-learning docking score index (ML-DSI) method was adopted as a ligand-based screening method. To combine the predicted compound's sets by these two screening methods, we examined the product of the sets (consensus set) and the sum of the sets. As a result, the consensus set achieved a higher hit ratio than the sum of the sets and than either individual predicted set. In addition, the current combination was shown to be robust enough for the structural diversities both in different crystal structure and in snapshot structures during molecular dynamics simulations.Entities:
Keywords: conformation of active site; consensus score; in silico; protein-based screening; protein-ligand docking; screening
Year: 2008 PMID: 21918604 PMCID: PMC3169939 DOI: 10.2147/aabc.s3767
Source DB: PubMed Journal: Adv Appl Bioinform Chem ISSN: 1178-6949
Figure 1Schematic representation of the screening methods in the current study. The same procedure was applied to models A and B. The protein set consists of the proteins listed in Appendixes A and B.
q values and hit ratios for target protein models A and B
| Protein | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSD | Conventional
| MTS&MASC | DSM | MSM | ML-DSI | Sum | Consensus | ||||||||||
| A | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | ||
| HIVP | 3hvp | 0.0 | 2.0 | 89.4 | 85.4 | 89.0 | 83.5 | 89.0 | 89.2 | 91.2 | 90.9 | 83.1 | 82.2 | 90.6 | 90.9 | 89.6 | 90.3 |
| 1aid | 1.4 | 1.8 | 85.3 | 87.6 | 86.9 | 87.5 | 88.8 | 89.6 | 91.7 | 91.7 | 80.0 | 83.4 | 92.4 | 91.5 | 90.5 | 90.2 | |
| 1hpx | 2.3 | 2.6 | 95.4 | 80.2 | 94.7 | 81.5 | 94.3 | 89.7 | 92.2 | 92.0 | 84.4 | 80.6 | 91.8 | 91.4 | 90.7 | 91.6 | |
| 1htf1 | 2.4 | 2.3 | 88.9 | 87.7 | 82.0 | 83.2 | 92.5 | 90.6 | 91.0 | 90.5 | 83.3 | 85.9 | 90.9 | 90.4 | 92.8 | 91.5 | |
| 1htf2 | 2.4 | 2.0 | 83.2 | 87.2 | 71.9 | 80.0 | 93.0 | 90.7 | 92.2 | 92.7 | 77.4 | 75.6 | 91.9 | 91.5 | 91.8 | 91.3 | |
| 1ivp | 2.2 | 2.5 | 82.7 | 64.4 | 74.8 | 49.5 | 90.9 | 49.5 | 91.0 | 91.4 | 80.4 | 77.6 | 91.0 | 91.2 | 91.0 | 90.0 | |
| 4phv | 2.4 | 2.6 | 92.3 | 83.9 | 79.8 | 73.9 | 90.4 | 91.5 | 92.2 | 91.3 | 79.3 | 84.8 | 91.2 | 91.0 | 89.9 | 90.3 | |
| average | 88.2 | 82.3 | 82.7 | 77.0 | 91.3 | 84.4 | 91.7 | 91.5 | 81.1 | 81.5 | 91.4 | 91.1 | 90.9 | 90.7 | |||
| σq | 4.4 | 7.7 | 7.5 | 11.9 | 1.9 | 14.3 | 0.5 | 0.7 | 2.3 | 3.5 | 0.6 | 0.4 | 1.0 | 0.6 | |||
| COX2 | 5cox | 0.0 | 1.6 | 46.7 | 33.1 | 89.2 | 70.5 | 55.5 | 49.1 | 94.3 | 95.0 | 89.4 | 80.1 | 96.8 | 96.2 | 98.2 | 97.8 |
| 1cx2 | 0.7 | 1.6 | 55.8 | 44.3 | 92.0 | 83.0 | 98.7 | 97.4 | 95.2 | 94.8 | 91.0 | 82.2 | 95.9 | 96.8 | 98.0 | 98.5 | |
| 1pxx | 0.8 | 1.4 | 46.3 | 33.2 | 85.8 | 78.4 | 99.1 | 78.4 | 95.4 | 95.0 | 88.5 | 84.0 | 96.8 | 96.7 | 98.1 | 98.3 | |
| 4cox | 0.6 | 1.3 | 45.0 | 16.4 | 82.2 | 52.9 | 95.6 | 31.7 | 97.4 | 95.7 | 99.5 | 89.0 | 96.5 | 97.2 | 98.5 | 98.6 | |
| 6cox | 0.6 | 1.5 | 55.2 | 12.3 | 91.3 | 31.9 | 95.1 | 93.7 | 94.9 | 95.4 | 81.7 | 87.6 | 96.1 | 96.0 | 98.4 | 97.9 | |
| average | 49.8 | 27.9 | 88.1 | 63.3 | 88.8 | 70.1 | 95.4 | 95.2 | 90.0 | 84.5 | 96.4 | 96.6 | 98.2 | 98.2 | |||
| σq | 4.7 | 11.8 | 3.7 | 18.8 | 16.7 | 25.6 | 1.0 | 0.3 | 5.7 | 3.3 | 0.3 | 0.4 | 0.2 | 0.5 | |||
| THR | 1l3f | 0.0 | 1.3 | 44.5 | 9.0 | 63.1 | 24.9 | 69.9 | 13.6 | 93.5 | 95.5 | 81.5 | 79.3 | 95.3 | 93.7 | 95.9 | 95.7 |
| 1tlp | 1.0 | 1.4 | 42.9 | 42.3 | 57.7 | 59.5 | 16.7 | 16.8 | 95.0 | 95.2 | 87.1 | 83.7 | 95.2 | 95.2 | 95.8 | 95.7 | |
| 1tmn | 0.9 | 1.2 | 46.1 | 42.1 | 56.1 | 60.5 | 15.8 | 87.8 | 95.4 | 95.0 | 85.4 | 84.2 | 95.0 | 95.2 | 95.9 | 96.1 | |
| 2tmn | 0.9 | 1.5 | 52.3 | 47.4 | 75.7 | 73.0 | 53.5 | 38.1 | 95.5 | 95.6 | 84.8 | 85.8 | 95.4 | 95.4 | 96.0 | 96.0 | |
| average | 46.4 | 35.2 | 63.2 | 54.5 | 39.0 | 39.1 | 94.9 | 95.3 | 84.7 | 83.2 | 95.2 | 94.8 | 95.9 | 95.9 | |||
| σq | 3.6 | 15.3 | 7.7 | 17.9 | 23.5 | 29.7 | 0.8 | 0.2 | 2.0 | 2.4 | 0.1 | 0.7 | 0.1 | 0.2 | |||
| GST | 16gs | 0.0 | 2.6 | 69.3 | 72.7 | 50.8 | 60.9 | 28.0 | 24.2 | 86.3 | 85.8 | 82.3 | 82.4 | 91.6 | 92.8 | 92.7 | 94.3 |
| 18gs | 0.4 | 2.0 | 71.5 | 70.4 | 62.9 | 48.6 | 29.4 | 26.5 | 85.9 | 87.4 | 87.9 | 83.3 | 91.4 | 93.4 | 94.9 | 93.0 | |
| 2gss | 0.3 | 1.3 | 67.4 | 70.4 | 54.3 | 59.7 | 26.5 | 28.8 | 87.8 | 86.1 | 86.0 | 85.9 | 92.4 | 93.7 | 93.3 | 95.5 | |
| 3pgt | 0.5 | 1.7 | 68.2 | 63.8 | 46.4 | 34.7 | 25.0 | 23.3 | 87.5 | 88.1 | 82.5 | 77.7 | 92.1 | 92.3 | 94.7 | 94.9 | |
| average | 69.1 | 69.3 | 53.6 | 51.0 | 27.2 | 25.7 | 86.8 | 86.9 | 84.6 | 82.3 | 91.9 | 93.1 | 93.9 | 94.4 | |||
| σq | 1.5 | 3.3 | 6.1 | 10.6 | 1.6 | 2.1 | 0.8 | 0.9 | 2.4 | 2.9 | 0.4 | 0.6 | 0.9 | 0.9 | |||
| average | 66.4 | 56.7 | 74.3 | 63.9 | 67.4 | 60.0 | 91.7 | 91.4 | 81.4 | 83.7 | 92.9 | 92.5 | 93.6 | 93.4 | |||
| hit ratio | at 1% | 23.8 | 16.2 | 15.1 | 7.3 | 21.1 | 14.7 | 35.8 | 29.3 | 16.7 | 16.1 | 28.8 | 25.4 | 47.2 | 46.0 | ||
Notes: The q value is the area under the database enrichment curve. The q value is 50 for random screening and the maximum value is 100. The hit ratio is the hit ratio at the first 1% of the entries in the database. The σq value is the standard deviation of the q values. Model A includes the original crystal structures of the target proteins. Model B includes the model structures of the target proteins obtained by the MD simulations in explicit water.
name of target protein.
PDB code.
represents apo structure.
RMSD value from the apo structure of protein model A.
combination of MTS and MASC methods (sum set) with original docking score.
MTS method with the DSM method.
MTS method with the MSM method (MSM-MTS method).
ML-DSI method.
sum set of predicted compounds by the MSM-MTS and the ML-DSI method.
consensus set of predicted compounds by the MSM-MTS and the ML-DSI methods.
Figure 2Database enrichment curves for models A and B. Filled circles, open circles, green squares, and red squares represent the results by the MSM-MTS method, the ML-DSI method, the sum sets of predicted compounds by the MSM-MTS and the ML-DSI methods, and the consensus sets of predicted compounds by the MSM-MTS and the ML-DSI methods, respectively. a: database enrichment curves for model A, in which the target protein structures are the original crystal structures. b: database enrichment curves for model B, in which the target protein structures are the model structures obtained by the MD simulations in explicit water.