| Literature DB >> 21860668 |
Abstract
Computational determination of protein-ligand interaction potential is important for many biological applications including virtual screening for therapeutic drugs. The novel internal consensus scoring strategy is an empirical approach with an extended set of 9 binding terms combined with a neural network capable of analysis of diverse complexes. Like conventional consensus methods, internal consensus is capable of maintaining multiple distinct representations of protein-ligand interactions. In a typical use the method was trained using ligand classification data (binding/no binding) for a single receptor. The internal consensus analyses successfully distinguished protein-ligand complexes from decoys (r², 0.895 for a series of typical proteins). Results are superior to other tested empirical methods. In virtual screening experiments, internal consensus analyses provide consistent enrichment as determined by ROC-AUC and pROC metrics.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21860668 PMCID: PMC3157911 DOI: 10.1371/journal.pone.0023215
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Combinations of factors.
Combining factors in varying proportions can effectively produce novel factors during training that are functions of the original factors. VDW1, VDW2 and a hybrid factor are shown as a function of atom distances. Dashed line, distance function of factor VDW1; solid line, function of factor VDW2 and dotted line, a 1∶1 mixture (coefficients of VDW1 and VDW2 both set to fraction 0.5). Free energy values are scaled to the range 0–1. Energy values are presented for an atom pair with each atom assuming a VDW radius of 1.5 Angstroms.
Factor-factor scoring correlation for a mixture of proteins.
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2Neural network training speed.
The accuracy of internal consensus predictions is compared to the number of training cycles. Overtraining is evident in the curve in which accuracy drops after an increase in training cycles. Squares, trypsin; triangles, HIV-1 protease; diamonds, DUD database set of proteins.
Figure 3ROC curve analysis.
Receiver operator characteristics (ROC) curves for analysis of internal consensus and Vina classification of native ligand and decoy complexes. A. Trypsin; B. HIV protease; C. 39 DUD proteins. Solid line, internal consensus; dashed line, Vina. A diagonal (dotted line) represents a random selection. Curves above the diagonal represent successful separation of decoys and native ligands.
Efficiency of internal consensus analysis and Vina in classification of native ligand and decoy complexes.
| Internal consensus | ||||
| Protein target | ROC-AUC | s.d. | Correlation | s.d. |
| DUD Database | 0.996 | 0.005 | 0.895 | 0.078 |
| Trypsin | 1.000 | <0.001 | 1.000 | <0.001 |
| HIV-1 protease | 1.000 | <0.001 | 0.950 | 0.071 |
AUC and r2 correlation are distinct methods for scoring classification accuracy. Both have a range of 0–1 with values less than 0.5 indicating a relative lack of classification. Values were scored for independent data samples. Standard deviations, s.d., are shown.
r-values were negative.
Correlation between factor scores and protein-ligand complex formation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Protein-ligand databases: DUD ligand/decoy database; Human trypsin complexes; HIV-1 protease inhibitor complexes. Correlation of factor score with ligand binding (1.0) versus decoy binding (0.0).
Figure 4Ligand conformation selection.
HIV-1 protease crystal structure 1BV7 with native ligand XV638 (gray) from the Protein Data Bank is shown with a superimposed modeled XV638 ligand (RMSD, 0.71, black) whose conformation was selected out of 45 candidate conformations by internal consensus analysis. Of the 45 conformations, 4 had RMSD values less than 2.0. Most native VDW contacts between protein and ligand are conserved (56/84 contacts with a 0.8 Angstrom threshold). Mottling of ligand occurs where the native and modeled structure are tightly aligned.
Figure 5Virtual screening.
Results of ROC-AUC analysis for 39 DUD protein virtual screening analyses are shown. AUC values obtained by the internal consensus method are compared to those from Vina scoring. Values above 0.5 indicate successful selection of ligands over decoys. The differences between Vina and the internal consensus method are significant (two-tailed, paired T-test; p<2.0×10−7).
Ability of methods to reduce a large sample of mostly decoy ligands to a small sample of complexes enriched for genuine binding ligands as determined by pROC metric.
| pROC | |||
| Protein target | Internal consensus | Vina | |
| Trypsin | 0.856 | 0.513 | |
| Estrogen receptor | 0.885 | 0.944 | |
| Thymidine kinase | 1.493 | 0.431 | |
| Retinoic acid X receptor | 1.696 | 2.228 | |
| Src tyrosine kinase | 0.755 | 0.690 | |
| Neuraminidase | 0.769 | 0.451 | |
| S-adenosyl homocysteine hydrolase | 1.251 | 0.963 | |
| HIV-1 protease | 1.028 | 0.771 | |
Significant. pROC critical value (P<0.05) is 0.70 [34].
Figure 6Enrichment curves.
The ability of analysis by the internal consensus approach and Vina to promote ligand enrichment over decoys in virtual screening is shown. Enrichment is presented as a function of the fraction of the original database eliminated in the screen. Protein targets: A. Thymidine kinase; B. Estrogen receptor; C. Neuraminidase; D. S-adenosyl homocysteine hydrolase. An enrichment factor of 1.0 corresponds to a random selection of genuine ligands from decoys. Closed markers, Vina; open markers, internal consensus.
PDBIDs of protein-ligand complexes used in analysis.
|
|
| 1A9M_B, 1AAQ_B, 1AJV_A, 1B6J_B, 1B6K_A, 1B6L_A, 1B6M_B, 1BDQ_B, 1BV7_A, 1C70_B, 1D4K_A, 1D4L_A, 1D4Y_A, 1DIF_B, 1DMP_B, 1G2K_B, 1G35_B, 1GNM_B, 1HBV_A, 1HIH_B, 1HOS_A, 1HPO_B, 1HPS_B, 1HPX_B, 1HSH_A, 1HVH_B, 1HVI_A, 1HVJ_A, 1HVK_A, 1HVL_B, 1HVR_A, 1HVS_A, 1HXW_B, 1KZK_A, 1MES_B, 1MSM_A, 1MTR_B, 1OHR_A, 1PRO_A, 1QBR_A, 1QBU_B, 1SBG_B, 1SDT_A, 1SH9_B, 1TCX_B, 1W5X_A, 1Z1H_A, 1Z1R_A, 1ZP8_A, 1ZPA_A, 2BPV_B, 2BPY_B, 2F80_B, 2HB3_B, 2I0A_A, 2I0D_A, 3AID_A, 7UPJ_A. |
|
|
| 1C1R_A, 1C5P_A, 1C5Q_A, 1C5S_A, 1C5T_A, 1CE5_A, 1F0T_A, 1F0U_A, 1G3B_A, 1G3C_A, 1GHZ_A, 1GI1_A, 1GI4_A, 1GI6_A, 1GJ6_A, 1K1I_A, 1K1L_A, 1K1N_A, 1KIM_A, 1O2H_A, 1O2J_A, 1O2N_A, 1O2O_A, 1O2S_A, 1O2W_A, 1O2Z_A, 1O30_A, 1O33_A, 1O36_A, 1O38_A, 1O3D_A, 1O3F_A, 1O3H_A, 1O3J_A, 1PPC_A, 1PPH_A, 1QB1_A, 1QB6_A, 1QB9_A, 1QBN_A, 1QBO_A, 1TNG_A, 1TNH_A, 1TNJ_A, 1TNK_A, 1TNL_A, 1V2K_A, 1V2N_A, 1V2O_A, 2BZA_A, 2FX6_A. |