| Literature DB >> 26193243 |
Woong-Hee Shin1, Xiaolei Zhu2, Mark Gregory Bures3, Daisuke Kihara4,5.
Abstract
Virtual screening has been widely used in the drug discovery process. Ligand-based virtual screening (LBVS) methods compare a library of compounds with a known active ligand. Two notable advantages of LBVS methods are that they do not require structural information of a target receptor and that they are faster than structure-based methods. LBVS methods can be classified based on the complexity of ligand structure information utilized: one-dimensional (1D), two-dimensional (2D), and three-dimensional (3D). Unlike 1D and 2D methods, 3D methods can have enhanced performance since they treat the conformational flexibility of compounds. In this paper, a number of 3D methods will be reviewed. In addition, four representative 3D methods were benchmarked to understand their performance in virtual screening. Specifically, we tested overall performance in key aspects including the ability to find dissimilar active compounds, and computational speed.Entities:
Keywords: 3D Zernike descriptors; PL-PatchSurfer; Patch-Surfer; ROCS; USR; ligand-based virtual screening; molecular shape; molecular surface; three-dimensional similarity
Mesh:
Substances:
Year: 2015 PMID: 26193243 PMCID: PMC5005041 DOI: 10.3390/molecules200712841
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Schematic illustration of SHAFTS procedure. (A) Generate pharmacophore feature points of selected active molecule; (B) Search a database by superimposing feature triplet; (C) Rank compounds by the similarity score. Reprinted with permission from [23]. Copyright (2015) American Chemical Society.
Figure 2Schematic illustration of PL-PatchSurfer. (A) Generated molecular surface of a ligand; (B) Patch generation and 3DZD of physicochemical feature calculation; (C) Searching binding ligands from a ligand database by finding complementary patch pairs with the query receptor pocket.
Figure 3Schematic view of Blaze, which was previously called Fieldscreen. (A) A known active molecule is selected as a query; (B) Field points generation; (C) Searching a ligand database; (D) Rank compounds in the database by the similarity score. Reprinted with permission from [13]. Copyright (2015) American Chemical Society.
Enrichment factors and area under the curve (AUC) values of the methods with different maximum numbers of conformations generated.
| EF2% | EF5% | EF10% | AUC | |
|---|---|---|---|---|
| USR | 10.0 | 6.2 | 4.1 | 0.76 |
| GZD | 13.4 | 8.0 | 5.3 | 0.81 |
| PS | 10.7 | 6.6 | 4.9 | 0.78 |
| ROCS | 20.1 | 10.7 | 6.2 | 0.83 |
| USR | 9.6 | 6.3 | 4.1 | 0.75 |
| GZD | 13.5 | 7.9 | 5.0 | 0.78 |
| PS | 10.6 | 6.5 | 4.9 | 0.78 |
| ROCS | 18.8 | 9.7 | 6.0 | 0.81 |
| USR | 9.6 | 6.1 | 4.1 | 0.75 |
| GZD | 12.9 | 7.3 | 4.9 | 0.75 |
| PS | 10.3 | 6.5 | 4.8 | 0.77 |
| ROCS | 18.2 | 9.4 | 5.9 | 0.80 |
| USR | 8.8 | 5.8 | 4.0 | 0.70 |
| GZD | 12.1 | 7.4 | 4.9 | 0.75 |
| PS | 10.3 | 6.4 | 4.7 | 0.77 |
| ROCS | 15.9 | 8.5 | 5.6 | 0.79 |
Pairwise student t-test between pairs of the methods.
| ROCS | GZD | PS | |
|---|---|---|---|
| GZD | - | - | |
| PS | 1.118 | - | |
| USR | 0.360 |
The t-values of performance of individual methods are shown. t-values at p-value = 0.1 and p-value = 0.05 are 1.302 and 1.682, respectively. A t-value are shown in bold if it is larger than 1.682 (i.e., statistically significant at p-value = 0.05) and underlined if it is larger than 1.302 (i.e., statistically significant at p-value = 0.1).
Figure 42D and 3D structures of active compounds that PS ranked high relative to other programs. Fifty conformations were generated for each compound. (a) The template compound is ipratropium and the query compound is propantheline. PS, USR, GZD, and ROCS ranked the query compound as third, 92nd, 86th, and 122nd, respectively; (b) The template compound is fentanyl and query compound is methadone. The rankings from all programs are fifth, 127th, 66th, and 168th for PS, USR, GZD, and ROCS, respectively. Template and query compounds are colored in gold and cyan, respectively. Matched pairs by PS have the same color codes.
Enrichment factors at 10%, 5%, and 2% and AUC values of each method after removing similar active compounds from the dataset.
| EF2% | EF5% | EF10% | AUC | |
|---|---|---|---|---|
| USR | 10.0 | 6.2 | 4.1 | 0.76 |
| GZD | 13.4 | 8.0 | 5.3 | 0.81 |
| PS | 10.7 | 6.6 | 4.9 | 0.78 |
| ROCS | 20.1 | 10.7 | 6.2 | 0.83 |
| USR | 5.3 | 4.7 | 3.4 | 0.721 |
| GZD | 8.5 | 6.4 | 4.7 | 0.775 |
| PS | 8.2 | 5.2 | 4.2 | 0.758 |
| ROCS | 15.6 | 9.4 | 5.6 | 0.801 |
| USR | 5.9 | 3.9 | 3.0 | 0.652 |
| GZD | 7.5 | 5.6 | 4.4 | 0.740 |
| PS | 7.9 | 4.8 | 3.9 | 0.736 |
| ROCS | 13.6 | 8.6 | 5.3 | 0.764 |
| USR | 3.5 | 3.0 | 2.0 | 0.621 |
| GZD | 6.0 | 4.3 | 3.5 | 0.719 |
| PS | 6.2 | 4.2 | 3.5 | 0.710 |
| ROCS | 10.1 | 7.9 | 4.3 | 0.727 |
Enrichment factors (EF) at 10%, 5%, and 2% for combined methods.
| Combined Programs | 2% | 5% | 10% |
|---|---|---|---|
| USR + GZD | 13.7 | 7.7 | 4.7 |
| USR + PS | 13.1 | 7.9 | 5.0 |
| USR + ROCS | 17.1 | 9.1 | 5.4 |
| GZD + PS | 16.0 | 9.1 | 5.9 |
| GZD + ROCS | 20.3 | 10.8 | 5.3 |
| PS + ROCS | 20.5 | 10.7 | 6.4 |
Student’s t-test between consensus method and individual methods (EF2%).
| Combined Programs | Single Program | Single Program | ||
|---|---|---|---|---|
| USR + GZD | USR | GZD | 0.150 | |
| USR + PS | USR | PS | ||
| USR + ROCS | USR | ROCS | ||
| GZD + PS | GZD | PS | ||
| GZD + ROCS | GZD | ROCS | 0.137 | |
| PS + ROCS | PS | ROCS | 0.452 |
The t-values for improvements of the combined methods from each individual method are shown. t-values at p-value = 0.1 and p-value = 0.05 are 1.302 and 1.682, respectively. A t-value are shown in bold if it is larger than 1.682 (i.e., statistically significant at p-value = 0.05) and underlined if it is larger than 1.302 (i.e., statistically significant at p-value = 0.1).
Computational time for calculating similarity between 850 compounds and a template compound.
| Programs | Time (s) |
|---|---|
| USR | 2.1 |
| GZD | 2.3 |
| PS | 4.4 |
| ROCS | 5.1 |