| Literature DB >> 22720033 |
Jingxian Zhang1, Bucong Han, Xiaona Wei, Chunyan Tan, Yuzong Chen, Yuyang Jiang.
Abstract
Target selective drugs, such as dopamine receptor (DR) subtype selective ligands, are developed for enhanced therapeutics and reduced side effects. In silico methods have been explored for searching DR selective ligands, but encountered difficulties associated with high subtype similarity and ligand structural diversity. Machine learning methods have shown promising potential in searching target selective compounds. Their target selective capability can be further enhanced. In this work, we introduced a new two-step support vector machines target-binding and selectivity screening method for searching DR subtype-selective ligands, which was tested together with three previously-used machine learning methods for searching D1, D2, D3 and D4 selective ligands. It correctly identified 50.6%-88.0% of the 21-408 subtype selective and 71.7%-81.0% of the 39-147 multi-subtype ligands. Its subtype selective ligand identification rates are significantly better than, and its multi-subtype ligand identification rates are comparable to the best rates of the previously used methods. Our method produced low false-hit rates in screening 13.56 M PubChem, 168,016 MDDR and 657,736 ChEMBLdb compounds. Molecular features important for subtype selectivity were extracted by using the recursive feature elimination feature selection method. These features are consistent with literature-reported features. Our method showed similar performance in searching estrogen receptor subtype selective ligands. Our study demonstrated the usefulness of the two-step target binding and selectivity screening method in searching subtype selective ligands from large compound libraries.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22720033 PMCID: PMC3376116 DOI: 10.1371/journal.pone.0039076
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Datasets of our collected dopamine receptor D1, D2, D3 and D4 ligands, non-ligands and putative non-ligands.
| Training Dataset | Independent Testing Dataset | ||||
| Dopamine Receptor Subtype | Positive Samples | Negative Samples | Positive Samples | Negative Samples | |
| Ligands published before 2010 (No of chemical families covered by ligands) | Non-ligands published before 2010 | Putative non-ligands | Ligands published since 2010 (percent of ligands outside training chemical families) | Non-ligands published since 2010 | |
| D1 | 491 (225) | 264 | 65198 | 59 (25.42%) | 25 |
| D2 | 2202 (642) | 1577 | 63687 | 135 (16.30%) | 65 |
| D3 | 1355 (463) | 631 | 62927 | 76 (18.42%) | 28 |
| D4 | 1486 (433) | 526 | 63272 | 29 (34.48%) | 33 |
Dopamine receptor D1, D2, D3 and D4 ligands (Ki <1 μM) and non-ligands (ki >10 μM) were collected as described in method section, and putative non-ligands were generated from representative compounds of compound families with no known ligand. These datasets were used for training and testing the multi-label machine learning models.
Datasets of our collected dopamine receptor D1, D2, D3 and D4 selective ligands against another subtype.
| Dopamine receptor subtype | Selectivity against the second subtype | Number of subtype selective ligands against the second subtype | Range of binding affinity ratio | Mean of binding affinity ratio |
| D1 | D2 | 97 | 10–4533 | 359 |
| D3 | 21 | 11–559 | 122 | |
| D4 | 29 | 11–4600 | 770 | |
| D2 | D1 | 43 | 10–3707 | 337 |
| D3 | 37 | 10–615 | 66 | |
| D4 | 63 | 10–1851 | 113 | |
| D3 | D1 | 48 | 17–38461 | 3863 |
| D2 | 99 | 10–6666 | 259 | |
| D4 | 85 | 10–9111 | 950 | |
| D4 | D1 | 27 | 13–4761 | 1315 |
| D2 | 408 | 10–10752 | 2962 | |
| D3 | 207 | 10–51162 | 1175 |
The binding affinity ratio is the experimentally measured binding affinity to the second subtype divided by that to the first subtype: (Ki of the second subtype / Ki of the first subtype). This dataset was used as positive samples for testing subtype selectivity of our developed virtual screening models.
Datasets of our collected dopamine receptor multi-subtype ligands.
| Ligand Group | Binding Subtypes | Number of Ligands of Subtypes | Used as Testing Dataset |
| Dual Subtype Ligands | D1 and D2 | 147 | Yes |
| D1 and D3 | 4 | No | |
| D1 and D4 | 8 | No | |
| D3 and D4 | 100 | Yes | |
| Triple Subtype Ligands | D1, D2 and D3 | 39 | Yes |
| D1, D3 and D4 | 2 | No | |
| Quadruple Subtype Ligands | D1, D2, D3 and D4 | 60 | Yes |
Four groups of this dataset were used as negative samples for testing subtype selectivity of our developed multi-label machine learning models.
The performance of our new method 2SBR-SVM and that of previously used methods Combi-SVM, ML-kNN and RAkEL-DT in predicting dopamine receptor subtype selective ligands.
| Percent of subtype selective ligands predicted as subtype selective with respect to the second subtype | ||||||
| Dopamine receptor subtype | Selectivity against the second subtype | Number of subtype selective ligands | Combi-SVM | ML-kNN | RAkEL-DT | 2SBR-SVM |
| D1 | D2 | 97 | 13.40% | 30.93% | 75.26% | 86.60% |
| D3 | 21 | 23.81% | 23.81% | 47.62% | 66.67% | |
| D4 | 29 | 17.24% | 58.62% | 44.83% | 65.52% | |
|
| 18.15% | 37.79% | 55.90% | 72.93% | ||
| D2 | D1 | 43 | 55.81% | 62.79% | 69.77% | 93.02% |
| D3 | 37 | 16.22% | 21.62% | 62.16% | 81.08% | |
| D4 | 63 | 14.29% | 39.68% | 30.16% | 82.54% | |
|
| 28.77% | 41.36% | 54.03% | 85.55% | ||
| D3 | D1 | 48 | 72.92% | 87.50% | 85.42% | 56.25% |
| D2 | 99 | 22.22% | 26.26% | 50.51% | 51.52% | |
| D4 | 85 | 17.65% | 31.76% | 22.35% | 50.59% | |
|
| 37.60% | 48.51% | 52.76% | 52.79% | ||
| D4 | D1 | 27 | 74.07% | 70.37% | 85.19% | 82.50% |
| D2 | 408 | 33.33% | 28.43% | 57.60% | 88.00% | |
| D3 | 209 | 26.79% | 24.40% | 45.46% | 83.73% | |
|
| 44.73% | 41.07% | 62.75% | 84.74% | ||
The performance of our new method 2SBR-SVM and that of previously used methods Combi-SVM, ML-kNN and RAkEL-DT in predicting dopamine receptor multi-subtype ligands as non-selective ligands.
| Percent of multi-subtype ligands predicted as non-selective ligands | ||||||
| Ligand Group | Binding subtypes | Number of Multi- Subtype Ligands | Combi-SVM | ML-kNN | RAkEL-DT | 2SBR-SVM |
| Dual Subtype Ligands | D1 and D2 | 147 | 68.02% | 31.97% | 35.37% | 76.19% |
| D3 and D4 | 100 | 83.0% | 37.0% | 39.0% | 81.0% | |
| Triple Subtype Ligands | D1, D2 and D3 | 39 | 76.92% | 28.2% | 33.33% | 71.79% |
| Quadruple Subtype Ligands | D1, D2, D3 and D4 | 60 | 75.42% | 36.67% | 38.75% | 71.67% |
Virtual screening performance of our new method 2SBR-SVM and that of our previously used method Combi-SVM in scanning 168,016 MDDR compounds and 657,736 ChEMBLdb compounds, and 13.56 million Pubchem compounds.
| Dopamine receptor subtype | Method | Number and Percent of the 13.56M PubChem Compounds Identified as subtype selective ligands | Number and Percent of the 168,016 MDDR Compounds Identified as subtype selective ligands | Number and Percent of the 657,736 ChemBLdb Compounds Identified as subtype selective ligands |
| D1 | SVM (Single Label) | 6798(0.0501%) | 463(0.28%) | 1034(0.16%) |
| Combi-SVM | 4948(0.0365%) | 383(0.23%) | 755(0.11%) | |
| 2SBR-SVM | 650(0.0048%) | 140(0.08%) | 355(0.05%) | |
| D2 | SVM (Single Label) | 17786(0.1312%) | 1105(0.66%) | 3208(0.49%) |
| Combi-SVM | 10080(0.0743%) | 712(0.42%) | 2023(0.31%) | |
| 2SBR-SVM | 1132(0.0083%) | 108(0.06%) | 686(0.10%) | |
| D3 | SVM (Single Label) | 19813(0.1461%) | 1149(0.68%) | 3057(0.46%) |
| Combi-SVM | 6055(0.0447%) | 679(0.40%) | 1894(0.29%) | |
| 2SBR-SVM | 1498(0.0110%) | 156(0.09%) | 687(0.10%) | |
| D4 | SVM (Single Label) | 21444(0.1581%) | 1160(0.69%) | 3489(0.53%) |
| Combi-SVM | 9186(0.0677%) | 790(0.47%) | 2579(0.39%) | |
| 2SBR-SVM | 1961(0.0145%) | 134(0.08%) | 907(0.14%) |
For comparison, the results of single label SVM, which identify putative subtype binding ligands regardless of their possible binding to another subtype, are also included.
Top-ranked molecular descriptors for distinguishing dopamine receptor subtype D1, D2, D3 or D4 selective ligands selected by RFE feature selection method.
| Dopamine receptor subtype | Top-ranked molecular descriptors for distinguishing subtype selective ligands and ligands of other subtypes |
| D1 | Number of O atoms, Sum of Estate of atom type dssC, Sum of Estate of atom type ssO, Sum of Estate of atom type ssNH, Graph-theoretical shape coefficient, Sum of H Estate of atom type HsNH2 |
| D2 | Number of H-bond acceptor, Sum of H Estate of atom type HaaNH, Sum of H Estate of atom type HCsats, Sum of Estate of atom type dssC, Sum of Estate of atom type aasC, Sum of Estate of atom type aaNH |
| D3 | Sum of Estate of atom type dsCH, Sum of H Estate of atom type HsOH, Sum of H Estate of atom type HCsats, Sum of Estate of atom type aaaC, Sum of Estate of atom type sOH, Number of H-bond donnor |
| D4 | Molecular path count of length 2, Sum of Estate of atom type ssCH2, 3th order Kier shape index, Topological radius, Sum of Estate of atom type aasC, Kier Molecular Flexibility Index |