| Literature DB >> 34768977 |
Amara Jabeen1, Claire A de March2, Hiroaki Matsunami2,3, Shoba Ranganathan1.
Abstract
Olfactory receptors (ORs) constitute the largest superfamily of G protein-coupled receptors (GPCRs). ORs are involved in sensing odorants as well as in other ectopic roles in non-nasal tissues. Matching of an enormous number of the olfactory stimulation repertoire to its counterpart OR through machine learning (ML) will enable understanding of olfactory system, receptor characterization, and exploitation of their therapeutic potential. In the current study, we have selected two broadly tuned ectopic human OR proteins, OR1A1 and OR2W1, for expanding their known chemical space by using molecular descriptors. We present a scheme for selecting the optimal features required to train an ML-based model, based on which we selected the random forest (RF) as the best performer. High activity agonist prediction involved screening five databases comprising ~23 M compounds, using the trained RF classifier. To evaluate the effectiveness of the machine learning based virtual screening and check receptor binding site compatibility, we used docking of the top target ligands to carefully develop receptor model structures. Finally, experimental validation of selected compounds with significant docking scores through in vitro assays revealed two high activity novel agonists for OR1A1 and one for OR2W1.Entities:
Keywords: G protein-coupled receptors; luciferase assay; machine learning; molecular descriptors; olfactory receptor; random forest; virtual ligand screening
Mesh:
Substances:
Year: 2021 PMID: 34768977 PMCID: PMC8583936 DOI: 10.3390/ijms222111546
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Workflow for agonist identification based on machine learning, molecular docking and in vitro testing. Details of feature selection is shown in Supplementary Figure S3 and the process for database filtration is shown in Supplementary Figure S5. Compounds from five different databases were downloaded and classified as agonists or non-agonists for OR1A1 and OR2W1.
Figure 2First two principal components of (A) OR1A1 and OR2W1 agonists and non-agonists showing the diversified nature of agonists for both ORs and (B) training and testing datasets showing the considerable overlap between training and testing set.
Figure 3Performance comparison of different classifiers based on accuracy, sensitivity, specificity, and F1 score for (A) OR1A1 (B) OR2W1. Training and testing datasets for each OR have been compared, using RF: random forest, NB: Naïve Bayes and SVM: Support vector machine.
Highly probable OR1A1 agonists based on docking scores. Control in italics; experimentally test compounds underlined.
| PubChem_CID | Compound Name | Database | Chemical Nature | ICM Docking Score |
|---|---|---|---|---|
|
|
|
|
|
|
| 10465547 | [( | ZINC | Heterocyclic compound | −24.058033 |
| 70545042 | Prop-2-enyl 3-iodobenzoate | ZINC | Heterocyclic compound | −20.113147 |
| 56806459 | 2-(4-Methylphenoxy)pentan-3-one | ZINC | Ketone | −19.297316 |
| 84603836 | (3-Fluorophenyl)methyl 4-methylpentanoate | ZINC | Ester | −18.084389 |
| 101977 | D-citronellol | ChEBI, HMDB | Terpene | −18.009 |
| 56828593 | 4-(3-Fluorophenyl)-3-methyl-4-oxobutanenitrile | ZINC | Heterocyclic compound | −17.436507 |
| 17973047 | 4-(1-Methylcyclopropyl)phenol | ZINC | Heterocyclic compound | −17.286264 |
| 30842889 | Prop-2-enyl 2-(2,4-difluorophenyl)acetate | ZINC | Ester | −17.136729 |
| 22048986 | 6-Chloro-1-(3-fluorophenyl)hexan-1-one | ZINC | Ketone | −17.053948 |
| 11470552 |
| ZINC | Ester | −16.812695 |
| 45085600 | (5 | ZINC | Ketone | −16.733556 |
| 5352782 | 3-[( | ZINC | Heterocyclic compound | −16.708306 |
| 7021479 |
| ZINC | Ester | −16.649985 |
| 59382573 | (3-Methoxyphenyl)methyl butanoate | ZINC | Ester | −16.572913 |
| 84177 | Ethyl 2-(4-chlorophenyl)acetate | ZINC | Ester | −16.278326 |
| 6368521 | 1-[( | ZINC | Heterocyclic compound | −16.266073 |
| 78901972 | (3-Fluorophenyl)methyl 2-propylsulfanylacetate | ZINC | Ester | −16.165344 |
| 8842 | Citronellol | OdorDB, ChEBI | Terpene | −14.6017 |
| 22311 |
| OdorDB, COD | Terpene | −13.9064 |
| 1318 |
| ChEBI | Heterocyclic compound | −13.7334 |
| 24473 | Dihydrocarvone | OdorDB | Ketone | −11.3054 |
| 131752167 | 2,10-Bisaboladiene-1,4-diol | HMDB | Alcohol | −10.6224 |
| 78236 | 4-Nonanone | HMDB | Ketone | −10.5291 |
Highly probable OR2W1 agonists based on docking scores. Control in italics; experimentally test compounds underlined.
| PubChem_CID | Compound Name | Database | Chemical Nature | ICM Docking Score |
|---|---|---|---|---|
|
|
|
|
|
|
| 13433021 | 4-Methyl-2-m-tolylpyridine | ZINC | Heterocyclic compound | −14.515694 |
| 2733871 | 2,4-Dimethyl-1-phenylpyrrole | ZINC | Heterocyclic compound | −14.150921 |
| 249799 | 1-Butoxy-4-phenylbenzene | ZINC | Heterocyclic compound | −13.419691 |
| 22562335 | Methyl 3-(4-ethoxyphenyl)prop-2-ynoate | ZINC | Heterocyclic compound | −12.639271 |
| 16530415 | (2,3,4,5,6-Pentafluorophenyl)methyl 2-hydroxy-3-methylbenzoate | ZINC | Heterocyclic compound | −11.02916 |
| 3847415 |
| ZINC | Heterocyclic compound | −8.767672 |
| 12252872 | Ethyl 4-hydroxy-3-prop-2-enylbenzoate | ZINC | Heterocyclic compound | −8.68684 |
| 231770 | 1,3-bis(4-Bromophenyl)prop-2-en-1-one | ZINC | Heterocyclic compound | −8.668044 |
| 7129 |
| ZINC | Ether | −8.208685 |
| 60008260 | Ethyl 2-amino-5-cyanobenzoate | ZINC | Heterocyclic compound | −8.011839 |
Figure 4Dose-response curves for tested compounds against (A) OR1A1 and (B) OR2W1 for the luciferase assay (see Methods). The tested compounds were randomly selected from short-listed compounds after machine learning and molecular docking to evaluate the random forest model predictions. Cell surface expression of these two ORs from flow cytometry are shown in Figure S7.
Figure 5The comparison of potency of agonists and control identified in the current study compared with those reported by Bushdid et al. [14], for (A) OR1A1 and (B) OR2W1. The agonists reported in the current study are more potent than the agonists previously reported by Bushdid et al. [14] for both ORs.