| Literature DB >> 29671802 |
Guohua Huang1,2, Jincheng Li3,4, Chenglin Zhao5,6.
Abstract
Interactions between drugs and proteins occupy a central position during the process of drug discovery and development. Numerous methods have recently been developed for identifying drug⁻target interactions, but few have been devoted to finding interactions between post-translationally modified proteins and drugs. We presented a machine learning-based method for identifying associations between small molecules and binding-associated S-nitrosylated (SNO-) proteins. Namely, small molecules were encoded by molecular fingerprint, SNO-proteins were encoded by the information entropy-based method, and the random forest was used to train a classifier. Ten-fold and leave-one-out cross validations achieved, respectively, 0.7235 and 0.7490 of the area under a receiver operating characteristic curve. Computational analysis of similarity suggested that SNO-proteins associated with the same drug shared statistically significant similarity, and vice versa. This method and finding are useful to identify drug⁻SNO associations and further facilitate the discovery and development of SNO-associated drugs.Entities:
Keywords: SNO; fingerprints; information entropy; machine learning; random forest
Mesh:
Substances:
Year: 2018 PMID: 29671802 PMCID: PMC6017196 DOI: 10.3390/molecules23040954
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1The overview of the proposed method.
Figure 2Molecular fingerprints. The top diagram (A) illustrates structure-based fingerprints, where eight substructures were explored and three substructures, marked with circles, were present. The bottom diagram (B) illustrates hash-based fingerprints, where all paths starting from NH2 (circle) and of up to a length of six were explored, and each path was then hashed into a binary bit.
The categories and the bit numbers of fingerprints.
| Category | Number of Bits | Category | Number of Bits |
|---|---|---|---|
| E-state | 79 | Klekota–Roth | 4860 |
| Daylight | 1024 | MACCs | 166 |
| CDK extended | 1024 | CDK substructure | 307 |
| CDK graph | 1024 | PubChem | 881 |
| CDK hybridization | 1024 |
Figure 3The receiver operating characteristic (ROC) curves of ten types of fingerprint by ten-fold cross validation.
Figure 4The ROC curves of five algorithms by leave-one-out cross validation.
Figure 5Illustration of cases of similarity. The top diagram (A) illustrates the similarity of SNO-proteins associated with the same small molecule. The bottom diagram (B) illustrates the similarity of small molecules associated with the same SNO-protein P63000-178.
Twelve predicted drug–SNO associations, similar drugs, and SNO-proteins targeted by similar drugs.
| Predicted Associations | Similar Small Molecules | SNO-Proteins Targeted by Similar Molecules |
|---|---|---|
| DB04427–P18031-215 | DB06887, DB07719, DB08003, DB08549, DB08591, DB08593, DB02827 | P18031-215 |
| DB01960–P15121-299 | DB02338, DB03461 | P15121-299 |
| DB04315–P15121-299 | DB02338, DB03461, DB08772 | P15121-299 |
| DB08213–P18031-215 | DB02827, DB06887, DB07134, DB07719, DB07730, DB08003, DB08549, DB08591, DB08593 | P18031-215 |
| DB04502–P18031-215 | DB01962, DB03483, DB03557, DB06887, DB07719, DB08003, DB08549, DB08591 | P18031-215 |
| DB00114–P18031-215 | DB01962, DB07480 | P18031-215 |
| DB02051–P18031-215 | DB06887, DB07719, DB08549, DB08591 | P18031-215 |
| DB07905–P18031-215 | not existing similar drugs | P18031-215 |
| DB08607–P18031-215 | DB02072, DB02827, DB03102, DB03670, DB07298 | P18031-215 |
| DB02200–P18031-215 | DB03483, DB03557, DB03714, DB06887, DB07651, DB07719, DB08003, DB08549, DB08783 | P18031-215 |
| DB00171–P15121-299 | DB02338, DB03461 | P15121-299 |
| DB00155–P43235-139 | DB04276, DB04523, DB07592 | P43235-139 |