Literature DB >> 29268609

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition.

Sabrina Jaeger1, Simone Fulle1, Samo Turk1.   

Abstract

Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as a reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities.

Mesh:

Substances:

Year:  2018        PMID: 29268609     DOI: 10.1021/acs.jcim.7b00616

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  48 in total

1.  Bionoi: A Voronoi Diagram-Based Representation of Ligand-Binding Sites in Proteins for Machine Learning Applications.

Authors:  Joseph Feinstein; Wentao Shi; J Ramanujam; Michal Brylinski
Journal:  Methods Mol Biol       Date:  2021

2.  Deep Learning-Based Prediction of Drug-Induced Cardiotoxicity.

Authors:  Chuipu Cai; Pengfei Guo; Yadi Zhou; Jingwei Zhou; Qi Wang; Fengxue Zhang; Jiansong Fang; Feixiong Cheng
Journal:  J Chem Inf Model       Date:  2019-02-15       Impact factor: 4.956

3.  Deep learning improves prediction of drug-drug and drug-food interactions.

Authors:  Jae Yong Ryu; Hyun Uk Kim; Sang Yup Lee
Journal:  Proc Natl Acad Sci U S A       Date:  2018-04-16       Impact factor: 11.205

4.  BionoiNet: ligand-binding site classification with off-the-shelf deep neural network.

Authors:  Wentao Shi; Jeffrey M Lemoine; Abd-El-Monsif A Shawky; Manali Singha; Limeng Pu; Shuangyan Yang; J Ramanujam; Michal Brylinski
Journal:  Bioinformatics       Date:  2020-05-01       Impact factor: 6.937

5.  Machine learning strategies for the structure-property relationship of copolymers.

Authors:  Lei Tao; John Byrnes; Vikas Varshney; Ying Li
Journal:  iScience       Date:  2022-06-10

6.  DTI-BERT: Identifying Drug-Target Interactions in Cellular Networking Based on BERT and Deep Learning Method.

Authors:  Jie Zheng; Xuan Xiao; Wang-Ren Qiu
Journal:  Front Genet       Date:  2022-06-08       Impact factor: 4.772

7.  What Does the Machine Learn? Knowledge Representations of Chemical Reactivity.

Authors:  Joshua A Kammeraad; Jack Goetz; Eric A Walker; Ambuj Tewari; Paul M Zimmerman
Journal:  J Chem Inf Model       Date:  2020-03-03       Impact factor: 4.956

Review 8.  Trends in application of advancing computational approaches in GPCR ligand discovery.

Authors:  Siyu Zhu; Meixian Wu; Ziwei Huang; Jing An
Journal:  Exp Biol Med (Maywood)       Date:  2021-02-27

9.  S2DV: converting SMILES to a drug vector for predicting the activity of anti-HBV small molecules.

Authors:  Jinsong Shao; Qineng Gong; Zeyu Yin; Wenjie Pan; Sanjeevi Pandiyan; Li Wang
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 11.622

Review 10.  Opportunities and obstacles for deep learning in biology and medicine.

Authors:  Travers Ching; Daniel S Himmelstein; Brett K Beaulieu-Jones; Alexandr A Kalinin; Brian T Do; Gregory P Way; Enrico Ferrero; Paul-Michael Agapow; Michael Zietz; Michael M Hoffman; Wei Xie; Gail L Rosen; Benjamin J Lengerich; Johnny Israeli; Jack Lanchantin; Stephen Woloszynek; Anne E Carpenter; Avanti Shrikumar; Jinbo Xu; Evan M Cofer; Christopher A Lavender; Srinivas C Turaga; Amr M Alexandari; Zhiyong Lu; David J Harris; Dave DeCaprio; Yanjun Qi; Anshul Kundaje; Yifan Peng; Laura K Wiley; Marwin H S Segler; Simina M Boca; S Joshua Swamidass; Austin Huang; Anthony Gitter; Casey S Greene
Journal:  J R Soc Interface       Date:  2018-04       Impact factor: 4.293

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.