Literature DB >> 31933037

A comparison of molecular representations for lipophilicity quantitative structure-property relationships with results from the SAMPL6 logP Prediction Challenge.

Raymond Lui1, Davy Guan1, Slade Matthews2.   

Abstract

Effective representation of a molecule is required to develop useful quantitative structure-property relationships (QSPR) for accurate prediction of chemical properties. The octanol-water partition coefficient logP, a measure of lipophilicity, is an important property for pharmacological and toxicological endpoints used in the pharmaceutical and regulatory spheres. We compare physicochemical descriptors, structural keys, and circular fingerprints in their ability to effectively represent a chemical space and characterise molecular features to correlate with lipophilicity. Exploratory landscape continuity analyses revealed that whole-molecule physicochemical descriptors could map together compounds that were similar in both molecular features and logP, indicating higher potential for use in logP QSPRs compared to the substructural approach of structural keys and circular fingerprints. Indeed, logP QSPR models parameterised by physicochemical descriptors consistently performed with the lowest error. Our best performing model was a stochastic gradient descent-optimised multilinear regression with 1438 descriptors, returning an internal benchmark RMSE of 1.03 log units. This corroborates the well-established notion that lipophilicity is an additive, whole-molecule property. We externally tested the model by participating in the 2019 SAMPL6 logP Prediction Challenge and blindly predicting for 11 protein kinase inhibitor fragment-like molecules. Our model returned an RMSE of 0.49 log units, placing eighth overall and third in the empirical methods category (submission ID 'hdpuj'). Permutation feature importance analyses revealed that physicochemical descriptors could characterise predictive molecular features highly relevant to the kinase inhibitor fragment-like molecules.

Entities:  

Keywords:  Machine learning; Physicochemical properties; QSPR; SAMPL6; logP

Year:  2020        PMID: 31933037     DOI: 10.1007/s10822-020-00279-0

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  26 in total

1.  Substructure and whole molecule approaches for calculating log P.

Authors:  R Mannhold; H van de Waterbeemd
Journal:  J Comput Aided Mol Des       Date:  2001-04       Impact factor: 3.686

2.  An electrotopological-state index for atoms in molecules.

Authors:  L B Kier; L H Hall
Journal:  Pharm Res       Date:  1990-08       Impact factor: 4.200

3.  Structure--activity landscape index: identifying and quantifying activity cliffs.

Authors:  Rajarshi Guha; John H Van Drie
Journal:  J Chem Inf Model       Date:  2008-02-28       Impact factor: 4.956

4.  QSAR modeling: where have you been? Where are you going to?

Authors:  Artem Cherkasov; Eugene N Muratov; Denis Fourches; Alexandre Varnek; Igor I Baskin; Mark Cronin; John Dearden; Paola Gramatica; Yvonne C Martin; Roberto Todeschini; Viviana Consonni; Victor E Kuz'min; Richard Cramer; Romualdo Benigni; Chihae Yang; James Rathman; Lothar Terfloth; Johann Gasteiger; Ann Richard; Alexander Tropsha
Journal:  J Med Chem       Date:  2014-01-06       Impact factor: 7.446

5.  Coarse-Grained Models for Automated Fragmentation and Parametrization of Molecular Databases.

Authors:  Johannes G E M Fraaije; Jan van Male; Paul Becherer; Rubèn Serral Gracià
Journal:  J Chem Inf Model       Date:  2016-12-06       Impact factor: 4.956

Review 6.  Machine learning in chemoinformatics and drug discovery.

Authors:  Yu-Chen Lo; Stefano E Rensi; Wen Torng; Russ B Altman
Journal:  Drug Discov Today       Date:  2018-05-08       Impact factor: 7.851

7.  Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models.

Authors:  Alexey V Zakharov; Tongan Zhao; Dac-Trung Nguyen; Tyler Peryea; Timothy Sheils; Adam Yasgar; Ruili Huang; Noel Southall; Anton Simeonov
Journal:  J Chem Inf Model       Date:  2019-10-25       Impact factor: 4.956

8.  Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Authors:  Dávid Bajusz; Anita Rácz; Károly Héberger
Journal:  J Cheminform       Date:  2015-05-20       Impact factor: 5.514

9.  MoleculeNet: a benchmark for molecular machine learning.

Authors:  Zhenqin Wu; Bharath Ramsundar; Evan N Feinberg; Joseph Gomes; Caleb Geniesse; Aneesh S Pappu; Karl Leswing; Vijay Pande
Journal:  Chem Sci       Date:  2017-10-31       Impact factor: 9.825

10.  JPlogP: an improved logP predictor trained using predicted data.

Authors:  Jeffrey Plante; Stephane Werner
Journal:  J Cheminform       Date:  2018-12-14       Impact factor: 5.514

View more
  2 in total

1.  Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge.

Authors:  Mehtap Işık; Teresa Danielle Bergazin; Thomas Fox; Andrea Rizzi; John D Chodera; David L Mobley
Journal:  J Comput Aided Mol Des       Date:  2020-02-27       Impact factor: 3.686

2.  Improvement of Prediction Performance With Conjoint Molecular Fingerprint in Deep Learning.

Authors:  Liangxu Xie; Lei Xu; Ren Kong; Shan Chang; Xiaojun Xu
Journal:  Front Pharmacol       Date:  2020-12-18       Impact factor: 5.810

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.