Literature DB >> 33837771

Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction.

Xiang Liu1,2,3, Huitao Feng2,4, Jie Wu3,5, Kelin Xia1.   

Abstract

Molecular descriptors are essential to not only quantitative structure activity/property relationship (QSAR/QSPR) models, but also machine learning based chemical and biological data analysis. In this paper, we propose persistent spectral hypergraph (PSH) based molecular descriptors or fingerprints for the first time. Our PSH-based molecular descriptors are used in the characterization of molecular structures and interactions, and further combined with machine learning models, in particular gradient boosting tree (GBT), for protein-ligand binding affinity prediction. Different from traditional molecular descriptors, which are usually based on molecular graph models, a hypergraph-based topological representation is proposed for protein-ligand interaction characterization. Moreover, a filtration process is introduced to generate a series of nested hypergraphs in different scales. For each of these hypergraphs, its eigen spectrum information can be obtained from the corresponding (Hodge) Laplacain matrix. PSH studies the persistence and variation of the eigen spectrum of the nested hypergraphs during the filtration process. Molecular descriptors or fingerprints can be generated from persistent attributes, which are statistical or combinatorial functions of PSH, and combined with machine learning models, in particular, GBT. We test our PSH-GBT model on three most commonly used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. Our results, for all these databases, are better than all existing machine learning models with traditional molecular descriptors, as far as we know. © The authors 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Keywords:  Drug design; Hodge Laplacian; Machine learning; Persistent spectral hypergraph

Year:  2021        PMID: 33837771     DOI: 10.1093/bib/bbab127

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  4 in total

1.  Hodge theory-based biomolecular data analysis.

Authors:  Ronald Koh Joon Wei; Junjie Wee; Valerie Evangelin Laurent; Kelin Xia
Journal:  Sci Rep       Date:  2022-06-11       Impact factor: 4.996

2.  Prediction of Binding Free Energy of Protein-Ligand Complexes with a Hybrid Molecular Mechanics/Generalized Born Surface Area and Machine Learning Method.

Authors:  Lina Dong; Xiaoyang Qu; Yuan Zhao; Binju Wang
Journal:  ACS Omega       Date:  2021-11-21

3.  Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction.

Authors:  Xiang Liu; Huitao Feng; Jie Wu; Kelin Xia
Journal:  PLoS Comput Biol       Date:  2022-04-06       Impact factor: 4.475

4.  XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein-Ligand Scoring and Ranking.

Authors:  Lina Dong; Xiaoyang Qu; Binju Wang
Journal:  ACS Omega       Date:  2022-06-13
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.