Literature DB >> 34222831

Persistent Cohomology for Data With Multicomponent Heterogeneous Information.

Zixuan Cang1, Guo-Wei Wei2.   

Abstract

Persistent homology is a powerful tool for characterizing the topology of a data set at various geometric scales. When applied to the description of molecular structures, persistent homology can capture the multiscale geometric features and reveal certain interaction patterns in terms of topological invariants. However, in addition to the geometric information, there is a wide variety of nongeometric information of molecular structures, such as element types, atomic partial charges, atomic pairwise interactions, and electrostatic potential functions, that is not described by persistent homology. Although element-specific homology and electrostatic persistent homology can encode some nongeometric information into geometry based topological invariants, it is desirable to have a mathematical paradigm to systematically embed both geometric and nongeometric information, i.e., multicomponent heterogeneous information, into unified topological representations. To this end, we propose a persistent cohomology based framework for the enriched representation of data. In our framework, nongeometric information can either be distributed globally or reside locally on the datasets in the geometric sense and can be properly defined on topological spaces, i.e., simplicial complexes. Using the proposed persistent cohomology based framework, enriched barcodes are extracted from datasets to represent heterogeneous information. We consider a variety of datasets to validate the present formulation and illustrate the usefulness of the proposed method based on persistent cohomology. It is found that the proposed framework outperforms or at least matches the state-of-the-art methods in the protein-ligand binding affinity prediction from massive biomolecular datasets without resorting to any deep learning formulation.

Keywords:  55U10; 55U30; 92C40; biophysics; drug design; machine learning; topological data analysis

Year:  2020        PMID: 34222831      PMCID: PMC8249079          DOI: 10.1137/19m1272226

Source DB:  PubMed          Journal:  SIAM J Math Data Sci        ISSN: 2577-0187


  21 in total

1.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations.

Authors:  Todd J Dolinsky; Jens E Nielsen; J Andrew McCammon; Nathan A Baker
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  Multidimensional persistence in biomolecular data.

Authors:  Kelin Xia; Guo-Wei Wei
Journal:  J Comput Chem       Date:  2015-05-30       Impact factor: 3.376

3.  Improved protein-ligand binding affinity prediction by using a curvature-dependent surface-area model.

Authors:  Yang Cao; Lei Li
Journal:  Bioinformatics       Date:  2014-02-21       Impact factor: 6.937

4.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks.

Authors:  José Jiménez; Miha Škalič; Gerard Martínez-Rosell; Gianni De Fabritiis
Journal:  J Chem Inf Model       Date:  2018-01-29       Impact factor: 4.956

5.  Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology.

Authors:  Zixuan Cang; Guo-Wei Wei
Journal:  Bioinformatics       Date:  2017-11-15       Impact factor: 6.937

6.  SW1PerS: Sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data.

Authors:  Jose A Perea; Anastasia Deckard; Steve B Haase; John Harer
Journal:  BMC Bioinformatics       Date:  2015-08-16       Impact factor: 3.169

7.  Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening.

Authors:  Zixuan Cang; Lin Mu; Guo-Wei Wei
Journal:  PLoS Comput Biol       Date:  2018-01-08       Impact factor: 4.475

8.  A roadmap for the computation of persistent homology.

Authors:  Nina Otter; Mason A Porter; Ulrike Tillmann; Peter Grindrod; Heather A Harrington
Journal:  EPJ Data Sci       Date:  2017-08-09       Impact factor: 3.184

9.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

Authors:  Pedro J Ballester; John B O Mitchell
Journal:  Bioinformatics       Date:  2010-03-17       Impact factor: 6.937

10.  A topological paradigm for hippocampal spatial map formation using persistent homology.

Authors:  Y Dabaghian; F Mémoli; L Frank; G Carlsson
Journal:  PLoS Comput Biol       Date:  2012-08-09       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.