Literature DB >> 27030802

Impact of similarity threshold on the topology of molecular similarity networks and clustering outcomes.

Gergely Zahoránszky-Kőhalmi1, Cristian G Bologa1, Tudor I Oprea1.   

Abstract

BACKGROUND: Complex network theory based methods and the emergence of "Big Data" have reshaped the terrain of investigating structure-activity relationships of molecules. This change gave rise to new methods which need to face an important challenge, namely: how to restructure a large molecular dataset into a network that best serves the purpose of the subsequent analyses. With special focus on network clustering, our study addresses this open question by proposing a data transformation method and a clustering framework.
RESULTS: Using the WOMBAT and PubChem MLSMR datasets we investigated the relation between varying the similarity threshold applied on the similarity matrix and the average clustering coefficient of the emerging similarity-based networks. These similarity networks were then clustered with the InfoMap algorithm. We devised a systematic method to generate so-called "pseudo-reference" clustering datasets which compensate for the lack of large-scale reference datasets. With help from the clustering framework we were able to observe the effects of varying the similarity threshold and its consequence on the average clustering coefficient and the clustering performance.
CONCLUSIONS: We observed that the average clustering coefficient versus similarity threshold function can be characterized by the presence of a peak that covers a range of similarity threshold values. This peak is preceded by a steep decline in the number of edges of the similarity network. The maximum of this peak is well aligned with the best clustering outcome. Thus, if no reference set is available, choosing the similarity threshold associated with this peak would be a near-ideal setting for the subsequent network cluster analysis. The proposed method can be used as a general approach to determine the appropriate similarity threshold to generate the similarity network of large-scale molecular datasets.

Entities:  

Year:  2016        PMID: 27030802      PMCID: PMC4812625          DOI: 10.1186/s13321-016-0127-5

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  23 in total

Review 1.  Community structure in social and biological networks.

Authors:  M Girvan; M E J Newman
Journal:  Proc Natl Acad Sci U S A       Date:  2002-06-11       Impact factor: 11.205

2.  Discovery of chemical compound groups with common structures by a network analysis approach (affinity prediction method).

Authors:  Shigeru Saito; Takatsugu Hirokawa; Katsuhisa Horimoto
Journal:  J Chem Inf Model       Date:  2010-12-09       Impact factor: 4.956

3.  ZINC--a free database of commercially available compounds for virtual screening.

Authors:  John J Irwin; Brian K Shoichet
Journal:  J Chem Inf Model       Date:  2005 Jan-Feb       Impact factor: 4.956

4.  Uncovering the overlapping community structure of complex networks in nature and society.

Authors:  Gergely Palla; Imre Derényi; Illés Farkas; Tamás Vicsek
Journal:  Nature       Date:  2005-06-09       Impact factor: 49.962

Review 5.  Molecular similarity and diversity in chemoinformatics: from theory to applications.

Authors:  Ana G Maldonado; J P Doucet; Michel Petitjean; Bo-Tao Fan
Journal:  Mol Divers       Date:  2006-02       Impact factor: 2.943

6.  CFinder: locating cliques and overlapping modules in biological networks.

Authors:  Balázs Adamcsek; Gergely Palla; Illés J Farkas; Imre Derényi; Tamás Vicsek
Journal:  Bioinformatics       Date:  2006-02-10       Impact factor: 6.937

Review 7.  Maps of random walks on complex networks reveal community structure.

Authors:  Martin Rosvall; Carl T Bergstrom
Journal:  Proc Natl Acad Sci U S A       Date:  2008-01-23       Impact factor: 11.205

8.  Small-world phenomena in chemical library networks: application to fragment-based drug discovery.

Authors:  Naoki Tanaka; Kazuki Ohno; Tatsuya Niimi; Ayako Moritomo; Kenichi Mori; Masaya Orita
Journal:  J Chem Inf Model       Date:  2009-12       Impact factor: 4.956

9.  Diagnostic tests. 1: Sensitivity and specificity.

Authors:  D G Altman; J M Bland
Journal:  BMJ       Date:  1994-06-11

10.  HierS: hierarchical scaffold clustering using topological chemical graphs.

Authors:  Steven J Wilkens; Jeff Janes; Andrew I Su
Journal:  J Med Chem       Date:  2005-05-05       Impact factor: 7.446

View more
  8 in total

1.  Modulation of Triple Artemisinin-Based Combination Therapy Pharmacodynamics by Plasmodium falciparum Genotype.

Authors:  Megan R Ansbro; Zina Itkin; Lu Chen; Gergely Zahoranszky-Kohalmi; Chanaki Amaratunga; Olivo Miotto; Tyler Peryea; Charlotte V Hobbs; Seila Suon; Juliana M Sá; Arjen M Dondorp; Rob W van der Pluijm; Thomas E Wellems; Anton Simeonov; Richard T Eastman
Journal:  ACS Pharmacol Transl Sci       Date:  2020-11-02

2.  Erratum to: Impact of similarity threshold on the topology of molecular similarity networks and clustering outcomes.

Authors:  Gergely Zahoránszky-Kőhalmi; Cristian G Bologa; Oleg Ursu; Tudor I Oprea
Journal:  J Cheminform       Date:  2016-05-20       Impact factor: 5.514

3.  SmartGraph: a network pharmacology investigation platform.

Authors:  Gergely Zahoránszky-Kőhalmi; Timothy Sheils; Tudor I Oprea
Journal:  J Cheminform       Date:  2020-01-21       Impact factor: 5.514

4.  DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach.

Authors:  Yash Khemchandani; Stephen O'Hagan; Soumitra Samanta; Neil Swainston; Timothy J Roberts; Danushka Bollegala; Douglas B Kell
Journal:  J Cheminform       Date:  2020-09-04       Impact factor: 5.514

5.  Quantifying Mineral-Ligand Structural Similarities: Bridging the Geological World of Minerals with the Biological World of Enzymes.

Authors:  Daniel Zhao; Stuart Bartlett; Yuk L Yung
Journal:  Life (Basel)       Date:  2020-12-10

6.  Network-based piecewise linear regression for QSAR modelling.

Authors:  Jonathan Cardoso-Silva; Lazaros G Papageorgiou; Sophia Tsoka
Journal:  J Comput Aided Mol Des       Date:  2019-10-18       Impact factor: 3.686

Review 7.  The Transporter-Mediated Cellular Uptake and Efflux of Pharmaceutical Drugs and Biotechnology Products: How and Why Phospholipid Bilayer Transport Is Negligible in Real Biomembranes.

Authors:  Douglas B Kell
Journal:  Molecules       Date:  2021-09-16       Impact factor: 4.411

8.  A Novel Network Science and Similarity-Searching-Based Approach for Discovering Potential Tumor-Homing Peptides from Antimicrobials.

Authors:  Maylin Romero; Yovani Marrero-Ponce; Hortensia Rodríguez; Guillermin Agüero-Chapin; Agostinho Antunes; Longendri Aguilera-Mendoza; Felix Martinez-Rios
Journal:  Antibiotics (Basel)       Date:  2022-03-17
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.