Literature DB >> 23330685

BioSM: metabolomics tool for identifying endogenous mammalian biochemical structures in chemical structure space.

Mai A Hamdalla1, Ion I Mandoiu, Dennis W Hill, Sanguthevar Rajasekaran, David F Grant.   

Abstract

The structural identification of unknown biochemical compounds in complex biofluids continues to be a major challenge in metabolomics research. Using LC/MS, there are currently two major options for solving this problem: searching small biochemical databases, which often do not contain the unknown of interest or searching large chemical databases which include large numbers of nonbiochemical compounds. Searching larger chemical databases (larger chemical space) increases the odds of identifying an unknown biochemical compound, but only if nonbiochemical structures can be eliminated from consideration. In this paper we present BioSM; a cheminformatics tool that uses known endogenous mammalian biochemical compounds (as scaffolds) and graph matching methods to identify endogenous mammalian biochemical structures in chemical structure space. The results of a comprehensive set of empirical experiments suggest that BioSM identifies endogenous mammalian biochemical structures with high accuracy. In a leave-one-out cross validation experiment, BioSM correctly predicted 95% of 1388 Kyoto Encyclopedia of Genes and Genomes (KEGG) compounds as endogenous mammalian biochemicals using 1565 scaffolds. Analysis of two additional biological data sets containing 2330 human metabolites (HMDB) and 2416 plant secondary metabolites (KEGG) resulted in biochemical annotations of 89% and 72% of the compounds, respectively. When a data set of 3895 drugs (DrugBank and USAN) was tested, 48% of these structures were predicted to be biochemical. However, when a set of synthetic chemical compounds (Chembridge and Chemsynthesis databases) were examined, only 29% of the 458,207 structures were predicted to be biochemical. Moreover, BioSM predicted that 34% of 883,199 randomly selected compounds from PubChem were biochemical. We then expanded the scaffold list to 3927 biochemical compounds and reevaluated the above data sets to determine whether scaffold number influenced model performance. Although there were significant improvements in model sensitivity and specificity using the larger scaffold list, the data set comparison results were very similar. These results suggest that additional biochemical scaffolds will not further improve our representation of biochemical structure space and that the model is reasonably robust. BioSM provides a qualitative (yes/no) and quantitative (ranking) method for endogenous mammalian biochemical annotation of chemical space and, thus, will be useful in the identification of unknown biochemical structures in metabolomics. BioSM is freely available at http://metabolomics.pharm.uconn.edu.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23330685      PMCID: PMC3866231          DOI: 10.1021/ci300512q

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  29 in total

1.  Plant metabolomics: the missing link in functional genomics strategies.

Authors:  Robert Hall; Mike Beale; Oliver Fiehn; Nigel Hardy; Lloyd Sumner; Raoul Bino
Journal:  Plant Cell       Date:  2002-07       Impact factor: 11.277

2.  ZINC--a free database of commercially available compounds for virtual screening.

Authors:  John J Irwin; Brian K Shoichet
Journal:  J Chem Inf Model       Date:  2005 Jan-Feb       Impact factor: 4.956

3.  ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI). Interview by Wendy A. Warr.

Authors:  John Overington
Journal:  J Comput Aided Mol Des       Date:  2009-02-05       Impact factor: 3.686

Review 4.  The rise of chemodiversity in plants.

Authors:  Jing-Ke Weng; Ryan N Philippe; Joseph P Noel
Journal:  Science       Date:  2012-06-29       Impact factor: 47.728

5.  Small Molecule Subgraph Detector (SMSD) toolkit.

Authors:  Syed Asad Rahman; Matthew Bashton; Gemma L Holliday; Rainer Schrader; Janet M Thornton
Journal:  J Cheminform       Date:  2009-08-10       Impact factor: 5.514

6.  Mapping human metabolic pathways in the small molecule chemical space.

Authors:  Antonio Macchiarulo; Janet M Thornton; Irene Nobeli
Journal:  J Chem Inf Model       Date:  2009-10       Impact factor: 4.956

7.  Comparing the chemical spaces of metabolites and available chemicals: models of metabolite-likeness.

Authors:  Sunil Gupta; João Aires-de-Sousa
Journal:  Mol Divers       Date:  2007-02-16       Impact factor: 3.364

8.  Computational prediction of human metabolic pathways from the complete human genome.

Authors:  Pedro Romero; Jonathan Wagg; Michelle L Green; Dale Kaiser; Markus Krummenacker; Peter D Karp
Journal:  Genome Biol       Date:  2004-12-22       Impact factor: 13.583

9.  PubChem: a public information system for analyzing bioactivities of small molecules.

Authors:  Yanli Wang; Jewen Xiao; Tugba O Suzek; Jian Zhang; Jiyao Wang; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2009-06-04       Impact factor: 16.971

10.  A mapping of drug space from the viewpoint of small molecule metabolism.

Authors:  James Corey Adams; Michael J Keiser; Li Basuino; Henry F Chambers; Deok-Sun Lee; Olaf G Wiest; Patricia C Babbitt
Journal:  PLoS Comput Biol       Date:  2009-08-21       Impact factor: 4.475

View more
  15 in total

1.  Rethinking Mass Spectrometry-Based Small Molecule Identification Strategies in Metabolomics.

Authors:  Fumio Matsuda
Journal:  Mass Spectrom (Tokyo)       Date:  2014-08-16

2.  Solving CASMI 2013 with MetFrag, MetFusion and MOLGEN-MS/MS.

Authors:  Emma L Schymanski; Michael Gerlich; Christoph Ruttkies; Steffen Neumann
Journal:  Mass Spectrom (Tokyo)       Date:  2014-08-16

3.  In silico enzymatic synthesis of a 400,000 compound biochemical database for nontargeted metabolomics.

Authors:  Lochana C Menikarachchi; Dennis W Hill; Mai A Hamdalla; Ion I Mandoiu; David F Grant
Journal:  J Chem Inf Model       Date:  2013-09-12       Impact factor: 4.956

4.  The octet rule in chemical space: generating virtual molecules.

Authors:  Rafel Israels; Astrid Maaß; Jan Hamaekers
Journal:  Mol Divers       Date:  2017-08-03       Impact factor: 2.943

5.  Optimizing artificial neural network models for metabolomics and systems biology: an example using HPLC retention index data.

Authors:  L Mark Hall; Dennis W Hill; Lochana C Menikarachchi; Ming-Hui Chen; Lowell H Hall; David F Grant
Journal:  Bioanalysis       Date:  2015       Impact factor: 2.681

6.  Correction of precursor and product ion relative abundances in order to standardize CID spectra and improve Ecom50 accuracy for non-targeted metabolomics.

Authors:  Ritvik Dubey; Dennis W Hill; Steven Lai; Chen Ming-Hui; David F Grant
Journal:  Metabolomics       Date:  2015-06-01       Impact factor: 4.290

7.  High-Throughput Non-targeted Chemical Structure Identification Using Gas-Phase Infrared Spectra.

Authors:  Erandika Karunaratne; Dennis W Hill; Philipp Pracht; José A Gascón; Stefan Grimme; David F Grant
Journal:  Anal Chem       Date:  2021-07-21       Impact factor: 8.008

8.  A molecular structure matching approach to efficient identification of endogenous mammalian biochemical structures.

Authors:  Mai A Hamdalla; Reda A Ammar; Sanguthevar Rajasekaran
Journal:  BMC Bioinformatics       Date:  2015-03-18       Impact factor: 3.169

9.  Understanding the foundations of the structural similarities between marketed drugs and endogenous human metabolites.

Authors:  Steve O'Hagan; Douglas B Kell
Journal:  Front Pharmacol       Date:  2015-05-13       Impact factor: 5.810

Review 10.  How drugs get into cells: tested and testable predictions to help discriminate between transporter-mediated uptake and lipoidal bilayer diffusion.

Authors:  Douglas B Kell; Stephen G Oliver
Journal:  Front Pharmacol       Date:  2014-10-31       Impact factor: 5.810

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.