Literature DB >> 33685519

Empowering large chemical knowledge bases for exposomics: PubChemLite meets MetFrag.

Emma L Schymanski1, Todor Kondić2, Steffen Neumann3,4, Paul A Thiessen5, Jian Zhang5, Evan E Bolton6.   

Abstract

Compound (or chemical) databases are an invaluable resource for many scientific disciplines. Exposomics researchers need to find and identify relevant chemicals that cover the entirety of potential (chemical and other) exposures over entire lifetimes. This daunting task, with over 100 million chemicals in the largest chemical databases, coupled with broadly acknowledged knowledge gaps in these resources, leaves researchers faced with too much-yet not enough-information at the same time to perform comprehensive exposomics research. Furthermore, the improvements in analytical technologies and computational mass spectrometry workflows coupled with the rapid growth in databases and increasing demand for high throughput "big data" services from the research community present significant challenges for both data hosts and workflow developers. This article explores how to reduce candidate search spaces in non-target small molecule identification workflows, while increasing content usability in the context of environmental and exposomics analyses, so as to profit from the increasing size and information content of large compound databases, while increasing efficiency at the same time. In this article, these methods are explored using PubChem, the NORMAN Network Suspect List Exchange and the in silico fragmentation approach MetFrag. A subset of the PubChem database relevant for exposomics, PubChemLite, is presented as a database resource that can be (and has been) integrated into current workflows for high resolution mass spectrometry. Benchmarking datasets from earlier publications are used to show how experimental knowledge and existing datasets can be used to detect and fill gaps in compound databases to progressively improve large resources such as PubChem, and topic-specific subsets such as PubChemLite. PubChemLite is a living collection, updating as annotation content in PubChem is updated, and exported to allow direct integration into existing workflows such as MetFrag. The source code and files necessary to recreate or adjust this are jointly hosted between the research parties (see data availability statement). This effort shows that enhancing the FAIRness (Findability, Accessibility, Interoperability and Reusability) of open resources can mutually enhance several resources for whole community benefit. The authors explicitly welcome additional community input on ideas for future developments.

Entities:  

Keywords:  Chemical database; Cheminformatics; Compound database; Compound knowledge base; Environmental science; Exposomics; FAIR; High resolution mass spectrometry; Identification; Open science

Year:  2021        PMID: 33685519     DOI: 10.1186/s13321-021-00489-0

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  24 in total

Review 1.  Complementing the genome with an "exposome": the outstanding challenge of environmental exposure measurement in molecular epidemiology.

Authors:  Christopher Paul Wild
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2005-08       Impact factor: 4.254

Review 2.  Biological insights through nontargeted metabolomics.

Authors:  Daniel C Sévin; Andreas Kuehne; Nicola Zamboni; Uwe Sauer
Journal:  Curr Opin Biotechnol       Date:  2014-10-22       Impact factor: 9.740

Review 3.  Searching molecular structure databases using tandem MS data: are we there yet?

Authors:  Sebastian Böcker
Journal:  Curr Opin Chem Biol       Date:  2016-12-22       Impact factor: 8.822

4.  Mass spectral reference libraries: an ever-expanding resource for chemical identification.

Authors:  Stephen Stein
Journal:  Anal Chem       Date:  2012-07-13       Impact factor: 6.986

5.  Identifying small molecules via high resolution mass spectrometry: communicating confidence.

Authors:  Emma L Schymanski; Junho Jeon; Rebekka Gulde; Kathrin Fenner; Matthias Ruff; Heinz P Singer; Juliane Hollender
Journal:  Environ Sci Technol       Date:  2014-01-29       Impact factor: 9.028

6.  Nontarget Screening with High Resolution Mass Spectrometry in the Environment: Ready to Go?

Authors:  Juliane Hollender; Emma L Schymanski; Heinz P Singer; P Lee Ferguson
Journal:  Environ Sci Technol       Date:  2017-09-26       Impact factor: 9.028

7.  Hybrid Search: A Method for Identifying Metabolites Absent from Tandem Mass Spectrometry Libraries.

Authors:  Brian T Cooper; Xinjian Yan; Yamil Simón-Manso; Dmitrii V Tchekhovskoi; Yuri A Mirokhin; Stephen E Stein
Journal:  Anal Chem       Date:  2019-10-22       Impact factor: 6.986

Review 8.  The exposome and health: Where chemistry meets biology.

Authors:  Roel Vermeulen; Emma L Schymanski; Albert-László Barabási; Gary W Miller
Journal:  Science       Date:  2020-01-24       Impact factor: 47.728

9.  Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93% accuracy.

Authors:  Ivana Blaženović; Tobias Kind; Hrvoje Torbašinović; Slobodan Obrenović; Sajjan S Mehta; Hiroshi Tsugawa; Tobias Wermuth; Nicolas Schauer; Martina Jahn; Rebekka Biedendieck; Dieter Jahn; Oliver Fiehn
Journal:  J Cheminform       Date:  2017-05-25       Impact factor: 5.514

10.  Critical Assessment of Small Molecule Identification 2016: automated methods.

Authors:  Emma L Schymanski; Christoph Ruttkies; Martin Krauss; Céline Brouard; Tobias Kind; Kai Dührkop; Felicity Allen; Arpana Vaniya; Dries Verdegem; Sebastian Böcker; Juho Rousu; Huibin Shen; Hiroshi Tsugawa; Tanvir Sajed; Oliver Fiehn; Bart Ghesquière; Steffen Neumann
Journal:  J Cheminform       Date:  2017-03-27       Impact factor: 5.514

View more
  13 in total

1.  In Silico Structure Predictions for Non-targeted Analysis: From Physicochemical Properties to Molecular Structures.

Authors:  Dimitri Abrahamsson; Adi Siddharth; Thomas M Young; Marina Sirota; June-Soo Park; Jonathan W Martin; Tracey J Woodruff
Journal:  J Am Soc Mass Spectrom       Date:  2022-06-01       Impact factor: 3.262

2.  OmicsNet 2.0: a web-based platform for multi-omics integration and network visual analytics.

Authors:  Guangyan Zhou; Zhiqiang Pang; Yao Lu; Jessica Ewald; Jianguo Xia
Journal:  Nucleic Acids Res       Date:  2022-05-26       Impact factor: 19.160

3.  Discovering pesticides and their TPs in Luxembourg waters using open cheminformatics approaches.

Authors:  Jessy Krier; Randolph R Singh; Todor Kondić; Adelene Lai; Philippe Diderich; Jian Zhang; Paul A Thiessen; Evan E Bolton; Emma L Schymanski
Journal:  Environ Int       Date:  2021-09-21       Impact factor: 9.621

4.  FAIR chemical structures in the Journal of Cheminformatics.

Authors:  Emma L Schymanski; Evan E Bolton
Journal:  J Cheminform       Date:  2021-07-07       Impact factor: 5.514

5.  Reply to "FAIR chemical structure in the Journal of Cheminformatics".

Authors:  Rajarshi Guha; Nina Jeliazkova; Egon Willighagen; Barbara Zdrazil
Journal:  J Cheminform       Date:  2021-07-07       Impact factor: 5.514

Review 6.  New Advances in Tissue Metabolomics: A Review.

Authors:  Michelle Saoi; Philip Britz-McKibbin
Journal:  Metabolites       Date:  2021-09-30

Review 7.  Defining the Scope of Exposome Studies and Research Needs from a Multidisciplinary Perspective.

Authors:  Pei Zhang; Christopher Carlsten; Romanas Chaleckis; Kati Hanhineva; Mengna Huang; Tomohiko Isobe; Ville M Koistinen; Isabel Meister; Stefano Papazian; Kalliroi Sdougkou; Hongyu Xie; Jonathan W Martin; Stephen M Rappaport; Hiroshi Tsugawa; Douglas I Walker; Tracey J Woodruff; Robert O Wright; Craig E Wheelock
Journal:  Environ Sci Technol Lett       Date:  2021-09-07

Review 8.  An exposomic framework to uncover environmental drivers of aging.

Authors:  Vrinda Kalia; Daniel W Belsky; Andrea A Baccarelli; Gary W Miller
Journal:  Exposome       Date:  2022-03-04

9.  Paths to Cheminformatics: Q&A with Norberto Sánchez-Cruz and Emma Schymanski.

Authors:  Norberto Sánchez-Cruz; Emma L Schymanski
Journal:  J Cheminform       Date:  2022-08-02       Impact factor: 8.489

10.  Uptake, Transport, and Toxicity of Pristine and Weathered Micro- and Nanoplastics in Human Placenta Cells.

Authors:  Hanna M Dusza; Eugene A Katrukha; Sandra M Nijmeijer; Anna Akhmanova; A Dick Vethaak; Douglas I Walker; Juliette Legler
Journal:  Environ Health Perspect       Date:  2022-09-21       Impact factor: 11.035

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.