Ziling Fan1, Amber Alley2, Kian Ghaffari2, Habtom W Ressom3. 1. Department of Biochemistry and Molecular & Cellular Biology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC, USA. 2. Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Suite 173, Building D, 4000 Reservoir Road NW, Washington, DC, 20057, USA. 3. Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Suite 173, Building D, 4000 Reservoir Road NW, Washington, DC, 20057, USA. hwr@georgetown.edu.
Abstract
INTRODUCTION: Metabolite annotation is a critical and challenging step in mass spectrometry-based metabolomic profiling. In a typical untargeted MS/MS-based metabolomic study, experimental MS/MS spectra are matched against those in spectral libraries for metabolite annotation. Yet, existing spectral libraries comprise merely a marginal percentage of known compounds. OBJECTIVE: The objective is to develop a method that helps rank putative metabolite IDs for analytes whose reference MS/MS spectra are not present in spectral libraries. METHODS: We introduce MetFID, which uses an artificial neural network (ANN) trained for predicting molecular fingerprints based on experimental MS/MS data. To narrow the search space, MetFID retrieves candidates from metabolite databases using molecular formula or m/z value of the precursor ions of the analytes. The candidate whose fingerprint is most analogous to the predicted fingerprint is used for metabolite annotation. A comprehensive evaluation was performed by training MetFID using MS/MS spectra from the MoNA repository and NIST library and by testing with structure-disjoint MS/MS spectra from the NIST library, the CASMI 2016 dataset, and in-house MS/MS data from a cancer biomarker discovery study. RESULTS: We observed that training separate models for distinct ranges of collision energies enhanced model performance compared to a single model that covers a wide range of collision energies. Using MetaboQuest to retrieve candidates, MetFID prioritized the correct putative ID in the first place rank for about 50% of the testing cases. Through the independent testing dataset, we demonstrated that MetFID has the potential to improve the accuracy of ranking putative metabolite IDs by more than 5% compared to other tools such as ChemDistiller, CSI:FingerID, and MetFrag. CONCLUSION: MetFID offers a promising opportunity to enhance the accuracy of metabolite annotation by using ANN for molecular fingerprint prediction.
INTRODUCTION: Metabolite annotation is a critical and challenging step in mass spectrometry-based metabolomic profiling. In a typical untargeted MS/MS-based metabolomic study, experimental MS/MS spectra are matched against those in spectral libraries for metabolite annotation. Yet, existing spectral libraries comprise merely a marginal percentage of known compounds. OBJECTIVE: The objective is to develop a method that helps rank putative metabolite IDs for analytes whose reference MS/MS spectra are not present in spectral libraries. METHODS: We introduce MetFID, which uses an artificial neural network (ANN) trained for predicting molecular fingerprints based on experimental MS/MS data. To narrow the search space, MetFID retrieves candidates from metabolite databases using molecular formula or m/z value of the precursor ions of the analytes. The candidate whose fingerprint is most analogous to the predicted fingerprint is used for metabolite annotation. A comprehensive evaluation was performed by training MetFID using MS/MS spectra from the MoNA repository and NIST library and by testing with structure-disjoint MS/MS spectra from the NIST library, the CASMI 2016 dataset, and in-house MS/MS data from a cancer biomarker discovery study. RESULTS: We observed that training separate models for distinct ranges of collision energies enhanced model performance compared to a single model that covers a wide range of collision energies. Using MetaboQuest to retrieve candidates, MetFID prioritized the correct putative ID in the first place rank for about 50% of the testing cases. Through the independent testing dataset, we demonstrated that MetFID has the potential to improve the accuracy of ranking putative metabolite IDs by more than 5% compared to other tools such as ChemDistiller, CSI:FingerID, and MetFrag. CONCLUSION: MetFID offers a promising opportunity to enhance the accuracy of metabolite annotation by using ANN for molecular fingerprint prediction.
Authors: Kai Dührkop; Markus Fleischauer; Marcus Ludwig; Alexander A Aksenov; Alexey V Melnik; Marvin Meusel; Pieter C Dorrestein; Juho Rousu; Sebastian Böcker Journal: Nat Methods Date: 2019-03-18 Impact factor: 28.547
Authors: Christoph Ruttkies; Emma L Schymanski; Sebastian Wolf; Juliane Hollender; Steffen Neumann Journal: J Cheminform Date: 2016-01-29 Impact factor: 5.514