Tianwei Yu1, Dean P Jones1. 1. Department of Biostatistics and Bioinformatics, Rollins School of Public Health and Department of Medicine, School of Medicine, Emory University, Atlanta, GA 30322, USA.
Abstract
MOTIVATION: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics. RESULTS: Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties. AVAILABILITY AND IMPLEMENTATION: The new peak detection approach is implemented as part of the apLCMS package available at http://web1.sph.emory.edu/apLCMS/ CONTACT: tyu8@emory.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics. RESULTS: Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties. AVAILABILITY AND IMPLEMENTATION: The new peak detection approach is implemented as part of the apLCMS package available at http://web1.sph.emory.edu/apLCMS/ CONTACT: tyu8@emory.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Ragnar Stolt; Ralf J O Torgrip; Johan Lindberg; Leonard Csenki; Johan Kolmert; Ina Schuppe-Koistinen; Sven P Jacobsson Journal: Anal Chem Date: 2006-02-15 Impact factor: 6.986
Authors: Ken H Liu; Mary Nellis; Karan Uppal; Chunyu Ma; ViLinh Tran; Yongliang Liang; Douglas I Walker; Dean P Jones Journal: Anal Chem Date: 2020-06-15 Impact factor: 6.986
Authors: Ronald C Eldridge; Karan Uppal; D Neil Hayes; M Ryan Smith; Xin Hu; Zhaohui S Qin; Jonathan J Beitler; Andrew H Miller; Evanthia C Wommack; Kristin A Higgins; Dong M Shin; Bryan Ulrich; David C Qian; Nabil F Saba; Deborah W Bruner; Dean P Jones; Canhua Xiao Journal: Cancer Epidemiol Biomarkers Prev Date: 2021-08-10 Impact factor: 4.254
Authors: Che-Jung Chang; Dana Boyd Barr; P Barry Ryan; Parinya Panuwet; Melissa M Smarr; Ken Liu; Kurunthachalam Kannan; Volha Yakimavets; Youran Tan; ViLinh Ly; Carmen J Marsit; Dean P Jones; Elizabeth J Corwin; Anne L Dunlop; Donghai Liang Journal: Environ Int Date: 2021-11-01 Impact factor: 9.621
Authors: Ken H Liu; Douglas I Walker; Karan Uppal; ViLinh Tran; Patricia Rohrbeck; Timothy M Mallon; Dean P Jones Journal: J Occup Environ Med Date: 2016-08 Impact factor: 2.162
Authors: Jonathan L Spalding; Fuad J Naser; Nathaniel G Mahieu; Stephen L Johnson; Gary J Patti Journal: J Proteome Res Date: 2018-09-25 Impact factor: 4.466
Authors: Douglas I Walker; Karan Uppal; Luoping Zhang; Roel Vermeulen; Martyn Smith; Wei Hu; Mark P Purdue; Xiaojiang Tang; Boris Reiss; Sungkyoon Kim; Laiyu Li; Hanlin Huang; Kurt D Pennell; Dean P Jones; Nathaniel Rothman; Qing Lan Journal: Int J Epidemiol Date: 2016-10-05 Impact factor: 7.196
Authors: Shabatun J Islam; Jeong Hwan Kim; Matthew Topel; Chang Liu; Yi-An Ko; Mahasin S Mujahid; Mario Sims; Mohamed Mubasher; Kiran Ejaz; Jan Morgan-Billingslea; Kia Jones; Edmund K Waller; Dean Jones; Karan Uppal; Sandra B Dunbar; Priscilla Pemu; Viola Vaccarino; Charles D Searles; Peter Baltrus; Tené T Lewis; Arshed A Quyyumi; Herman Taylor Journal: J Am Heart Assoc Date: 2020-04-28 Impact factor: 5.501
Authors: Chandresh Nanji Ladva; Rachel Golan; Donghai Liang; Roby Greenwald; Douglas I Walker; Karan Uppal; Amit U Raysoni; ViLinh Tran; Tianwei Yu; W Dana Flanders; Gary W Miller; Dean P Jones; Jeremy A Sarnat Journal: PLoS One Date: 2018-09-19 Impact factor: 3.240
Authors: Elizabeth Y Chong; Yijian Huang; Hao Wu; Nima Ghasemzadeh; Karan Uppal; Arshed A Quyyumi; Dean P Jones; Tianwei Yu Journal: Sci Rep Date: 2015-11-24 Impact factor: 4.379