| Literature DB >> 35323655 |
Jian Guo1, Sam Shen1, Min Liu2, Chenjingyi Wang1, Brian Low1, Ying Chen1, Yaxi Hu1, Shipei Xing1, Huaxu Yu1, Yu Gao3, Mingliang Fang2, Tao Huan1.
Abstract
Extracting metabolic features from liquid chromatography-mass spectrometry (LC-MS) data has been a long-standing bioinformatic challenge in untargeted metabolomics. Conventional feature extraction algorithms fail to recognize features with low signal intensities, poor chromatographic peak shapes, or those that do not fit the parameter settings. This problem also poses a challenge for MS-based exposome studies, as low-abundant metabolic or exposomic features cannot be automatically recognized from raw data. To address this data processing challenge, we developed an R package, JPA (short for Joint Metabolomic Data Processing and Annotation), to comprehensively extract metabolic features from raw LC-MS data. JPA performs feature extraction by combining a conventional peak picking algorithm and strategies for (1) recognizing features with bad peak shapes but that have tandem mass spectra (MS2) and (2) picking up features from a user-defined targeted list. The performance of JPA in global metabolomics was demonstrated using serial diluted urine samples, in which JPA was able to rescue an average of 25% of metabolic features that were missed by the conventional peak picking algorithm due to dilution. More importantly, the chromatographic peak shapes, analytical accuracy, and precision of the rescued metabolic features were all evaluated. Furthermore, owing to its sensitive feature extraction, JPA was able to achieve a limit of detection (LOD) that was up to thousands of folds lower when automatically processing metabolomics data of a serial diluted metabolite standard mixture analyzed in HILIC(-) and RP(+) modes. Finally, the performance of JPA in exposome research was validated using a mixture of 250 drugs and 255 pesticides at environmentally relevant levels. JPA detected an average of 2.3-fold more exposure compounds than conventional peak picking only.Entities:
Keywords: data processing; exposomics; feature extraction; metabolite annotation; untargeted metabolomics
Year: 2022 PMID: 35323655 PMCID: PMC8952385 DOI: 10.3390/metabo12030212
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1Schematic workflow of JPA (The numbers stand for the three methods of feature extraction used in JPA. The green check means the metabolic feature can be detected by using the method. The stop sign means the metabolic feature cannot be detected by using the method).
Figure 2Mechanistic explanation of JPA-MS2 recognition (JPA-MR) in extracting metabolic features.
Figure 3The influence of dilution on the (A) number and (B) fidelity rate of features extracted by JPA-peak picking (JPA-PP) and JPA-MS2 recognition (JPA-MR) in urine.
Figure 4Comparison of original urine data in RP(+) mode processed by JPA-PP and JPA-MR in (A) number of features, (B) quantitative precision, (C) number of annotated endogenous metabolites, and (D) processing time.
Figure 5Comparison of (A) endogenous metabolite standards and (B) exposome chemicals extracted by JPA-peak picking (JPA-PP), JPA-MS2 recognition (JPA-MR), and JPA-targeted list (JPA-TL).
Figure 6Circular bar plot of LODs of endogenous metabolites calculated based on the results of JPA-peak picking (JPA-PP), JPA-peak picking + MS2 recognition (JPA-PP + MR), and JPA-peak picking + MS2 recognition + targeted list (JPA-PP + MR + TL) on (A) HILIC(−) and (B) RP(+) mode data. The column plot inside the circular plot shows the absolute LOD of one representative metabolite in each mode.