| Literature DB >> 35172886 |
Miao Yu1, Georgia Dolios2, Lauren Petrick2,3.
Abstract
Unknown features in untargeted metabolomics and non-targeted analysis (NTA) are identified using fragment ions from MS/MS spectra to predict the structures of the unknown compounds. The precursor ion selected for fragmentation is commonly performed using data dependent acquisition (DDA) strategies or following statistical analysis using targeted MS/MS approaches. However, the selected precursor ions from DDA only cover a biased subset of the peaks or features found in full scan data. In addition, different statistical analysis can select different precursor ions for MS/MS analysis, which make the post-hoc validation of ions selected following a secondary analysis impossible for precursor ions selected by the original statistical method. Here we propose an automated, exhaustive, statistical model-free workflow: paired mass distance-dependent analysis (PMDDA), for reproducible untargeted mass spectrometry MS2 fragment ion collection of unknown compounds found in MS1 full scan. Our workflow first removes redundant peaks from MS1 data and then exports a list of precursor ions for pseudo-targeted MS/MS analysis on independent peaks. This workflow provides comprehensive coverage of MS2 collection on unknown compounds found in full scan analysis using a "one peak for one compound" workflow without a priori redundant peak information. We compared pseudo-spectra formation and the number of MS2 spectra linked to MS1 data using the PMDDA workflow to that obtained using CAMERA and RAMclustR algorithms. More annotated compounds, molecular networks, and unique MS/MS spectra were found using PMDDA compared with CAMERA and RAMClustR. In addition, PMDDA can generate a preferred ion list for iterative DDA to enhance coverage of compounds when instruments support such functions. Finally, compounds with signals in both positive and negative modes can be identified by the PMDDA workflow, to further reduce redundancies. The whole workflow is fully reproducible as a docker image xcmsrocker with both the original data and the data processing template.Entities:
Keywords: Data analysis; High-resolution mass spectrometry; Metabolomics; Open science; Reproducible research; Workflow
Year: 2022 PMID: 35172886 PMCID: PMC8848943 DOI: 10.1186/s13321-022-00586-8
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1PMDDA workflow. Raw peaks are filtered by GlobalStd Algorithm to remove redundant peaks, then the remaining peaks are merged by cluster analysis to generate the precursor ion list. The selected peaks are assigned into multiple injections to collect the fragment ions for structure identification. The whole analysis can be found as a data process template in the ‘rmwf’ package. The complete data analysis is reproducible by xcmsrocker image
Fig. 2UpSet plot of metabolites networks found from CAMERA selected ions, RAMClustR selected ions, PMDDA selected ions, and iterative DDA (left panel is positive mode data and right panel is negative mode data). The set of ‘iDDA’ means iterative DDA with PMDDA selected precursor ions as the preferred list
Fig. 3Features linked between positive and negative by PMD 2.02 Da within a retention time shift of 10 s for positive and negative mode ionization. The red and blue circles represent positive and negative ions, respectively. Compounds with confirmed identities based on MS/MS annotation to GNPS are colored in black