| Literature DB >> 24533635 |
Xiaoli Wei1, Xue Shi, Seongho Kim, Jeffrey S Patrick, Joe Binkley, Maiying Kong, Craig McClain, Xiang Zhang.
Abstract
A data dependent peak model (DDPM) based spectrum deconvolution method was developed for analysis of high resolution LC-MS data. To construct the selected ion chromatogram (XIC), a clustering method, the density based spatial clustering of applications with noise (DBSCAN), is applied to all m/z values of an LC-MS data set to group the m/z values into each XIC. The DBSCAN constructs XICs without the need for a user defined m/z variation window. After the XIC construction, the peaks of molecular ions in each XIC are detected using both the first and the second derivative tests, followed by an optimized chromatographic peak model selection method for peak deconvolution. A total of six chromatographic peak models are considered, including Gaussian, log-normal, Poisson, gamma, exponentially modified Gaussian, and hybrid of exponential and Gaussian models. The abundant nonoverlapping peaks are chosen to find the optimal peak models that are both data- and retention-time-dependent. Analysis of 18 spiked-in LC-MS data demonstrates that the proposed DDPM spectrum deconvolution method outperforms the traditional method. On average, the DDPM approach not only detected 58 more chromatographic peaks from each of the testing LC-MS data but also improved the retention time and peak area 3% and 6%, respectively.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24533635 PMCID: PMC3982975 DOI: 10.1021/ac403803a
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 6.986
Figure 1Information of m/z span within the XICs generated by DBCSAN for analysis of LC-MS data acquired from sample S251: (A) the cumulative distribution of the Δm/z values of all XICs; (B) the relationship of m/z span in XICs and the number of chromatographic peaks detected in the corresponding XICs.
Figure 2The distribution of Silhouette scores of all clusters obtained by DBSCAN from the entire LC-MS data of sample S251.
Figure 3An example of XICs generated using different user defined m/z variation windows and DBSCAN approach: (A) Δm/z ≤ 4 ppm; (B) Δm/z ≤ 5 ppm; (C) Δm/z ≤ 6 ppm; (D) Δm/z ≤ 7 ppm; (E) Δm/z ≤ 8 ppm; (F) DBSCAN approach.
The Analysis Results of All Spiked-in LC-MS Data Set Using the User Defined m/z Variation Window (Δm/z ≤ 7 ppm) and the DBSCAN Approach
| μrt_10 (%) | μarea_10 (%) | μrt_25 (%) | μarea_25 (%) | μrt_40 (%) | μarea_40 (%) | |||
|---|---|---|---|---|---|---|---|---|
| Δ | 5141 | 1827 | 0.41 | 14.3 | 0.41 | 14.5 | 0.38 | 14.2 |
| DBSCAN | 5557 | 1885 | 0.39 | 14.3 | 0.34 | 13.7 | 0.33 | 13.7 |
Figure 4A sample XIC constructed by DBSCAN approach with m/z variation of 37 ppm.
Figure 5Effect of five chromatographic peak models in fitting a region of an XIC containing overlapped chromatographic peaks: (A) PMM model; (B) GMM model; (C) GaMM model; (D) LNMM model; (E) EMGM model.
The Number of Aligned Chromatographic Peaks Detected in All 18 Spiked-in Samples Using the XICs Constructed by m/z Variation Window Approach and DBSCAN Approach
| frequency | number of samples | Δ | DBSCAN |
|---|---|---|---|
| 100 | 18 | 295 | 298 |
| 80–99 | 15, 16, 17 | 255 | 270 |
| 60–79 | 11, 12, 13, 14 | 206 | 223 |
Frequency refers to the ratio of the number of samples from which a chromatographic peak was aligned divided by the total number of samples.
Figure 6Comparison of XIC construction using DBSCAN and m/z variation window approaches: (A) XIC of a peak with m/z value of 812.6157 constructed from sample S403 by DBSCAN; (B) XIC of the same peak constructed using a m/z variation window of 7 ppm; (C) four fitted peaks by EMGM model using the XIC data presented in part A.