| Literature DB >> 33319832 |
Rosa Alba Sola Martínez1,2, José María Pastor Hernández1,2, Gema Lozano Terol1,2, Julia Gallego-Jara1, Luis García-Marcos2,3,4, Manuel Cánovas Díaz1,2, Teresa de Diego Puente5,6.
Abstract
The noninvasive diagnosis and monitoring of high prevalence diseases such as cardiovascular diseases, cancers and chronic respiratory diseases are currently priority objectives in the area of health. In this regard, the analysis of volatile organic compounds (VOCs) has been identified as a potential noninvasive tool for the diagnosis and surveillance of several diseases. Despite the advantages of this strategy, it is not yet a routine clinical tool. The lack of reproducible protocols for each step of the biomarker discovery phase is an obstacle of the current state. Specifically, this issue is present at the data preprocessing step. Thus, an open source workflow for preprocessing the data obtained by the analysis of exhaled breath samples using gas chromatography coupled with single quadrupole mass spectrometry (GC/MS) is presented in this paper. This workflow is based on the connection of two approaches to transform raw data into a useful matrix for statistical analysis. Moreover, this workflow includes matching compounds from breath samples with a spectral library. Three free packages (xcms, cliqueMS and eRah) written in the language R are used for this purpose. Furthermore, this paper presents a suitable protocol for exhaled breath sample collection from infants under 2 years of age for GC/MS.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33319832 PMCID: PMC7738550 DOI: 10.1038/s41598-020-79014-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Steps of the biomarker discovery phase conducted before data preprocessing.
Figure 2Workflow for data preprocessing after exhaled breath analysis using TD-GC/q-MS.
Figure 3Differences in retention time between both sample groups. The graph shows the retention times of the 337 peaks detected in the room air content samples (210 in Group 1 samples) and (127 in Group 2 samples) at 91 m/z relative to toluene. Statistically significant differences between the retention times of both groups were observed using the two tailed Mann–Whitney U test (p value = 2.2e−16).
Results obtained by both approaches.
| Samples | Approach 1 | Approach 2 | ||||
|---|---|---|---|---|---|---|
| No of features | No of compounds | No of compounds | ||||
| Group 1 | Group 2 | Group 1 | Group 2 | Group 1 | Group 2 | |
| Exhaled breath of mothers | 542 | 524 | 1613 | 1241 | 835 | 554 |
| Exhaled breath of children | 467 | 543 | 1242 | 1105 | 802 | 569 |
| Room air content | 867 | 937 | 2927 | 2223 | 913 | 610 |
Figure 4Spectral alignment using eRah. (A) Elution profile of toluene in room air content samples from both groups of samples (Group 1 and Group 2) before and after alignment. (B) Elution profile of limonene in room air content samples from both groups of samples (Group 1 and Group 2) before and after alignment. Plots of this figure were generated by the plotAlign function of package eRah in R.
Filtered features and filtered compounds.
| Samples | No of filtered features | No of filtered compounds | ||
|---|---|---|---|---|
| Group 1 | Group 2 | Group 1 | Group 2 | |
| Exhaled breath of mothers | 494 | 452 | 194 | 150 |
| Exhaled breath of Children | 377 | 483 | 155 | 158 |
| Room air content | 824 | 877 | 231 | 193 |
Figure 5Possible Duplicate Compounds. Statistical comparisons between percentages each subset of possible duplicate compounds in compounds detected by first approach, compounds detected by second approach and filtered compounds were performed with two tailed ANOVA test followed by Bonferroni post-hoc tests (p value < 0.01(**), p value < 0.0001 (****)).
Figure 6Explorative analysis of filtered features by principal component analysis (PCA). (A) PCA score plot (PC-1 vs. PC-2) of breath samples from Group 1. (B) PCA score plot (PC-1 vs. PC-2) of breath samples from Group 2.