| Literature DB >> 33946997 |
Michalis Koureas1, Dimitrios Kalompatsios1, Grigoris D Amoutzias2, Christos Hadjichristodoulou1, Konstantinos Gourgoulianis3, Andreas Tsakalof1,4.
Abstract
The aim of the present study was to compare the efficiency of targeted and untargeted breath analysis in the discrimination of lung cancer (Ca+) patients from healthy people (HC) and patients with benign pulmonary diseases (Ca-). Exhaled breath samples from 49 Ca+ patients, 36 Ca- patients and 52 healthy controls (HC) were analyzed by an SPME-GC-MS method. Untargeted treatment of the acquired data was performed with the use of the web-based platform XCMS Online combined with manual reprocessing of raw chromatographic data. Machine learning methods were applied to estimate the efficiency of breath analysis in the classification of the participants.Entities:
Keywords: breath analysis; cancer biomarkers; exhaled breath; lung cancer; untargeted analysis; volatile organic compounds; volatolomics
Mesh:
Substances:
Year: 2021 PMID: 33946997 PMCID: PMC8125376 DOI: 10.3390/molecules26092609
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Cloud plots with results of pairwise XCMS analysis between (a) Ca+ vs. HC. Detection settings: p-value < 0.01, fold change > 2, m/z range: 0–140, retention range: 0–14 min, max intensity > 10,000 and (b) Ca+ vs. Ca− characteristics (ion). Detection settings: p-value < 0.05, fold change > 1.1, m/z range: 0–140, retention range: 0–14 min, max intensity > 10,000.
Figure 2Flow chart of the process applied for selecting, identifying and processing informative compounds.
Identification of compounds based on spectra comparison with NIST library and retention time criteria.
| Candidate Compound | Probability (NIST), % | Match Score (NIST) | Retention Time, min | Retention Time Simulated 1, min | Deviations in | Experimentally Determined Retention Index | NIST | Deviations in Retention Index, % |
|---|---|---|---|---|---|---|---|---|
| 3-methyl-furane | 86 | 892 | 5.03 | NA | 615 | 602 | 2.16 | |
| acetaldoxime | 53 | 753 | 5.38 | NA | 625 | 606 | 3.14 | |
| Benzene * | 72 | 923 | 7.17 | 7.73 | −7.81 | 677 | 647 | 4.64 |
| acetic acid | 59 | 912 | 7.86 | NA | 698 | 650 | 7.38 | |
| 1-methoxy-2-propanol | 69 | 891 | 8.27 | 7.83 | 5.32 | 711 | 658 | 8.05 |
| dimethyl furane | 78 | 852 | 8.33 | 8.66 | −3.96 | 714 | 694 | 2.88 |
| methyl propyl sulfide | 89 | 840 | 8.79 | NA | 729 | 714 | 2.10 | |
| 1-methylthio-(E)-1-propene ** | 90 | 877 | 9.57 | NA | 756 | 722 | 4.71 | |
| Toluene * | 34 | 868 | 10.31 | 10.95 | −6.21 | 782 | 750 | 4.27 |
| propionic acid | 78 | 702 | 10.59 | NA | 792 | 712 | 11.24 | |
| p- xylene ** | 81 | 845 | 12.00 | 12.1 | −0.83 | 859 | 833 | 3.12 |
| ethyl benzene * | 61 | 877 | 12.10 | 12.38 | −2.31 | 890 | 858 | 3.73 |
| Styrene * | 37 | 869 | 12.3 | 12.79 | −3.98 | 906 | 876 | 3.42 |
| methylacetamide | 78 | 831 | 12.79 | NA | 959 | 825 | 16.24 | |
| p-benzoquinone | 90 | 817 | 13.00 | NA | 982 | 888 | 10.59 | |
| N-2-Aminoethyl acetamide | 62 | 800 | 13.2 | NA | 1005 | NA | ||
| eucalyptol | 52 | 848 | 13.51 | NA | 1060 | 1017 | 4.23 |
* Verified by analytical standard. ** NIST probability is given for all isomer compounds. Mass spectra were very similar for isomers of these compounds, compounds were identified based on RI similarities. 1 Retention time was simulated with Pro EZGC Chromatogram Modeler, Restek Corporation. 2 Retention indices were derived from NIST database related to a fully non-polar column (100% polydimethylsiloxane). NA: not available with equivalent column.
Comparative analysis of the areas of the 29 chromatographic peaks between patient groups and relative presence in ambient air.
| Compound | Relative Presence in Ambient Air 1 | Ca+/HC | Ca+/Ca− | ||
|---|---|---|---|---|---|
| Trend in LC Patients | Significance * | Trend in LC Patients | Significance * | ||
| unknown | insignificant | ↑ | 0.052 | ↑ | 0.311 |
| unknown | moderate | ↓ | 0.071 | ↓ | 0.056 |
| 3-methyl-furan * | low | ↓ | 0.514 | ↓ | 0.482 |
| acetaldoxime | high | ↓↓↓ | <0.001 | ↓ | 0.341 |
| unknown | moderate | ↓↓↓ | <0.001 | ↓ | 0.689 |
| unknown | low | ↑↑ | 0.01 | ↓↓ | 0.013 |
| benzene | moderate | ↓↓↓ | <0.001 | ↓ | 0.089 |
| unknown | moderate | ↓↓↓ | <0.001 | ↓ | 0.756 |
| acetic acid | low | ↓↓↓ | <0.001 | ↓ | 0.979 |
| 1-methoxy-2-propanol | high | ↓↓↓ | <0.001 | ↓ | 0.272 |
| dimethyl furan | low | ↓ | 0.125 | ↓ | 0.286 |
| unknown | moderate | ↑↑↑ | 0.002 | ↓ | 0.396 |
| unknown | moderate | ↓ | 0.902 | ↓ | 0.082 |
| methyl propyl sulfide | insignificant | ↓↓↓ | <0.001 | ↓↓ | 0.035 |
| 1-methylthio-(E)-1-propene | insignificant | ↓↓↓ | <0.001 | ↓ | 0.239 |
| unknown | insignificant | ↓↓↓ | <0.001 | ↓ | 0.185 |
| toluene | moderate | ↑↑↑ | 0.001 | ↑ | 0.986 |
| propionic acid | insignificant | ↓↓↓ | <0.001 | ↑ | 0.384 |
| unknown | high | ↑ | 0.053 | ↑ | 0.752 |
| unknown | moderate | ↓ | 0.124 | ↓ | 0.175 |
| ethylbenzene | moderate | ↑↑↑ | <0.001 | ↑ | 0.618 |
| xylene(p,o,m) | moderate | ↑↑↑ | <0.001 | ↑ | 0.434 |
| styrene | moderate | ↑↑↑ | <0.001 | ↑ | 0.423 |
| methylacetamide | high | ↓ | 0.178 | ↑ | 0.539 |
| p-benzoquinone | insignificant | ↓ | 0.076 | ↓ | 0.388 |
| N-2-Aminoethyl acetamide | moderate | ↓↓↓ | <0.001 | ↓ | 0.824 |
| unknown | moderate | ↓↓↓ | <0.001 | ↓ | 0.104 |
| eucalyptol | low | ↑ | 0.066 | ↑ | 0.511 |
| unknown | moderate | ↑ | 0.092 | ↑ | 0.463 |
1 Determined from mean breath/mean air ratio. Insignificant: >20, low: 5–20, moderate: 0.5–5, high: <0.5. * Significance determined by Mann–Whitney test. ↑, ↓: p > 0.05, ↑↑, ↓↓: p = 0.01–0.05, ↓↓↓, ↑↑↑: p < 0.01.
Results of machine learning methods (random forest) to estimate the discrimination efficiency of the breath analysis.
| Analysis no. | Approach | Variable | Comparison Groups | Smoking Habit | Features Used | Accuracy | AUC |
|---|---|---|---|---|---|---|---|
| 1 | targeted | Br | Ca+ vs. HC | All | t1–t19 | 85.14 | 0.95 |
| 2 | targeted | Br | Ca+ vs. HC | All | t4, t5, t7–t11,t13–t15,t18 | 89.10 | 0.97 |
| 3 | targeted | Br | Ca− vs. HC | All | t1–t19 | 86.36 | 0.91 |
| 4 | targeted | Br | Ca− vs. HC | All | t4,t5, t7–t17 | 88.63 | 0.94 |
| 5 | targeted | Br | Ca+ & Ca− vs. HC | All | t1–t19 | 86.70 | 0.96 |
| 6 | targeted | Br | Ca+ & Ca− vs. HC | All | t1, t4,t5,t7–t15,t17 | 90.50 | 0.96 |
| 7 | targeted | Br | Ca+ vs. Ca− | All | t1–t19 | 43.50 | 0.39 |
| 8 | targeted | Br | Ca+ vs. Ca− | All | t4,t9, t17 | 52.90 | 0.55 |
| 9 | untargeted | Br | Ca+ vs. HC | All | u1–u29 | 86.14 | 0.94 |
| 10 | untargeted | Br | Ca+ vs. HC | All | u4,u8,u12,u14,u16,u19,u28,u29 | 91.08 | 0.96 |
| 11 | untargeted | Br | Ca− vs. HC | All | u1–u29 | 89.77 | 0.94 |
| 12 | untargeted | Br | Ca− vs. HC | All | u4,u6, u8, u12,u26,u27,u29 | 94.3 | 0.97 |
| 13 | untargeted | Br | Ca+ & Ca− vs. HC | All | u1–u29 | 86.9 | 0.95 |
| 14 | untargeted | Br | Ca+ & Ca− vs. HC | All | u4, u8, u11,u12,u19,u22,u26,u27, u29 | 92 | 0.97 |
| 15 | untargeted | Br | Ca+ vs. Ca− | All | u1–u29 | 52.9 | 0.54 |
| 16 | untargeted | Br | Ca+ vs. Ca− | All | u4, u20,u26 | 75.3 | 0.82 |
| 17 | untargeted | Sbtr | Ca+ vs. Ca− | All | t1–t19, u1–u29 | 57.6 | 0.54 |
| 18 | untargeted | Sbtr | Ca+ vs. Ca− | All | u2, u4,u6, u11,u14, u25, u28,u29 | 71.76 | 0.78 |
| 19 | merged | Br | Ca+ vs. Ca− | All | u1–u29, t1–t19 | 44.7 | 0.44 |
| 20 | merged | Br | Ca+ vs. Ca− | All | t9, u4, u26 | 72.9 | 0.72 |
| 21 | untargeted | Br | Ca+ vs. Ca− | Non-smokers | u1–u29 | 59.4 | 0.57 |
| 22 | untargeted | Br | Ca+ vs. Ca− | Non-smokers | u4, u20,u 26 | 72.5 | 0.68 |
| 23 | untargeted | Br | Ca+ vs. Ca− | Non-smokers | u4, u11, u13,u20,u26 | 76.8 | 0.85 |
Br: corresponds to breath compound levels, Sbtr: corresponds to breath subtract levels, Ca+: patients diagnosed with lung cancer, Ca−: patients with pathological CT findings not diagnosed with lung cancer by histological/cytological examination, HC: healthy controls. Features from targeted analysis: t1: isoprene, t2: acetone, t3: 2-propanol, t4: hexane, t5: 1-propanol, t6: 2-butanone, t7: cyclohexane, t8: benzene, t9: thiophene, t10: 1-butanol, t11: toluene, t12: octane, t13: ethyl butyrate, t14: hexanal, t15: ethyl benzene, t16: styrene, t17: cyclohexanone, t18: octanal, t19: nonanal. Features from untargeted analysis: u1: unknown, u2: unknown, u3: 3-methyl-furan, u4: acetaldoxime, u5: unknown, u6: unknown, u7: benzene, u8: unknown, u9: acetic acid, u10: 1-methoxy-2-propanol, u11: dimethyl furan, u12: unknown, u13: unknown, u14: 1-methylthio-(E)-1-propene, u15: allyl methyl sulfide, u16: unknown, u17: toluene, u18: propionic acid, u19: unknown, u20: unknown, u21: ethylbenzene, u22: p-xylene, u23: styrene, u24: methylacetamide, u25: p-benzoquinone, u26: N-2-aminoacetyl acetamide, u27: unknown, u28: eucalyptol, u29: unknown.