| Literature DB >> 33801001 |
Ping-Hsien Tsou1, Zong-Lin Lin2, Yu-Chiang Pan3, Hui-Chen Yang1, Chien-Jen Chang1, Sheng-Kai Liang1, Yueh-Feng Wen1, Chia-Hao Chang1, Lih-Yu Chang1, Kai-Lun Yu1, Chia-Jung Liu1, Li-Ta Keng1, Meng-Rui Lee1, Jen-Chung Ko1, Guan-Hua Huang2,3, Yaw-Kuen Li3,4.
Abstract
(1) Background: Lung cancer is silent in its early stages and fatal in its advanced stages. The current examinations for lung cancer are usually based on imaging. Conventional chest X-rays lack accuracy, and chest computed tomography (CT) is associated with radiation exposure and cost, limiting screening effectiveness. Breathomics, a noninvasive strategy, has recently been studied extensively. Volatile organic compounds (VOCs) derived from human breath can reflect metabolic changes caused by diseases and possibly serve as biomarkers of lung cancer. (2)Entities:
Keywords: SIFT-MS; XGBoost; breath analysis; lung cancer; machine learning; volatile organic compounds
Year: 2021 PMID: 33801001 PMCID: PMC8003836 DOI: 10.3390/cancers13061431
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Summary of the noninvasive detection methods for lung cancer.
| Biomarkers/Specimen | Analytic Platform | Detection Target | Sensitivity (%) | Advantages | Deficiencies | Ref. |
|---|---|---|---|---|---|---|
| CTCs/Blood | IF; FISH | EpCAM, Size-based cells | 30.0–69.5 | Viable cell, high specificity, high throughput | Limited sensitivity; require enrichment; only detect advanced cancers | [ |
| Traditional Proteins/Blood | ECLIA | CEA, CYFRA 21-1 | 22–69 | Rapid and common | Limited sensitivity and specificity | [ |
| Novel Proteins/EBC, Saliva, Urine, Blood | Microarray; LC-MS/MS | CKAP4, exosomal proteins (NFX1, PKG1, GPC1) | 70.0–84.0 | Higher sensitivity; high throughput; rapid | Quantity required (MS); validation required | [ |
| microRNA/Blood | Microarray; RT-PCR; NGS | miRNAs-126, -145, -210 and -205-5p, -17, -190b, -19a, -19b, -26b, -375 | 80.0–91.5 | High throughput, stable | Specialized abilities and facilities are required | [ |
| Methylated DNA/Blood | NGS; PCR | HOXD10, PAX9, PTPRN2, STAG3, SHOX2 | 70.0–87.8 | High sensitivity and specificity | Require standardization | [ |
| ctDNA/Blood | NGS; Multiplex-PCR | Genetic mutation, SNVs | 48.0–59.0 | Target for precision medicine; early detection (~70 days prior to CT image) | Limited sensitivity, require expensive equipment | [ |
| VOCs/Exhaled Breath | E-Nose sensors; GC-MS; PTR-MS, IMS; LPPI-MS | propanol, isoprene, acetone, pentane, hexanal, toluene, benzene, ethylbenzene, and others | 81.0–96.5 | Rapid, simple, noninvasive; inexpensive | Require standardization | [ |
Abbreviations: CTCs (circulating tumor cells); IF (immunofluorescence); FISH (fluorescence in situ hybridization); EpCAM (epithelial cell adhesion molecule); ECLIA (electrochemiluminescence immunoassay); CEA (carcinoembryonic antigen); CYFRA 21-1 (cytokeratin fraction 21-1); EBC (exhaled breath condensate); NGS (next-generation sequencing); CT(computed tomography); RT-PCR (reverse transcription PCR); ctDNA (circulating tumor DNA); SNVs (single-nucleotide variants); GC-MS (gas chromatography mass spectrometry); PTR-MS (proton transfer reaction mass spectrometry); IMS (ion mobility spectrometry); LPPI-MS (low-pressure photoionization mass spectrometry); VOCs (volatile organic compounds).
Figure 1The research flow chart. (a) Collecting the alveolar air. The breath was exhaled through the mouthpiece with a direct-connect three-way valve. At the first stage, the exhaled air flows through Exit 1. When the volume of the front portion of the exhaled air reaches 0.2 L, the three-way valve switches to Exit 2 and starts to collect the rest of the exhaled air, i.e., alveolar air, in a 1.0 L aluminum bag. The process of sample collection may be repeated 2–3 times to collect enough samples for analysis. (b) Delivery and analysis of an exhaled sample. Selected ion flow tube mass spectrometry (SIFT-MS) extracts the exhaled breath from an aluminum bag and analyzes the composition of volatile organic compounds (VOCs). The VOC data are used for model construction, machine learning, and prediction of lung cancer.
Characteristics of patients with lung cancer and healthy volunteers of the study.
| Characteristic | Lung Cancer Patients ( | Health Controls ( |
|---|---|---|
| Age (years), y * | ||
| Mean ± SD | 64.5 ± 11 | 31.4 ± 10.4 |
| Rage | 37–90 | 20–74 |
| Sex, | ||
| Female | 75 (50.7) | 101 (60.1) |
| Male | 73 (49.3) | 67 (39.9) |
| Smoking status, | ||
| Current smoker | 9 (6) | 0 |
| Former smoker | 47 (31.2) | 1 |
| Nonsmoker | 92 (62.1) | 167 (99) |
| Lung cancer type, | - | |
| Adenocarcinoma | 108 (72.9) | |
| Squamous cell carcinoma | 17 (11.5) | |
| Small cell lung cancer | 14 (9.5) | |
| Other lung cancer | 8 (5.4) | |
| Targetable driver mutation, | ||
| EGFR | - | |
| Exon 19 deletion | 33 (22.3) | |
| Exon 21 point mutation | 30 (20.3) | |
| T790M | 6 (4.1) | |
| ALK | 7 (4.7) | |
| ROS1 | 3 (2.0) | |
| Wild type | 75 (50.7) | |
| PD-L1 expression, | ||
| >50% | 18 (12.1) | - |
| 1–49% | 57 (39.0) | |
| <1% | 29 (19.6) | |
| Clinical stage status, | ||
| IA and B | 4 (2.7) | - |
| IIA and B | 4 (2.7) | |
| IIIA | 8 (5.4) | |
| III B and C | 27 (18.2) | |
| IVA | 65 (43.9) | |
| IVB | 40 (27.0) |
* Significantly different between lung cancer patients and healthy controls at p-value < 0.05. † Significantly different between lung cancer patients and healthy controls at p-value < 0.1. Abbreviations: EGFR (epidermal growth factor receptor); ALK (anaplastic lymphoma kinase); ROS1 (ROS1 oncogene); NTUH (National Taiwan University Hospital Hsin-Chu Branch); NCTU (National Yang Ming Chiao Tung University).
The 116 VOCs selected for selected ion flow tube mass spectrometry analysis of breath samples in this study.
| No. | Compound | No. | Compound | No. | Compound | No. | Compound |
|---|---|---|---|---|---|---|---|
| 1 *,† | beta-caryophyllene (87-44-5) | 30 | 2-pentanone (107-87-9) | 59 | diethyl ether (60-29-7) | 88 † | 1,4-diaminobutane (110-60-1) |
| 2 | pyrrole (109-97-7) | 31 *,† | (E)-2-heptenal (18829-55-5) | 60 | isobutyl alcohol (78-83-1) | 89 | o-xylene (95-47-6) |
| 3 * | benzoic acid (65-85-0) | 32 † | 3-buten-2-one (78-94-4) | 61 † | 2-methylpentane (107-83-5) | 90 † | cyclopentane (287-92-3) |
| 4 *,† | 2,5-dimethylfuran (625-86-5) | 33 † | butanone (78-93-3) | 62 | methylcyclopentane (96-37-7) | 91 | propane (74-98-6) |
| 5 * | acetophenone (98-86-2) | 34 *,† | 1,5-diaminopentane (462-94-2) | 63 † | heptanal (111-71-7) | 92 | heptane (142-82-5) |
| 6 | pyridine (110-86-1) | 35 *,† | alpha-terpinene (99-86-5) | 64 | 1-butanol (71-36-3) | 93 | propanal (123-38-6) |
| 7 * | 2-methylpyrazine (109-08-0) | 36 * | 1-butyne (107-00-6) | 65 | 3-methyl-2-butenal (107-86-8) | 94 * | 2-propanol (67-63-0) |
| 8 † | tridecane (629-50-5) | 37 | 1-methyl-2-pyrrolidinone (872-50-4) | 66 † | pentanoic acid (109-52-4) | 95 *,† | cyclohexane (110-82-7) |
| 9 † | 2,5-dimethylpyrazine (123-32-0) | 38 † | diisopropyl ether (108-20-3) | 67 * | ethylbenzene (100-41-4) | 96 | ethane (74-84-0) |
| 10 † | 1,3-butadiene (106-99-0) | 39 | 2-pentanone new (107-87-9) | 68 *,† | 1-heptene (592-76-7) | 97 | carbon disulfide (75-15-0) |
| 11 *,† | dodecane (112-40-3) | 40 * | 1,2,4-trimethylbenzene (95-63-6) | 69 *,† | dimethyl sulfide (75-18-3) | 98 *,† | trimethylamine (75-50-3) |
| 12 | propyne (74-99-7) | 41 † | nonane (111-84-2) | 70 *,† | propanoic acid (79-09-4) | 99 | acetaldehyde (75-07-0) |
| 13 † | (E)-2-nonenal (18829-56-6) | 42 * | propylbenzene (103-65-1) | 71 | toluene (108-88-3) | 100 | dimethyl ether (115-10-6) |
| 14 * | 4-isopropyl toluene (99-87-6) | 43 | 3-butyn-2-ol new (2028-63-9) | 72 | p-xylene (106-42-3) | 101 † | acetic acid (64-19-7) |
| 15 † | 2-hexanone (591-78-6) | 44 *,† | cyclohexanone (108-94-1) | 73 † | 3-methylbutanal (590-86-3) | 102 | propene (115-07-1) |
| 16 | undecane (1120-21-4) | 45 *,† | ethylcyclohexane (1678-91-7) | 74 | butanal (123-72-8) | 103 | formaldehyde (50-00-0) |
| 17 * | benzaldehyde (100-52-7) | 46 † | 2-methylbutanal (96-17-3) | 75 | xylenes + ethylbenzene (1330-20-7) | 104 | furan (110-00-9) |
| 18 * | styrene (100-42-5) | 47 * | nonanal (124-19-6) | 76 * | isopropylamine (75-31-0) | 105 * | 1-propanol (71-23-8) |
| 19 *,† | eucalyptol (470-82-6) | 48 *,† | limonene (138-86-3; 7705-14-8) | 77 *,† | methyl acetate (79-20-9) | 106 † | isobutane (75-28-5) |
| 20 † | furfural (98-01-1) | 49 † | 2-pentene (109-68-2) | 78 *,† | 1-hexene (592-41-6) | 107 | isoprene (78-79-5) |
| 21 * | 1-pentanol (71-41-0) | 50 | decane (124-18-5) | 79 *,† | 1-butene (106-98-9) | 108 * | formic acid (64-18-6) |
| 22 *,† | butyl acetate (123-86-4) | 51 | methyl n-propyl sulfide (3877-15-4) | 80 † | pentanal (110-62-3) | 109 | pentane (109-66-0) |
| 23 * | octanal (124-13-0) | 53 † | 2-methylpropanal (78-84-2) | 81 | 1-methoxy-2-propanol (107-98-2) | 110 * | acetonitrile (75-05-8) |
| 24 * | 3-methyl-1-butanol (123-51-3) | 53 *,† | acetoin (513-86-0) | 82 | 2,3-butanediol (513-85-9; 513-89-3) | 111 * | ethanol (64-17-5) |
| 25 † | (E)-2-hexenal (6728-26-3) | 54 *,† | alpha-pinene (80-56-8; 2437-95-8) | 83 † | hexanal (66-25-1) | 112 † | hexane (110-54-3) |
| 26 † | 1,4-butyrolactone (96-48-0) | 55 * | acrylonitrile (107-13-1) | 84 *,† | acrolein (107-02-8) | 113 * | methanol (67-56-1) |
| 27 † | 6-methyl-5-hepten-2-one (110-93-0) | 56 *,† | ethyl acetate (141-78-6) | 85 † | acetic anhydride (108-24-7) | 114 * | acetone (67-64-1) |
| 28 | benzene (71-43-2) | 57 *,† | 2,3-butanedione (431-03-8) | 86 † | 3-methylpentane (96-14-0) | 115 * | butane (106-97-8) |
| 29 † | decanal (112-31-2) | 58 *,† | 2-methyl-2-propenal (78-85-3) | 87 *,† | octane (111-65-9) | 116 * | ethanedial (107-22-2) |
* The VOC that showed a significant difference between lung cancer patients and healthy controls in all three statistical hypothesis tests adopted. † VOCs whose concentration was not significantly different between the National Yang Ming Chiao Tung University (NCTU) and the National Taiwan University Hospital Hsin-Chu Branch (NTUH) in all statistical hypothesis tests adopted.
Figure 2The heat map of 116 volatile organic compounds (VOC) measurements (log concentrations) for 316 participants. In the color matrix at the center, each row represents a participant, and each column represents a single VOC with the diverging color scheme of red (high concentration) and blue (low concentration). VOCs and participants are clustered using the agglomerative hierarchical clustering method. In the color bar on the left, red indicates cancer patients, green indicates healthy volunteers of the National Yang Ming Chiao Tung University (NCTU), and blue indicates healthy volunteers of the National Taiwan University Hospital Hsin-Chu Branch (NTUH). The two-color bars at the top represent the significance of VOCs. The first bar indicates whether VOCs showed significant differences between lung cancer patients and healthy controls (brown for significant and black for nonsignificant). The second bar indicates whether VOCs were significantly different between NCTU and NTUH (green for significantly different and blue for not significantly different).
Summary of various algorithms applied to lung cancer diagnosis using multiple VOCs.
| Algorithms | Analytical Platform | Patients with Cancer No. | Analyzed VOC No. | Sensitivity % | Specificity % | AUC | Reference/(Year) |
|---|---|---|---|---|---|---|---|
| Stepwise Discriminant Analysis | GC-MS | 67 | 9 | 85.1 | 80.5 | NR | [ |
| Logistic Regression | GC-MS | 193 | 16 | 84.6 | 80.0 | 0.88 | [ |
| Weighted Digital Sum Discriminator | GC-MS | 193 | 30 | 84.5 | 81 | 0.9 | [ |
| Support Vector Machine | GS-MS | 107 | 5 | 95 | 89 | NR * | [ |
| Artificial Neural Networks | GC-MS | 108 | 88 | 86.36 | 86.36 | 0.86 | [ |
| K-nearest Neighbor | GC-MS | 325 | NR | NR | NR | 0.63 † | [ |
| Extreme Gradient Boosting | SIFT-MS | 148 | 116 | 82 | 94 | 0.95 | This WorkConsidering only participants’ VOCs |
| 96 | 88 | 0.98 | Considering both participants’ VOCs and environmental VOCs |
Abbreviations: AUC, area under the curve; GC-MS, gas chromatography-mass spectrometry; NR, not reported; SIFT-MS, selected ion flow tube mass spectrometry; * Accuracy: 89%, † Classify adenocarcinoma and squamous cell carcinoma patients.