| Literature DB >> 35204584 |
Inese Polaka1, Manohar Prasad Bhandari1, Linda Mezmale1,2,3, Linda Anarkulova1,4,5, Viktors Veliks1, Armands Sivins1,2, Anna Marija Lescinska1,2,3, Ivars Tolmanis6,7, Ilona Vilkoite6,8,9, Igors Ivanovs2,3, Marta Padilla10, Jan Mitrovics10, Gidi Shani11, Hossam Haick11, Marcis Leja1,6.
Abstract
BACKGROUND: Gastric cancer is one of the deadliest malignant diseases, and the non-invasive screening and diagnostics options for it are limited. In this article, we present a multi-modular device for breath analysis coupled with a machine learning approach for the detection of cancer-specific breath from the shapes of sensor response curves (taxonomies of clusters).Entities:
Keywords: breath analysis; electronic nose; gastric cancer; machine learning; screening
Year: 2022 PMID: 35204584 PMCID: PMC8871298 DOI: 10.3390/diagnostics12020491
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1The point-of-care device used in the study: (a) the design of the device, with a disposable mouthpiece inserted at the front; (b) the main blocks of the system.
Figure 2Diagram of the data analysis process.
Figure 3An example of the data after the preprocessing: (a) common features describing the curve; (b) curves of a GNP sensor; (c) curves of an analogue MOX sensor; (d) curves of a digital MOX sensor.
Figure 4Cluster taxonomy of the responses from one gold nanoparticle sensor.
Classification results (and 95%CI) using Naïve Bayes classifiers.
| Feature | Overall Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| Minimum | 72.18% (71.49–72.87%) | 46.9% (45.39–48.41%) | 85.51% (84.76–86.26%) |
| Average | 74.21% (73.5–74.91%) | 51.85% (50.35–53.34%) | 86.02% (85.27–86.76%) |
| Maximum | 73.7% (72.96–74.44%) | 53.44% (51.94–54.94%) | 84.38% (83.6–85.16%) |
| Average of the last 10 time points | 73.74% (73.02–74.45%) | 53.00% (51.51–54.49%) | 84.67% (83.9–85.44%) |
| Area under the curve | 73.75% (73.04–74.47%) | 50.77% (49.28–52.26%) | 85.88% (85.13–86.64%) |
| Cluster (DTWARP distance, Ward linkage, InfoGain) | 77.81% (77.15–78.48%) | 64.05% (62.66–65.44%) | 85.04% (84.29–85.78%) |
| Cluster (Euclidean distance, Ward linkage, Symm.Unc.) | 77.1% (76.41–77.79%) | 66.54% (65.21–67.87%) | 82.64% (81.83–83.45%) |
Figure 5The overall accuracy of Naïve Bayes classifiers: mean values and 95% confidence intervals.
Figure 6Sensitivity of Naïve Bayes classifiers: mean values and 95% confidence intervals.
Figure 7Specificity of Naïve Bayes classifiers: mean values and 95% confidence intervals.
Figure 8The area under the ROC curve of Naïve Bayes classifiers: mean values and 95% confidence intervals.
Figure 9An example of the characteristic shapes used in a Naïve Bayes model: a taxonomy for GNP sensor responses cut at six clusters (a), taxonomies of two other GNP sensors cut at 10 and four clusters (b,c), and one MOXD sensor at five clusters (d); the dashed lines shows individual measurements, and the solid bold lines show the cluster-characteristic shapes.
Classification results using Random Forests.
| Feature | Overall Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| Minimum | 69.89% (69.19–70.59%) | 45.3% (43.87–46.73%) | 82.93% (82.12–83.74%) |
| Average | 70.58% (69.86–71.29%) | 47% (45.56–48.45%) | 83.13% (82.3–83.95%) |
| Maximum | 70.33% (69.66–71.01%) | 43.23% (41.82–44.64%) | 84.72% (83.95–85.49%) |
| Average of the last 10 time points | 70.97% (70.25–71.69%) | 48.79% (47.32–50.25%) | 82.78% (81.96–83.6%) |
| Area under the curve | 70.51% (69.79–71.23%) | 46.54% (45.1–47.99%) | 83.27% (82.44–84.1%) |
| Cluster (DTWARP distance, Ward linkage, ReliefF) | 75.01% (74.39–75.63%) | 46.52% (45.13–47.91%) | 90.12% (89.5–90.74%) |
| Cluster (Euclidean distance, complete linkage, ReliefF) | 74.51% (73.90–75.13%) | 46.04% (44.68–47.41%) | 89.66% (89.01–90.30%) |
Figure 10The overall accuracy of Random Forests: mean values and 95% confidence intervals.
Classification results using SVMs.
| Feature | Overall Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| Minimum | 73.84% (73.23–74.45%) | 41.31% (39.99–42.64%) | 91.14% (90.51–91.78%) |
| Maximum | 73.45% (72.79–74.11%) | 45.72% (44.31–47.14%) | 88.2% (87.53–88.88%) |
| Average | 74.26% (73.64–74.87%) | 43.16% (41.77–44.55%) | 90.74% (90.12–91.37%) |
| Average of the last 10 time points | 75.1% (74.47–75.74%) | 48.33% (46.94–49.71%) | 89.27% (88.62–89.92%) |
| Area under the curve | 72.75% (72.13–73.37%) | 40.86% (39.48–42.25%) | 89.68% (89.02–90.33%) |
| Cluster (DTWARP distance, complete linkage, InfoGain) | 74.87% (74.16–75.59%) | 60.73% (59.31–62.15%) | 82.48% (81.66–83.30%) |
| Cluster (Euclidean distance, Ward linkage, InfoGain) | 73.86% (73.07–74.65%) | 61.05% (59.52–62.58%) | 80.72% (79.84–81.6%) |
Figure 11The overall accuracy of SVMs: mean values and 95% confidence intervals.