| Literature DB >> 36250077 |
Mikolaj Wieczorek1, Alexander Weston1, Matthew Ledenko2, Jonathan Nelson Thomas2, Rickey Carter1, Tushar Patel2.
Abstract
Liver disease such as cirrhosis is known to cause changes in the composition of volatile organic compounds (VOC) present in patient breath samples. Previous studies have demonstrated the diagnosis of liver cirrhosis from these breath samples, but studies are limited to a handful of discrete, well-characterized compounds. We utilized VOC profiles from breath samples from 46 individuals, 35 with cirrhosis and 11 healthy controls. A deep-neural network was optimized to discriminate between healthy controls and individuals with cirrhosis. A 1D convolutional neural network (CNN) was accurate in predicting which patients had cirrhosis with an AUC of 0.90 (95% CI: 0.75, 0.99). Shapley Additive Explanations characterized the presence of discrete, observable peaks which were implicated in prediction, and the top peaks (based on the average SHAP profiles on the test dataset) were noted. CNNs demonstrate the ability to predict the presence of cirrhosis based on a full volatolomics profile of patient breath samples. SHAP values indicate the presence of discrete, detectable peaks in the VOC signal.Entities:
Keywords: breath; cirrhosis; deep learning; prediction; volatile organic compound
Year: 2022 PMID: 36250077 PMCID: PMC9556819 DOI: 10.3389/fmed.2022.992703
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
FIGURE 1Flow-chart of sample collection and analysis.
FIGURE 2Data partition using group fourfold cross-validation (CV) method. Within training data, each split represents one independently trained model. Models were evaluated on a hold-out test dataset of 22 patients (75 samples).
FIGURE 3Diagram of custom CNN model architecture.
Comparison of demographics between healthy and cirrhosis patients.
| Median (minimum, maximum) or No. (%) of patients | |||
| Disease ( | Healthy ( | ||
| Sex (Male) | 1.00 | ||
| Female | 17 (48.6%) | 6 (54.5%) | |
| Male | 18 (51.4%) | 5 (45.5%) | |
| Age (years) | 61.0 (33.0, 76.0) | 45.0 (24.0, 60.0) |
|
| Age group (years) |
| ||
| (20, 50) | 7 (20.0%) | 8 (72.7%) | |
| (50, 80) | 28 (80.0%) | 3 (27.3%) | |
| Body mass index (kg/m2) | 30.2 (20.2, 41.3) | 27.6 (21.0, 41.8) | 0.42 |
| Body mass index (categorical) | 0.20 | ||
| Healthy weight (18.5–24.9) | 5 (14.3%) | 2 (18.2%) | |
| Overweight (25.0–29.9) | 10 (28.6%) | 6 (54.5%) | |
| Obesity (>30.0) | 20 (57.1%) | 3 (27.3%) | |
P-values result from a Wilcoxon rank sum test (continuous variables) or Fisher’s exact test (categorical variables). Bold values denote statistical significance at the p < 0.05 level.
Comparison of characteristics across disease stage for the cirrhosis study population.
| Median (minimum, maximum) or No. (%) of patients | ||||
| Cirrhosis stage I, compensated ( | Cirrhosis stage II, compensated ( | Cirrhosis stage III, decompensated ( | ||
| Ascites | 0 (0.0%) | 0 (0.0%) | 10 (100.0%) |
|
| Varices | 0 (0.0%) | 12 (100.0%) | 8 (80.0%) |
|
| Platelets | 185.0 (123.0, 272.0) | 92.5 (44.0, 279.0) | 83.0 (36.0, 238.0) |
|
| MELD | 8.0 (6.0, 20.0) | 10.0 (7.0, 19.0) | 13.0 (7.0, 28.0) |
|
| APRI | 0.4 (0.2, 1.1) | 0.8 (0.2, 3.5) | 0.9 (0.3, 3.5) | 0.16 |
| FIB4 | 2.4 (0.6, 4.2) | 3.7 (1.2, 10.7) | 6.0 (1.3, 14.8) |
|
| Etiology | 0.13 | |||
| Non-alcoholic steatohepatitis (NASH) | 10 (76.9%) | 8 (66.7%) | 3 (30.0%) | |
| Alcoholic liver cirrhosis (ALC) | 0 (0.0%) | 2 (16.7%) | 0 (0.0%) | |
| Hepatitis C Virus (HCV) | 1 (7.7%) | 1 (8.3%) | 1 (10.0%) | |
| HCV + ALC | 0 (0.0%) | 0 (0.0%) | 2 (20.0%) | |
| Primary sclerosing cholangitis 2 (PSC 2) | 2 (15.4%) | 1 (8.3%) | 3 (30.0%) | |
| Hemochromatosis | 0 (0.0%) | 0 (0.0%) | 1 (10.0%) | |
P-values result from a Kruskal-Wallis rank sum test (continuous variables) or Fisher’s exact test (categorical variables). MELD, model for end-stage liver disease; APRI, aspartate aminotransferase to platelet ratio index; FIB4, Fibrosis-4 index for liver fibrosis. Bold values denote statistical significance at the p < 0.05 level.
FIGURE 4(A) Receiver operating characteristic (ROC) curves of the final models at the sample level. Area under the ROC curve (AUC) is annotated for each model. The ensemble’s confusion matrix heatmaps at the sample (B) and patient (C) levels summarize the frequency of True Positives (TP), False Negatives (FN), True Negatives (TN), False Positives (FP).
Model performance metrics at sample and patient levels at the 0.5 threshold.
| AUC | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 score | |
|
| |||||||
| Ensemble | 0.899 | 86.7% 65/75 | 100.0% 59/59 | 37.5% 6/16 | 85.5% 59/69 | 100.0% 6/6 | 92.2 |
| CV1 | 0.800 | 86.7% 65/75 | 100.0% 59/59 | 37.5% 6/16 | 88.7% 55/62 | 69.2% 9/13 | 92.2 |
| CV2 | 0.890 | 85.3%64/75 | 93.2% 55/59 | 56.2% 9/16 | 88.7% 55/62 | 69.2% 9/13 | 90.9 |
| CV3 | 0.771 | 85.3% 64/75 | 98.3% 58/59 | 37.5% 6/16 | 85.3% 58/68 | 85.7% 6/7 | 91.3 |
| CV4 | 0.682 | 81.3%61/75 | 93.2% 55/59 | 37.5% 6/16 | 84.6% 55/65 | 60.0% 6/10 | 88.7 |
|
| |||||||
| Ensemble | 0.894 (0.751, 1.000) | 86.4% (65.1%, 97.1%) 19/22 | 100.0% (80.5%, 100.0%) 17/17 | 40.0% (5.3%, 85.3%) 2/5 | 85.0% (62.1%, 96.8%) 17/20 | 100.0% (15.8%, 100.0%) 2/2 | 91.9 |
| CV1 | 0.824 (0.627, 1.000) | 86.4% (65.1%, 97.1%) 19/22 | 100.0% (80.5%, 100.0%) 17/17 | 40.0% (5.3%, 85.3%) 2/5 | 85.0% (62.1%, 96.8%) 17/20 | 100.0% (15.8%, 100.0%) 2/2 | 91.9 |
| CV2 | 0.882 (0.691, 1.000) | 81.8% (59.7%, 94.8%) 18/22 | 88.2% (63.6%, 98.5%) 15/17 | 60.0% (14.7%, 94.7%) 3/5 | 88.2% (63.6%, 98.5%) 15/17 | 60.0% (14.7%, 94.7%) 3/5 | 88.2 |
| CV3 | 0.800 (0.486, 1.000) | 86.4% (65.1%, 97.1%) 19/22 | 100.0% (80.5%, 100.0%) 17/17 | 40.0% (5.3%, 85.3%) 2/5 | 85.0% (62.1%, 96.8%) 17/20 | 100.0% (15.8%, 100.0%) 2/2 | 91.9 |
| CV4 | 0.682 (0.371, 0.994) | 81.8% (59.7%, 94.8%) 18/22 | 94.1% (71.3%, 99.9%) 16/17 | 40.0% (5.3%, 85.3%) 2/5 | 84.2% (60.4%, 96.6%) 16/19 | 66.7% (9.4%, 99.2%) 2/3 | 88.9 |
95% Confidence Intervals are reported at the patient level only, clustering of technical replicates precluded calculation of the exact confidence interval at the sample level. PPV, positive predictive value; NPV, negative predictive value.
FIGURE 5Distribution of the ensemble model’s predicted probabilities for healthy vs. disease classifications stratified by the true stage of cirrhosis. Ground truth labels of healthy (red) and disease (blue) are displayed. On the y-axis, probability values of model output are displayed. Model performance is reported at the sample level (A), as well as patient level (B) by aggregating based on median probabilities.
FIGURE 6Beeswarm summary plots on train and test data. This plot combines feature importance and feature effects. Every feature (VOC) is represented as a row on the y-axis (3,400 total) and SHAP values are on the x-axis (multiple VOC may overlap at a single index). Each dot represents a Shapley value for a given sample prediction. The color intensity shows the magnitude of importance of each feature.
FIGURE 7Patient breath samples with overlayed heatmaps which identify the 5 most important peaks from each CV model (up to 20 peaks total) in the classification of liver cirrhosis for a healthy control (A), and 3 individuals with stage I (B), stage II (C), and stage III (D) cirrhosis, respectively. Compounds are represented by indices on the y-axis and VOC signal value is on the x-axis; darker shading indicates the feature was selected by multiple CV models.