| Literature DB >> 32730362 |
Mohit Pandey1, Zhuoran Xu1, Evan Sholle2, Gabriel Maliakal1, Gurpreet Singh1, Zahra Fatima1, Daria Larine3, Benjamin C Lee1, Jing Wang1, Alexander R van Rosendael1, Lohendran Baskaran1, Leslee J Shaw1, James K Min1, Subhi J Al'Aref4.
Abstract
BACKGROUND: Heart failure (HF) is a major cause of morbidity and mortality. However, much of the clinical data is unstructured in the form of radiology reports, while the process of data collection and curation is arduous and time-consuming.Entities:
Mesh:
Year: 2020 PMID: 32730362 PMCID: PMC7392233 DOI: 10.1371/journal.pone.0236827
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Overall study design and workflow.
11,808 thoracoabdominal CT reports from 11,808 patients were included in the final analysis.
Prevalence of radiographic findings between the ground truth cohort (manually annotated reports) and the Clever-Heart cohort.
For the ground truth corpus, the prevalence percentage is calculated based on known, manually annotated ground truths. For Clever-Heart, values are based on predictions from the CNN model.
| Ground Truth CT reports | Clever-Heart (As Predicted) | ||||
|---|---|---|---|---|---|
| No. | Radiographic finding | Radiographic finding Count | Prevalence (%) | Radiographic finding Count | Prevalence (%) |
| 1 | Aortic Aneurysm | 120 | 7.69 | 754 | 6.38 |
| 2 | Ascites | 299 | 19.16 | 1516 | 12.83 |
| 3 | Atelectasis | 691 | 44.29 | 5625 | 47.63 |
| 4 | Atherosclerosis | 620 | 39.74 | 5639 | 47.75 |
| 5 | Cardiomegaly | 390 | 25.00 | 3374 | 28.57 |
| 6 | Enlarged Liver | 86 | 5.50 | 162 | 1.37 |
| 7 | GB Thickening | 25 | 1.60 | 9 | .0007 |
| 8 | Hernia | 366 | 23.46 | 2594 | 21.96 |
| 9 | Hydronephrosis | 65 | 4.16 | 136 | 1.15 |
| 10 | Lymphadenopathy | 484 | 31.02 | 2593 | 21.95 |
| 11 | Pleural Effusion | 673 | 43.14 | 4823 | 40.84 |
| 12 | Pneumonia | 278 | 17.82 | 1678 | 39.96 |
| 13 | Previous Surgery | 778 | 49.87 | 4719 | 39.96 |
| 14 | Pulmonary Edema | 94 | 6.02 | 516 | 4.36 |
Performance of the Convolutional Neural Network (CNN) model benchmarked against machine learning algorithms for the extraction of 14 pre-selected radiographic findings.
| Feature | Naive Bayes | SVM | CNN | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1 Score | ROC AUC | Precision | Recall | F1 Score | ROC AUC | Precision | Recall | F1 Score | ROC AUC | |
| 0.82 | 0.91 | 0.86 | 0.49 | 0.82 | 0.91 | 0.86 | 0.94 | 0.94 | 0.94 | 0.91 | 0.98 | |
| 0.62 | 0.78 | 0.69 | 0.73 | 0.64 | 0.8 | 0.7 | 0.9 | 0.87 | 0.87 | 0.84 | 0.98 | |
| 0.71 | 0.57 | 0.44 | 0.8 | 0.31 | 0.56 | 0.4 | 0.85 | 0.97 | 0.96 | 0.96 | 0.98 | |
| 0.55 | 0.58 | 0.45 | 0.77 | 0.34 | 0.58 | 0.43 | 0.83 | 0.9 | 0.88 | 0.89 | 0.98 | |
| 0.56 | 0.75 | 0.64 | 0.65 | 0.52 | 0.72 | 0.6 | 0.88 | 0.86 | 0.86 | 0.84 | 0.98 | |
| 0.89 | 0.95 | 0.92 | 0.65 | 0.89 | 0.95 | 0.92 | 0.92 | 0.94 | 0.97 | 0.95 | 0.96 | |
| 0.97 | 0.99 | 0.98 | 0.43 | 0.98 | 0.99 | 0.99 | 0.77 | 0.97 | 0.99 | 0.98 | 0.83 | |
| 0.55 | 0.74 | 0.63 | 0.69 | 0.58 | 0.76 | 0.66 | 0.92 | 0.97 | 0.97 | 0.97 | 0.99 | |
| 0.91 | 0.95 | 0.93 | 0.68 | 0.88 | 0.94 | 0.91 | 0.88 | 0.89 | 0.95 | 0.92 | 0.99 | |
| 0.46 | 0.68 | 0.55 | 0.75 | 0.41 | 0.64 | 0.5 | 0.82 | 0.86 | 0.85 | 0.84 | 0.95 | |
| 0.76 | 0.57 | 0.43 | 0.84 | 0.35 | 0.59 | 0.44 | 0.9 | 0.96 | 0.96 | 0.96 | 0.98 | |
| 0.69 | 0.83 | 0.75 | 0.67 | 0.69 | 0.83 | 0.75 | 0.88 | 0.87 | 0.88 | 0.86 | 0.96 | |
| 0.73 | 0.73 | 0.73 | 0.84 | 0.23 | 0.48 | 0.31 | 0.78 | 0.86 | 0.85 | 0.85 | 0.95 | |
| 0.88 | 0.94 | 0.91 | 0.65 | 0.88 | 0.94 | 0.9 | 0.95 | 0.93 | 0.92 | 0.89 | 1 | |
Fig 2Receiver Operator Characteristic (ROC) curves for the 14 pre-selected radiographic findings.
The Convolutional Neural Network (CNN) is compared to Naive Bayes, Support Vector Machine (SVM), and random guessing.
Fig 3Explaining the output of a trained CNN model using layer-wise relevance propagation.
The predicted label is considered as the true class label. The color intensities are normalized to the absolute value of maximum relevance score per report such that the deepest red denotes the word with the highest positive relevance in the class label prediction, while the deepest blue denotes the most negative relevance score in the prediction of the same label.
Fig 4Correlation between radiographic findings and all-cause mortality.
(A-B): Forest plot for the Cox model with all 14 variables and selected 9 variables. Numbers represent hazard ratios. The range of lines indicates 95% confidence intervals. Color blue implies significant variables with p<0.05. (C) Variable importance plot of the Random Survival Forest model. Large positive values indicate the dependency of the outcome to get high predictive power. Values closer to zero represent a lower contribution to improved predictive accuracy. Negative numbers indicate the predictive accuracy would improve when the variables were unspecified.
Fig 5Prognostication of outcomes using the CNN model.
(A-C) Time-dependent ROC curves at 30, 60 and 365 days. (D) Time-dependent Brier scores.