| Literature DB >> 35866818 |
Matthew D Li1, Nishanth T Arun1, Mehak Aggarwal1, Sharut Gupta1, Praveer Singh1, Brent P Little2, Dexter P Mendoza2, Gustavo C A Corradi3, Marcelo S Takahashi3, Suely F Ferraciolli3, Marc D Succi4, Min Lang2, Bernardo C Bizzo1,5, Ittai Dayan5, Felipe C Kitamura3,6, Jayashree Kalpathy-Cramer1,5.
Abstract
To tune and test the generalizability of a deep learning-based model for assessment of COVID-19 lung disease severity on chest radiographs (CXRs) from different patient populations. A published convolutional Siamese neural network-based model previously trained on hospitalized patients with COVID-19 was tuned using 250 outpatient CXRs. This model produces a quantitative measure of COVID-19 lung disease severity (pulmonary x-ray severity (PXS) score). The model was evaluated on CXRs from 4 test sets, including 3 from the United States (patients hospitalized at an academic medical center (N = 154), patients hospitalized at a community hospital (N = 113), and outpatients (N = 108)) and 1 from Brazil (patients at an academic medical center emergency department (N = 303)). Radiologists from both countries independently assigned reference standard CXR severity scores, which were correlated with the PXS scores as a measure of model performance (Pearson R). The Uniform Manifold Approximation and Projection (UMAP) technique was used to visualize the neural network results. Tuning the deep learning model with outpatient data showed high model performance in 2 United States hospitalized patient datasets (R = 0.88 and R = 0.90, compared to baseline R = 0.86). Model performance was similar, though slightly lower, when tested on the United States outpatient and Brazil emergency department datasets (R = 0.86 and R = 0.85, respectively). UMAP showed that the model learned disease severity information that generalized across test sets. A deep learning model that extracts a COVID-19 severity score on CXRs showed generalizable performance across multiple populations from 2 continents, including outpatients and hospitalized patients.Entities:
Mesh:
Year: 2022 PMID: 35866818 PMCID: PMC9302282 DOI: 10.1097/MD.0000000000029587
Source DB: PubMed Journal: Medicine (Baltimore) ISSN: 0025-7974 Impact factor: 1.817
Figure 1.Schematic of study design. Previously published Siamese neural network-based model for extracting lung disease severity from CXRs[ was tuned using new CXR data and evaluated in 4 test sets.
Summary of dataset characteristics and radiologist mRALE scores.
| Hospital 1 Outpatient Dataset (United States) Patients presenting for outpatient imaging who tested positive by COVID-19 RT-PCR | Hospital 2 Emergency Test Set (Brazil) Patients presenting to emergency department with suspected COVID-19 | |||||||
|---|---|---|---|---|---|---|---|---|
| All | Training/validation set | Outpatient test set | All | RT-PCR positive | RT-PCR negative | |||
| CXRs, N | 358 | 250 | 108 | 303 | 203 | 100 | ||
| Unique Patients, N | 349 | 248 | 106 | 242 | 167 | 75 | ||
| Age (years), | 53 (41–64) | 52 (41–65) | 53 (41–63) | 0.9 | 41 (33–52) | 40 (33–50) | 44 (33–52) | 0.2 |
| Sex, N | 186 (52%) | 132 (53%) | 54 (50%) | 0.7 | 175 (58%) | 113 (56%) | 62 (62%) | 0.4 |
| mRALE, | 1.0 (0–3.5) | 1.0 (0–3.0) | 1.0 (0–4.5) | 0.2 | 0.3 (0–2.7) | 0.3 (0–2.8) | 0.3 (0–1.8) | 0.6 |
| mRALE, N | ||||||||
| mRALE = 0 | 123 (34%) | 88 (35%) | 35 (32%) | 122 (40%) | 84 (41%) | 38 (38%) | ||
| 0 < mRALE ≤ 4 | 164 (46%) | 122 (49%) | 42 (39%) | 126 (42%) | 78 (38%) | 48 (48%) | ||
| 4 < mRALE ≤ 10 | 58 (16%) | 30 (12%) | 28 (26%) | 29 (10%) | 22 (11%) | 7 (7%) | ||
| 13 (4%) | 10 (4%) | 3 (3%) | 26 (9%) | 19 (9%) | 7 (7%) | |||
P-value for comparison of internal test set with training/validation set;
p-value for comparison of patients who tested positive vs negative by COVID-19 RT-PCR.
mRALE, Modified Radiographic Assessment of Lung Edema, N, Number, Q1–Q3, Quartile 1 to Quartile 3 (i.e. interquartile range).
Figure 2.Boxplots show variable distributions in patient age (A) and lung disease severity by mRALE score (B) in the different CXR test sets. Boxplots show the median and interquartile range (IQR), where the whiskers extend up to 1.5 x IQR.
Summary of x-ray equipment manufacturers extracted from DICOM metadata.
| Dataset | Manufacturer (headquarters) | Number of CXRs |
|---|---|---|
| Hospital 1 inpatient test set (United States) | Agfa (Mortsel, Belgium) | 136 |
| GE Healthcare (Chicago, USA) | 1 | |
| Varian (Palo Alto, USA) | 4 | |
| Not available | 13 | |
| Hospital 1 outpatient test set (United States) | Agfa (Mortsel, Belgium) | 108 |
| Hospital 2 emergency test set (Brazil) | Fujifilm Corporation (Tokyo, Japan) | 303 |
| Hospital 3 inpatient test set (United States) | Agfa (Mortsel, Belgium) | 33 |
| Caresteam (Rochester, USA) | 72 | |
| Kodak (Rochester, USA) | 2 | |
| Philips (Amsterdam, Netherlands) | 2 | |
| Siemens (Munich, Germany) | 2 |
Figure 3.Scatterplots show the correlation between radiologist-determined mRALE score and the deep learning-based PXS score in the Hospital 1 Inpatient Test Set (R = 0.88) (A), Hospital 1 Outpatient Test Set (R = 0.86) (B), Hospital 2 Emergency Test Set (R = 0.85) (C), and Hospital 3 Inpatient Test Set (R = 0.90) (A). Linear regression 95% confidence intervals are shown in each scatterplot.
Figure 4.Dimensionality reduction using UMAP shows the relationships between CXR data passed through the deep learning-based PXS score model from all 4 test sets (total N = 678), color coded for PXS score (A), mRALE score (B), and test set (C). For the legend in (C), H indicates Hospital. Across the different test sets, a representation of lung disease severity is learned by the PXS score model.