| Literature DB >> 31978149 |
Eric Engle1, Andrei Gabrielian1, Alyssa Long1, Darrell E Hurt1, Alex Rosenthal1.
Abstract
Availability of trained radiologists for fast processing of CXRs in regions burdened with tuberculosis always has been a challenge, affecting both timely diagnosis and patient monitoring. The paucity of annotated images of lungs of TB patients hampers attempts to apply data-oriented algorithms for research and clinical practices. The TB Portals Program database (TBPP, https://TBPortals.niaid.nih.gov) is a global collaboration curating a large collection of the most dangerous, hard-to-cure drug-resistant tuberculosis (DR-TB) patient cases. TBPP, with 1,179 (83%) DR-TB patient cases, is a unique collection that is well positioned as a testing ground for deep learning classifiers. As of January 2019, the TBPP database contains 1,538 CXRs, of which 346 (22.5%) are annotated by a radiologist and 104 (6.7%) by a pulmonologist-leaving 1,088 (70.7%) CXRs without annotations. The Qure.ai qXR artificial intelligence automated CXR interpretation tool, was blind-tested on the 346 radiologist-annotated CXRs from the TBPP database. Qure.ai qXR CXR predictions for cavity, nodule, pleural effusion, hilar lymphadenopathy was successfully matching human expert annotations. In addition, we tested the 12 Qure.ai classifiers to find whether they correlate with treatment success (information provided by treating physicians). Ten descriptors were found as significant: abnormal CXR (p = 0.0005), pleural effusion (p = 0.048), nodule (p = 0.0004), hilar lymphadenopathy (p = 0.0038), cavity (p = 0.0002), opacity (p = 0.0006), atelectasis (p = 0.0074), consolidation (p = 0.0004), indicator of TB disease (p = < .0001), and fibrosis (p = < .0001). We conclude that applying fully automated Qure.ai CXR analysis tool is useful for fast, accurate, uniform, large-scale CXR annotation assistance, as it performed well even for DR-TB cases that were not used for initial training. Testing artificial intelligence algorithms (encapsulating both machine learning and deep learning classifiers) on diverse data collections, such as TBPP, is critically important toward progressing to clinically adopted automatic assistants for medical data analysis.Entities:
Mesh:
Year: 2020 PMID: 31978149 PMCID: PMC6980594 DOI: 10.1371/journal.pone.0224445
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Qure.ai qXR (http://qure.ai/qxr/) TB deep learning classifiers.
| Abnormal | Indication of abnormality on the chest X-ray |
|---|---|
| Blunted CP angle | Indicating the presence of a pleural effusion or pleural thickening |
| Cardiomegaly | Increased heart size, increased cardio-thoracic ratio |
| Pleural Effusion | A build-up of excess fluid between layers of pleura outside of the lungs |
| Nodule | Small well-defined opacity in the lung fields |
| Hilar lymphadenopathy | Hilar enlargement, prominence or visible lymph nodes |
| Cavity | A gas-filled space, seen as a lucency or low-attenuation area |
| Opacity | A general term indicating an abnormal radiopaque region, includes a wide range of well circumscribed or ill-defined abnormalities in the lung fields |
| Atelectasis | Decrease in lung capacity |
| Consolidation | Airspace opacification, often following a lobar pattern |
| Tuberculosis | Indication of TB disease on the chest X-ray |
| Fibrosis | Reticular shadowing or evidence of scarring |
346 TBPP CXR Annotations versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | CXR Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 73% | <0.0001 | 62.1% | 79.7% | 44.4% | 74.1% | 69.3% | 79.0% | |
| 80% | <0.0001 | 75.0% | 84.3% | 65.7% | 82.1% | 77.4% | 86.7% | |
| 21% | 0.5181 | 19.4% | 23.6% | 15.2% | 72.7% | 46.4% | 99.0% | |
| 91% | <0.0001 | 60.0% | 76.2% | 43.8% | 94.9% | 92.4% | 97.3% | |
| 61% | 0.0003 | 59.7% | 65.1% | 54.3% | 74.2% | 58.8% | 89.6% |
184 TBPP CT Annotations versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | Imputed CT Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 65% | 0.0318 | 69.7% | 77.1% | 62.2% | 48.7% | 33.0% | 64.4% | |
| 65% | <0.0001 | 84.3% | 91.8% | 76.7% | 46.3% | 36.3% | 56.3% | |
| 72% | <0.0001 | 89.6% | 95.4% | 83.8% | 47.4% | 36.4% | 58.5% | |
| 42% | 0.0214 | 92.2% | 99.5% | 84.8% | 22.6% | 15.5% | 29.7% | |
| 45% | 0.1629 | 45.3% | 53.0% | 37.7% | 39.1% | 19.2% | 59.1% |
271 TBPP CXR Annotations for Drug Resistant TB versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | CXR Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 77% | <0.0001 | 65.0% | 85.9% | 44.1% | 77.7% | 72.5% | 82.8% | |
| 80% | <0.0001 | 75.6% | 85.2% | 66.1% | 82.4% | 77.0% | 87.8% | |
| 22% | 0.737 | 20.2% | 25.0% | 15.3% | 75.0% | 45.0% | 100.0% | |
| 91% | <0.0001 | 62.1% | 79.7% | 44.4% | 94.6% | 91.8% | 97.5% | |
| 62% | 0.0046 | 60.9% | 67.0% | 54.8% | 69.6% | 50.8% | 88.4% |
75 TBPP CXR Annotations for Sensitive TB versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | CXR Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 60% | 0.3557 | 55.6% | 88.0% | 23.1% | 60.6% | 48.8% | 72.4% | |
| 80% | 0.0073 | 66.7% | 100.0% | 28.9% | 81.2% | 71.9% | 90.4% | |
| 19% | 0.4549 | 16.7% | 25.3% | 8.1% | 66.7% | 13.3% | 100.0% | |
| 92% | <0.0001 | 50.0% | 90.0% | 10.0% | 95.7% | 90.8% | 100.0% | |
| 59% | 0.0223 | 55.2% | 67.1% | 43.3% | 87.5% | 64.6% | 100.0% |
63 TBPP CXR Annotations for Relapse TB Patient Case versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | CXR Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 76% | 0.0407 | 57.1% | 93.8% | 20.5% | 78.6% | 67.8% | 89.3% | |
| 75% | 0.0001 | 90.9% | 100.0% | 73.9% | 71.2% | 58.8% | 83.5% | |
| 13% | 0.641 | 9.8% | 17.3% | 2.4% | 0.0% | 0.0% | 0.0% | |
| 95% | 0.0023 | 33.3% | 86.7% | 0.0% | 98.3% | 95.1% | 100.0% | |
| 65% | 0.0669 | 63.9% | 76.0% | 51.9% | 0.0% | 0.0% | 0.0% |
187 TBPP CXR Annotations for New TB Patient Case versus Qure.ai Binary Prediction, Crosstabulation Summary Statistics.
| Lung Feature | CXR Accuracy | P-Value | Sensitivity | Upper CL | Lower CL | Specificity | Upper CL | Lower CL |
|---|---|---|---|---|---|---|---|---|
| 71% | 0.0006 | 73.3% | 95.7% | 51.0% | 70.3% | 63.5% | 77.2% | |
| 82% | <0.0001 | 69.2% | 87.0% | 51.5% | 84.5% | 78.9% | 90.1% | |
| 19% | 0.9765 | 17.1% | 22.6% | 11.6% | 83.3% | 53.5% | 100.0% | |
| 94% | <0.0001 | 58.8% | 82.2% | 35.4% | 97.1% | 94.5% | 99.6% | |
| 57% | 0.0237 | 54.8% | 62.4% | 47.2% | 71.4% | 52.1% | 90.8% |
346 TBPP CXR Annotations versus Qure.ai Continuous Score, Receiver Operating Characteristic Statistics Summary.
| Lung Feature | CXR Accuracy | 95% Wald Confidence Limits | |
|---|---|---|---|
| 74.69% | 65.0% | 84.4% | |
| 84.17% | 79.1% | 89.3% | |
| 50.58% | 31.1% | 70.1% | |
| 84.77% | 76.2% | 93.3% | |
| 71.96% | 62.9% | 81.0% |
Fig 1ROC Curve for CXR: Hilar Lymphadenopathy.
Fig 5ROC Curve for CXR: Nodule.
184 TBPP CT Annotations versus Qure.ai Continuous Score, Receiver Operating Characteristic Statistics Summary.
| Lung Feature | CT Accuracy | 95% Wald Confidence Limits | |
|---|---|---|---|
| 61.47% | 51.8% | 71.1% | |
| 70.05% | 62.5% | 77.6% | |
| 77.67% | 70.9% | 84.4% | |
| 64.71% | 56.5% | 72.9% | |
| 56.25% | 43.9% | 68.6% |
Fig 6ROC Curve for CT: Hilar Lymphadenopathy.
Fig 10ROC Curve for CT: Nodule.
281 Qure.ai classifier binary value predicting TBPP outcome died/failure.
| Lung Feature | Number of cases present | Fisher's Exact Test | Sensitivity | Specificity |
|---|---|---|---|---|
| 263 | 0.770 | 95.1% | 6.8% | |
| 126 | 0.059 | 55.7% | 58.2% | |
| 16 | 0.536 | 3.3% | 93.6% | |
| 38 | 0.056 | 21.3% | 88.6% | |
| 174 | 0.001 | 80.3% | 43.2% | |
| 93 | 0.046 | 44.3% | 70.0% | |
| 120 | 0.002 | 60.7% | 62.3% | |
| 249 | 0.108 | 95.1% | 13.2% | |
| 82 | 0.204 | 36.1% | 72.7% | |
| 179 | 0.035 | 75.4% | 39.5% | |
| 242 | 0.208 | 91.8% | 15.5% | |
| 208 | 0.003 | 88.5% | 30.0% |
281 Qure.ai continuous classifier score predicting TBPP outcome.
| Cured: Mean Score | Died/Failure: Mean Score | WMW Test | |
|---|---|---|---|
| 132.05 | 173.28 | 0.0005 | |
| 136.75 | 156.31 | 0.0963 | |
| 137.72 | 152.84 | 0.1999 | |
| 135.93 | 159.29 | 0.048 | |
| 131.84 | 174.03 | 0.0004 | |
| 133.54 | 167.88 | 0.0038 | |
| 131.35 | 175.77 | 0.0002 | |
| 132.08 | 173.16 | 0.0006 | |
| 134.11 | 164.85 | 0.0074 | |
| 131.82 | 174.1 | 0.0004 | |
| 130.28 | 179.67 | < .0001 | |
| 130.53 | 178.75 | < .0001 |
Qure.ai classifier score predicting TBPP outcome died; receiver operating characteristic statistics summary.
| Lung Feature | Accuracy | 95% Wald Confidence Limits | |
|---|---|---|---|
| 64.67% | 57.1% | 72.2% | |
| 56.96% | 49.0% | 64.9% | |
| 55.38% | 47.4% | 63.4% | |
| 58.32% | 49.7% | 66.9% | |
| 65.01% | 57.6% | 72.4% | |
| 62.22% | 54.4% | 70.0% | |
| 65.80% | 58.1% | 73.5% | |
| 64.62% | 57.0% | 72.2% | |
| 61.30% | 53.9% | 68.7% | |
| 65.04% | 57.5% | 72.6% | |
| 67.58% | 59.8% | 75.3% | |
| 67.16% | 59.4% | 74.9% | |