| Literature DB >> 31628424 |
Zhi Zhen Qin1, Melissa S Sander2, Bishwa Rai3, Collins N Titahong2, Santat Sudrungrot3, Sylvain N Laah2,4, Lal Mani Adhikari3, E Jane Carter5, Lekha Puri1, Andrew J Codlin1, Jacob Creswell6.
Abstract
Deep learning (DL) neural networks have only recently been employed to interpret chest radiography (CXR) to screen and triage people for pulmonary tuberculosis (TB). No published studies have compared multiple DL systems and populations. We conducted a retrospective evaluation of three DL systems (CAD4TB, Lunit INSIGHT, and qXR) for detecting TB-associated abnormalities in chest radiographs from outpatients in Nepal and Cameroon. All 1196 individuals received a Xpert MTB/RIF assay and a CXR read by two groups of radiologists and the DL systems. Xpert was used as the reference standard. The area under the curve of the three systems was similar: Lunit (0.94, 95% CI: 0.93-0.96), qXR (0.94, 95% CI: 0.92-0.97) and CAD4TB (0.92, 95% CI: 0.90-0.95). When matching the sensitivity of the radiologists, the specificities of the DL systems were significantly higher except for one. Using DL systems to read CXRs could reduce the number of Xpert MTB/RIF tests needed by 66% while maintaining sensitivity at 95% or better. Using a universal cutoff score resulted different performance in each site, highlighting the need to select scores based on the population screened. These DL systems should be considered by TB programs where human resources are constrained, and automated technology is available.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31628424 PMCID: PMC6802077 DOI: 10.1038/s41598-019-51503-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Demographic and Clinical Characteristics of People Screened with Chest X-ray and Xpert MTB/RIF.
| Overall | Nepal | Cameroon | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Xpert Positive | Xpert Negative | Total | Xpert Positive | Xpert Negative | Total | Xpert Positive | Xpert Negative | Total | |
| N | 109 | 1087 | 1196 | 94 | 421 | 515 | 15 | 666 | 681 |
| Age (median [IQR]) | 36 [25, 47] | 48 [31, 61] | 46 [30, 61] | 38 [25, 48.75] | 47 [29, 62] | 45 [28, 61] | 33 [25.50, 38] | 48 [32, 61] | 47 [32, 61] |
| Sex = F/M (%) | 32/77 (29.4/70.6) | 614/473 (56.5/43.5) | 646/550 (54.0/46.0) | 27/67 (28.7/71.3) | 200/221 (47.5/52.5) | 227/288 (44.1/55.9) | 5/10 (33.3/66.7) | 414/252 (62.2/37.8) | 419/262 (61.5/38.5) |
| Cough (%) | 105 (96.3) | 632 (58.1) | 737 (61.6) | 93 (98.9) | 368 (87.4) | 461 (89.5) | 12 (80) | 264 (39.6) | 276 (40.5) |
| Fever (%) | 76 (69.7) | 444 (40.8) | 520 (43.5) | 68 (72.3) | 218 (51.8) | 286 (55.5) | 8 (53.3) | 226 (33.9) | 234 (34.4) |
| Weight loss (%) | 74 (67.9) | 550 (50.6) | 624 (52.2) | 62 (66) | 150 (35.6) | 212 (41.2) | 12 (80) | 400 (60.1) | 412 (60.5) |
| Night Sweats (%) | 37 (33.9) | 269 (24.7) | 306 (25.6) | 29 (30.9) | 83 (19.7) | 112 (21.7) | 8 (53.3) | 186 (27.9) | 194 (28.5) |
|
| |||||||||
| Yes | 89 (81.7) | 982 (90.3) | 1071 (89.5) | 20 (21.3) | 76 (18.1) | 96 (18.6) | 0 (0) | 22 (3.3) | 22 (3.2) |
| No | 0 (0.0) | 7 (0.6) | 7 (0.6) | 74 (78.7) | 345 (81.9) | 419 (81.4) | 15 (100) | 637 (95.6) | 652 (95.7) |
| Unknown | 20 (18.3) | 98 (9.0) | 118 (9.9) | 0 (0) | 7 (1.1) | 7 (1) | |||
|
| |||||||||
| Negative | 17 (15.6) | 254 (23.4) | 271 (22.7) | 9 (9.6) | 50 (11.9) | 59 (11.5) | 8 (53.3) | 204 (30.6) | 212 (31.1) |
| Positive | 10 (9.2) | 28 (2.6) | 38 (3.2) | 4 (4.3) | 3 (0.7) | 7 (1.4) | 6 (40) | 25 (3.8) | 31 (4.6) |
| Unknown | 82 (75.2) | 805 (74.1) | 887 (74.2) | 81 (86.2) | 368 (87.4) | 449 (87.2) | 1 (6.7) | 437 (65.6) | 438 (64.3) |
| AFB positive (%) | 76 (70.4) | 2 (0.2) | 78 (6.6) | 67 (71.3) | 1 (0.2) | 68 (13.2) | 9 (64.3) | 1 (0.2) | 10 (1.5) |
| Xpert positive (%) | 109 (100.0) | 0 (0.0) | 109 (9.1) | 94 (100) | 0 (0) | 94 (18.3) | 15 (100) | 0 (0) | 15 (2.2) |
| CAD4TB (median [IQR]) | 86 [75, 99] | 45 [25, 54] | 46 [28, 59] | 86 [77, 99] | 48 [25, 69] | 54 [40.5, 79.5] | 77 [58, 99] | 44 [25, 49] | 44 [25, 50] |
| Lunit (median [IQR]) | 0.99 [0.97, 0.99] | 0.08 [0.01, 0.51] | 0.10 [0.01, 0.80] | 0.99 [0.97, 0.99] | 0.37 [0.05, 0.90] | 0.69 [0.09, 0.96] | 0.97 [0.90, 0.98] | 0.03 [0.01, 0.15] | 0.03 [0.01, 0.18] |
| qXR (median [IQR]) | 0.90 [0.82, 0.94] | 0.17 [0.10, 0.36] | 0.19 [0.11, 0.50] | 0.92 [0.84, 0.94] | 0.30 [0.12, 0.65] | 0.44 [0.14, 0.82] | 0.78 [0.64, 0.86] | 0.14 [0.09, 0.23] | 0.15 [0.10, 0.24] |
| Senior Radiologist (Nepal) = Normal CXR (%) | 4 (4.3) | 203 (48.2) | 207 (40.2) | ||||||
| Junior Radiologist & Residents (Nepal) = Normal CXR (%) | 12 (12.8) | 290 (68.9) | 302 (58.6) | ||||||
| Radiologist (Cameroon) = Normal CXR (%) | 3 (20) | 495 (74.3) | 498 (73.1) | ||||||
| Teleradiology Company (Cameroon) = Normal CXR (%) | 3 (20) | 494 (74.2) | 497 (73.0) | ||||||
Figure 1Frequency distribution of the abnormality scores of CAD4TB, Lunit, and qXR.
Figure 2The ROC curves of CAD4TB (v6), Lunit (v4.7.2) and qXR (v2) using Xpert results as the reference (n = 1196).
Figure 3The ROC curves of CAD4TB (v6), Lunit (v4.7.2) and qXR (v2) among individuals with negative smear (n = 1102)
Radiologist and Deep Learning System Performance for Chest Radiographs and Tuberculosis.
| Human Readers | Deep Learning Systems | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CAD4TB (v6) | Lunit (v4.7.2) | qXR (v2) | ||||||||||
| Accuracy | Sensitivity (95%CI) | Specificity (95% CI) | Accuracy | Sensitivity (95% CI) | Specificity (95% CI) | Accuracy | Sensitivity (95% CI) | Specificity (95%CI) | Accuracy | Sensitivity (95% CI) | Specificity (95% CI) | |
|
| ||||||||||||
| Senior Radiologist | 0.57 | 0.96 | 0.48 | 0.74 | 0.96 | 0.69 | 0.67 | 0.96 | 0.6 | 0.7 | 0.97* | 0.65 |
| (0.89–0.99) | (0.43–0.53) | (0.89–0.99) | (0.64–0.73) | (0.89–0.99) | (0.55–0.65) | (0.91–0.99) | (0.6–0.69)) | |||||
| Junior Radiologist & Residents | 0.72 | 0.87 | 0.69 | 0.77 | 0.87 | 0.75 | 0.85 | 0.87 | 0.78 | 0.69 | 0.87 | 0.81 |
| (0.79–0.93) | (0.64–0.73) | (0.79–0.93) | (0.71–0.79) | (0.79–0.93) | (0.73–0.82) | (0.79–0.93) | (0.76–0.84) | |||||
|
| ||||||||||||
| Radiologist | 0.74 | 0.8 | 0.74 | 0.9 | 0.8 | 0.9 | 0.94 | 0.8 | 0.94 | 0.94 | 0.8 | 0.95 |
| (0.52–0.96) | (0.71–0.78) | (0.52–0.96) | (0.87–0.92) | (0.52–0.96) | (0.92–0.96) | (0.52–0.96) | (0.93–0.96) | |||||
| Teleradiology Company | 0.74 | 0.8 | 0.74 | 0.9 | 0.8 | 0.9 | 0.94 | 0.8 | 0.94 | 0.94 | 0.8 | 0.95 |
| (0.52–0.96) | (0.71–0.77) | (0.52–0.96) | (0.87–0.92) | (0.52–0.96) | (0.92–0.96) | (0.52–0.96) | (0.93–0.96) | |||||
*The sensitivity of qXR version 2 closest to that of the senior radiologist is 97%, instead of 96%.
Performance of CAD4TB (v6), Lunit (v4.7.2), and qXR (v2) at Selected Thresholds.
| Thresholds | CAD4TB (v6) | Lunit (v4.7.2) | qXR (v2) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Threshold | Accuracy | Sensitivity (95% CI) | Specificity (95% CI) | Threshold | Accuracy | Sensitivity (95% CI) | Specificity (95% CI) | Threshold | Accuracy | Sensitivity (95% CI) | Specificity (95% CI) | |
| ROC01# | 63 | 0.85 | 0.91 (0.84–0.96) | 0.84 (0.82–0.86) | 0.92 | 0.89 | 0.87 (0.79–0.93) | 0.89 (0.87–0.91) | 0.67 | 0.86 | 0.88 (0.8–0.93) | 0.89 (0.87–0.91) |
| Sensitivity ≥95% | 57 | 0.81 | 0.95 (0.9–0.98) | 0.8 (0.77–0.82) | 0.55 | 0.77 | 0.95 (0.9–0.98) | 0.76 (0.73–0.78) | 0.49 | 0.83 | 0.95 (0.90–0.98) | 0.82 (0.79–0.84) |
| Reduce Xpert tests by 1/2 | 47 | 0.61 | 0.97 (0.92–0.99) | 0.57 (0.54–0.6) | 0.11 | 0.6 | 0.99 (0.95–1) | 0.56 (0.53–0.59) | 0.18 | 0.57 | 0.97 (0.92–0.99) | 0.53 (0.5–0.56) |
| Reduce Xpert tests by 2/3 | 53 | 0.75 | 0.96 (0.91–0.99) | 0.73 (0.7–0.76) | 0.39 | 0.74 | 0.95 (0.9–0.98) | 0.72 (0.69–0.74) | 0.34 | 0.75 | 0.96 (0.91–0.99) | 0.73 (0.7–0.76) |
| Reduce Xpert tests by 3/4 | 59 | 0.82 | 0.94 (0.87–0.97) | 0.82 (0.79–0.84) | 0.79 | 0.82 | 0.93 (0.86–0.97) | 0.81 (0.79–0.84) | 0.5 | 0.83 | 0.93 (0.86–0.97) | 0.82 (0.8–0.84) |
| Max Accuracy | 88 | 0.92 | 0.47 (0.37–0.57) | 0.96 (0.95–0.97) | 0.98 | 0.94 | 0.58 (0.48–0.67) | 0.97 (0.96–0.98) | 0.84 | 0.94 | 0.71 (0.61–0.79) | 0.96 (0.94–0.97) |
#ROC01: the point on the ROC that was closest to the coordinates (0,1), the perfect classification.
Individuals with Bacteriologically Confirmed TB, Missed by Deep Learning Systems at 95% Sensitivity.
| Individual Missed | Senior Radiologist (Nepal) | Junior Radiologist & Residents (Nepal) | Field Radiologist (Cameroon) | Teleradiology Company (Cameroon) |
| CAD4TB score | qXR score | Lunit score | Annotation by a senior pulmonologist |
|---|---|---|---|---|---|---|---|---|---|
| Nepal 1 | Normal | Normal |
|
|
| 9 | 0.1205 | 0.2330 | Normal |
| Nepal 2 | Abnormal | Abnormal |
|
|
| 65 | 0.4225 | 0.9853 | Abnormal: may be an azygous lobe (normal variant) but also could be apical TB |
| Nepal 3 | Abnormal | Abnormal |
|
|
| 71 | 0.5223 | 0.3458 | Abnormal maybe old scar: minimal tenting of the right diaphragm |
| Nepal 4 | Normal | Normal |
|
|
| 63 | 0.1517 | 0.1318 | Non-TB abnormality: elevated right hemidiaphragm |
| Nepal 5 | Normal | Normal |
|
|
| 46 | 0.4992 | 0.2728 | Normal |
| Cameroon 1 |
|
| Normal | Abnormal |
| 48 | 0.1259 | 0.0176 | Abnormal in the left mid lung field by the heart- could be TB but not classic |
| Cameroon 2 |
|
| Normal | Normal |
| 53 | 0.6695 | 0.8744 | Abnormal: ill defined infiltrate in the right upper lung field |
| Cameroon 3 |
|
| Normal | Normal |
| 18 | 0.2504 | 0.5518 | Normal lung field but cardiomegaly |
Figure 4(a) The ROC curves of CAD4TB (v6), Lunit (v4.7.2) and qXR (v2) in Nepal (n = 515). (b) The ROC curves of CAD4TB (v6), Lunit (v4.7.2) and qXR (v2) in Cameroon (n = 681).
Performance of CAD4TB (v6), Lunit (v4.7.2), and qXR (v2) at Selected Thresholds Stratified by Study Sites.
| CAD4TB (v6) | Lunit (v4.7.2) | qXR (v2) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Thresholds | Accuracy | Sensitivity (95%CI) | Specificity (95%CI) | Thresholds | Accuracy | Sensitivity (95%CI) | Specificity (95%CI) | Thresholds | Accuracy | Sensitivity (95%CI) | Specificity (95%CI) | |
|
| ||||||||||||
| Sensitivity ≥95% | 63 | 0.74 | 0.95 (0.88–0.98) | 0.69 (0.65–0.74) | 0.8 | 0.71 | 0.95 (0.88–0.98) | 0.66 (0.61–0.7) | 0.52 | 0.72 | 0.95 (0.88–0.98) | 0.67 (0.62–0.71) |
| Reduce Xpert tests by 1/2 | 55 | 0.68 | 0.98 (0.93–1) | 0.61 (0.56–0.66) | 0.69 | 0.67 | 0.96 (0.89–0.99) | 0.6 (0.55–0.65) | 0.44 | 0.67 | 0.97 (0.91–0.99) | 0.61 (0.56–0.65) |
| Reduce Xpert tests by 2/3 | 73 | 0.78 | 0.81 (0.71–0.88) | 0.78 (0.73–0.82) | 0.93 | 0.79 | 0.87 (0.79–0.93) | 0.78 (0.73–0.82) | 0.7 | 0.81 | 0.9 (0.83–0.96) | 0.79 (0.75–0.83) |
| Reduce Xpert tests by 3/4 | 80 | 0.82 | 0.7 (0.6–0.79) | 0.85 (0.81–0.88) | 0.96 | 0.85 | 0.81 (0.71–0.88) | 0.86 (0.83–0.9) | 0.82 | 0.86 | 0.79 (0.69–0.86) | 0.88 (0.84–0.91) |
| Max Accuracy | 94 | 0.85 | 0.35 (0.26–0.46) | 0.96 (0.93–0.97) | 0.98 | 0.89 | 0.63 (0.52–0.73) | 0.94 (0.92–0.96) | 0.88 | 0.88 | 0.67 (0.57–0.76) | 0.92 (0.89–0.95) |
|
| ||||||||||||
| Sensitivity ≥95% | 48 | 0.7 | 0.93* (0.68–1) | 0.7 (0.66–0.73) | 0.55 | 0.88 | 0.93* (0.68–1) | 0.88 (0.86–0.91) | 0.25 | 0.78 | 0.93* (0.68–1) | 0.78 (0.74–0.81) |
| Reduce Xpert tests by 1/2 | 45 | 0.54 | 0.93 (0.68–1) | 0.53 (0.5–0.57) | 0.03 | 0.51 | 0.93 (0.68–1) | 0.5 (0.46–0.53) | 0.14 | 0.48 | 0.93 (0.68–1) | 0.47 (0.44–0.51) |
| Reduce Xpert tests by 2/3 | 47 | 0.66 | 0.93 (0.68–1) | 0.65 (0.61–0.69) | 0.1 | 0.68 | 0.93 (0.68–1) | 0.68 (0.64–0.71) | 0.2 | 0.7 | 0.93 (0.68–1) | 0.69 (0.66–0.73) |
| Reduce Xpert tests by 3/4 | 50 | 0.77 | 0.87 (0.6–0.98) | 0.76 (0.73–0.79) | 0.17 | 0.77 | 0.93 (0.68–1) | 0.76 (0.73–0.79) | 0.24 | 0.77 | 0.93 (0.68–1) | 0.77 (0.74–0.8) |
| Max Accuracy | 90 | 0.99 | 0.4 (0.16–0.68) | 1 (0.99–1) | 0.97 | 0.98 | 0.4 (0.16–0.68) | 0.99 (0.98–0.99) | 0.77 | 0.98 | 0.53 (0.27–0.79) | 0.99 (0.97–0.99) |
*Due to the limited number of Xpert-positive patients in the site in Cameroon, the sensitivity closest to 95% is 93%.