| Literature DB >> 31479448 |
Miriam Harris1,2,3, Amy Qi2,4, Luke Jeagal4, Nazi Torabi5, Dick Menzies1,4,6, Alexei Korobitsyn7, Madhukar Pai1,4,6, Ruvandhi R Nathavitharana8, Faiz Ahmad Khan1,4,6.
Abstract
We undertook a systematic review of the diagnostic accuracy of artificial intelligence-based software for identification of radiologic abnormalities (computer-aided detection, or CAD) compatible with pulmonary tuberculosis on chest x-rays (CXRs). We searched four databases for articles published between January 2005-February 2019. We summarized data on CAD type, study design, and diagnostic accuracy. We assessed risk of bias with QUADAS-2. We included 53 of the 4712 articles reviewed: 40 focused on CAD design methods ("Development" studies) and 13 focused on evaluation of CAD ("Clinical" studies). Meta-analyses were not performed due to methodological differences. Development studies were more likely to use CXR databases with greater potential for bias as compared to Clinical studies. Areas under the receiver operating characteristic curve (median AUC [IQR]) were significantly higher: in Development studies AUC: 0.88 [0.82-0.90]) versus Clinical studies (0.75 [0.66-0.87]; p-value 0.004); and with deep-learning (0.91 [0.88-0.99]) versus machine-learning (0.82 [0.75-0.89]; p = 0.001). We conclude that CAD programs are promising, but the majority of work thus far has been on development rather than clinical evaluation. We provide concrete suggestions on what study design elements should be improved.Entities:
Mesh:
Year: 2019 PMID: 31479448 PMCID: PMC6719854 DOI: 10.1371/journal.pone.0221339
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Study flow diagram.
Computer aided detection (CAD).
Methods of studies included in the descriptive analysis.
| Author and year | Country where CXR completed | Databases used | Computer software | Reference | Accuracy measures |
|---|---|---|---|---|---|
| Heo et al, 2019 | South Korea | YU AWH | Not named | Human reader | AUC |
| Hwang et al, 2018 | South Korea, USA, China | SNUH, BMC, KUHG, DEMC, MC, CH | DLAD | Liquid culture, NAAT, and or TB treatment | AUC, |
| Lakhani et al, 2017 | USA, China | MC, CH, TJH, Belarus | AlexNet and GoogLeNet | Human reader | AUC, Sn, Sp |
| Santosh et al, 2017 | USA, China, India | MC, CH, IN | Not named | Human reader | AUC, Sn, Sp |
| Lopes et al, 2017 | USA, China | MC, CH | Not named | Human reader | AUC |
| Santosh et al, 2016 | USA, China | MC, CH | Not named | Human reader | AUC |
| Hwang et al, 2016 | South Korea, USA, China | KIT, MC, CH | Alexnet | Human reader | AUC |
| Ilena et al, 2018 | China | CH | Matlab | Human reader | Sn, Sp, TP, TN, FP, FN |
| Rajaraman et al, 2018 | China, USA, Kenya, India | CH, MC, Kenya, IN | Not named | Human reader | AUC |
| Sivaramakrishnan et al, 2018 | China, USA, Kenya, India | CH, MC, Kenya, IN | Custom 12-layer CNN | Human reader | AUC |
| Vajda et al, 2018 | USA, China | MC, CH | Matlab | Human reader | AUC |
| Alfadhli et al, 2017 | USA | MC | Not named | Human reader | AUC, Sn, TP |
| Fatima et al, 2017 | USA | MC | Not named | Human reader | Sn, Sp |
| Ding et al, 2017 | China, India, Kenya | Kenya, IN, CH | Not named | Human reader | NR |
| Hogeweg, et al, 2017 | Japan, Sub-Saharan Africa | JSRT, Sub-Saharan Africa | Not named | Human reader | AUC |
| Udayakumar et al. 2017 | USA, China | MC, CH | SVM and CBC techniques | Human reader | AUC |
| Maduskar et al, 2016 | Zambia | Large Zambian | Not named | Human reader | AUC |
| Poornimadevi et al, 2016 | Japan, USA | JSRT, MC | Not named | Human reader | Sn, Sp |
| Karargyris et al, 2016 | China, Japan | JSRT, CH | Not named | Human reader | AUC |
| Melendez et al, 2016 | Zambia | Zambian | Not named | Human reader | AUC |
| Melendez et al, 2015 | Zambia, Tanzania, Gambia | Zambian, Tanzania, Gambian | Not named | Human reader | NR |
| Hogeweg et al, 2015 | UK, South Africa | F&T, TB-NEAT | Not named | Human reader, Liquid culture, composite reference standard | AUC, Sn, Sp |
| Giacomini et al, 2015 | Brazil | Prospective, study-specific | Not named | Liquid culture+ | NR |
| Jaeger et al, 2015 | China | CH | Not named | Human reader | NR |
| Requena-Mendez et al, 2015 | Peru | CXR from DOT study in Peru | Not named | Human reader | NR |
| Jaeger et al, 2014 | China, USA, Japan | JSRT, MC, CH | Not named | Human reader | AUC, Sn, Sp |
| Melendez et al, 2014 | Zambia, South Africa | Zambian | TB-Xpredict | Human reader | AUC |
| Chauhan et al, 2014 | India | IN | Not named | Human reader | NR |
| Seixas et al, 2013 | Brazil | Clinical data set from another study | Artificial Neural Network | Composite reference | NR |
| Sundaram et al, 2013 | Not specified | Not specified | Not named | Human reader | NR |
| Jaeger et al, 2012 | USA, Japan | JSRT, MC | Not named | Human reader | AUC |
| Xu et al, 2011 | Japan, Canada | JSRT, Calgary dataset | Andrews' curve | Human reader | TP, FP, FPR |
| Noor et al, 2011 | Malaysia | Retrospective non-clinical study specific radiological | Not named | Human reader | Sn, Sp |
| Shen et al, 2010 | Canada | JSRT, Calgary | Not named | Human reader | TP, FPR |
| Mouton et al, 2010 | South Africa | Clinical dataset from previous study not specific to PTB | Not named | Human reader | AUC |
| Hogeweg et al, 2010 | Sub-Saharan Africa | Sub-Saharan Africa | CAD with rib suppression | Human reader | AUC |
| Hogeweg et al, 2010 | Not specified | Not specified | Not named | Human reader | NR |
| Lieberman et al, 2009 | China | Prospective, study-specific | Not named | Human reader | NR |
| Arzhaeva et al, 2009 | Netherlands | F&T | Not named | Human reader | AUC |
| Noor et al, 2005 | China, USA | MC, CH | Andrews' curve | Composite reference | NR |
| Koesoemadinata et al, 2018 | Indonesia | Prospective study-specific | CAD4TB | Liquid culture/NAAT | AUC, Sn, Sp |
| Melendez et al, 2018 | United Kingdom | Find & Treat | CAD4TB | Human reader, TB treatment | AUC, Sn, Sp, TP, FP, TN, FN |
| Zaidi et al, 2018 | Pakistan | Sehatmand Zindagi (Healthy Life) | CAD4TB | NAAT | AUC, Sn, Sp |
| Rahman et al, 2017 | Bangladesh | Prospective, study-specific | CAD4TB | NAAT | AUC, Sn, Sp |
| Melendez et al, 2017 | Zambia | Zambia National TB Prevalence Survey | CAD4TB | Human reader CXR-, Liquid culture/NAAT for CXR+ | AUC, Sn, Sp |
| Muyoyeta et al, 2017 | Zambia | Prospective, study-specific | CAD4TB | NAAT for CXR+, AFB Smear for CXR- | NR |
| Melendez et al, 2016 | South Africa | TB-NEAT collaborative study | CAD4TB | Liquid culture | AUC, Sn, Sp |
| Philipsen et al, 2015 | South Africa | TB-NEAT collaborative study | CAD4TB | NAAT, liquid culture | AUC, Sn, Sp |
| Steiner et al, 2015 | Tanzania | TB REACH project | CAD4TB | Human reader | AUC, Sn, Sp |
| Muyoyeta et al, 2015 | Zambia | Prospective, study-specific | CAD4TB | NAAT, AFB Smear for CXR- | AUC, Sn, Sp |
| Breuninger et al, 2014 | Tanzania | TB Cohort and TB CHILD study | CAD4TB | Liquid culture, AFB smear | AUC, Sn, Sp |
| Muyoyeta et al, 2014 | Zambia | Prospective, study-specific | CAD4TB | NAAT | AUC, Sn, Sp |
| Maduskar et al, 2013 | Zambia | Prospective, study-specific | CAD4TB | Liquid culture, AFB smear | AUC, Sn, Sp |
CXR, chest x-ray; USA, United States of America; UK, United Kingdom; AI, artificial intelligence; YU AWHE, Yonsei University annual worker's health examination; SNUH, Seoul National University Hospital; BMC, Boramae Medical Center; KUHG, Kyunghee University Hospital at Gangdong; DEMC, Daejeon Eulji Medical Center; MC, Montgomery County; CH, Shenzhen Hospital, China; IN, Indian collection New Delhi; TJH, Thomas Jefferson Hospital dataset; JSRT, Japanese Society of Radiology; KIT, Korean Institute of Tuberculosis; F&T, Find and Treat; DLAD, deep learning automatic detection; SVM, Support vector machines; CBC, clustering based classification; CAD, computer aided detection; NAAT, nucleic acid amplification test; AFB, acid fast bacilli; ‘+’, positive; ‘-‘, negative; AUC, area under the receiver operating curve; Sn, sensitivity; Sp, specificity; NR, not reported; TP, true positives; FP, false positives; FPR, false positive rate; TN, true negatives, FN, false negatives; ACC, accuracy
* Trajman et al. Pleural fluid ADA, IgA-ELISA and NAAT sensitivities for the diagnosis of pleural tuberculosis Study
**Composite reference: positive culture/NAAT and/or initiation of TB treatment
†In these studies the study database was developed prospectively for the specific study
Accuracy measures reported by development studies.
| Author and year | Database(s) used for training of CAD | Number of CXRs used for training | Database (s) used for testing CAD | Number of CXRs used for testing | Number of TB positive CXR | AUC (95% CI) | Thres-hold score | Sn (95% CI) | Sp (95% CI) |
|---|---|---|---|---|---|---|---|---|---|
| Heo et al, 2019 | YU AWHE | 2000 | YU AWHE | 37475 | 1202 | 0.91 (NR), 0.92 (NR) | NR | NR | NR |
| Hwang et al, 2018 | SNUH | 60989 | SNUH, BMC, KUHG, DEMC, MC,CH | NR | 6768 | 0.988 (0.976–0.999) | NR | 0.95(SNUH), 0.94 (BMC), 1.0 (KUGH), 1.0 (DEMC), 1.0 (MC), 0.95 (CH) | 1.0 (SNUH), 0.96 (BMC), 0.91 (KUGH), 0.98 (DEMC), 0.94 (MC), 0.91 (CH) |
| Lakhani et al, 2017 | MC,CH, TJH, Belarus | 857 | MC, CH,TJ, Belarus | 150 | 75 | 0.99 (0.96–1.00) | NR | 0.97 (0.90–1.0) | 0.95 (0.87–0.98) |
| Santosh et al, 2017 | MC,CH, IN | 976 | MC,CH, IN | 976 | 478 | 0.92 (MC) 0.82 (CH) 0.96 (IN) | NR | 0.88 (MC) 0.78 (CH) 0.92 (IN) | 0.81 (MC) 0.76 (CH) 0.86 (IN) |
| Lopes et al, 2017 | NR | NR | CHMC, CI,NR | 1031 | 550 | 0.834 (CH) 0.926 (MC) | NR | NR | NR |
| Santosh et al, 2016 | NR | NR | CHMC, CI | 878 | 400 | 0.93 (CH) & 0.88 (MC) | NR | NR | NR |
| Hwang et al, 2016 | KIT | 9221 | KIT,MC,CH | 2427 | NR | 0.96 | NR | NR | NR |
| Ilena et al, 2018 | CH | 20 | CH | 30 | 15 | NR | NR | 0.67 (NR) | 0.86 (NR) |
| Rajaraman et al, 2018 | CH,MC, AMPATH, Kenya, IN | 2073 | CH,MC, Kenya,IN | 2073 | 785 | 0.991 (CH) 0.962 (MC) 0.826 (Kenya) 0.965 (IN) | NR | NR | NR |
| Sivaramakrishnan et al, 2018 | CH,MC, Kenya, IN | 1659 | CH,MC, Kenya, IN | 1228 | 785 | 0.926 (CH), 0.833 (MC), 0.775 (Kenya), 0.956 (IN) | NR | NR | NR |
| Vajda et al, 2018 | MC,CH | NR | MC,CH | 814 | 392 | 0.91 (MC), 0.99 (CH) | NR | NR | NR |
| Alfadhli et al, 2017 | MC | 97 | MC | 41 | 58 | 0.89 | NR | 0.79 | NR |
| Fatima et al, 2017 | MC | 138 | MC | 138 | 58 | NR | NR | 0.83 | 0.78 |
| Udayakumar et al. | MC,CH | NR | MC, CH | NR | NR | 0.87 | NR | 0.81 | 0.74 |
| Hogeweg, et al, 2017 | JSRT, Sub-Saharan Africa | NR | Sub-Saharan Africa | 348 | 174 | 0.891 | NR | NR | NR |
| Ding et al, 2017 | NR | NR | Kenya, IN,CH | NR | NR | 0.949 (CH), 0.982 (IN), 0.76 (Kenya) | NR | NR | NR |
| Maduskar et al, 2016 | Large Zambian | 629 | Large Zambian | 638 | NR | 0.9 | NR | 0.83 | 0.70 |
| Poornimadevi et al, 2016 | JSRT | 247 | JSRT | 247 | NA | NR | NR | 0.56 | 0.36 |
| Karargyris et al, 2016 | CH | 43 | JSRT,CH | NR | NR | 0.93 | NR | NR | NR |
| Melendez et al, 2016 | Zambian | 461 | Zambian | 456 | 248 | 0.87 | 0.45 | NR | NR |
| Melendez et al, 2015 | Zambian, Tanzania Gambian | 1323 | Zambian, Tanzania, Gambian | 1313 | 671 | 0.86 (Zambia), 0.88 (Tanzania), 0.91 Gambia | NR | NR | NR |
| Hogeweg et al, 2015 | F&T, TB-Neat | 400 | F&T, TB-Neat | 400 | 153 | 0.87 (0.81–0.92)(F&T), 0.74 (0.69–0.83)(TB-Neat) | NR | NR | NR |
| Jaeger et al, 2014 | MC,CH, JSRT | 1000 | MC,CH | 753 | 333 | 0.87 | NR | 0.78 (0.70–0.85) | 0.81 (0.71–0.89) |
| Melendez et al, 2014 | Zambian | 461 | Zambian | 456 | NR | 0.88 | NR | NR | NR |
| Chauhan et al, 2014 | IN | 204 | IN | 102 | 153 | 0.96 (0.86–0.99) (DA), 0.89 (0.77–0.96) (DB) | NR | 0.96 (DA), 0.88 (DB) | 0. 92 DA, 0.84 (DB) |
| Sundaram et al, 2013 | NR | 95 | NR | 95 | 52 | NR | NR | 0.75 | 0.90 |
| Jaeger et al, 2012 | JSRT | 247 | MC | 138 | NR | 0.83 | NR | NR | NR |
| Xu et al, 2011 | JSRT, Calgary | 60 | JSRT, Calgary | 60 | NR | NR | NR | 0.68 | 0.68 |
| Noor et al, 2011 | Retrospective non-clinical | 90 | Retrospective non-clinical | 213 | 208 | NR | NR | 0.88 | 0.84 |
| Shen et al, 2010 | JSRT, Calgary | 18 | JSRT, Calgary | 131 | 19 | NR | NR | 0.82 | NR |
| Mouton et al, 2010 | Clinical non-TB specific | 119 | Clinical non-TB specific | 119 | NR | NR | 0.78 | NR | NR |
| Hogeweg, et al, 2017 | CRASS | 348 | CRASS, JSRT | 498 | NR | 0.75 | NR | NR | NR |
| Arzhaeva et al, 2009 | F&T | 217 | F&T | 217 | 37 | NR | 0.83 TB-sus, 0.74 micro | NR | NR |
CAD, Computer aided detection;; YU AWHE, Yonsei University annual worker's health examination; SNUH, Seoul National University Hospital; BMC, Boramae Medical Center; KUHG, Kyunghee University Hospital at Gangdong; DEMC, Daejeon Eulji Medical Center; MC, Montgomery County; CH, Shenzhen Hospital, China; IN, Indian collection New Delhi; TJH, Thomas Jefferson Hospital dataset; AMPATH, Academic Model Providing Access to Healthcare; JSRT, Japanese Society of Radiology; KIT, Korean Institute of Tuberculosis; F&T, Find and Treat; AUC, area under the receiver operating curve; 95% CI, 95 percent confidence interval; NR, not reported; DA, dataset A; DB, dataset B; Sn, sensitivity; Sp, specificity;; TP, true positives; FP, false positives; FPR, false positive rate; TB-sus, TB suspect
* No 95% CI reported
+Average AUC from KIT, MC, Shenzhen
++ 128 of the normal images were the same CXRS used in the training
# An external and radiological reference standard were used. The external reference for tuberculosis was set by an independent test not associated with the CXR; the result of a sputum culture testing for the TB-NEAT database and a combination of sputum culture testing and clinical diagnosis for the Find & Treat database
## Two CXR digital image datasets, dataset A and B, were obtained from two different X-ray machines available at the National Institute of Tuberculosis and Respiratory Diseases, New Delh
†The database was split between TB suspect cases were re-read by a third radiologist, and if classified differently were excluded. The database contained 256 normal radiographs, 178 TB suspect radiographs, and 37 microbiologically diagnosed TB CXRs.
Fig 2Quality assessment (QUADAS 2) graph of development studies.
Fig 3Quality assessment (QUADAS 2) graph of clinical studies.
Fig 4Forest plots of accuracy measures of development and CAD4TB studies.
TP, true positive; FP, false positive; FN, false negative; TN, true negative; AI, artificial intelligence; CXRs, chest x-rays; ML, machine learning; DL, deep learning; CI, confidence interval; NAAT, nucleic acid amplification test.
Fig 5Boxplots of the AUC of studies stratified by software design, CXR usage, reference standard, and degree of patient selection, index test, and reference standard bias.
AUC, area under the cure; Vs, versus; CXR, chest x-ray.
| • For the databases used to assess CAD accuracy, describe whether CXR had been used for triage or screening purposes. |
| • Apply QUADAS-2 to assess the risk of bias in the databases used to evaluate CAD’s diagnostic accuracy |
| • Describe how CXRs were selected for training and testing |
| • Use different CXRs from separate databases for training and testing |
| • Clearly define true positive PTB |
| • Use a microbiologic reference standard of culture (preferred) or NAAT |
| • For CAD that output a continuous score, preferably pre-specify the threshold used to differentiate between a positive and negative CAD result. |
| • For CAD that output a continuous score, report how the threshold score was determined |
| • State whether pre-training/verification of CAD with local CXRs is required prior to use in each setting |