| Literature DB >> 32984796 |
Minghuan Wang1, Chen Xia2, Lu Huang3, Shabei Xu1, Chuan Qin1, Jun Liu4, Ying Cao2, Pengxin Yu2, Tingting Zhu3, Hui Zhu3, Chaonan Wu3, Rongguo Zhang2, Xiangyu Chen5, Jianming Wang6, Guang Du7, Chen Zhang5, Shaokang Wang2, Kuan Chen2, Zheng Liu8, Liming Xia3, Wei Wang1.
Abstract
Background: Prompt identification of patients suspected to have COVID-19 is crucial for disease control. We aimed to develop a deep learning algorithm on the basis of chest CT for rapid triaging in fever clinics.Entities:
Mesh:
Year: 2020 PMID: 32984796 PMCID: PMC7508506 DOI: 10.1016/S2589-7500(20)30199-0
Source DB: PubMed Journal: Lancet Digit Health ISSN: 2589-7500
Figure 1Development and validation of a deep learning algorithm to provide rapid triage in fever clinics and to automatically analyse lung opacities on the basis of chest CT scans
(A) Overview of the development and validation of the algorithm. (B) Evaluation of triage efficiency; black lines show the standard workflow in Chinese fever clinics; after a patient's CT examination is completed, a first reader drafts a radiology report in a first-in-first-out order and then a second radiologist revises and approves the first reader's report before sending it to a fever clinician; after receiving the radiological report the fever clinician decides whether the patient qualifies as a suspected case and should receive RT-PCR testing; we proposed that through directly notifying either the second radiologist (ie, scan-to-second-reader triage; red line) or the fever clinician (scan-to-fever-clinician triage; green line) of suspected cases triaged by AI, the workflow in fever clinics could be expedited. AI=artificial intelligence.
Figure 2Data collection
(A) Dataset for algorithm development and internal validation. (B) Dataset for external validation.
Characteristics of the Tongji dataset used for algorithm development and internal validation
| Sex | ||||
| Male | 1544 (50%) | 1235 (50%) | 309 (48%) | |
| Female | 1542 (50%) | 1212 (50%) | 330 (52%) | |
| Age, years | 55 (39–67) | 55 (39–67) | 55 (38–66) | |
| COVID-19 positive CT scans | 2086 (68%) | 1647 (67%) | 439 (69%) | |
| COVID-19 negative CT scans | 1000 (32%) | 800 (33%) | 200 (31%) | |
| CT manufacturers | ||||
| GE Medical System (Chicago, IL, USA) | 140 (5%) | 122 (5%) | 18 (3%) | |
| MinFound Medical Systems (Shaoxing, China) | 6 (<1%) | 6 (<1%) | 0 | |
| Siemens (Munich, Germany) | 2180 (71%) | 1723 (70%) | 457 (72%) | |
| Toshiba (Tokyo, Japan) | 74 (2%) | 51 (2%) | 23 (4%) | |
| United Imaging Healthcare (Shanghai, China) | 686 (22%) | 545 (22%) | 141 (22%) | |
Data are n (%), or median (IQR).
Characteristics of the external validation dataset and accuracy of AI-aided triage
| Sex | |||||
| Male | 1079 (51%) | 506 (46%) | 463 (56%) | 110 (54%) | |
| Female | 1041 (49%) | 591 (54%) | 357 (44%) | 93 (46%) | |
| Age, years | 43 (31–56) | 48 (36–58) | 34 (27–49) | 45 (34–63) | |
| CT scans, n | 2120 | 1097 | 820 | 203 | |
| CT manufacturers | |||||
| GE Medical System (Chicago, IL, USA) | 1730 (82%) | 978 (90%) | 752 (92%) | 0 | |
| Siemens (Munich, Germany) | 271 (13%) | 0 | 68 (8%) | 203 (100%) | |
| United Imaging Healthcare (Shanghai, China) | 119 (6%) | 119 (11%) | 0 | 0 | |
| CT findings | |||||
| Positive | 802 (38%) | 547 (50%) | 180 (22%) | 75 (37%) | |
| Negative | 1318 (62%) | 550 (50%) | 640 (78%) | 128 (63%) | |
| AI-aided triage performance | |||||
| Sensitivity (95% CI) | 0·923 (0·914–0·932) | 0·934 (0·925–0·944) | 0·900 (0·880–0·924) | 0·893 (0·862–0·932) | |
| Specificity (95% CI) | 0·851 (0·842–0·860) | 0·855 (0·840–0·868) | 0·859 (0·846–0·874) | 0·789 (0·752–0·828) | |
| Positive predictive value (95% CI) | 0·790 (0·777–0·803) | 0·865 (0·851–0·878) | 0·643 (0·613–0·673) | 0·713 (0·662–0·764) | |
| Negative predictive value (95% CI) | 0·948 (0·941–0·954) | 0·929 (0·919–0·940) | 0·968 (0·962–0·976) | 0·927 (0·905–0·953) | |
| AUC (95% CI) | 0·953 (0·949–0·959) | 0·966 (0·961–0·971) | 0·931 (0·921–0·945) | 0·908 (0·888–0·929) | |
| RT-PCR testing | |||||
| Positive | 217/910 (24%) | 118/411 (29%) | 87/369 (24%) | 12/130 (9%) | |
| Negative | 693/910 (76%) | 293/411 (71%) | 282/369 (76%) | 118/130 (91%) | |
| AI-aided triage performance | |||||
| Sensitivity (95% CI) | 0·876 (0·854–0·898) | 0·907 (0·883–0·935) | 0·839 (0·803–0·880) | 0·833 (0·750–1·000) | |
| Specificity (95% CI) | 0·519 (0·501–0·539) | 0·386 (0·358–0·401) | 0·660 (0·633–0·686) | 0·517 (0·473–0·562) | |
| Positive predictive value (95% CI) | 0·363 (0·343–0·384) | 0·373 (0·345–0·401) | 0·432 (0·392–0·469) | 0·149 (0·109–0·189) | |
| Negative predictive value (95% CI) | 0·930 (0·918–0·944) | 0·911 (0·889–0·939) | 0·930 (0·913–0·949) | 0·968 (0·957–1·000) | |
| AUC (95% CI) | 0·774 (0·757–0·791) | 0·725 (0·699–0·751) | 0·837 (0·810–0·866) | 0·679 (0·608–0·781) | |
| RT-PCR positive and CT positive cases | 191/217 (88%) | 109/118 (92%) | 72/87 (83%) | 10/12 (83%) | |
| Sensitivity of AI-aided triage on RT-PCR positive and CT positive cases (95% CI) | 0·974 (0·966–0·987) | 0·972 (0·964–0·989) | 0·972 (0·963–1·000) | 1·000 (1·000–1·000) | |
Data are n (%), median (IQR), or n/N (n%), unless otherwise stated. AUC=area under the receiver operating curve.
Dataset from Wuhan (Hubei province): 2018 population, 8 837 300 (according to National Bureau of Statistics of China), 50 006 confirmed COVID-19 cases (calculated up to March 25, 2020, according to the National Health Commission of the People's Republic of China), and a disease prevalence of 0·566% (ie, number of confirmed cases in the total population).
Dataset from Xianning (Hubei province): 2018 population, 2 485 000 (according to National Bureau of Statistics of China), 836 confirmed COVID-19 cases (calculated up to March 25, 2020, according to the National Health Commission of the People's Republic of China), and and a disease prevalence of 0·034% disease prevalence.
Dataset from Changsha (Hunan province): 2018 population, 7 288 600 (according to 2018 National Bureau of Statistics of China data), 242 confirmed COVID-19 cases (calculated up to March 25, 2020, according to the National Health Commission of the People's Republic of China), and a disease prevalence of 0·003% disease prevalence.
Of 802 positive CT scans, 772 clearly mentioned COVID-19 signs in the radiological impression section and 30 had an ambiguous radiological impression description but described COVID-19 signs in the radiological findings section (11 in the Tianyou Hopsital dataset, nine in the Xianning Central hospital dataset, and ten in the Second Xiangya Hospital dataset).
Of 1318 negative CT studies, 593 had negative CT findings and 725 had positive findings not associated with COVID-19 (324 in the Tianyou Hopsital dataset, 291 in the Xianning Central Hospital dataset, and 110 in the Second Xiangya Hopsital dataset).
Radiological CT findings were used as the reference standard.
RT-PCR was used as the reference standard.
Figure 3AI triage accuracy for the internal validation set, external validation set overall, and three individual hospital datasets
The black points indicate sensitivity and specificity thresholds used. AUC=area under the receiver operating curve.
Triage efficiency for the external validation set
| True positive scans, n | 698 | 511 | 129 | 58 | |
| Median draft report time (IQR), min | 16·21 (11·67–25·71) | 14·50 (10·75–21·11) | 24·75 (15·67–43·72) | 25·73 (20·10–38·84) | |
| Median report approval time (IQR), min | 23·06 (15·67–39·20) | 19·23 (14·33–27·33) | 47·37 (27·12–96·35) | 198·77 (45·85–675·83) | |
| Median AI triage time (IQR), min | 0·55 (0·43–0·63) | 0·58 (0·45–0·64) | 0·50 (0·42–0·58) | 0·48 (0·44–0·53) | |
| Median reduction in triage time under scan-to-second-reader triage workflow (IQR), min | 15·73 (11·05–25·25) | 14·03 (10·13–20·55) | 24·31 (15·13–43·40) | 25·24 (19·65–38·48) | |
| p value | <0·0001 | <0·0001 | <0·0001 | <0·0001 | |
| 257·42 | 243·09 | 114·59 | 69·32 | ||
| Median reduction in triage time under scan-to-fever-clinician triage workflow (IQR), min | 22·62 (15·12–38·63) | 18·77 (13·88–26·73) | 47·03 (26·53–95·83) | 198·28 (45·31–675·26) | |
| p value | <0·0001 | <0·0001 | <0·0001 | <0·0001 | |
| 188·08 | 253·61 | 87·37 | 50·78 | ||
Raw data did not follow a normal distribution.
Calculated by comparing the results with zero.
Performance of AI and radiologists for the identification of changes in lesion burden between two CT scans
| True positive | 50 | 47 | 52 | 51 |
| True negative | 42 | 47 | 42 | 45 |
| False positive | 6 | 1 | 6 | 3 |
| False negative | 2 | 5 | 0 | 1 |
| Accuracy (95% CI) | 0·920 (0·900–0·950) | 0·940 (0·925–0·962) | 0·940 (0·925–0·962) | 0·960 (0·950–0·988) |
| Sensitivity (95% CI) | 0·962 (0·947–1·000) | 0·904 (0·872–0·951) | 1·000 (1·000–1·000) | 0·981 (0·974–1·000) |
| Specificity (95% CI) | 0·875 (0·833–0·923) | 0·979 (0·971–1·000) | 0·875 (0·833–0·923) | 0·938 (0·917–0·974) |
Data are n, unless stated otherwise. 52 patients had an increase in lesion burden volume and were defined as positive. 48 patients did not have any increase in lesion burden volume and were defined as negative. We presented the complete information to show interrater variability. AI=artificial intelligence.
Correct prediction of lesion burden volume increase.
Correct prediction of no increase in lesion burden volume.
Incorrect prediction of lesion burden volume increase.
Incorrect prediction of no increase in lesion burden volume.