Literature DB >> 34580550

A practical artificial intelligence system to diagnose COVID-19 using computed tomography: A multinational external validation study.

Ali Abbasian Ardakani¹, Robert M Kwee², Mohammad Mirza-Aghazadeh-Attari³, Horacio Matías Castro⁴, Taha Yusuf Kuzan⁵, Kübra Murzoğlu Altintoprak⁵, Giulia Besutti^6,7, Filippo Monelli^6,7, Fariborz Faeghi¹, U Rajendra Acharya^8,9,10, Afshin Mohammadi¹¹.

Abstract

Computed tomography has gained an important role in the early diagnosis of COVID-19 pneumonia. However, the ever-increasing number of patients has overwhelmed radiology departments and has caused a reduction in quality of services. Artificial intelligence (AI) systems are the remedy to the current situation. However, the lack of application in real-world conditions has limited their consideration in clinical settings. This study validated a clinical AI system, COVIDiag, to aid radiologists in accurate and rapid evaluation of COVID-19 cases. 50 COVID-19 and 50 non-COVID-19 pneumonia cases were included from each of five centers: Argentina, Turkey, Iran, Netherlands, and Italy. The Dutch database included only 50 COVID-19 cases. The performance parameters namely sensitivity, specificity, accuracy, and area under the ROC curve (AUC) were computed for each database using COVIDiag model. The most common pattern of involvement among COVID-19 cases in all databases were bilateral involvement of upper and lower lobes with ground-glass opacities. The best sensitivity of 92.0% was recorded for the Italian database. The system achieved an AUC of 0.983, 0.914, 0.910, and 0.882 for Argentina, Turkey, Iran, and Italy, respectively. The model obtained a sensitivity of 86.0% for the Dutch database. COVIDiag model could diagnose COVID-19 pneumonia in all of cohorts with AUC of 0.921 (sensitivity, specificity, and accuracy of 88.8%, 87.0%, and 88.0%, respectively). Our study confirmed the accuracy of our proposed AI model (COVIDiag) in the diagnosis of COVID-19 cases. Furthermore, the system demonstrated consistent optimal diagnostic performance on multinational databases, which is critical to determine the generalizability and objectivity of the proposed COVIDiag model. Our results are significant as they provide real-world evidence regarding the applicability of AI systems in clinical medicine.

Entities: Chemical

Keywords: Artificial intelligence; Coronavirus infections; Machine learning; Pneumonia; Tomography, X-ray computed

Year: 2021 PMID： 34580550 PMCID： PMC8457921 DOI： 10.1016/j.patrec.2021.09.012

Source DB: PubMed Journal: Pattern Recognit Lett ISSN： 0167-8655 Impact factor: 3.756

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a "once in a generation" pandemic. Conventionally the disease and its corresponding condition termed coronavirus disease 2019 (COVID-19) was diagnosed early by first-line clinicians or health workers and was confirmed with reverse transcription-polymerase chain reaction (RT-PCR) [1]. The high specificity of RT-PCR is guaranteed by the available diagnostic kits that target specific regions on the ribonucleic acid (RNA) of the virus, which could not be found on other subfamilies of the virus [2]. However, this comes at the cost of reduced sensitivity, as the detection of these sequences depends on high levels of technical and procedural efficacy. Early studies suggested that RT-PCR's sensitivity was below 70 percent, as a subgroup of patients tested negative for the virus when asymptomatic, and showed positive tests after becoming symptomatic; furthermore, some patients who were symptomatic with COVID-19 never had positive RT-PCR test result [3], [4], [5], [6]. To overcome these limitations and address the accessibility and availability of commercial tests, a handful of other diagnostic methods were considered, especially when RT-PCR is not available or the results are delayed [7]. One of these is chest CT, which was shown to be sensitive in diagnosing COVID-19 pneumonia and may be used as an adjunct to RT-PCR [8], [9], [10]. Early chest CT of the disease showed a dominant pattern, which consisted of bilateral ground-glass opacities (GGOs) located at the periphery of the lungs. In later stages, the patients developed airspace consolidation and fibrous band formation in some cases. Although no specific signs were seen in CT images, the distribution of opacities especially during the disease peak, when prevalence was high, made diagnosis possible by radiologists [11]. Chest CT was considered for initial triage of patients and was later incorporated into treatment guidelines as a proxy indicator of disease severity. Some studies also suggested a prognostic role for initial CT imaging obtained from the patients [12]. This over-reliance on imaging, highlighted the inherent limitations of radiology practice. First of all, limited access to the proper imaging machines and expert radiologists may limit the wide-scale utilization of chest imaging in the frontlines [13]. Second, correct diagnosis of condition may be challenging as a wide range of conditions can possibly have similar radiologic presentations, such as other viral and atypical pneumonias [14]. Finally, an overload of patients and imaging outputs may heavily burden the radiology units, and cause attrition of infrastructure and human resources, reducing the quality of image interpretation [15,16]. To mitigate these limitations, a novel computer-aided diagnosis (CAD) systems is proposed. CT images are used as an input to the CAD systems, as they are widely used in clinical settings and have achieved acceptable accuracy in detecting COVID-19. These systems have been fed by either images or the interpretation of the images [17,18]. Such a system can be an optimal tool in aiding the diagnosis of COVID-19, as it relies on limited information from human source, is fast, accurate, and would omit subjective assessment of relations between different inputs, as radiologists often do. In fact, objectifying this step of diagnosis may prove to be detrimental in any form of evidence-based clinical decision-making. In order to achieve a desirable level of generalizability for any CAD system, it needs to be validated using multinational external databases in order to determine the equal performance in different clinical settings, imaging protocols, populations, and countries. Furthermore, these systems should be compared to radiologists, to see if their relative efficacy would help in real clinical situations [19]. Importantly, most of the AI systems currently being considered for COVID-19 detection, have not been compared with radiologists [20]. In the present study, we validate the clinical CAD system (COVIDiag) that yielded better diagnostic performance than radiologists. Also, we have tested the performance of the model, using databases from different countries to determine its efficiency in daily routine practices [17]. In this work, the COVIDiag model was validated using five databases from five different countries. The proposed COVIDiag software is made freely available and attached to the article.

Patients and Methods

The present study reports the results of the real-world application of an AI system based on CT findings of COVID-19 and non-COVID-19 patients. The COVIDiag model was built and validated (internal validation) in our previous publication [17]. For external validation purposes, the databases used in this study were gathered independently from different countries and were interpreted by independent readers.

Patients

This study was approved by the Institutional Review Board of respective centers. In this study, laboratory-confirmed COVID-19 and non-COVID-19 patients from five centers from different countries and continents were included: 1- Hospital Italiano de Buenos Aires, city of Buenos Aires, Argentina; 2- University of Health Sciences, Sancaktepe Şehit Prof. Dr. İlhan Varank Training and Research Hospital, Istanbul, Turkey; 3- Imam-Asadabadi Hospital, Tabriz University of Medical Sciences, Tabriz, Iran; 4- Zuyderland Medical Center, Heerlen/Sittard-Geleen, The Netherlands; and 5- AUSL-IRCCS di Reggio Emilia, Reggio Emilia, Italy. The COVID-19 confirmation was done by RT-PCR test based on nasopharyngeal and/or pharyngeal swabs. Fig. 1 . demonstrates the geographical distribution of the databases.

Fig. 1

Geographical distribution of databases used in this study.

Geographical distribution of databases used in this study. Exclusion criteria consisted of those individuals with suggestive clinical symptoms but negative RT-PCR test results, or those who had pre-existing chronic lung conditions, such as silicosis, idiopathic fibrosis, cystic fibrosis, etc. Patients with a time gap between PCR and CT examination greater than 3 days were excluded. The non-COVID-19 group consisted of those individuals with other causes of atypical and viral pneumonia, acquired from before the COVID-19 pandemic to eliminate all uncertainties and confounding factors.

CT imaging and features extraction

All cases had undergone high-resolution CT (HRCT) imaging. The HRCT protocols of each imaging center are presented in Table 1 . Fifteen CT features, which were useful to diagnose COVID-19 cases [17], were extracted by an expert pulmonologist with 6 years (Argentinian), and expert radiologist with 5 years (Dutch), 6 years (Turkish), 5 years (Iranian), 5 years (Italian) of experience in thoracic imaging:

Table 1

Computed tomography parameters of five centers used in this study.

Center	Scanner	Tube voltage, kVp	Tube current, mAs	Pitch	Matrix	Reconstruction slice thickness, mm	Reconstruction algorithm
Argentina	Canon Aquilion 64	120	50-100	1.5	512 × 512	1	Adaptive iterative Dose Reduction 3D (AIDR3D)
Argentina	Canon Activion 16	120	50-100	0.9	512 × 512	2	Adaptive iterative Dose Reduction 3D (AIDR3D)
Turkey	GE Optima 520	120	100-200	0.8-2.0	512 × 512	1.25	Adaptive Statistical Iterative Reconstruction (ASIR)
Iran	Siemens SOMATOM scope	120	50-100	0.8-1.5	512 × 512	1.5	Model-based iterative reconstruction (MBIR)
The Netherlands	Philips Incisive	120	73	1.0	512 × 512	1	Iterative
The Netherlands	Siemens SOMATOM Definition Flash	120	85	1.2	512 × 512	1	Iterative
Italy	Siemens SOMATOM Definition Edge	120	50–150	1.2	512 × 512	1	Advanced Modeled Iterative Reconstruction (ADMIRE)

Location of the lesion, 1 (side of involvement: unilateral vs. bilateral) Location of the lesion, 2 (position of involved lobes: lower, upper, or both lobes) Distribution pattern (central, peripheral or both) Number of lesions (single, if one patch of a lesion existed; multiple, if 2-4 patches of lesions existed; diffuse, if the entire lobe affected bilaterally) GGO (hazy, ill-defined opacities which do not obscure the underlying lung parenchyma) Consolidation (an opacification of the lung field which obscures the underlying lung parenchyma) Reticular opacity (linear lesions with a thickness equal or less than 3mm) Nodules (round or oval lesions with well-defined margins, without regard to the diameter of the lesion) Bronchial wall thickening (abnormal thickening of bronchial walls usually due to an inflammatory response) Air bronchogram (the visibility of air-filled bronchi by the opacification of alveoli) Cavity Crazy-paving (the simultaneous appearance of GGO and interlobar and intralobular septal thickening) Pleural effusion (abnormal accumulations of fluid within the pleural space. This term does not consider the characteristics of the fluid accumulated) Pleural thickening (thickening of either the parietal or visceral pleura, which is less than 5mm in most benign causes) Lymphadenopathy (the existence of a lymph node with a size greater or equal to 10 mm in the short axis) Computed tomography parameters of five centers used in this study.

Artificial intelligence technique

In this study, the COVIDiag model was used to evaluate its validity to differentiate COVID-19 from non-COVID-19 cases. The COVIDiag model is based on ensemble learning, which succeeded in diagnosing COVID-19 and non-COVID-19 cases with an area under the ROC curve (AUC) of 0.988 and 0.965 in the training and testing dataset of Ardakani et al. study [17]. All the fifteen CT findings were fed into the COVIDiag model in the proposed order from each database. The COVIDiag software is made freely available and attached to the article.

Statistical analysis and performance analysis

To evaluate the model, the following parameters were determined for each of the databases: where and are the number of cases diagnosed correctly as COVID-19 and non-COVID-19 pneumonia, respectively. In addition, non-COVID-19 cases wrongly diagnosed as COVID-19 and incorrectly diagnosed COVID-19 cases are assigned as and , respectively. The ROC curve analysis was used to determine the AUC with 95% confidence interval (CI) and evaluate the algorithm's performance on each of the database. A p-value less than 0.05 was considered significant. All statistical analysis was performed by the SPSS software (version 24, IBM Corporation).

Results

The performance of COVIDiag model was evaluated by databases from five different countries. Each of database from Argentina, Turkey, Iran, and Italy had 50 COVID-19 and 50 non-COVID-19 pneumonia cases. The database from the Netherlands included only 50 COVID-19 pneumonia subjects. The common patterns in the COVID-19 patients were bilateral involvement (229/250, 91.6%), with multiple patchy opacities (137/250, 54.8%) affecting both upper and lower segments of the lungs (201/250, 80.4%) with the lesions predominantly affecting the lungs' peripheral regions (133/250, 53.2%). In addition, ground-glass opacity and crazy-paving were seen in 242 (96.8%) and 67 (26.8%) out of 250 COVID-19 cases. Table 2 demonstrates other CT findings of patients, obtained from each of the five centers.

Table 2

CT chest findings of COVID-19 and non-COVID-19 groups based on each center.

CT Findings	Database
	Argentina		Turkey		Iran		Italy		The Netherlands
	C	NC	C	NC	C	NC	C	NC	C
Location 1
Unilateral	1 (2.0)	14 (28.0)	8 (16.0)	17 (34.0)	7 (14.0)	19 (38.0)	0 (0.0)	9 (18.0)	5 (10.0)
Bilateral	49 (98.0)	36 (72.0)	42 (84.0)	33 (66.0)	43 (86.0)	31 (62.0)	50 (100)	41 (82.0)	45 (90.0)
Location 2
Lower Lobe	15 (30.0)	18 (36.0)	17 (34.0)	10 (20.0)	5 (10.0)	12 (24.0)	0 (0.0)	10 (20.0)	5 (10.0)
Upper Lobe	0 (0.0)	5 (10.0)	2 (4.0)	6 (12.0)	4 (8.0)	25 (50.0)	1 (2.0)	2 (4.0)	0 (0.0)
Both Lobes	35 (70.0)	27 (54.0)	31 (62.0)	34 (68.0)	41 (82.0)	13 (26.0)	49 (98.0)	38 (76.0)	45 (90.0)
Distribution
Peripheral	31 (62.0)	6 (12.0)	34 (68.0)	3 (6.0)	37 (74.0)	8 (16.0)	5 (10.0)	13 (26.0)	26 (52.0)
Central	0 (0.0)	17 (34.0)	0 (0.0)	1 (2.0)	4 (8.0)	22 (44.0)	0 (0.0)	3 (6.0)	0 (0.0)
Both Central and Peripheral	19 (38.0)	27 (54.0)	16 (32.0)	46 (92.0)	9 (18.0)	20 (40.0)	45 (90.0)	34 (68.0)	24 (48.0)
Lesion
Single	1 (2.0)	12 (24.0)	6 (12.0)	5 (10.0)	7 (14.0)	15 (30.0)	0 (0.0)	4 (8.0)	4 (8.0)
Multiple	34 (68.0)	33 (66.0)	40 (80.0)	26 (52.0)	35 (70.0)	29 (58.0)	2 (4.0)	23 (46.0)	26 (52.0)
Diffuse	15 (30.0)	5 (10.0)	4 (8.0)	19 (38.0)	8 (16.0)	6 (12.0)	48 (96.0)	23 (46.0)	20 (40.0)
GGO
No	4 (8.0)	38 (76.0)	0 (0.0)	4 (8.0)	1 (2.0)	34 (68.0)	2 (4.0)	11 (22.0)	1 (2.0)
Yes	46 (92.0)	12 (24.0)	50 (100)	46 (92.0)	49 (98.0)	16 (32.0)	48 (96.0)	39 (78.0)	49 (98.0)
Consolidation
No	29 (58.0)	11 (22.0)	33 (66.0)	8 (16.0)	29 (58.0)	26 (52.0)	31 (62.0)	15 (30.0)	28 (56.0)
Yes	21 (42.0)	39 (78.0)	17 (34.0)	42 (84.0)	21 (42.0)	24 (48.0)	19 (38.0)	35 (70.0)	22 (44.0)
Reticular
No	27 (54.0)	44 (88.0)	44 (88.0)	33 (66.0)	47 (94.0)	21 (42.0)	8 (16.0)	9 (18.0)	46 (92.0)
Yes	23 (46.0)	6 (12.0)	6 (12.0)	17 (34.0)	3 (6.0)	29 (58.0)	42 (84.0)	41 (82.0)	4 (8.0)
Nodule
No	48 (96.0)	21 (42.0)	48 (96.0)	21 (42.0)	42 (84.0)	32 (64.0)	48 (96.0)	26 (52.0)	49 (98.0)
Yes	2 (4.0)	29 (58.0)	2 (4.0)	29 (58.0)	8 (16.0)	18 (36.0)	2 (4.0)	24 (48.0)	1 (2.0)
Bronchial Wall Thickening
No	46 (92.0)	37 (74.0)	47 (94.0)	11 (22.0)	30 (60.0)	31 (62.0)	47 (94.0)	22 (44.0)	47 (94.0)
Yes	4 (8.0)	13 (26.0)	3 (6.0)	39 (78.0)	20 (40.0)	19 (38.0)	3 (6.0)	28 (56.0)	3 (6.0)
Air Bronchogram
No	43 (86.0)	38 (76.0)	47 (94.0)	22 (44.0)	34 (68.0)	48 (96.0)	36 (72.0)	22 (44.0)	39 (78.0)
Yes	7 (14.0)	12 (24.0)	3 (6.0)	28 (56.0)	16 (32.0)	2 (4.0)	14 (28.0)	28 (56.0)	11 (22.0)
Cavity
No	50 (100)	43 (86.0)	50 (100)	50 (100)	50 (100)	47 (94.0)	50 (100)	46 (92.0)	48 (96.0)
Yes	0 (0.0)	7 (14.0)	0 (0.0)	0 (0.0)	0 (0.0)	3 (6.0)	0 (0.0)	4 (8.0)	2 (4.0)
Crazy Paving
No	35 (70.0)	50 (100)	47 (94.0)	48 (96.0)	36 (72.0)	42 (84.0)	32 (64.0)	42 (84.0)	37 (74.0)
Yes	15 (30.0)	0 (0.0)	3 (6.0)	2 (4.0)	14 (28.0)	8 (16.0)	18 (36.0)	8 (16.0)	13 (26.0)
Pleural Effusion
No	47 (94.0)	36 (72.0)	50 (100)	35 (70.0)	43 (86.0)	37 (74.0)	50 (100)	31 (62.0)	44 (88.0)
Yes	3 (6.0)	14 (28.0)	0 (0.0)	15 (30.0)	7 (14.0)	13 (26.0)	0 (0.0)	19 (38.0)	6 (12.0)
Pleural Thickening
No	50 (100)	47 (94.0)	43 (86.0)	32 (64.0)	50 (100)	45 (90.0)	47 (94.0)	40 (80.0)	48 (96.0)
Yes	0 (0.0)	3 (6.0)	7 (14.0)	18 (36.0)	0 (0.0)	5 (10.0)	3 (6.0)	10 (20.0)	2 (4.0)
Lymphadenopathy
No	45 (90.0)	48 (96.0)	46 (92.0)	38 (76.0)	47 (94.0)	33 (66.0)	48 (96.0)	35 (70.0)	45 (90.0)
Yes	5 (10.0)	2 (4.0)	4 (8.0)	12 (24.0)	3 (6.0)	17 (34.0)	2 (4.0)	15 (30.0)	5 (10.0)

CT chest findings of COVID-19 and non-COVID-19 groups based on each center. In the non-COVID-19 group, the common pattern consisted of bilateral involvement (141/200, 70.5%), with multiple patchy opacities (111/200, 55.5%) affecting both upper and lower segments of the lungs (112/200, 56%), with the lesions mostly occurring in both central and peripheral regions of the lobes (127/200, 63.5%). The most common imaging sign in the non-COVID19 group was consolidation (140/200, 70%), followed by ground-glass opacities (113/200, 56.5%). The COVIDiag model could distinguish COVID-19 from non-COVID-19 pneumonia cases with AUC of 0.983 (95% CI: 0.963-1.000; 90.0% sensitivity; 96.0% specificity), 0.914 (95% CI: 0.856-0.972; 86.0% sensitivity; 88.0% specificity), 0.910 (95% CI: 0.849-0.970; 90% sensitivity; 84% specificity), and 0.882 (95% CI: 0.810-0.953; 92% sensitivity; 80% specificity) for Argentinian, Turkish, Iranian, and Italian databases, respectively. As the database from the Netherlands consisted of COVID-19 patients, only sensitivity was determined. Hence, our model could diagnose 43 out of 50 (86.0% sensitivity) COVID-19 cases correctly (Table 3 ). COVIDiag model could diagnose COVID-19 pneumonia in all cohorts with AUC of 0.921 (95%CI: 0.894-0.948, 88.8% sensitivity, 87.0% specificity, 88.0% accuracy). Fig. 2 represents ROC curves and radar plots of COVIDiag for different databases.

Table 3

Summary of developed AI systems for COVID-19 diagnosis.

Ref	Network	Origin	Patients used for training (internal validation)	Pneumonia Groups	Number of Centers, Countries for System Development	Training/Internal validation					External validation
Ref	Network	Origin	Patients used for training (internal validation)	Pneumonia Groups	Number of Centers, Countries for System Development	Sen (%)	Spc (%)	Acc (%)	AUC	Patients used for external validation	Number of centers, countries for external validation	Sen (%)	Spc (%)	Acc (%)	AUC
[29]	DensNet121	United States	984 (296)	COVID-19, non-COVID-19 (Viral, Bacterial, Fungal)	Six centers, four countries (China, Italy, Japan, USA)	NA/NA	NA/NA	NA/91.70	NA/NA	1337	Six centers from four countries (China, Italy, Japan, USA)	84.00	93.00	90.80	0.949
[30]	Inception	Netherlands	476 (105)	Negative, positive COVID-19	Two centers, Netherlands	NA/85.7	NA/89.8	NA/NA	NA/0.950	262	One center, Netherlands	82.00	80.50	NA	0.880
[18]	ResNet18	China	2246 (260)	COVID-19, non-COVID-19 (Viral, Bacterial, Mycoplasma), and Normal	Seven centers from China	NA/94.93	NA/91.13	NA/92.49	NA/0.979	208	Yichang, China	92.51	85.92	90.70	0.971
										242	Hefei, China	94.74	89.19	90.32	0.970
										409	Wuhan, China	94.03	88.46	91.20	0.961
										140	Guangzhou, China	90.00	84.15	84.78	0.951
										107	Ecuador	86.67	82.26	84.11	0.905
[31]	DenseNet	China	709 (NA)	COVID-19, non-COVID-19 (Viral, Bacterial, Mycoplasma, Fungal)	Two centers from China	78.93/NA	89.93/NA	81.24/NA	0.900/NA	161	Heilongjiang, China	79.35	81.16	80.12	0.880
[31]	DenseNet	China	709 (NA)		Two centers from China	78.93/NA	89.93/NA	81.24/NA	0.900/NA	226	Anhui, China	80.39	76.61	78.32	0.870
[32]	U-Net based algorithm	China	2447(639)	COVID-19, non-COVID-19	Two centers from China	NA/97.30	NA/85.00	NA/NA	NA/0.985	369 (820 scans)	Xianning, China	83.90	66.00	NA	0.837
										411 (1097 Scans)	Tianyou, China	90.70	38.60	NA	0.725
										130 (203 scans)	Xiangy, China	83.30	51.70	NA	0.679
[33]	ResNet152	China	2688 (2688)	COVID-19, non-COVID-19 pneumonia (Viral and Bacterial)	Three centers from China	NA/87.30	NA/96.60	NA/NA	NA/0.974	2539	Seven centers, China	78.00	93.50	NA	0.921
This study [17]	Ensemble learning	Iran	488 (124)	COVID-19, and non-COVID-19 (Viral, Atypical)	Single center	94.67/93.54	93.03/90.32	93.85/91.94	0.988/0.965	100	Argentina	90.0	96.0	93.0	0.983
										100	Turkey	86.0	88.0	87.0	0.914
										100	Iran	90.0	84.0	87.0	0.910
										100	Italy	92.0	80.0	86.0	0.882
										50	The Netherlands	86.0	NA	NA	NA

Sen, Sensitivity; Spc, Specificity, Acc, Accuracy, AUC, Area under the ROC curve

Fig. 2

(a) ROC curves and (b) radar plots of COVIDiag model on different centers. Sen, sensitivity; Spc, specificity; Acc, accuracy; AUC, area under the ROC curve.

Summary of developed AI systems for COVID-19 diagnosis. Sen, Sensitivity; Spc, Specificity, Acc, Accuracy, AUC, Area under the ROC curve (a) ROC curves and (b) radar plots of COVIDiag model on different centers. Sen, sensitivity; Spc, specificity; Acc, accuracy; AUC, area under the ROC curve.

Discussion

In the present study, we have validated the performance of our proposed novel AI system (COVIDiag) in differentiating COVID-19 from non-COVID pneumonia cases using CT images obtained from five countries. As during the pandemic, health systems worldwide are overwhelmed by a staggering number of patients, early diagnosis is essential for authorities to manage patients in time [21,22]. Chest imaging has been used both as an initial screening method and a secondary measure to determine the extent of COVID-19 progression [23]. The inadequate infrastructure in this field has urged radiologists and medical imaging experts to actively look for new options to increase the quality and quantity of image interpretation. Hence, AI systems have been developed using CT and chest radiographic images which are the most accessible imaging modalities in almost any clinical setting [8]. Also, World Health Organization (WHO) encouraged the researchers to "study the role of artificial intelligence in chest imaging in different settings." [7]. The Italian Society of Medical and Interventional Radiology, has investigated the use of AI in diagnosing COVID-19 and has endorsed the multicenter studies by validating different AI schemes [24]. However, this particular society and other peers have not endorsed the wide-scale use of AI-based diagnosing tools to be used as the first step for screening and diagnosing [25]. Potential factors limiting wide-scale use, are lack of multicenter and multinational studies, privacy issues, and technical issues pertaining to technical specifications of CT imaging [26,27]. National and international initiatives such as the "AI-ROBOTICS vs. COVID-19 initiative" set by the European commission and academic, private, and private non-profit ones have aimed to provide such evidence [28]. The results of such research initiatives are presented in the following paragraphs. A multicenter study performed by Harmon et al. utilized multinational databases gathered from China, Italy, Japan, and the United States to validate an AI model using CT images to detect COVID-19 and non-COVID-19 (viral, bacterial, fungal) pneumonias. Their control group consisted of a wide spectrum of subjects with either bacterial, viral, or fungal pneumonia. They reached an AUC of 0.949 and 0.947 in 3D and hybrid 3D model, respectively [29]. A similar multicenter study was Dutch authorities, using an AI tool that determined COVID-19 report and database scoring (CO-RADS) for each series of imaging sequences presented to the machine. The deep learning-based algorithm could diagnose positive COVID-19 cases with an AUC of 0.95 and 0.88 in the internal test and external dataset, respectively [30]. Zhang et al. validated an AI model on 1366 cases of COVID-19 and non-COVID-19 (viral, bacterial, mycoplasma) from all over China. Their AI system could diagnose COVID-19 cases with an AUC of 0.979 based on internal validation, and AUC of 0.971, 0.96, 0.970, and 0.951 for other independent Chinese Cohorts. However, the system's performance based on an international foreign database (Ecuador) decreased to AUC of 0.905 [18]. Wang et al. developed a deep learning algorithm that automatically segmented CT images and diagnosed and differentiated COVID-19 from other pneumonias (viral, bacterial, mycoplasma, fungal). Their algorithm was based on DenseNet and could diagnose COVID-19 pneumonia with an AUC of 0.900 based on the training dataset. The external validation was done on two databases from different regions of China (Anhui, 226 patients, and Heilongjiang 161 patients). The system achieved an AUC of 0.870 and 0.880 for the first and second external validation databases, respectively [31]. A similar study was performed in China, where an AI model was developed using 2447 patients. This AI model was based on U-Net algorithm, and consecutive slices of CT imaging were fed to the system as the primary input. Internal and external validations were done by applying 639 and 910 cases from three different centers to the algorithm. The model achieved an AUC of 0.985 for the internal validation dataset. However, the performance of the model dropped to 0.725, 0.837, and 0.679 for three external validation databases, respectively [32]. Another multicenter study developed an AI system using 2688 patients with COVID-19 and other pneumonias (viral and bacterial) obtained from three Chinese centers. Their system achieved an AUC of 0.974 and 0.921 using the internal and external validation dataset, respectively [33]. Our results indicate that the performances of COVIDiag model are comparable to these studies. Also, our COVIDiag model is simple, easy to run, and does not require any high-performance computer as is based on machine learning, but the models proposed by other studies are based on deep learning. One point which needs to be noted is that these studies included all types of non-COVID-19 pneumonia, which can improve their model significantly as the differentiation between COVID-19 and non-viral pneumonia is not critical. Since the image patterns of the patients with COVID-19 and other viral or atypical pneumonias look similar, radiologists' main and vital task is to differentiate these two groups [34], [35], [36]. The COVIDiag model was developed to differentiate COVID-19 and other viral or atypical pneumonias to help the radiologists in their daily practices. Table 3 shows the summary of comparison between our proposed COVIDiag model with other similar studies. According to the WHO advisory guide, chest imaging should be used as a primary method to diagnose COVID-19 pneumonia under three conditions: 1- RT-PCR kits are not available, 2- results of RT-PCR are not available within 24 hours, and 3- RT-PCR results are negative for patients suspected of COVID-19 pneumonia. Therefore, the COVIDiag model can be used based on CT findings for those conditions effectively and eliminate the relevant WHO's concerns, especially in section 5.1 of the advice guide [7]. Furthermore, the model can be used in regions where there is shortage of diagnostic kits or whenever the results of the PCR test are not instantly available [37]. Although the results of COVIDiag model is encouraging, its limitations are as follows. A majority of studies reported earlier and those not mentioned in this study, only depended on images and not their interpretations. Since the COVIDiag model is based on CT findings, which are determined by radiologists, the model is prone to human errors. However, relying on basic radiologic signs, which can even be distinguished by non-radiologist clinicians such as front-line emergency room physicians, pulmonologists, and other specialists, maybe more implementable in resource-poor settings. Imaging outputs are also prone to alterations in individuals with COVID-19 undergoing treatment, those with pre-existing conditions or superimposed medical problems such as bacterial or fungal pneumonia or cardiac failure, and those being under treatment with certain medications. These conditions may cause a deviation from the typical pattern reported for COVID-19 patients and accordingly affect the performance of AI model [18,38]. Human-based interpretations of imaging signs may adjust this error to some extent, but cannot eliminate it. The main advantage of the COVIDiag model is that it cannot be affected by imaging parameters. Hence, the COVIDiag model can be readily used more than other AI models. However, other AI models based on deep learning could be affected by image characteristics such as resolution [29]. Another limitation is that our study did not involve any pediatric patients. However, according to the literature, the CT findings of pediatrics and adults are similar [39]. Therefore, we presume that our model can also be used for children, but this application needs further validation.

Conclusion

The significant increase in the number of COVID-19 patients undergoing CT imaging for diagnostic and prognostic reasons has led to an unbalance between radiologists' workforce and workload. One proposed method to reinstate this balance has led to incorporate AI systems in routine medical practice. In the present study, the performance of our proposed AI system called COVIDiag, was validated using five multinational databases obtained from five different countries in three continents. We found that the proposed COVIDiag model has yielded an acceptable diagnostic performance for all databases and can perform consistently among different groups of patients from various countries. Our results are significant as they provided evidence regarding the applicability of AI systems in real-world clinical settings and integration of AI in medicine.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

36 in total

1. Essentials for Radiologists on COVID-19: An Update-Radiology Scientific Expert Panel.

Authors: Jeffrey P Kanne; Brent P Little; Jonathan H Chung; Brett M Elicker; Loren H Ketai
Journal: Radiology Date: 2020-02-27 Impact factor: 11.105

2. Accuracy of CT in a cohort of symptomatic patients with suspected COVID-19 pneumonia during the outbreak peak in Italy.

Authors: Giulia Besutti; Paolo Giorgi Rossi; Valentina Iotti; Lucia Spaggiari; Riccardo Bonacini; Andrea Nitrosi; Marta Ottone; Efrem Bonelli; Tommaso Fasano; Simone Canovi; Rossana Colla; Marco Massari; Ivana Maria Lattuada; Laura Trabucco; Pierpaolo Pattacini
Journal: Eur Radiol Date: 2020-07-14 Impact factor: 5.315

3. Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review.

Authors: Ming-Yen Ng; Elaine Y P Lee; Jin Yang; Fangfang Yang; Xia Li; Hongxia Wang; Macy Mei-Sze Lui; Christine Shing-Yen Lo; Barry Leung; Pek-Lan Khong; Christopher Kim-Ming Hui; Kwok-Yung Yuen; Michael D Kuo
Journal: Radiol Cardiothorac Imaging Date: 2020-02-13

4. Use of CT and artificial intelligence in suspected or COVID-19 positive patients: statement of the Italian Society of Medical and Interventional Radiology.

Authors: Emanuele Neri; Vittorio Miele; Francesca Coppola; Roberto Grassi
Journal: Radiol Med Date: 2020-04-29 Impact factor: 3.469

5. The Role of Chest Imaging in Patient Management During the COVID-19 Pandemic: A Multinational Consensus Statement From the Fleischner Society.

Authors: Geoffrey D Rubin; Christopher J Ryerson; Linda B Haramati; Nicola Sverzellati; Jeffrey P Kanne; Suhail Raoof; Neil W Schluger; Annalisa Volpi; Jae-Joon Yim; Ian B K Martin; Deverick J Anderson; Christina Kong; Talissa Altes; Andrew Bush; Sujal R Desai; Jonathan Goldin; Jin Mo Goo; Marc Humbert; Yoshikazu Inoue; Hans-Ulrich Kauczor; Fengming Luo; Peter J Mazzone; Mathias Prokop; Martine Remy-Jardin; Luca Richeldi; Cornelia M Schaefer-Prokop; Noriyuki Tomiyama; Athol U Wells; Ann N Leung
Journal: Chest Date: 2020-04-07 Impact factor: 9.410

6. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis.

Authors: Cheng Jin; Weixiang Chen; Yukun Cao; Zhanwei Xu; Zimeng Tan; Xin Zhang; Lei Deng; Chuansheng Zheng; Jie Zhou; Heshui Shi; Jianjiang Feng
Journal: Nat Commun Date: 2020-10-09 Impact factor: 14.919

Review 7. Evidence based management guideline for the COVID-19 pandemic - Review article.

Authors: Maria Nicola; Niamh O'Neill; Catrin Sohrabi; Mehdi Khan; Maliha Agha; Riaz Agha
Journal: Int J Surg Date: 2020-04-11 Impact factor: 6.071

8. Testing for SARS-CoV-2 (COVID-19): a systematic review and clinical guide to molecular and serological in-vitro diagnostic assays.

Authors: Antonio La Marca; Martina Capuzzo; Tiziana Paglia; Laura Roli; Tommaso Trenti; Scott M Nelson
Journal: Reprod Biomed Online Date: 2020-06-14 Impact factor: 3.828

9. Deep learning-based triage and analysis of lesion burden for COVID-19: a retrospective study with external validation.

Authors: Minghuan Wang; Chen Xia; Lu Huang; Shabei Xu; Chuan Qin; Jun Liu; Ying Cao; Pengxin Yu; Tingting Zhu; Hui Zhu; Chaonan Wu; Rongguo Zhang; Xiangyu Chen; Jianming Wang; Guang Du; Chen Zhang; Shaokang Wang; Kuan Chen; Zheng Liu; Liming Xia; Wei Wang
Journal: Lancet Digit Health Date: 2020-09-22

10. Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets.

Authors: Stephanie A Harmon; Thomas H Sanford; Sheng Xu; Evrim B Turkbey; Holger Roth; Ziyue Xu; Dong Yang; Andriy Myronenko; Victoria Anderson; Amel Amalou; Maxime Blain; Michael Kassin; Dilara Long; Nicole Varble; Stephanie M Walker; Ulas Bagci; Anna Maria Ierardi; Elvira Stellato; Guido Giovanni Plensich; Giuseppe Franceschelli; Cristiano Girlando; Giovanni Irmici; Dominic Labella; Dima Hammoud; Ashkan Malayeri; Elizabeth Jones; Ronald M Summers; Peter L Choyke; Daguang Xu; Mona Flores; Kaku Tamura; Hirofumi Obinata; Hitoshi Mori; Francesca Patella; Maurizio Cariati; Gianpaolo Carrafiello; Peng An; Bradford J Wood; Baris Turkbey
Journal: Nat Commun Date: 2020-08-14 Impact factor: 14.919

5 in total

Review 1. The COVID-19 epidemic analysis and diagnosis using deep learning: A systematic literature review and future directions.

Authors: Arash Heidari; Nima Jafari Navimipour; Mehmet Unal; Shiva Toumaj
Journal: Comput Biol Med Date: 2021-12-14 Impact factor: 6.698

2. Fully automatic pipeline of convolutional neural networks and capsule networks to distinguish COVID-19 from community-acquired pneumonia via CT images.

Authors: Qianqian Qi; Shouliang Qi; Yanan Wu; Chen Li; Bin Tian; Shuyue Xia; Jigang Ren; Liming Yang; Hanlin Wang; Hui Yu
Journal: Comput Biol Med Date: 2021-12-29 Impact factor: 6.698

Review 3. Role of Artificial Intelligence in COVID-19 Detection.

Authors: Anjan Gudigar; U Raghavendra; Sneha Nayak; Chui Ping Ooi; Wai Yee Chan; Mokshagna Rohit Gangavarapu; Chinmay Dharmik; Jyothi Samanth; Nahrizul Adib Kadri; Khairunnisa Hasikin; Prabal Datta Barua; Subrata Chakraborty; Edward J Ciaccio; U Rajendra Acharya
Journal: Sensors (Basel) Date: 2021-12-01 Impact factor: 3.576

4. Generalizability assessment of COVID-19 3D CT data for deep learning-based disease detection.

Authors: Maryam Fallahpoor; Subrata Chakraborty; Mohammad Tavakoli Heshejin; Hossein Chegeni; Michael James Horry; Biswajeet Pradhan
Journal: Comput Biol Med Date: 2022-04-01 Impact factor: 6.698

5. An implementation of a hybrid method based on machine learning to identify biomarkers in the Covid-19 diagnosis using DNA sequences.

Authors: Bihter Das
Journal: Chemometr Intell Lab Syst Date: 2022-10-03 Impact factor: 4.175

5 in total