| Literature DB >> 34511689 |
Zaid Abdi Alkareem Alyasseri1,2, Mohammed Azmi Al-Betar3,4, Iyad Abu Doush5,6, Mohammed A Awadallah3,7, Ammar Kamal Abasi3,8, Sharif Naser Makhadmeh9,3, Osama Ahmad Alomari10, Karrar Hameed Abdulkareem11, Afzan Adam1, Robertas Damasevicius12, Mazin Abed Mohammed13, Raed Abu Zitar14.
Abstract
COVID-19 is the disease evoked by a new breed of coronavirus called the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Recently, COVID-19 has become a pandemic by infecting more than 152 million people in over 216 countries and territories. The exponential increase in the number of infections has rendered traditional diagnosis techniques inefficient. Therefore, many researchers have developed several intelligent techniques, such as deep learning (DL) and machine learning (ML), which can assist the healthcare sector in providing quick and precise COVID-19 diagnosis. Therefore, this paper provides a comprehensive review of the most recent DL and ML techniques for COVID-19 diagnosis. The studies are published from December 2019 until April 2021. In general, this paper includes more than 200 studies that have been carefully selected from several publishers, such as IEEE, Springer and Elsevier. We classify the research tracks into two categories: DL and ML and present COVID-19 public datasets established and extracted from different countries. The measures used to evaluate diagnosis methods are comparatively analysed and proper discussion is provided. In conclusion, for COVID-19 diagnosing and outbreak prediction, SVM is the most widely used machine learning mechanism, and CNN is the most widely used deep learning mechanism. Accuracy, sensitivity, and specificity are the most widely used measurements in previous studies. Finally, this review paper will guide the research community on the upcoming development of machine learning for COVID-19 and inspire their works for future development. This review paper will guide the research community on the upcoming development of ML and DL for COVID-19 and inspire their works for future development.Entities:
Keywords: 2019‐nCoV; COVID‐19; COVID‐19 dataset; deep learning; machine learning
Year: 2021 PMID: 34511689 PMCID: PMC8420483 DOI: 10.1111/exsy.12759
Source DB: PubMed Journal: Expert Syst ISSN: 0266-4720 Impact factor: 2.812
FIGURE 1COVID‐19 Person to person spread (Al‐Betar et al., 2021)
FIGURE 2publications statistics on ML and DL works in COVID‐19. (a) Publication per database. (b) COVID‐19 publication. (c) Distribution of published COVID‐19 research articles. (d) COVID‐19 publication
FIGURE 3Flowchart of selected studies involving the query and inclusion criteria
FIGURE 4Categorizing of related works
FIGURE 5Machine learning research for COVID‐19. (a) Number of machine learning publication. (b) Mapping of machine learning publication
FIGURE 6Supervised learning approaches for COVID‐19 (Asnaoui et al., 2020)
FIGURE 7(a) Application of agglomerative and divisive to a dataset of five objects, {a, b, c, d, e} (b) Application of partition clustering to a dataset of fourteen clusters
FIGURE 8Deep learning steps for COVID‐19 (Ozturk et al., 2020)
FIGURE 9Deep learning publication in different countries for COVID‐19. (a) Number of deep learning publication. (b) Mapping of deep learning publication
Existing public COVID‐19 datasets
| Dataset | Data type | Classification Output | Dataset type | Characteristics | Primary dataset | Secondary dataset | Techniques | Achievement | Available online |
|---|---|---|---|---|---|---|---|---|---|
| World Health Organization (Siddiqui et al., | Outbreak | Statistical report | ‘Coronavirus disease (COVID‐19) situation reports’ | Rate of infection for temperature in various provinces of China. | ✓ | (Kumar, Arora, et al., | Accuracy Scenario 1: 80.6%, Scenario 2: 85.2%, Scenario 3: 99.9% | WHO. Coronavirus disease (COVID‐2019) situation reports. ‘ | |
| Protegen database (Randhawa et al., | Laboratory findings | Diagnosis | annotated proteins | Positive samples of 397 bacterial and 178 viral protective antigens (PAgs) and 4979 negative samples. | ✓ | (Ong, Wong, et al., | ‘ | ||
| (NCBI) database (DNA sequence data) (Randhawa et al., | Laboratory findings | Diagnosis | COVID‐19 virus sequence | 5000 unique viral genomic sequences ( 61.8 million bp) | ✓ | (Kannan et al., | Method 1: 72.7%, Method 2: 68.7%, Method 3: 91.45% | ‘ | |
| Hungary‐COVID‐Data (Pinter et al., | Outbreak | Statistical report | Time series data | Statistical analyses of the cases of COVID‐19 and the fatality rate in Hungary | ✓ | (Chandra et al., | ‘ | ||
| First dataset: (pneumonia) database, Second dataset: COVID‐19 images (Elaziz et al., | Medical image | Diagnosis | chest x‐ray image | Dataset part 1: 216 COVID‐19 and 1,675 normal cases. Dataset part 2: 219 COVID‐19 and 1,341 normal cases. | ✓ | (Kassani et al., | Accuracy Dataset1: 96.09% Dataset2: 98.09% | Dataset 1: ‘ | |
| real novel COVID‐19 data (Fayyoumi et al., | Laboratory findings | Diagnosis | Signs and symptoms of the patients | 64 negative PCR test and 41 Positive PCR test | ✓ | (Bandyopadhyay & Dutta, | Accuracy: 91.67% | unavailable | |
| Indian COVID‐19 dataset (Kavadi et al., | Outbreak | Statistical report | Time series data | Statistical reports of COVID‐19 cases in India | ✓ | (Prakash et al., | Case 1: 97.82%, Case 2: 98%, Case 3: 96.66% Case 4: 97.50% | ‘ | |
| COVID‐19 epidemiological data (Wang, Zhengm et al., | Outbreak | Statistical report | Time series data | COVID‐19 infected cases in numerous countries ( Brazil, Russia, India, Peru and Indonesia) | ✓ | (Punn et al., | Accuracy: 94.99%, Sensitivity 93.34%, Specificity 94.30% | ‘ | |
| U.S. hospitals COVID‐19 admission data (Burdick et al., | Laboratory findings | Diagnosis | Patient Vital Sign and Lab Measurement | 197 confirmed cases based on 12 factors, for example systolic blood pressure, Blood pressure, and heart rate. | ✓ | (Cheng et al., | Accuracy: 76.2%, Sensitivity 95%, Specificity 76.3% | ‘ | |
| Facebook‐Covid‐19 data (Sear et al., | Outbreak | Statistical report | COVID‐19 related comments | Data on common Facebook posts from the date 1/17/2020 until 2/28/2020. | ✓ | (Ahmed, Shahbaz, et al., | Sensitivity Case 1: 94.3% and 90.9%, Case 2: 79.6% and 81.5% | ‘ | |
| COVID‐XRay‐5K DATASET (Minaee et al., | Medical image | Diagnosis | 5000 images | COVID‐19 X‐ray samples and for Non‐COVID samples | ✓ | (Ahammed et al., | Sensitivity rate 98% and specificity rate 90% | ‘ | |
| ChestX‐ray8 database (Wang et al., | Medical image | Diagnosis | x‐ray images | totally of 108,948 images which 24,636 include one or more patient images. The rest are 84,312 normal cases images. last Updated 20 July 2020 | ✓ | (Sohan, | Sensitivity 72%, Specificity 82% | ‘ | |
| Deep‐Learning‐COVID‐19‐on‐CXR‐using‐Limited‐Training‐Data‐Sets (Oh et al., | Medical image | Diagnosis | x‐ray images | Public CXR datasets are available JSRT, SCR, NLM(MC), Pneumonia, and COVID‐19 | ✓ | (Cohen et al., | Accuracy is 91.9 % | ‘ | |
| COVID‐19 and Pneumonia Scans Dataset (Brunese et al., | Medical image | Diagnosis | x‐ray images | 5887 images | ✓ | (Afshar et al., | Area under the ROC curve higher than 97% | ‘ | |
| Random Sample of NIH Chest X‐ray Dataset (Wang, Peng, et al., | Medical image | Diagnosis | x‐ray images | total of 5,606 images and labels are extracted from the dataset of NIH Chest X‐ray | ✓ | (Antin et al., | Accuracy 90.13% | ‘ | |
| ICLUS ‐ Italian Covid‐19 Lung Ultra‐sound project (Wang, Peng, et al., | Medical video | Diagnosis | Lung ultrasound video | a database of ultrasounds images that can possibly used for identifying patient status in different stages | ✓ | (Roy et al., | Accuracy convex: 84%, linear: 94% | ‘ | |
| COVID‐CT (Zhao et al., | Medical image | Diagnosis | CT images | Tongji Hospital, Wuhan, China COVID‐19 data patients between January and April. Total of 349 CT images that included clinical features extracted from 216 patient of COVID‐19 disease. | ✓ | (Yang et al., | Accuracy 95.37%, Sensitivity 95.99%, Specificity 94.76% | ‘ | |
| PEDIATRIC CXR DATASET (Kermany et al., | Medical image | Diagnosis | chest X‐ray images | Dataset are collected from Guangzhou Women and Children's Medical Center in Guangzhou which include different classes of medical images such non‐COVID‐19 viral pneumonia, bacterial pneumonia, and normal lungs. | ✓ | (Haghanifar et al., | Accuracy rate 92.8%, sensitivity rate 93.2% and specificity rate 90.1% | ‘ | |
| RSNA CXR DATASET (Shih et al., | Medical image | Diagnosis | chest X‐ray images | Multi‐expert curated and chest X‐ray images dataset contains samples from the National Institutes of Health (NIH) CXR‐14 | ✓ | (Rahman, Khandakar, et al., | Accuracy rate 93.08%, sensitivity rate 97.53, precision rate 93.15%, and F‐score of 94.57% | ‘ | |
| TWITTER COVID‐19 CXR DATASET | Medical image | Diagnosis | chest X‐ray images | A collection of 134 CXRs with 2K2K pixel resolution in JFIF format that provided by a cardiothoracic radiologist from Spain via Twitter of SARS‐CoV‐2 confirmed cases | ✓ | (Haghanifar et al., | Accuracy rate 99.01%, and area under the curve 99.72% | ‘ | |
| MONTREAL COVID‐19 CXR DATASET (Cohen et al., | Medical image | Diagnosis | chest X‐ray images | A collection of 179 CXRs | ✓ | (Rajaraman, Siegelman, et al., | Accuracy rate 99.01%, and area under curve 99.72% | ‘ | |
| COVID‐19 Chest X‐Rays for Lung Severity Scoring (Cohen et al., | Medical image | Diagnosis | chest X‐ray images | 131 CXR images that extracted from 84 COVID‐19 cases | ✓ | (Mangal et al., | Accuracy rate 96.6%, precision rate 93.17%, recall rate 98.25% and F‐measure rate 95.6% | ‘ | |
| TCIA dataset (Hu et al., | Medical image | Diagnosis | 3D CT lung scans | 60 3D CT images that retrieved based on manual delineations of the lung anatomy | ✓ | (Choi et al., | CI:0.501–0.756, iAUC: 0.620; 95% | ‘ | |
| large‐scale COVID‐19 CX‐R image dataset (Al‐Waisy et al., | Medical image | Diagnosis | chest X‐ray image | 800 images that include four main classes specifically normal, COVID‐19, pneumonia virus, and pneumonia bacterial. | ✓ | (Al‐Waisy et al., | RMSE: 0.012%, MSE: 0.011%, specificity: 100%, sensitivity: 99.98%, precision: 100%, F1‐score: 99.99%, and accuracy rate: 99.99% | ‘ | |
| The World Mortality Dataset (Karlinsky & Kobak, | Outbreak | Statistical report | Time series data | data have collected weekly, monthly, or quarterly all‐cause mortality data from 77 countries, openly available as the regularly‐updated World Mortality Dataset | ✓ | (Karlinsky & Kobak, | AUC score: 81.38% and accuracy score: 81.30% | ‘ |
Evaluation measures for COVID‐19 diagnosis models
| Measures | Equation | References |
|---|---|---|
| Accuracy |
| (Apostolopoulos et al., |
| Sensitivity |
| (Fayyoumi et al., |
| Specificity |
| (Fayyoumi et al., |
| G_Mean |
| (Fayyoumi et al., |
| Precision |
| (Fayyoumi et al., |
| F1‐score |
| (Alazab et al., |
| Root mean square error |
| (Alazab et al., |
| Mean absolute percentage error |
| (Arora et al., |
| Mean absolute error |
| (Peng & Nagata, |
| Explained variance |
| (Zeroual et al., |
| Root Mean squared log error |
| (Zeroual et al., |
| Receiver operating characteristic |
| (Z. Li et al., |
| Logarithmic loss |
| (Vaid, Kalantar, & Bhandari, |
| Testing the execution time |
| (Brunese et al., |
| Area under curve |
| (Z. Li et al., |
| Average time for COVID‐19 detection |
| (Brunese et al., |
| Matthews correlation coefficient |
| (Cole et al., |
| Others Evaluation (cluster visualization, | ‐ | (Vaid, Cakan¸ et al., |
FIGURE 10Evaluation measures
FIGURE 11The percentage of using machine learning approaches for COVID‐19
FIGURE 12The percentage of using deep learning approaches for COVID‐19