| Literature DB >> 35677733 |
Saratu Yusuf Ilu1, Prasad Rajesh1, Hassan Mohammed2.
Abstract
Severe acute respiratory syndrome coronavirus (SARS-COV) is a major family of viruses that cause infections in both animals and humans, including common cold, coronavirus disease (COVID-19), severe acute respiratory syndrome (SARS), and Middle East respiratory syndrome. This study primarily aims to predict the number of COVID-19 positive cases in 36 states of Nigeria using a long short-term memory (LSTM) algorithm of deep learning. The proposed approach employs K-means clustering to detect outliers and principal component analysis (PCA) to select important features from the dataset. The LSTM was chosen because of its non-linear characteristics to handle the dataset. As COVID-19 cases follow non-linear characteristics, LSTM is the most suitable algorithm for predicting their numbers. For comparison, several types of machine learning algorithms, such as naive Bayes, XG-boost, and SVM, were employed. After the comparison, LSTM was observed to be superior among all algorithms.Entities:
Keywords: COVID-19; Classification and clustering; LSTM; Machine learning
Year: 2022 PMID: 35677733 PMCID: PMC9164683 DOI: 10.1016/j.imu.2022.100990
Source DB: PubMed Journal: Inform Med Unlocked ISSN: 2352-9148
Fig. 1Scatter diagram for COVID-19 cases and dates.
Sample of the dataset before preprocessing from (https://covid19.ncdc.gov.ng/report/#!).
| States | Confirmed | Recoveries | Deaths | Active cases | Testing | ||||
|---|---|---|---|---|---|---|---|---|---|
| Total | Last week | Total | Last Week | Total | Last week | Total | Last week | ||
| Abia | 2, 153 | 1 | 2, 118 | 6 | 34 | 0 | 1 | 51, 549 | 5, 782 |
| Adamawa | 1, 203 | 0 | 1, 103 | 0 | 32 | 0 | 68 | 30, 873 | 36 |
| Akwa Ibom | 4, 638 | 0 | 4, 562 | 0 | 44 | 0 | 32 | 57, 574 | 4, 397 |
| Anambra | 2, 825 | 0 | 2, 760 | 0 | 19 | 0 | 46 | 55, 303 | 216 |
| Bauchi | 1, 939 | 0 | 1, 882 | 0 | 24 | 0 | 33 | 37, 712 | 7 |
| Bayelsa | 1, 310 | 2 | 1, 277 | 0 | 28 | 0 | 5 | 37, 682 | 83 |
| Benue | 2, 129 | 0 | 1764 | 0 | 25 | 0 | 340 | 50, 361 | 210 |
| Borno | 1, 629 | 0 | 1, 580 | 0 | 44 | 0 | 5 | 28, 135 | 294 |
| Cross River | 805 | 20 | 760 | 7 | 25 | 0 | 20 | 18, 976 | 106 |
Sample of dataset after preprocessing.
| Date | Id | confirmed cases |
|---|---|---|
| 46 | 9940 | |
| 47 | 10300 | |
| 48 | 11179 | |
| 49 | 9676 | |
| 50 | 8506 | |
| 51 | 6606 | |
| 52 | 5720 | |
| 53 | 3583 | |
| 54 | 2878 | |
| 55 | 2122 |
Fig. 2LSTM Architecture containing memory blocks.
Fig. 3Flowchart of the COVID-19 prediction model.
Comparison of ML algorithms.
| Algorithm | Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| Naïve Bayes | 69% | 75% | 70% |
| SVM | 92% | 89% | 90% |
| LSTM | 98.1% | 98% | 98% |
| XGBoost | 91% | 76% | 80% |