| Literature DB >> 35805713 |
Pablo Ormeño1, Gastón Márquez2, Camilo Guerrero-Nancuante3, Carla Taramasco4.
Abstract
Epivigila is a Chilean integrated epidemiological surveillance system with more than 17,000,000 Chilean patient records, making it an essential and unique source of information for the quantitative and qualitative analysis of the COVID-19 pandemic in Chile. Nevertheless, given the extensive volume of data controlled by Epivigila, it is difficult for health professionals to classify vast volumes of data to determine which symptoms and comorbidities are related to infected patients. This paper aims to compare machine learning techniques (such as support-vector machine, decision tree and random forest techniques) to determine whether a patient has COVID-19 or not based on the symptoms and comorbidities reported by Epivigila. From the group of patients with COVID-19, we selected a sample of 10% confirmed patients to execute and evaluate the techniques. We used precision, recall, accuracy, F1-score, and AUC to compare the techniques. The results suggest that the support-vector machine performs better than decision tree and random forest regarding the recall, accuracy, F1-score, and AUC. Machine learning techniques help process and classify large volumes of data more efficiently and effectively, speeding up healthcare decision making.Entities:
Keywords: Epivigila; comorbidities; machine learning; symptoms
Mesh:
Year: 2022 PMID: 35805713 PMCID: PMC9265284 DOI: 10.3390/ijerph19138058
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Figure 1Proposed methodology.
Figure 2Epidemiological surveillance process conducted in Chile.
Symptoms and comorbidities used in our study.
| Symptoms | Comorbidities |
|---|---|
| Tachypnoea | Asthma |
| Odynophagia | Chronic kidney disease |
| Cyanosis | Chronic lung disease |
| Abdominal pain | High blood pressure |
| Headache | Obesity |
| Fever | Immunocompromised patient |
| Diarrhoea | Chronic heart disease |
| Loss of taste | Diabetes |
| Myalgia | Chronic neurological disease |
| Chest pain | Chronic liver disease |
| Prostration | Cardiovascular disease |
| Dyspnoea | |
| Cough | |
| Loss of smell |
Male and female demographics.
| All Patients | Suspected | Confirmed | Total |
|---|---|---|---|
| Mean age (interquartile range) | 37 (27–52) | 36 (26–51) | 39 (28–54) |
| Male gender (%) | 52.1% | 52.7% | 51.2% |
| Female gender (%) | 47.9% | 47.3% | 48.8% |
| Have symptoms (%) | 52.6% | 33.3% | 79.3% |
Figure 3Distribution of confirmed patients by age.
Figure 4Symptoms frequency.
Figure 5Comorbilities frequency.
Figure 6Relative frequency (percentage) of symptoms in confirmed and discarded patients.
Figure 7Relative frequency (percentage) of comorbilities in confirmed and discarded patients.
Top 5 features reported by SVM, RF and DT techniques categorized by age ranges.
| Dataset | Technique | Top 5 Features | ||||
|---|---|---|---|---|---|---|
| 1st | 2nd | 3rd | 4th | 5th | ||
| Age (0–20) | SVM | Abdominal pain | Loss of taste | Chronic kidney disease | Tachypnea | Chronic liver disease |
| RF | Chronic heart disease | Odynophagia | Diarrhoea | Cough | Fever | |
| DT | Abdominal pain | Odynophagia | Diarrhoea | Loss of smell | Cough | |
| Age (21–60) | SVM | Abdominal pain | High blood pressure | Chronic kidney disease | Asthma | Diabetes |
| RF | Abdominal pain | Diarrhoea | Chronic heart disease | Cough | Fever | |
| DT | Abdominal pain | Cough | Odynophagia | Dyspnoea | Chronic heart disease | |
| Age (61–96) | SVM | Abdominal pain | Diabetes | Loss of taste | Odynophagia | Fever |
| RF | Abdominal pain | Cough | Diarrhoea | Chronic heart disease | Fever | |
| DT | Abdominal pain | Cough | Diarrhoea | Chronic heart disease | Fever | |
| Age (0–96) | SVM | Abdominal pain | Chronic lung disease | Immunocompromised patient | Diabetes | Loss of taste |
| RF | Abdominal pain | Cough | Diarrhoea | Chronic heart disease | Fever | |
| DT | Abdominal pain | Chronic heart disease | Cough | Cardiovascular disease | Odynophagia | |
Precision, recall, -score, specificity, and AUC results for SVM, RF and DT techniques categorized by age ranges. Intense red indicates the highest values.
| Dataset | Technique | Precision | Recall | Specificity | AUC | |
|---|---|---|---|---|---|---|
| SVM | 0.613 | 0.680 | 0.645 | 0.574 | 0.640 | |
| Age (0–20) | RF | 0.628 | 0.604 | 0.616 | 0.712 | 0.636 |
| DT | 0.628 | 0.558 | 0.591 | 0.737 | 0.626 | |
| Age (21–60) | SVM | 0.717 | 0.785 | 0.749 | 0.705 | 0.739 |
| RF | 0.735 | 0.721 | 0.728 | 0.758 | 0.732 | |
| DT | 0.731 | 0.667 | 0.697 | 0.792 | 0.712 | |
| Age (61–96) | SVM | 0.730 | 0.811 | 0.768 | 0.687 | 0.753 |
| RF | 0.717 | 0.690 | 0.704 | 0.747 | 0.705 | |
| DT | 0.718 | 0.607 | 0.658 | 0.779 | 0.680 | |
| Age (0–96) | SVM | 0.727 | 0.798 | 0.761 | 0.681 | 0.748 |
| RF | 0.739 | 0.740 | 0.740 | 0.740 | 0.738 | |
| DT | 0.746 | 0.684 | 0.713 | 0.784 | 0.724 |
SVM, RF and DT accuracy results categorized by age range. Intense red indicates the highest values.
| 0–20 Years | 21–60 Years | 61–96 Years | 0–96 Years | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SVM | RF | DT | SVM | RF | DT | SVM | RF | DT | SVM | RF | DT | |
| Abdominal Pain | 0.9712 | 0.0711 | 0.1946 | 0.9962 | 0.1973 | 0.5565 | 0.9930 | 0.1723 | 0.4304 | 0.9919 | 0.1859 | 0.4294 |
| Asthma | 0.0016 | 0.0346 | 0.0527 | 0.0002 | 0.0211 | 0.0216 | 0.0000 | 0.0421 | 0.0359 | 0.0000 | 0.0053 | 0.0055 |
| Cardiovascular disease | 0.0000 | 0.0458 | 0.0379 | 0.0000 | 0.0289 | 0.0197 | 0.0000 | 0.0160 | 0.0231 | 0.0000 | 0.0308 | 0.0395 |
| Chest pain | 0.0001 | 0.0363 | 0.0484 | 0.0000 | 0.0042 | 0.0039 | 0.0000 | 0.0132 | 0.0169 | 0.0000 | 0.0312 | 0.0187 |
| Chronic heart disease | 0.0001 | 0.0946 | 0.0276 | 0.0000 | 0.1043 | 0.0292 | 0.0000 | 0.0676 | 0.0331 | 0.0000 | 0.0813 | 0.0657 |
| Chronic kidney disease | 0.1198 | 0.0052 | 0.0046 | 0.0632 | 0.0036 | 0.0043 | 0.0000 | 0.0137 | 0.0125 | 0.0000 | 0.0111 | 0.0141 |
| Chronic liver disease | 0.0849 | 0.0028 | 0.0049 | 0.0000 | 0.0152 | 0.0166 | 0.0000 | 0.0134 | 0.0104 | 0.0000 | 0.0039 | 0.0045 |
| Chronic lung disease | 0.0003 | 0.0116 | 0.0093 | 0.0000 | 0.0126 | 0.0170 | 0.0000 | 0.0224 | 0.0076 | 0.0001 | 0.0288 | 0.0247 |
| Chronic neurological disease | 0.0000 | 0.0029 | 0.0014 | 0.0000 | 0.0106 | 0.0094 | 0.0000 | 0.0088 | 0.0068 | 0.0000 | 0.0027 | 0.0003 |
| Cough | 0.0000 | 0.0793 | 0.0617 | 0.0000 | 0.0886 | 0.0334 | 0.0000 | 0.1032 | 0.0283 | 0.0000 | 0.0879 | 0.0566 |
| Cyanosis | 0.0002 | 0.0639 | 0.0516 | 0.0000 | 0.0308 | 0.0130 | 0.0000 | 0.0304 | 0.0302 | 0.0000 | 0.0326 | 0.0102 |
| Diabetes | 0.0007 | 0.0275 | 0.0367 | 0.0000 | 0.0193 | 0.0203 | 0.0000 | 0.0169 | 0.0170 | 0.0000 | 0.0148 | 0.0157 |
| Diarrhoea | 0.0000 | 0.0820 | 0.0680 | 0.0000 | 0.1221 | 0.0168 | 0.0000 | 0.0770 | 0.0432 | 0.0000 | 0.0863 | 0.0303 |
| Dyspnoea | 0.0001 | 0.0390 | 0.0394 | 0.0000 | 0.0268 | 0.0297 | 0.0000 | 0.0518 | 0.0465 | 0.0000 | 0.0403 | 0.0333 |
| Fever | 0.0000 | 0.0734 | 0.0464 | 0.0000 | 0.0744 | 0.0212 | 0.0000 | 0.0606 | 0.0093 | 0.0000 | 0.0652 | 0.0107 |
| Headache | 0.0283 | 0.0010 | 0.0022 | 0.0000 | 0.0001 | 0.0000 | 0.0000 | 0.0035 | 0.0026 | 0.0000 | 0.0014 | 0.0036 |
| High blood pressure | 0.0001 | 0.0035 | 0.0016 | 0.1019 | 0.0008 | 0.0012 | 0.0000 | 0.0152 | 0.0164 | 0.0000 | 0.0060 | 0.0054 |
| Immunocompromised patient | 0.0001 | 0.0010 | 0.0018 | 0.0000 | 0.0012 | 0.0020 | 0.0000 | 0.0173 | 0.0173 | 0.0000 | 0.0275 | 0.0299 |
| Loss of smell | 0.0001 | 0.0625 | 0.0645 | 0.0000 | 0.0329 | 0.0174 | 0.0000 | 0.0305 | 0.0268 | 0.0000 | 0.0375 | 0.0270 |
| Loss of taste | 0.4791 | 0.0521 | 0.0490 | 0.0000 | 0.0249 | 0.0268 | 0.0000 | 0.0228 | 0.0193 | 0.0000 | 0.0259 | 0.0281 |
| Myalgia | 0.0000 | 0.0487 | 0.0405 | 0.0000 | 0.0485 | 0.0257 | 0.0000 | 0.0158 | 0.0192 | 0.0000 | 0.0435 | 0.0253 |
| Obesity | 0.0000 | 0.0016 | 0.0013 | 0.0000 | 0.0030 | 0.0030 | 0.0000 | 0.0351 | 0.0168 | 0.0000 | 0.0193 | 0.0170 |
| Odynophagia | 0.0000 | 0.0820 | 0.0802 | 0.0000 | 0.0576 | 0.0316 | 0.0000 | 0.0277 | 0.0318 | 0.0000 | 0.0590 | 0.0380 |
| Prostration | 0.0000 | 0.0043 | 0.0056 | 0.0000 | 0.0013 | 0.0007 | 0.0000 | 0.0087 | 0.0129 | 0.0000 | 0.0078 | 0.0063 |
| Tachypnea | 0.1129 | 0.0061 | 0.0105 | 0.0000 | 0.0245 | 0.0233 | 0.0000 | 0.0376 | 0.0174 | 0.0000 | 0.0089 | 0.0047 |