| Literature DB >> 33675198 |
Malik Yousef1,2, Louise C Showe3, Izhar Ben Shlomo2,4.
Abstract
COVID-19 pandemic has flooded all triage stations, making it difficult to carefully select those most likely infected. Data on total patients tested, infected, and hospitalized is fragmentary making it difficult to easily select those most likely to be infected. The Israeli Ministry of Health made public its registry of immediate clinical data and the respective status of infected/not infected for all viral DNA tests performed up to Apr. 18th, 2020 including almost 120,000 tests. We used a machine-learning algorithm to find out which immediate clinical elements mattered the most in identifying the true status of the tested persons including age or gender matter, to enable future better allocation of surveillance policy for those belonging to high-risk groups. In addition to the analyses applied on the first batch of the available data (Apr. 11th), we further tested the algorithm on the independent second batch (Apr. 12th to 18th). Fever, cough and headache were the most diagnostic, differing in degree of importance in different subgroups. Higher percentage of men were found positive (9.3 vs. 7.3%), but gender did not matter for the clinical presentation. The prediction power of the model was high, with accuracy of 0.84 and area under the curve 0.92. We provide a hand-held short checklist with verbal description of importance for the leading symptoms, which should expedite the triage and enable proper selection of people for further follow-up.Entities:
Keywords: COVID_19; clinical presentation; machine learning; national registry; risk allocation
Mesh:
Year: 2021 PMID: 33675198 PMCID: PMC8035960 DOI: 10.1515/jib-2020-0050
Source DB: PubMed Journal: J Integr Bioinform ISSN: 1613-4516
Immediate clinical features recorded for people sampled for viral DNA in Israel.
| Parameter | Value | Details |
|---|---|---|
| Test date | dd/mm/yy | Day received in the lab |
| Gender | M/F/unknown | |
| Corona result | P/N/unknown (+at work) | The outcome of the COVID-19 test |
| Age ≥ 60 | Y/N/unknown | Value is 1 if ≥60; 0 otherwise |
| Cough | Y/N/unknown | Before the test |
| Fever | Y/N/unknown | Before the test |
| Sore throat | Y/N/unknown | Before the test |
| Shortness of breath | Y/N/unknown | Before the test |
| Headache | Y/N/unknown | Before the test |
| Test indication | Abroad/Contact with infected/Other | Arriving from abroad? Contact with infected person? Other |
National Israeli COVID-19 DNA tests performed until Apr. 11, 2020, without the “Test Indication” feature (first batch). Average results obtained from 100 MCVV.
| Ratio | #pos | #neg | Acc | Sen | Spe | Prec | F1 | AUC |
|---|---|---|---|---|---|---|---|---|
| 01:1 | 6,427 | 6,428 | 0.929 | 1.000 | 0.859 | 0.876 | 0.934 | 0.959 |
| 01:2 | 6,427 | 12,855 | 0.906 | 0.998 | 0.860 | 0.781 | 0.876 | 0.963 |
| 01:3 | 6,427 | 19,282 | 0.902 | 0.950 | 0.886 | 0.740 | 0.829 | 0.966 |
| Stdv | 0.007 | 0.011 | 0.013 | 0.015 | 0.008 | 0.004 | ||
| Random labels (ratio 1:2) | 0.36 | 0.98 | 0.05 | 0.34 | 0.51 | 0.51 | ||
| Tests on second batch data (Apr. 12th to 18th 2020) | 0.84 | 0.82 | 0.84 | 0.92 | ||||
Ratio column describes the ratio between the positives and the negatives. Acc – accuracy, Sen – sensitivity, Spe – specificity, Prec – precession, F1 – F1-measure and AUC – area under the curve. “stdv” is the average of standard deviations for each corresponding measurement. The last row is the result for the same data while shuffling the labels of the data (random labels). Last row refers to the results obtained from deploying the model from the first batch on the second batch of the data (data recorded Apr. 12th to 18th).
Figure 1:The decision tree by immediate clinical data obtained from applying the DT on all the first batch data published by the Ministry of Health on detection of viral DNA of COVID-19 up to Apr. 11, 2020, applied on the dataNonTI of ratio 1:2.
The ranks/score of the significance of each clinical feature on the National Israeli COVID-19 DNA tests assigned by RF model on the first batch of the data (until Apr. 11, 2020).
| Feature | Score |
|---|---|
| Fever | 1 |
| Cough | 0.643832075 |
| Headache | 0.462187066 |
| Sore throat | 0.227116751 |
| Shortness of breath | 0.110647828 |
| Gender | 0.009394193 |
| Age ≥ 60 | 0.007142431 |
A check list for triage of patients seen for possible COVID_19 infection.
| Feature | Likelihood of infection |
|---|---|
| 1. fever | Very high |
| 2. cough | Both present – absolute |
| 3. headache | |
| 4. sore throat | Absolute |
| 5. shortness of breath | Absolute |
| 6. gender | Marginal |
| 7. age ≥ 60 | More likely to be |