Literature DB >> 32442756

Combination of four clinical indicators predicts the severe/critical symptom of patients infected COVID-19.

Liping Sun1, Fengxiang Song2, Nannan Shi2, Fengjun Liu2, Shenyang Li2, Ping Li3, Weihan Zhang2, Xiao Jiang4, Yongbin Zhang3, Lining Sun5, Xiong Chen2, Yuxin Shi6.   

Abstract

BACKGROUND: Despite the death rate of COVID-19 is less than 3%, the fatality rate of severe/critical cases is high, according to World Health Organization (WHO). Thus, screening the severe/critical cases before symptom occurs effectively saves medical resources. METHODS AND MATERIALS: In this study, all 336 cases of patients infected COVID-19 in Shanghai to March 12th, were retrospectively enrolled, and divided in to training and test datasets. In addition, 220 clinical and laboratory observations/records were also collected. Clinical indicators were associated with severe/critical symptoms were identified and a model for severe/critical symptom prediction was developed.
RESULTS: Totally, 36 clinical indicators significantly associated with severe/critical symptom were identified. The clinical indicators are mainly thyroxine, immune related cells and products. Support Vector Machine (SVM) and optimized combination of age, GSH, CD3 ratio and total protein has a good performance in discriminating the mild and severe/critical cases. The area under receiving operating curve (AUROC) reached 0.9996 and 0.9757 in the training and testing dataset, respectively. When the using cut-off value as 0.0667, the recall rate was 93.33 % and 100 % in the training and testing datasets, separately. Cox multivariate regression and survival analyses revealed that the model significantly discriminated the severe/critical cases and used the information of the selected clinical indicators.
CONCLUSION: The model was robust and effective in predicting the severe/critical COVID cases.
Copyright © 2020 The Authors. Published by Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19; Prediction; SVM; critical/severe symptom

Mesh:

Substances:

Year:  2020        PMID: 32442756      PMCID: PMC7219384          DOI: 10.1016/j.jcv.2020.104431

Source DB:  PubMed          Journal:  J Clin Virol        ISSN: 1386-6532            Impact factor:   3.168


Introduction/Background

The prevalent of COVID-19 (SARS-CoV-2) has caused 81,021 infection and 3194 deaths in China, according statistics in March 14th, 2020. In other countries, 64,299 cases and 2234 deaths were reported. To current knowledge, the Coronavirus shared 79 % sequences with SARS-CoV, which was prevalent in 2002–2003, especially in China, and shared 96 % sequences with bat coronavirus [1]. The receptor of COVID-19 was ACE2 for cell entry [2]. Clinical observations suggest that the incubation time for COVID-19 was 3–5 days, ranged from 0 to 24 days or more, similar to SARS [3]. According to a study in Wuhan, the mean incubation period was 5.2 days, (95 % CI, 4.1–7.0 days), and the epidemic doubles in every 7.4 days [4]. The R0 was estimated to be 2.24–3.58 [5]. In previous study, the most common early clinical symptoms were fever (98 %), cough (76 %), dyspnea (55 %) and myalgia or fatigue (44 %). In addition, sputum production (28 %) and headache (8%) were also reported [4]. In consistent with this study, fever (91.7 %), cough (75.0 %), fatigue (75.0 %), and gastrointestinal symptoms (39.6 %) were the most common clinical manifestations [6]. Laboratory features including leukopenia (25 %), lymphopenia (25 %) and raised aspartate aminotransferase (37 %, including seven of 28 non-ICU patients) was also included. In addition, AST, ALT, γ-GT, LDH and α-HBDH abnormality was reported [5]. Histopathologic changes and CT features observed [6,7]. Clinically, criteria for severe was identified as respiratory distress, more than 30 times/min, SpO3<93 % at rest, and PaO2/FiO2 < = 300 mmHg. Critical was respiratory failure, shock and extra pulmonary organ failure [8]. However, the mild cases may develop into severe or critical. Despite of the effort devoted for CT-based early critical case diagnosis [9], the performance is still blur. While prediction model for mild case developing into severe or critical is still not reported yet. In this study, it is aimed to identify the initial clinical observations or laboratory features at significantly associated with severe/critical cases, and predict if the disease would develop into severe/critical cases. Machine learning is emphasized for investigating COVID-19 [10].

Materials and methods

Sample enrollment and Clinical feature collection

This study is approved by Ethnic Committee of Shanghai Public Health Clinical Center, and all patients have informed and consent. The patients diagnosed with PCR in Shanghai during 2019-12-22 to 2020-3-12 was all enrolled in this study. As the only appointed hospital of COVID-19 curation, the patients were transferred to Shanghai Public Heath Clinical Center days after the initial diagnosis, and the clinical and laboratory features were generated from Shanghai Public Heath Clinical Center, and the sample was treated as initial ones. Temperature, heart rate, blood pressor was collected when the patients reached the hospital. Demographic information, laboratory features and clinical indicators were collected from the electronic record system of Shanghai Public Health Clinical Center and re-arranged manually by expert doctors. The accession of the system has been approved by the director of the hospital. The History of hypertension diseases, diabetes, coronary diseases and tuberculosis was collected individually. Severe/critical symptom was defined had one of the following criteria: (a) respiratory frequency ≥30/min; (b) rest pulse oximeter oxygen saturation ≤93 % or (c) oxygenation index (PaO2/FiO2) ≤ 300 mm Hg.

Laboratory assays

Pharyngeal swab specimens were collected from each patient was used for the COVID-19 viral nucleic acid detection with PCR assay, as previously described [6]. All laboratory data was generated from Shanghai Public Health Clinical Center according to the guidelines. The laboratory features include: Systolic pressure, Urine protein, Urinary red blood

Statistical analyses

For sample demographic analysis, fisher’s exact test was used. For feature selection, both student t-test and Wilcox rank test were assayed for each clinical/laboratory feature, and features significantly (p < 0.01) different in both algorithms were retained. Survival analysis was implemented using critical/severe symptom as event, and the time to critical/severe event for survival analyses using R package “survival”, and p < 0.01 considered significant. All analysis was performed on R platform (v3.6)

Results

Demographics and clinical characteristics

A total of 336 patients diagnosed as COVID-19 with PCR Kit were enrolled in this study, with 310 non-severe/critical cases and 26 severe/critical cases (Table 1 ). Ten out of the 26 severe/critical cases were onset of critical/severe symptom since they reached the hospital. Among these cases, 74 were from Wuhan, Hubei Province, 4 from Iran, and the other were from the other regions of China. The median age of all cases were 50 years old, the median age of non-severe patients was 48, while severe or critical patients were median 65. Among these patients, 79 have hypertension diseases, 29 have diabetes, 17 have coronary diseases, and 4 have history of tuberculosis.
Table 1

Characteristics of samples enrolled. Note that not all information was collected.

AllNon-S/CS/Cp value
Age5048653.10E-06
(36−49)(35−62)(63−75)
GenderFemale15815260.013
Male17715720
HypertensionNo256241150.028
Yes796811
DiabetesNo301281200.056
Yes29245
Coronary diseaseNo319298210.0061
Yes17125
Characteristics of samples enrolled. Note that not all information was collected.

Clinical and laboratory features associated with severe/critical cases

The clinical and laboratory results of patients enrolled were analyzed. Totally, 249 laboratory and clinical records were obtained, including but not limited to Liver function test, Blood test and Immunocytochemistry were obtained of the initial assay within 24 h since the hospital received the patients. The data were re-arranged and cleaned, and data including few records were excluded, for example, the records of HBV loading was less than 6. Finally, 220 features were included. The clinical features were compared, by dividing into the samples into non-severe/critical and severe/critical groups. Student t-test and Wilcox rank test were used, and for features with p values <0.01 in both algorithms was considered to be significantly associated with severe/critical symptom. Totally, 36 clinical and laboratory features were significantly associated with severe/critical cases, Fig. 1 . These were mainly immune features (including CD3, CD4, CD19, CRP, super-sensitive CRP, leukomonocytes and neutrophils), thyroxine products (including triiodothyronine, free triiodothyronine, thyroxine and free thyroxine), and electrolyte balance (Na+, Cl−). Considering that severe/critical symptom was detected when 10 out of these 26 patients reached the hospital, these features may reflect the character of severe/critical cases, instead of the sign. In other word, these features may be used for diagnosis instead of prediction. Thus, the severe/critical samples were further divided into two groups, one group did not show severe/critical symptom when collecting samples while the other did. Statistical difference of these 33 features were re-analyzed. Interestingly, none of these features were statistically different between the groups (Table S1). This may imply that the various immune cells have participate in the severe/critical disease, and laboratory features have been exhibited before the severe/critical symptom onset.
Fig. 1

The clinical indicators of severe/critical and non-severe/critical cases. A. The clinical feature values were z-score transformed. Red indicates high values, white indicate missing values and green indicate low values. The blue columns represent the mild cases while red columns refer to severe/critical samples. B. Vioplots of indicators, the two groups on x-axis in each panel were mild and severe cases, respectively, and the y-axis represents the values of the indicator (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.).

The clinical indicators of severe/critical and non-severe/critical cases. A. The clinical feature values were z-score transformed. Red indicates high values, white indicate missing values and green indicate low values. The blue columns represent the mild cases while red columns refer to severe/critical samples. B. Vioplots of indicators, the two groups on x-axis in each panel were mild and severe cases, respectively, and the y-axis represents the values of the indicator (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.).

Support Vector Machine (SVM) for decimating the severe/critical disease

Since the performance of single clinical indicators were not satisfactory, the combination of features was considered for further prediction. Considering the over-fitting effect rapidly increase with number of features, a combination of less than five features were used. Exhaustive Attack method (numeration method, which means list all combinations of the fatures) by combining 2, 3, and 4 features was used. The samples were divided into training and testing datasets. The training datasets contains 15 severe/critical cases and 178 mild cases, while the testing dataset is consist of 11 severe/critical cases and 132 mild cases. In this step, Support Vector Machine (SVM) was used to develop model in training set by the selected features, and predict the outcome in testing set. The Area Under Receiving Operating Curve (AUROC) was used for evaluate the performance for the model in both training and testing dataset. As expected, the performance of the models was better than single features. Among the AUROC of the combinations of age, GSH, CD3 percentage and total protein (the AUROC for each feature was 0.79.3, 0.7970, 0.8147 and 0.7443, respectively) reached 0.9997 (Fig. 2 a) in the training dataset (Table 2 ). Thus, the combination was used, and the performance of model was also satisfactory in testing dataset, and AUROC was 0.9757 (Fig. 2b).
Fig. 2

Receiving Operating Characteristic (ROC) curves to evaluate the performance of the SVM model in training (A) and testing (B) datasets. The black dots is the optimized cut-off value (0.0667).

Table 2

The combinations performed best in the training set using SVM models.

CombinationsTraining AUCTesting AUC
Age, GSH, CD3 ratio, total protein0.9996168580.975711
Neutrophil percentage, albumin, GSH, CD4 ratio0.9973180080.975711
HCRP, Serum myoglobin, CL, CD4 ratio0.9983579640.969466
Age, Cl, Calcium, LDH0.9973180080.951748
Age, Serum myoglobin, Retinol binding protein, Acid glycoprotein0.9909604520.951423
Neutrophil percentage, Procalcitonin, Serum myoglobin, total protein0.9770244820.958362
Receiving Operating Characteristic (ROC) curves to evaluate the performance of the SVM model in training (A) and testing (B) datasets. The black dots is the optimized cut-off value (0.0667). The combinations performed best in the training set using SVM models. The detailed prediction results were shown in Table 2. Using the optimized cut-off value, 0.0667, the only one sample was false negative, all the other samples were correctly predicted. When applying the same model onto the testing dataset, recall rate was 100 %, and there were 15 false positives, and no false negatives. In summary, the four-features based SVM model is robust and effective in predicting the severe/critical patients.

Performance of SVM model

The performance of the SVM model was further analyzed by comparing the survival analysis. Since only three death cases were enrolled in this study, the “event” was selected as the time clinical severe/critical symptom observed. Using the aforementioned cut-off value, 0.0667, the samples were divided in to two groups, named Low-risk and High-risk groups. Since the sample number with severe/critical symptom is limited, the training set and validation set was combined for further analyses. As expected, the High-risk group has a higher severe rate than the Low-risk group (Fig. 3 a, p<1e-16). Since a proportion of cases were detected severe/critical symptom, which may bring bias in analysis. Thus, these samples were excluded for “survival” analysis. In consistent with previous results, the severe/critical symptom rate of High-risk groups was also significantly higher than the Low-risk groups (Fig. 3b, p<1e-16).
Fig. 3

Performance of the model. “Survival” analysis of the High-risk and Low-risk groups in all samples (A) and samples without severe/critical cases when sampling (B). The predicted values in different groups (C).

Performance of the model. “Survival” analysis of the High-risk and Low-risk groups in all samples (A) and samples without severe/critical cases when sampling (B). The predicted values in different groups (C). In addition to survival analyses, the prediction risk value was compared between severe/critical cases. As expected, the risk value of severe/critical cases is significantly higher than that of mild cases (Fig. 3c). Cox multivariate regression was analyzed, and the results showed that the features used in the model, GSH, total protein and CD3 percentage were not statistically significant, except for age, while the risk value is (Table 3 ). It is notable that despite that age is statistically significant, but the hazard ration is much low than the risk model (33 vs 1.04) indicating that model is more informative than these features (Table 4 ).
Table 3

True positive, true negative, false positive and false negative values of the model in training and testing datasets.

TrainingPredicted PositivePredicted Negative
Real Positive141
Real Negative0174
Table 4

Cox multivariate regression using features and predicted values.

VariablesHRL95 %CIH95 %CIp-value
Age1.04251.00251.0840.0368
GSH0.99660.97441.0190.7703
CD3 Percent0.98170.94271.0220.3715
Total protein0.93070.85531.0130.0958
Predict value32.98838.6023126.5053.43E-07
True positive, true negative, false positive and false negative values of the model in training and testing datasets. Cox multivariate regression using features and predicted values.

Discussion

The prevalence of COVID-19 posed a huge burden to medical resources since its high severe/critical rate. In this study, clinical and laboratory features were analyzed and 36 of them were found to be statistically significantly associated with the clinical outcome (severe/critical symptom) of these patients infected COVID-19. It is interesting that despite some patients (10 out of 26) were observed severe/critical symptom while the others (16 out of 26) were mild when underwent clinical and laboratory examinations, the features of all these cases were similar. It is also noticed that the features include dysfunction of immune cells and immune products, including CD3, CD4, CD19, CRP, high-sensitive CRP, leukomonocytes and neutrophils. In consistent with this, previous study claimed that severe cases have significantly more leukocytes count and CRP [6]. In combination of these clues, we suspect that the acute immune response has been start several days before severe/critical symptom begins. The lack of prediction model makes the early detection difficult. Despite that models for COVID-19 diagnosis and prognosis was developed, and at least 27 studies and 31 prediction model was developed [11]. Among these models, 10 were for survival risk while only two models were aimed to predict progression to a severe or critical state. A new study revealed that one demographic and six serological indicators (age, serum lactate dehydrogenase, C-reactive protein, the coefficient of variation of red blood cell distribution width (RDW), blood urea nitrogen, albumin, direct bilirubin were associated with severe symptoms, which is consistent with our study [12]. The model developed has sensitivity of 77.5 % and specificity of 78.4 % in the validation cohort. Since the laboratory indicators of this study is limited, the sensitivity and specificity are not satisfactory. Another study collected data from 133 patients with mild symptom in Wuhan, and used multivariate logistic regression for predicting the patients who will developed into severe symptom using AI, and the best AUC achieved was 0.954. However, the sample number is the major concern [13]. Compared with the models, our model used over 220 clinical indicators, and the model developed achieved a better performance and this model was further was validated. It is also noticed triiodothyronine (T3), free triiodothyronine, thyroxine (T4) and free thyroxine was significantly lower in severe/critical patients. The AUROC of triiodothyronine reached 0.96. Despite that correlation between thyroxine and severe/critical symptom was not reported in COVID-19 or MERS, relationship between critical symptom and thyroxine has been reported, and could be used for prognosis. Also, some SARS infected patients have decreased T3 and T4 [14], which may be caused by necrosis of thyroid [15]. The utilization of the model: develop an SVM model using the existing data, consisting of clinical outcome (severe/critical symptom) and features (age, GSH, total protein and CD3 percentage), input the corresponding data of each individual, and the likelihood of the patient develop into S/C symptom will be generated. If the value is high than the cutoff (0.0067). The patient is predicted to progress into SC, and vice versa. The limitation of this study is the relatively small sample size (N = 336). Due to the relative advanced treatment technology in Shanghai region, the critical/severe symptom rate is lower, which result in the limited number of severe/critical cases. In addition, among the patients with severe/mild symptom, some had observed critical/severe symptom when the samples were collected. In the future work, we will collect and analyze more samples from the other regions to further validate our model. In summary, we analyzed more than 200 clinical and laboratory features and proposed an SVM based model to predict the opportunity of patients progress into severe/critical symptoms. The model was developed in training dataset and validated in the testing dataset, the AUROC was 0.9996 and 0.9757, respectively, suggesting the robustness of model.

Author contributions

SL, LG, SX designed the project; SF, SL, SN, LF and LG collected the data, LG, SL and LP analyzed the data; LG, SL and SX interpreted the results.

Declaration of Competing Interest

The authors declare no (potential) conflict of interest
  19 in total

1.  A systematic review on AI/ML approaches against COVID-19 outbreak.

Authors:  Onur Dogan; Sanju Tiwari; M A Jabbar; Shankru Guggari
Journal:  Complex Intell Systems       Date:  2021-07-05

Review 2.  Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review.

Authors:  Samuel Lalmuanawma; Jamal Hussain; Lalrinfela Chhakchhuak
Journal:  Chaos Solitons Fractals       Date:  2020-06-25       Impact factor: 5.944

Review 3.  Contact tracing apps for the COVID-19 pandemic: a systematic literature review of challenges and future directions for neo-liberal societies.

Authors:  Alex Akinbi; Mark Forshaw; Victoria Blinkhorn
Journal:  Health Inf Sci Syst       Date:  2021-04-13

Review 4.  Machine Learning Approaches in COVID-19 Diagnosis, Mortality, and Severity Risk Prediction: A Review.

Authors:  Norah Alballa; Isra Al-Turaiki
Journal:  Inform Med Unlocked       Date:  2021-04-03

5.  Predictors of clinical deterioration in patients with suspected COVID-19 managed in a 'virtual hospital' setting: a cohort study.

Authors:  Nick A Francis; Beth Stuart; Matthew Knight; Rama Vancheeswaran; Charles Oliver; Merlin Willcox; Andrew Barlow; Michael Moore
Journal:  BMJ Open       Date:  2021-03-23       Impact factor: 2.692

6.  SMOTE-NC and gradient boosting imputation based random forest classifier for predicting severity level of covid-19 patients with blood samples.

Authors:  Elif Ceren Gök; Mehmet Onur Olgun
Journal:  Neural Comput Appl       Date:  2021-06-11       Impact factor: 5.606

Review 7.  Deep insight: Convolutional neural network and its applications for COVID-19 prognosis.

Authors:  Nadeem Yousuf Khanday; Shabir Ahmad Sofi
Journal:  Biomed Signal Process Control       Date:  2021-05-28       Impact factor: 3.880

8.  A hybrid computational framework for intelligent inter-continent SARS-CoV-2 sub-strains characterization and prediction.

Authors:  Moses Effiong Ekpenyong; Mercy Ernest Edoho; Udoinyang Godwin Inyang; Faith-Michael Uzoka; Itemobong Samuel Ekaidem; Anietie Effiong Moses; Martins Ochubiojo Emeje; Youtchou Mirabeau Tatfeng; Ifiok James Udo; EnoAbasi Deborah Anwana; Oboso Edem Etim; Joseph Ikim Geoffery; Emmanuel Ambrose Dan
Journal:  Sci Rep       Date:  2021-07-15       Impact factor: 4.379

9.  Thyroid hormone alterations in critically and non-critically ill patients with SARS-CoV-2 infection.

Authors:  Dimitra Argyro Vassiliadi; Ioannis Ilias; Maria Pratikaki; Edison Jahaj; Alice G Vassiliou; Maria Detsika; Kleio Ampelakiotou; Marina Koulenti; Konstantinos N Manolopoulos; Stamatis Tsipilis; Evdokia Gavrielatou; Aristidis Diamantopoulos; Alexandros Zacharis; Nicolaos Athanasiou; Stylianos Orfanos; Anastasia Kotanidou; Stylianos Tsagarakis; Ioanna Dimopoulou
Journal:  Endocr Connect       Date:  2021-06-17       Impact factor: 3.335

Review 10.  Predicting clinical outcomes among hospitalized COVID-19 patients using both local and published models.

Authors:  William Galanter; Jorge Mario Rodríguez-Fernández; Kevin Chow; Samuel Harford; Karl M Kochendorfer; Maryam Pishgar; Julian Theis; John Zulueta; Houshang Darabi
Journal:  BMC Med Inform Decis Mak       Date:  2021-07-24       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.