Literature DB >> 33521746

Development and validation of a risk score using complete blood count to predict in-hospital mortality in COVID-19 patients.

Hui Liu^1,2, Jing Chen^3,4, Qin Yang⁵, Fang Lei^1,6,3, Changjiang Zhang^6,3,7, Juan-Juan Qin^6,3, Ze Chen^1,6,3, Lihua Zhu^6,3, Xiaohui Song³, Liangjie Bai³, Xuewei Huang^6,3, Weifang Liu³, Feng Zhou^3,5, Ming-Ming Chen^6,3, Yan-Ci Zhao^6,3, Xiao-Jing Zhang^1,6,3, Zhi-Gang She^6,3, Qingbo Xu⁸, Xinliang Ma⁹, Peng Zhang^1,3,5, Yan-Xiao Ji^3,5, Xin Zhang³, Juan Yang³, Jing Xie⁶, Ping Ye¹⁰, Elena Azzolini^11,12, Alessio Aghemo^11,12, Michele Ciccarelli^11,12, Gianluigi Condorelli^11,12, Giulio G Stefanini^11,12, Jiahong Xia¹³, Bing-Hong Zhang¹⁴, Yufeng Yuan¹⁵, Xiang Wei¹³, Yibin Wang¹⁶, Jingjing Cai^1,17, Hongliang Li^1,6,3,5.

Abstract

BACKGROUND: To develop a sensitive risk score predicting the risk of mortality in patients with coronavirus disease 2019 (COVID-19) using complete blood count (CBC).
METHODS: We performed a retrospective cohort study from a total of 13,138 inpatients with COVID-19 in Hubei, China, and Milan, Italy. Among them, 9,810 patients with ≥2 CBC records from Hubei were assigned to the training cohort. CBC parameters were analyzed as potential predictors for all-cause mortality and were selected by the generalized linear mixed model (GLMM).
FINDINGS: Five risk factors were derived to construct a composite score (PAWNN score) using the Cox regression model, including platelet counts, age, white blood cell counts, neutrophil counts, and neutrophil:lymphocyte ratio. The PAWNN score showed good accuracy for predicting mortality in 10-fold cross-validation (AUROCs 0.92-0.93) and subsets with different quartile intervals of follow-up and preexisting diseases. The performance of the score was further validated in 2,949 patients with only 1 CBC record from the Hubei cohort (AUROC 0.97) and 227 patients from the Italian cohort (AUROC 0.80). The latent Markov model (LMM) demonstrated that the PAWNN score has good prediction power for transition probabilities between different latent conditions.
CONCLUSIONS: The PAWNN score is a simple and accurate risk assessment tool that can predict the mortality for COVID-19 patients during their entire hospitalization. This tool can assist clinicians in prioritizing medical treatment of COVID-19 patients. FUNDING: This work was supported by National Key R&D Program of China (2016YFF0101504, 2016YFF0101505, 2020YFC2004702, 2020YFC0845500), the Key R&D Program of Guangdong Province (2020B1111330003), and the medical flight plan of Wuhan University (TFJH2018006).

Entities: Chemical

Keywords: COVID-19; complete blood count; latent markov model; mortality; prediction model; risk score

Mesh：

Year: 2021 PMID： 33521746 PMCID： PMC7831644 DOI： 10.1016/j.medj.2020.12.013

Source DB: PubMed Journal: Med (N Y) ISSN： 2666-6340

Introduction

The outbreak of coronavirus disease 2019 (COVID-19) continues to escalate, with particular intensity in developing countries, where the surging demand for pandemic prevention and control has put a major strain on underresourced national health systems. , Considering the heterogeneity in pathogenic manifestation among COVID-19 patients, there is an urgent need to develop an accurate and robust risk assessment tool to evaluate the disease prognosis that is also easy and economical to implement. Such a tool would help frontline clinicians to optimize medical interventions with very limited medical resources. Several prognostic models for COVID-19 have been reported in the past few months. However, most of them are considered to have a high risk of bias due to deficiencies in the methodologies used. Furthermore, some predictors in these models rely on tests that are time-consuming and costly, thus reducing their applicability in regions with limited medical resources. Extensive and dynamic changes in peripheral blood cells, such as lymphopenia and neutrophilia, have been observed in patients with COVID-19, and are considered to be closely associated with the severity of the disease.4, 5, 6 Furthermore, complete blood count (CBC) is one of the most commonly available tests in the clinic, with minimal time and cost involved. Here, we have collected and analyzed longitudinal data about CBC from a large sample of COVID-19 cases and found that a composite score based on a few selected CBC parameters could dynamically and with high accuracy predict the risk of imminent death during hospitalization. We also revealed the longitudinal trajectories of CBC parameters and the composite score for the disease severity of COVID-19 across the duration of hospitalization. The findings of our study would be particularly helpful for optimizing clinical decision making and potentially reducing the mortality rate in countries suffering from a significant shortage of medical resources.

Results

Clinical characteristics of patients from Hubei province

As shown in Figure 1 , a final total of 12,759 patients (6,593 non-severe survivors, 4,246 severe survivors, 984 deaths and 936 censored) were included in this analysis from the Hubei cohort. The median follow-up day was 17 (interquartile range [IQR], 11–26). The baseline clinical characteristics, preexisting chronic diseases, and laboratory examinations at admission are described in Table 1 . The median age of the participants was 59 (IQR, 46–68) years, and 48.3% were males. The median period from symptom onset to hospitalization was 11 (IQR, 7–20) days. The median peripheral oxygen saturation (SpO2) was 97% (IQR, 95-98), and 73.3% of the patients had fever. Hypertension (36.4%), diabetes (17.2%), and coronary arterial disease (9.2%) were the most common coexisting chronic diseases in this cohort. On admission, 40.4% of the patients showed decreased lymphocyte counts. Increased neutrophil counts and decreased platelet counts occurred in 15.5% and 9.7% of the patients, respectively. Inflammation markers, including C-reactive protein (CRP) and procalcitonin, were elevated in 49.2% and 42.1% of the patients, respectively. The elevated alanine aminotransferase (ALT), blood urea nitrogen (BUN), and creatinine kinase-myocardial band (CK-MB) indicating liver, kidney, and cardiac impairment were reported in 22.3%, 9.5%, and 5.5% of the participants, respectively. The differences in baseline characteristics among patients from non-severe survivor, severe survivor, and death groups are detailed in Table 1.

Figure 1

Flowchart for patient selection and distribution of the training and the validation cohorts

aExcluded due to leukemia, bwith at least 2 complete blood count (CBC) records, cwith CBC records at arbitrary time points, dwith only 1 CBC test, and eremained in hospitals at the end of follow-up date. LMM, latent Markov model.

Table 1

Baseline characteristics of the Hubei cohort

Variables	All (n = 12,759)	Non-severe survivor (n = 6,593)	Severe survivor (n = 4,246)	Death (n = 984)
Clinical characteristics of admission

Median age (IQR), y	59 (46–68)	56.0 (42.0–66.0)	60.0 (48.0–68.0)	70.0 (63.0–78.0)
Male sex, no./total no. (%)	6,157 (48.3)	3,016/6,593 (45.8)	2,031/4,246 (47.8)	648/984 (65.9)
Median heart rate (IQR), bpm	84 (78–96)	81.0 (77.0–90.0)	94.0 (80.0–107.0)	89.0 (78.0–101.0)
Median respiratory rate (IQR)	20 (19–21)	20.0 (19.0–20.0)	20.0 (20.0–22.0)	21.0 (20.0–25.0)
Median systolic blood pressure (IQR), mmHg	128 (120–140)	128.0 (120.0–140.0)	127.0 (117.0–140.0)	131.0 (120.0–146.0)
Median diastolic blood pressure (IQR), mmHg	79 (71–87)	80.0 (72.0–88.0)	78.0 (70.0–85.0)	77.0 (69.0–86.0)
Fever, no./total no. (%)	9,351 (73.3)	4,617/6,593 (70.0)	3,278/4,246 (77.2)	782/984 (79.5)
Median SpO₂ (IQR), %	97 (95–98)	98.0 (96.0–99.0)	97.0 (95.0–98.0)	90.0 (81.0–96.0)
Days from symptom to hospitalization (IQR), days	11 (7–20)	12.0 (6.0–22.0)	11.0 (7.0–19.0)	10.0 (6.0–14.0)
Median follow-up time (IQR), days	17 (11–26)	16.0 (10.0–24.0)	22.0 (14.0–32.0)	9.0 (5.0–17.0)

Comorbidities on admission

Diabetes mellitus (any type), n (%)	2,200 (17.2)	852 (12.9)	872 (20.5)	290 (29.5)
Chronic obstructive pulmonary disease, n (%)	150 (1.18)	51 (0.8)	50 (1.2)	36 (3.7)
Hypertension, n (%)	4,648 (36.4)	2,020 (30.6)	1,680 (39.6)	559 (56.8)
Coronary arterial disease, n (%)	1,168 (9.2)	454 (6.9)	397 (9.4)	202 (20.5)
Heart failure, n (%)	99 (0.8)	11 (0.2)	31 (0.7)	46 (4.7)
Cerebrovascular disease, n (%)	413 (3.2)	148 (2.2)	125 (2.9)	81 (8.2)
Renal insufficiency, n (%)	500 (3.9)	168 (2.6)	132 (3.1)	150 (15.2)
Neoplastic disease, n (%)	385 (3.0)	180 (2.7)	110 (2.6)	61 (6.2)
Liver disease, n (%)	276 (2.2)	125 (1.9)	103 (2.4)	32 (3.3)

Laboratory examination on admission

WBC count >9.5 × 10⁹/L, no./total no. (%)	1,296/12,759 (10.2)	320/6,593 (4.9)	431/4,246 (10.2)	411/984 (41.8)
Neutrophil count >6.3 × 10⁹/L, no./total no. (%)	1,980/12,759 (15.5)	501/6,593 (7.6)	695/4,246 (16.4)	574/984 (58.3)
Lymphocyte count <1.1 × 10⁹/L, no./total no (%)	5,150/12,759 (40.4)	1,832/6,493 (27.8)	1,958/4,246 (46.1)	841/984 (85.5)
Platelet count <125 × 10⁹/L, no./total no. (%)	1,236/12,759 (9.7)	439/6,493 (6.7)	374/4,246 (8.8)	304/984 (30.9)
CRP > ULN, no./total no. (%)a	3,565/7,247 (49.2)	1,550/4,157 (37.3)	1,055/1,871 (56.4)	529/542 (97.6)
Procalcitonin > ULN, no./total no. (%)a	4,303/10,212 (42.1)	1,444/4,927 (29.3)	1,859/3,629 (51.2)	696/854 (81.5)
Alanine transaminase >40 U/L, no./total no. (%)	2,717/12,213 (22.3)	1,260/6,245 (20.2)	1,005/4,159 (24.2)	261/949 (27.5)
BUN > ULN, no./total no. (%)a	1,178/12,361 (9.5)	302/6,308 (4.8)	296/4,188 (7.1)	449/959 (46.8)
CK-MB > ULN, no./total no. (%)a	462/8,339 (5.5)	138/3,870 (3.6)	109/2,960 (3.7)	177/748 (23.7)
Total cholesterol >5.17 mmol/L, no./total no. (%)	1,302/10,301 (12.6)	762/4,936 (15.4)	409/3,745 (10.9)	49/820 (6.0)
D-dimer > ULN, no./total no. (%)a	5,497/11,384 (48.3)	2,038/5,618 (36.3)	2,175/3,985 (54.6)	819/930 (88.1)
LDL-C >3.37 mmol/L, no./total no. (%)	1,302/9,220 (14.1)	731/4,604 (15.9)	421/3,103 (13.6)	57/720 (7.9)

bpm, beats per minute; BUN, blood urea nitrogen; CK-MB, creatinine kinase-myocardial band; CRP, C-reactive protein; IQR, interquartile range; LDL-C, low-density lipoprotein cholesterol; SpO2, peripheral oxygen saturation; ULN, upper limit of normal; WBC, white blood cell.

ULN was defined according to criteria in each hospital.

Flowchart for patient selection and distribution of the training and the validation cohorts aExcluded due to leukemia, bwith at least 2 complete blood count (CBC) records, cwith CBC records at arbitrary time points, dwith only 1 CBC test, and eremained in hospitals at the end of follow-up date. LMM, latent Markov model. Baseline characteristics of the Hubei cohort bpm, beats per minute; BUN, blood urea nitrogen; CK-MB, creatinine kinase-myocardial band; CRP, C-reactive protein; IQR, interquartile range; LDL-C, low-density lipoprotein cholesterol; SpO2, peripheral oxygen saturation; ULN, upper limit of normal; WBC, white blood cell. ULN was defined according to criteria in each hospital.

Dynamic trajectories of CBC parameters

Figure 2 illustrates the linear fitting curve for dynamic trajectories of 13 CBC parameters from admission to day 30 of hospitalization, grouped by disease severity. The levels of white blood cell (WBC) counts, neutrophil counts, neutrophil percentage, and neutrophil:lymphocyte ratio (NLR) increased along with the severity of the disease. These severity-correlated elevations were discernible early on at admission and persisted throughout the 30-day period of hospitalization. For all four parameters, an upward trend was observed during the first 2 weeks, followed by a decline in the later period. In contrast, the levels of lymphocyte counts, lymphocyte percentage, monocyte counts, eosinophil counts, basophil counts, and platelet counts were decreased along with the severity of the disease. The differences among the non-severe survivor, the severe survivor, and the death groups were also evident early on at admission and persisted during hospitalization. Finally, for red blood cell (RBC) counts, hematocrit, and hemoglobin concentrations, the differences were not significant on admission. However, during hospitalization, the values of these three parameters decreased steadily, and the slope and the magnitude of the decline correlated with the disease severity.

Figure 2

Dynamic trajectories of 13 CBC parameters in patients with COVID-19

Smooth trajectories of the values of CBC parameters by the severity of the disease with 95% confidence intervals were plotted based on locally weighted regression and smoothing scatterplots. The horizontal dotted lines represent the empirical upper limit of normal (ULN) or lower limit of normal (LLN) of these CBC parameters. M, males; F, females.

Dynamic trajectories of 13 CBC parameters in patients with COVID-19 Smooth trajectories of the values of CBC parameters by the severity of the disease with 95% confidence intervals were plotted based on locally weighted regression and smoothing scatterplots. The horizontal dotted lines represent the empirical upper limit of normal (ULN) or lower limit of normal (LLN) of these CBC parameters. M, males; F, females.

Predictor selection and score development

All of the 9,810 patients (8,311 discharged, 773 deaths, and 726 censored) in the training cohort were included for variable selection and risk score development. Variable selection process was detailed in Figure 3 and Tables S1 and S2. A total of 38 variables—25 categorized variables and 13 continuous parameters after being scaled and centered—were used for subsequent variable selection. The generalized linear mixed models (GLMMs) with each variable as a fixed effect were ranked by the Akaike information criterion (AIC) and are reported in Table S1. Comparisons of hierarchical models with different random effects (models 1, 2, and 3) are shown in Table S2, and model 1 ,with patient-specific random slope and random intercept, was chosen due to its lowest AIC value. To further select the fixed-effect predictors with the most impact, we used multivariate GLMMs with a forward stepwise approach based on the AIC rankings in Table S1. As shown in Table S2, model 4 showed the middle step of the forward selection process, and model 5 was the final model selected with fixed-effects NLR (3 categories), platelet counts decrease, WBC counts increase, and neutrophil counts increase when maintaining the hierarchical structure of patient-specific random intercept and random slope. Model 6 was based on model 5, with an additional controlled age category. Age was categorized based on COVID-19 fatality ratios—that is, category 1 (also as the reference category): ages 0–49 when the fatality ratio was <1%, category 2: ages 50–59, category 3: ages 60–69, and category 4: ages ≥ 70. All of the parameters remained significant, including all age categories in model 6.

Figure 3

Flowchart for variable selection

A total of 38 CBC factors—13 numeric and 25 categorized variables—were included in the selection process. Generalized linear mixed models (GLMMs) with each variable as a fixed effect were built. The multivariate GLMM with stepwise forward selection following the Akaike information criterion (AIC) ranking established from the univariate fixed-effects models was applied. The significance levels for entry and stay were set to 0.05, and the multivariate GLMM was further controlled for age. A multivariate model with 5 variables was selected as the optimal model and was used to develop the risk assessment score.

Flowchart for variable selection A total of 38 CBC factors—13 numeric and 25 categorized variables—were included in the selection process. Generalized linear mixed models (GLMMs) with each variable as a fixed effect were built. The multivariate GLMM with stepwise forward selection following the Akaike information criterion (AIC) ranking established from the univariate fixed-effects models was applied. The significance levels for entry and stay were set to 0.05, and the multivariate GLMM was further controlled for age. A multivariate model with 5 variables was selected as the optimal model and was used to develop the risk assessment score. We assigned each of the factors derived from model 6 a numeric score by rounding up the number of its specific coefficient in the Cox proportional hazards regression model. Thus, a risk-assessment scoring model based on PAWNN parameters (platelet, age, WBC counts, neutrophil counts, and NLR) was established, with the possible scores ranging from 0 to 12 points (Table 2 ). A PAWNN score of 6 was the cutoff value for discriminating the risk of death in the Cox model and in the Kaplan-Meier curve.

Table 2

Factors generating from GLMM and point distribution according to the coefficiency in the Cox model

Covariates	Estimates	SE	z	p	Points
NLR > 4.06	3.50	0.11	14.46	<0.001	6
NLR 2.22–4.06	1.03	0.12	4.35	<0.001	2
Platelet counts decrease	0.75	0.03	10.42	<0.001	2
Neutrophil counts increase	0.46	0.04	7.59	<0.001	1
WBC counts increase	0.15	0.04	2.86	0.004	1

Age, y

50–59	0.51	0.06	2.76	0.006	1
60–69	0.47	0.06	2.85	0.004	1
≥70	0.75	0.06	4.68	<0.001	2

Platelet count decrease indicates platelet count <100 × 109/L, neutrophil count increase indicates neutrophil count >6.3 × 109/L, and WBC count increase indicates WBC count >9.5 × 109/L. GLMM, generalized linear mixed model; NLR, neutrophil:lymphocyte ratio; SE, standard error; WBC, white blood cell.

Cox model: ℎ(t,X) = ℎ_0 (t)exp(3.5[NLR > 4.06] + 1.03[NLR 2.22–4.06] + 0.75[platelet counts decrease] + 0.46[neutrophil counts increase] + 0.15[WBC counts increase] + 0.51[age 50–59] + 0.47 [age 60–69] + 0.75[Age ≥ 70]).

Factors generating from GLMM and point distribution according to the coefficiency in the Cox model Platelet count decrease indicates platelet count <100 × 109/L, neutrophil count increase indicates neutrophil count >6.3 × 109/L, and WBC count increase indicates WBC count >9.5 × 109/L. GLMM, generalized linear mixed model; NLR, neutrophil:lymphocyte ratio; SE, standard error; WBC, white blood cell. Cox model: ℎ(t,X) = ℎ_0 (t)exp(3.5[NLR > 4.06] + 1.03[NLR 2.22–4.06] + 0.75[platelet counts decrease] + 0.46[neutrophil counts increase] + 0.15[WBC counts increase] + 0.51[age 50–59] + 0.47 [age 60–69] + 0.75[Age ≥ 70]).

Performance of PAWNN score in the training and validation cohorts

We performed internal validation for the accuracy and specificity of the PAWNN score by 10-fold cross-validation among patients with outcomes. The area under the receiver operating characteristic (AUROC) curves based on randomly divided 10 subsets from the training cohort ranged from 0.92 (95% confidence interval [CI] 0.91–0.93) to 0.93 (95% CI 0.92–0.94) (Figure S1A). The trend plot of the PAWNN score showed a clear distinction of levels among the non-severe survivor, severe survivor, and death groups during the course of hospitalization (Figure S1B). By dividing the follow-up period into quartiles, we found that the PAWNN score remained highly accurate for predicting mortality at different time intervals. The lowest AUROC, 0.89 (95% CI 0.88–0.90), was observed for 0–1 day after hospital admission, with a cutoff value of 6 points, and the highest AUROC, 0.94 (95% CI 0.93–0.94/0.95), was reached after 8 days after admission, with a cutoff value of 6 points (Figure S1C; Table S3). In the validation dataset of 2,949 patients (2,528 discharged, 211 died, and 210 censored) from Hubei Province with only a single CBC test during hospitalization, the PAWNN score had an AUROC of 0.97 (95% CI 0.96–0.98), a sensitivity of 93.84% (95% CI 90.51–98.10), and a specificity of 90.90% (95% CI 85.13–92.84) (Table 3 ). The performance of the PAWNN score was further tested in a cohort of COVID-19 patients from Milan, Italy, where CBC data were collected at admission. Baseline characteristics of this cohort were described in Table S4. The predictability performance in the Italian cohort remained high, with an AUROC of 0.80 (95% CI 0.74–0.86), a sensitivity of 68.83% (95% CI, 58.44–94.81), and a specificity of 80.67% (95% CI, 49.33–87.33) (Table 3).

Table 3

Performance of PAWNN score in Wuhan and Italian validation datasets

	Hubei participants with 1 CBC test	Italian participants
No. patients	2,739	227
AUROC (95% CI)	0.97 (0.96–0.98)	0.80 (0.74–0.86)
Total accuracy, % (95% CI)	91.13 (86.05–92.85)	76.65 (65.20–81.94)
Sensitivity, % (95% CI)	93.84 (90.51–98.10)	68.83 (58.44–94.81)
Specificity, % (95% CI)	90.90 (85.13–92.84)	80.67 (49.33–87.33)
PPV, % (95% CI)	46.30 (35.27–52.01)	64.77 (49.32–73.97)
NPV, % (95% CI)	99.44 (99.13–99.81)	83.67 (79.00–93.94)

AUROC, area under the subject operating characteristic curve; CBC, complete blood count; CI, confidence interval; NPV, negative predictive value; PPV, positive predictive value.

Performance of PAWNN score in Wuhan and Italian validation datasets AUROC, area under the subject operating characteristic curve; CBC, complete blood count; CI, confidence interval; NPV, negative predictive value; PPV, positive predictive value.

Latent Markov Model (LMM)

The LMM consists of a structural model for the latent disease status and a measurement model for the observed indicators, which are the parameters selected in GLMMs. By fitting the proposed model with the LMest package in R, we found that the model with three statuses resulted in the lowest AIC (Table S5). In the following, we report the results obtained with this number of statuses. The parameters’ effects on the logit of initial probabilities are presented in Table S6. Table S7 reports the model initial probability configurations, which indicate that the two outcome groups performed similarly at the early phase of the test. These probabilities allowed us to characterize the latent status. The first latent status (LS 1) (low-risk group) and the second status (LS 2) (medium-risk group) showed the highest probability to respond to survived outcomes, and the third status (LS 3) (high-risk group) response was mostly to dead outcomes. These probability distributions represent response strategies as a function of the latent component or LS of the disease. In particular, in our context, LS 1 may be easily understood as a safety status in which LS 2 seems to characterize a type of endangered status. Finally, LS 3 indicates a nearly dying disease status. The structure of the three-status model with the prevalence of each status is shown in Figure 4 . The means and standard deviations (SDs) of the PAWNN score were calculated by status to evaluate the ability of the PAWNN score to identify LS at different time points. These represent, at the given time point, the probability of transiting from a current status LS to a different status LS or remaining in the same status LS.

Figure 4

Status prevalence and transition probabilities between the statuses at subsequent time points

The number of survivors and non-survivors with their means and standard deviations of the PAWNN score of each status at each time point are recorded in the boxes. The transition probabilities represent the probability that a member of a given status at a specified time point will transition to another given status at the next time point. Transition probabilities are represented by arrows. Time 1, time 2, and time 3 were selected based on days 0–7 after admission (time 1), value of CBC records during days 8–14 (time 2) of hospitalization, and CBC during hospitalization ≥15 days (time 3).

Status prevalence and transition probabilities between the statuses at subsequent time points The number of survivors and non-survivors with their means and standard deviations of the PAWNN score of each status at each time point are recorded in the boxes. The transition probabilities represent the probability that a member of a given status at a specified time point will transition to another given status at the next time point. Transition probabilities are represented by arrows. Time 1, time 2, and time 3 were selected based on days 0–7 after admission (time 1), value of CBC records during days 8–14 (time 2) of hospitalization, and CBC during hospitalization ≥15 days (time 3). The transition trajectories showed a clear pattern that all patients with dead outcomes either went through the medium-risk to the high-risk process or directly from the low-risk group, whereas fewer patients with medium risk transferred back to the low-risk status. The transition probabilities from the low risk at time 1 to the medium-risk and high-risk status at time 2 were 18% and 1%, respectively. The transition probabilities from the medium risk at time 1 to the low-risk group at time 2 were 27%. There were 17% patients at the low risk and 1% at the medium risk at time 2 transit to the high-risk group at time 3. In general, the PAWNN score was significantly higher for the high-risk group (LS 3) compared to the survived at all time points. The PAWNN score was a good predictor of death at all 3 time points, with the AUROC 0.77 at time 1, 0.91 at time 2, and 0.97 at time 3 (Tables S8 and S9). The PAWNN score also demonstrated good discrimination power for time 2 LS, with an AUROC of 0.86 and time 3 LS, with an AUROC of 0.81. For prediction transition probability, an AUROC value of 0.86 was attributed from the time 1 PAWNN score on time 2 LS, as well as an AUROC of 0.81 from the time 2 PAWNN score on time 3 LS (Table S9).

Discussion

In this study, we have developed and validated a composite score (PAWNN score) using age and 4 CBC parameters (platelet, WBC counts, neutrophil counts, and NLR) to predict mortality in patients with COVID-19 during hospitalization. The performance of this risk score was satisfactory in accuracy based on AUROCs in both the training and the validation cohorts from China and an external validation cohort from Italy. Therefore, the PAWNN score may serve as an accurate and reliable tool to quickly quantify the risk of imminent death across different clinical cohorts. Since the CBC is among the most commonly available and low-cost tests, the PAWNN score can be readily determined and implemented as a very simple and economic tool to prioritize patients for physicians with diverse backgrounds and specialties quickly. In the subgroup analysis, we showed that the AUROC was lower in the patients admitted before February 12 compared to those admitted after February 12 (Table S10). It is suggested that the limited medical resources and lack of therapeutic experience may disrupt proper treatment for patients with COVID-19 before February 12 (the first peak day of daily new cases in China). While the performance of the PAWNN score was stable in patients with different age, gender, and preexisting comorbidities (Tables S10 and S11), a PAWNN score of 6 demonstrated good performance in discriminating the risk of death in all of the groups (Figure S2). If a patient’s predicted risk for death is low (e.g., PAWNN score ≤6), the physician may choose to monitor at peripheral or district hospitals, while a high-risk estimate may support more aggressive intervention or early transfer to tertiary centers and admission to the intensive care unit. Our PAWNN score is particularly helpful for physicians to allocate limited resources to the most needy in areas with less advanced healthcare systems during a surge of COVID-19. In Table S3, we categorize patients with CBC tests into quartiles according to the number of hospital days from admission. The death rate was higher in the first two quartiles but lower in the last two quartiles. A possible explanation for this phenomenon may be associated with the stringent rules for patient discharge in China. After COVID-19-associated symptoms are significantly relieved, hospital discharge is authorized only after 2 consecutive negative viral PCR tests. Thus, patients who survive have a long hospital stay. In the first quartile, there was a higher proportion of sick patients and higher mortality. The PAWNN score can dynamically predict mortality during the entire course of hospitalization. At different intervals during the hospital stay, the PAWNN score shows high predictive accuracy for death based on ROC curves. In the clinical setting, evaluation at admission decides medical resource allocation. The PAWNN score demonstrates decent but slightly lower prediction accuracy in the earliest phase of hospitalization, which indicates a larger variability in the earlier phases of the disease; the closer to the outcome, the more accurate the PAWNN score in its prediction. Furthermore, using a data-driven approach (i.e., LMM), we defined longitudinal statuses of COVID-19 patients based on selected CBC parameters. The PAWNN score was a distinguishing feature of the defined statuses at each time point. LMM also allowed us to examine the transition probabilities between statuses over time simultaneously. One of the most interesting findings of this analysis is the existence of transitioning from the low-risk status at time 1 to the medium-risk status at time 2 and then to the high-risk status at time 3. These transitioning trajectories are correlated with a marked increase in mortality risk, and the PAWNN score showed good predictive accuracy for class transitioning at the subsequent time point. Thus, early prediction of transition probabilities based on the current PAWNN score can help to modify interventions in advance. This further supports the significance of the PAWNN score for dynamic monitoring of prognosis in COVID-19 patients during the entire course of hospitalization. Our study systematically analyzed the dynamic trajectories of 13 CBC parameters in patients with different severities of COVID-19. In general, the temporal patterns of changes in all CBC parameters are distinctly different between non-survivors and survivors. Moreover, the dynamic changes in all parameters are closely associated with the course of disease progression. These data can help us assess the status of the disease in COVID-19 patients during hospitalization. The identified predictors in PAWNN score from our study have been implicated in several previous studies as potential risk factors for mortality or severe illness related to COVID-19. Liu et al. have proposed that NLR on admission could serve as an independent predictor of in-hospital mortality for COVID-19 patients. Both lymphocytopenia and neutrophilia were often observed in earlier reports of COVID-19 symptoms, which showed more prominent manifestations among non-survivors versus survivors. , An overt lymphocytopenia suggests that lymphocyte deficiency or incapacity is a critical cellular pathology of COVID-19, and less robust immune responses following severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) infection may contribute to disease progression. Moreover, previous studies demonstrate that ∼7–14 days from symptom onset, there is a surge in the clinical manifestations of COVID-19-related complications along with a pronounced systemic increase of inflammatory mediators and cytokines, the so-called cytokine storm. Neutrophils are the main source of chemokines and cytokines. In our study, neutrophil counts in the high-risk group were above the normal range throughout hospitalization, which reflects a sustained status of inflammatory overactivation. This may induce the cytokine storm and subsequently contribute to the development of acute respiratory distress syndrome (ARDS) and death. In our study, the levels of platelet counts in non-survivors were substantially lower than those in survivors, which agreed with the findings of Zhou et al. and a recent meta-analysis. The mechanisms of thrombocytopenia in coronavirus infections may be multifactorial, but a surge in platelet consumption in response to endothelial damage caused by coronavirus infection and mechanical ventilation is plausible. Currently, the severity and risk of death in patients with COVID-19 are often graded with the use of only a single parameter, such as lymphopenia. Tan et al. analyzed the time courses of lymphocyte counts in a small sample of COVID-19 patients and proposed a model that found that patients with <20% lymphocytes at days 10–12 from illness onset and <5% at days 17–19 have the worst prognosis. Nevertheless, this model was based on a small sample size and lacked the minimal level of test and validation of its robustness. Furthermore, a single parameter may not adequately reflect disease progression at the function level. Thus, a sensitive and quantitative composite score may be more valuable for risk stratification. Recently, Liang et al. developed a clinical risk score (COVID-GRAM) to predict the occurrence of critical illness based on 1,590 hospitalized patients with COVID-19. However, some of the parameters in this score, such as chest radiography, lactate dehydrogenase, and direct bilirubin, require additional time and cost, which undermines its application as a simple, economic, yet broadly applicable tool in regions with scarce medical resources. Furthermore, this score was based only on variables at hospital admission; therefore, it may not be dynamically monitored during the course of hospitalization. In this study, we developed and validated a readily applicable risk assessment tool, the PAWNN score, to dynamically estimate the risk of imminent death for patients with COVID-19 during the course of hospitalization by using CBC parameters. As CBC is the most commonly available test, the PAWNN score may assist frontline clinicians from diverse backgrounds and specialties to optimize the use of limited medical resources in areas experiencing a surge in COVID-19.

Limitations of study

Although our study developed and validated a simple and dynamically applicable composite risk score based on a large retrospective in-hospital cohort of COVID-19, several limitations should be noted in interpreting the results. First, our study was based on retrospectively collected CBC parameters and focused on prediction. Hence, no causal conclusions could be drawn from our algorithm. Second, this simple CBC model could be a convenient tool for stratifying patients at risk in the clinic; however, it may have a limited ability to explain all of the variances across patient populations. Third, repeated tests for CBC were carried out at different time intervals for each patient; thus, bias may occur due to an increased number of tests in patients with severe illness. Fourth, due to the limited longitudinal data for patients in the external validation sets, we were not able to validate the PAWNN score indicating the probability of transition between classes during hospitalization. Fifth, the patient cohort was recruited from hospitalized patients from only one province in China, and the sample size in the Italy cohort was relatively small. Whether this model could be generalizable to outpatients and patients with different genetic and geographic backgrounds would require further external validation.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to the Lead Contact, Hongliang Li (lihl@whu.edu.cn).

Materials availability

The study did not generate any new reagents or materials.

Data and code availability

Data and codes involved in this research are available from the corresponding author upon reasonable request. The research team will provide an email address for communication once the information sharing is approved. The proposal should include detailed aims, statistical plan, and other information/materials to guarantee the rationality of requirement and the security of the data. The related patient data will be shared after review and approval of the submitted proposal and any related requested materials. Of note, data with patient names, national identification number, and other identifiers cannot be shared.

Experimental model and subject details

Participants and study procedure

We performed a retrospective cohort study from a total of 13,138 in-hospital patients with confirmed COVID-19 and at least one CBC test in Hubei Province, China, and Milan, Italy. Among them, 12,911 patients with COVID-19 who were admitted to 16 COVID-19 designated hospitals in Hubei province, China from January 1st, 2020 to April 15th, 2020, and had CBC recorded at least once during hospitalization were initially evaluated for inclusion. The Italian cohort initially included 227 patients with confirmed COVID-19 admitted between Match 1st, 2020, and Match 31st, 2020, in Humanitas Research Hospital in Milan, Italy. All patients were consecutively enrolled in the study in each designated hospital. COVID-19 was diagnosed by clinical manifestations, computerized chest tomography (CT), or reverse transcription-polymerase chain reaction (RT-PCR) according to the New Coronavirus Pneumonia Prevention and Control Program (5th edition) published by the National Health Commission of China and WHO interim guidance. , The classification of non-severe and severe cases was according to the New Coronavirus Pneumonia Prevention and Control Program (5th edition). During the hospitalization, patients with fever or suspected respiratory infection, plus one of the following clinical manifestations including respiratory rate > 30 breaths/min, severe respiratory distress, or SpO2 < 93% on room air were classified as severe cases. Of the initial 12,911 patients in Hubei province, 152 with leukemia were excluded. The end follow-up date was April 26th, 2020. 11,823 of the 12,759 patients who were either discharged or died were included in our analysis. 936 remained in hospital at the end of follow up were treated as censored during model building process. Among them, 9,810 who had at least two CBC tests during hospitalization were designated as the training cohort to establish a dynamically applicable risk score based on the longitudinal data. 3,174 patients who had at least 3 times CBC tests on 3 different phases during hospitalization were included in the development of a LMM. 2,949 patients who had only one CBC record in Hubei Province were used for externally validate the performance of the PAWNN score in discriminating the risk of death. The C-statistic of PAWNN score was performed in 2,739 patients with clear outcomes at the end of the follow-up date (Figure 1). To test the generality of PAWNN score in Western population, the performance of the PAWNN score in discriminating the risk of death and C-statistic were also external validated in 227 patients from the Italy. There were 77 patients died and 150 patients discharged by the end of the follow-up date (Figure 1). The study design was approved by the central ethics boards and was accepted or approved by each collaborating hospital. Patient informed consent was waived by the ethics committees from each hospital. A part of the baseline data in this manuscript have been used in our previous articles.30, 31, 32, 33

Method details

Data collection and definitions

Demographic information, clinical characteristics, medical history, laboratory tests, and outcome data were obtained with data collection forms from patients’ electronic medical records. CBC parameters including RBC counts, hemoglobin concentrations, hematocrit, WBC counts, neutrophil counts, neutrophil percentage, lymphocyte counts, lymphocyte percentage, basophil counts, eosinophil counts, monocyte counts, and platelet counts during hospitalization were extracted. NLR was calculated and used as a candidate predictor due to its significant changes reported in COVID-19 patients. There were no missing data on the 13 CBC parameters in all included patients. The outcome was defined as all-cause mortality during hospitalization. All included patients were stratified into non-severe survivor, severe survivor, and death groups. Personal identification information (e.g., name and ID) of each participant was anonymized before data extraction by giving a new study ID through a coding system. A team of experienced physicians carefully interpreted and double-checked all data to ensure accuracy. Mortality and collected medical information were evaluated by two independent groups of physicians.

Quantification and statistical analysis

Statistical analysis

All statistical analyses were conducted using R-3.6.3 (R Foundation for Statistical Computing, Vienna, Austria). Continuous variables were presented as the median and IQR or mean and SD. Categorical variables were presented as frequency and percentage (%). The linear fitting curve for dynamic trajectories of CBC parameters from admission to Day 30 of hospitalization by the severity of COVID-19 was performed using locally weighted regression and smoothing scatterplots (LOESS). A two-sided P value less than 0.05 was used to define statistical significance.

Predictor selection and model development

We followed the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) statement for reporting multivariable prediction model development and validation (Methods S1). The variable selection process was depicted in Figure 3. All patients with COVID-19 in the training cohort were included for variable selection and risk score development. Before fixed effects selection, patient-specific and site-specific random effects were compared to no random effect model by AIC and the marginal and conditional R2 recommended by Nakagawa and Schielzeth and Johnson. Association with CBC records and the outcome was evaluated using hierarchical GLMMs with selected random effects. To better understand the contribution of each CBC variable, all CBC variables except NLR were transformed into three forms: numeric that scaled and centered, and two categorical variables represent > upper limit of normal (ULN) or < lower limit of normal (LLN) according to their reference ranges set at each participating hospital. NLR was transformed into numeric that scaled and centered, and a categorical variable with three categories (< 2.22, 2.22-4.06, > 4.06) based on a recent study. Thus, a total of 38 variables were used for subsequent variable selection. After that, GLMMs with each variable as a fixed effect was built. Then we applied these variables, as well as their two-way interactions with time, to the multivariate GLMM with stepwise forward selection following the AIC ranking established from the univariate fixed effect models. , The minimum significance levels for entry and stay at the selection process for the parameters were set at 0.05. To ensure quality of the developed model, the best candidates for the final regression model were identified manually by dropping the covariates with a p value > 0.05 one at a time until all of the regression coefficients were significantly different from 0. Lastly, age was added to the final chosen model as a confounder. Predictive ability was summarized with the marginal and conditional R2, which represents the proportion of variation explained by the fixed effects only and the proportion of variation explained by both fixed and random effects. The conditional R2 can be viewed as representing the total amount of variability in the patient’s outcome that can be explained by the model when accounting for both fixed and random effects. Meanwhile, the marginal R2 helps characterize the ability of fixed effects to predict the outcome.

Score development and validation

To further validate the predicting ability of the fixed effects and to facilitate clinical usage, we assigned each of the final selected predictors a numeric score that was proportional to its specific coefficient in the Cox proportional hazards regression model. A composite score was therefore developed, and time-varying scores were calculated for each patient from their CBC records during hospitalization. The temporal patterns of changes in patient’s score among the non-severe survivor, severe survivor, and death groups were demonstrated by trend plot. The covariate specific time-dependent ROC analysis was conducted to assess the overall performances of the model for predicting the risk of death in the training cohort. , The PAWNN was derived from the Cox-model and further internally and externally validated. First, we conducted an internal validation of the score by 10-fold cross-validation in the training cohort to estimate the accuracy of the score. Moreover, follow up days were divided by quartile to evaluate the prediction accuracy by follow up time from AUROC as well as sensitivity and specificity of the max score in each quartile. Then, the performance of the score was further validated in 2,949 patients from Hubei Province with CBC tested only once. The score’s performance was also validated externally in an Italian cohort of 227 patients with only baseline CBC records.

Latent markov model

To unmask a “latent” (i.e., unobserved) construct of patient statuses and further investigate how the score characterizes patient’s condition and predicts the transitioning between statuses over time, a longitudinal analysis with fixed effects selected by multivariate GLMM was conducted using LMM on patients with 3 and more CBC test records on 3 and more different dates during hospitalization (n = 3,174). This analysis allowed us to investigate whether the current composite score can early predict transition probabilities and risk of death, thus help to modify interventions in advance. Three time points were selected for patient assessments in LMM including CBC from day 0 to 7 after admission (Time 1), value of CBC records during day 8-14 (Time 2) of hospitalization, and CBC during hospitalization over 15 days (Time 3). Three assessment time points enabled us to obtain the same CBC profile for each patient’s status membership over time. To select the best model with certain number of latent status (LS), we optimally combined goodness of fit and parsimony as measured using the AIC, when a lower AIC means the model is better. We used LMM to simultaneously estimate status prevalence at each time point and transition between the statuses over time. The mean and SD of scores were also calculated for each status at each time point to characterize status characteristics. Furthermore, the discrimination accuracy of the score on status membership and its prediction accuracy on status transitioning were evaluated by ROC and multiple class ROC.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Software and algorithms

R-3.6.3	R Foundation for Statistical Computing	https://www.r-project.org/
Adobe illustrator CC 2019	Adobe company	https://www.adobe.com/cn
Ggplot-3.3.2	Wickham¹⁶	https://cran.r-project.org/web/packages/ggplot2/index.html
pRoc-1.16.2	Robin et al.¹⁷	https://cran.r-project.org/web/packages/pROC/index.html
lme4- 1.1-26	Bates et al.¹⁸	https://cran.r-project.org/web/packages/lme4/index.html
Caret- 6.0-86	Max Kuhn	https://cran.r-project.org/web/packages/caret/index.html
Data.table- 1.13.4	Matt Dowle	https://cran.r-project.org/web/packages/data.table/
Effects- 4.2-0	Fox and Weisberg,¹⁹ Fox,²⁰ Fox and Hong²¹	https://cran.r-project.org/web/packages/effects/
Lmest-3.0.1	Bartolucci et al.²²	https://cran.r-project.org/web/packages/LMest/index.html
cAIC4-0.9	Saefken et al.²³	https://cran.r-project.org/web/packages/cAIC4/index.html
LCAvarsel-1.1	Fop et al.,²⁴ Dean and Raftery²⁵	https://cran.r-project.org/web/packages/LCAvarsel/index.html
Survival- 3.2-7	Therneau and Grambsch²⁶	https://cran.r-project.org/web/packages/survival/index.html
Survminer-0.4.8	Alboukadel Kassambara	https://cran.r-project.org/web/packages/survminer/index.html

5 in total

1. Prediction of SARS-CoV-2-positivity from million-scale complete blood counts using machine learning.

Authors: Gianlucca Zuin; Daniella Araujo; Vinicius Ribeiro; Maria Gabriella Seiler; Wesley Heleno Prieto; Maria Carolina Pintão; Carolina Dos Santos Lazari; Celso Francisco Hernandes Granato; Adriano Veloso
Journal: Commun Med (Lond) Date: 2022-06-15

2. Clinical Characteristics of Immune Response in Asymptomatic Carriers and Symptomatic Patients With COVID-19.

Authors: Entao Li; Shen Wang; Wenwen He; Jun He; Luogeng Liu; Xiaotuan Zhang; Songtao Yang; Feihu Yan; Yuwei Gao; Bin Liu; Xianzhu Xia
Journal: Front Microbiol Date: 2022-05-24 Impact factor: 6.064

3. A risk score based on baseline risk factors for predicting mortality in COVID-19 patients.

Authors: Ze Chen; Jing Chen; Jianghua Zhou; Fang Lei; Feng Zhou; Juan-Juan Qin; Xiao-Jing Zhang; Lihua Zhu; Ye-Mao Liu; Haitao Wang; Ming-Ming Chen; Yan-Ci Zhao; Jing Xie; Lijun Shen; Xiaohui Song; Xingyuan Zhang; Chengzhang Yang; Weifang Liu; Xiao Zhang; Deliang Guo; Youqin Yan; Mingyu Liu; Weiming Mao; Liming Liu; Ping Ye; Bing Xiao; Pengcheng Luo; Zixiong Zhang; Zhigang Lu; Junhai Wang; Haofeng Lu; Xigang Xia; Daihong Wang; Xiaofeng Liao; Gang Peng; Liang Liang; Jun Yang; Guohua Chen; Elena Azzolini; Alessio Aghemo; Michele Ciccarelli; Gianluigi Condorelli; Giulio G Stefanini; Xiang Wei; Bing-Hong Zhang; Xiaodong Huang; Jiahong Xia; Yufeng Yuan; Zhi-Gang She; Jiao Guo; Yibin Wang; Peng Zhang; Hongliang Li
Journal: Curr Med Res Opin Date: 2021-04-10 Impact factor: 2.580

4. The Development and Validation of Simplified Machine Learning Algorithms to Predict Prognosis of Hospitalized Patients With COVID-19: Multicenter, Retrospective Study.

Authors: Fang He; John H Page; Kerry R Weinberg; Anirban Mishra
Journal: J Med Internet Res Date: 2022-01-21 Impact factor: 5.428

5. Lung Ultrasound, Clinical and Analytic Scoring Systems as Prognostic Tools in SARS-CoV-2 Pneumonia: A Validating Cohort.

Authors: Jaime Gil-Rodríguez; Michel Martos-Ruiz; José-Antonio Peregrina-Rivas; Pablo Aranda-Laserna; Alberto Benavente-Fernández; Juan Melchor; Emilio Guirao-Arrabal
Journal: Diagnostics (Basel) Date: 2021-11-26

5 in total