| Literature DB >> 34128861 |
Suxia Bao1, Hong-Yi Pan2, Wei Zheng1, Qing-Qing Wu1,3, Yi-Ning Dai1, Nan-Nan Sun4, Tian-Chen Hui1,2, Wen-Hao Wu1,5, Yi-Cheng Huang1, Guo-Bo Chen6, Qiao-Qiao Yin1,7, Li-Juan Wu8, Rong Yan1, Ming-Shan Wang1, Mei-Juan Chen1, Jia-Jie Zhang1, Li-Xia Yu9, Ji-Chan Shi10, Nian Fang11, Yue-Fei Shen12, Xin-Sheng Xie13, Chun-Lian Ma14, Wan-Jun Yu15, Wen-Hui Tu16, Bin Ju4, Hai-Jun Huang1, Yong-Xi Tong1, Hong-Ying Pan1.
Abstract
ABSTRACT: Early determination of coronavirus disease 2019 (COVID-19) pneumonia from numerous suspected cases is critical for the early isolation and treatment of patients.The purpose of the study was to develop and validate a rapid screening model to predict early COVID-19 pneumonia from suspected cases using a random forest algorithm in China.A total of 914 initially suspected COVID-19 pneumonia in multiple centers were prospectively included. The computer-assisted embedding method was used to screen the variables. The random forest algorithm was adopted to build a rapid screening model based on the training set. The screening model was evaluated by the confusion matrix and receiver operating characteristic (ROC) analysis in the validation.The rapid screening model was set up based on 4 epidemiological features, 3 clinical manifestations, decreased white blood cell count and lymphocytes, and imaging changes on chest X-ray or computed tomography. The area under the ROC curve was 0.956, and the model had a sensitivity of 83.82% and a specificity of 89.57%. The confusion matrix revealed that the prospective screening model had an accuracy of 87.0% for predicting early COVID-19 pneumonia.Here, we developed and validated a rapid screening model that could predict early COVID-19 pneumonia with high sensitivity and specificity. The use of this model to screen for COVID-19 pneumonia have epidemiological and clinical significance.Entities:
Mesh:
Year: 2021 PMID: 34128861 PMCID: PMC8213313 DOI: 10.1097/MD.0000000000026279
Source DB: PubMed Journal: Medicine (Baltimore) ISSN: 0025-7974 Impact factor: 1.817
Clinical characteristics of the patients with COVID-19 pneumonia/non-COVID-19 pneumonia.
| COVID-19 pneumonia (n = 361) | Non-COVID-19 pneumonia (n = 553) | ||
| Age (years) | 47.16 ± 14.47 | 37.87 ± 17.94 | <0.001 |
| Sex (male) | 204 (56.51%) | 277 (50.09%) | 0.057 |
| Comorbidities | 110 (30.47%) | 73 (13.20%) | <0.001 |
| Hypertension | 71 (19.67%) | 42 (7.60%) | <0.001 |
| Diabetes mellitus | 23 (6.37%) | 14 (2.53%) | 0.005 |
| Cancer | 4 (1.11%) | 5 (0.90%) | 0.745 |
| COPD | 3 (0.83%) | 10 (1.81%) | 0.266 |
| Hepatitis B infection | 8 (2.22%) | 5 (0.90%) | 0.151 |
| Others | 11∗ (3.05%) | 14† (2.53%) | 0.681 |
| Travel/or residence within 14 days | |||
| The outbreak area (Wuhan) | 120 (33.24%) | 70 (12.65%) | <0.001 |
| The outbreak area (Wuhan) nearby areas in Hubei Province | 9 (2.49%) | 68 (12.30%) | <0.001 |
| Other areas with persistent local transmission or communities with confirmed COVID-19 pneumonia cases | 162 (44.88%) | 279 (50.45%) | 0.104 |
| Contacting the patients with fever or respiratory symptoms in 14-days, or who had a history of traveling or residence in the following areas | |||
| The outbreak area (Wuhan) | 93 (25.76%) | 64 (11.57%) | <0.001 |
| The outbreak area (Wuhan) nearby areas in Hubei Province | 8 (2.21%) | 24 (4.34%) | 0.097 |
| Other areas with persistent local transmission, or communities with confirmed COVID-19 pneumonia cases | 99 (27.42%) | 81 (14.65%) | <0.001 |
| Association with a cluster outbreak | 133 (36.84%) | 16 (2.89%) | <0.001 |
| Exposure to wildlife animals | 1 (0.27%) | 1 (0.18%) | 1.000 |
| Contact patients with influenza A | 3 (0.83%) | 15 (2.71%) | 0.052 |
| Contact patients with influenza B | 5 (1.38%) | 15 (2.71%) | 0.248 |
| Fever‡ | 175 (48.48%) | 261 (47.20%) | 1.000 |
| Body temperature | 37.51 ± 0.83 | 37.48 ± 0.82 | 0.527 |
| Dry cough (cough without sputum) | 158 (43.77%) | 214 (38.70%) | 0.130 |
| Sputum | 130 (36.01%) | 136 (24.59%) | <0.001 |
| Fatigue | 98 (27.15%) | 59 (10.67%) | <0.001 |
| Dyspnea | 33 (9.14%) | 11 (1.99%) | <0.001 |
| Conjunctival congestion | 2 (0.63%) | 4 (0.72%) | 1.000 |
| Nasal congestion | 14 (3.88%) | 47 (8.50%) | 0.006 |
| Diarrhea or abdominal ache | 37 (10.25%) | 21 (3.80%) | <0.001 |
| Dizziness or headache | 28 (7.76%) | 48 (8.68%) | 0.628 |
| Nausea or vomiting | 13 (3.60%) | 7 (1.27%) | 0.021 |
| Sore throat | 19 (5.26%) | 60 (10.85%) | 0.004 |
| Muscle soreness | 17 (4.71%) | 3 (0.54%) | <0.001 |
| White blood cell count (×109) | 5.40 ± 2.52 | 7.30 ± 3.01 | <0.001 |
| Normal or reduced white blood cells§ | 340 (94.18%) | 459 (83.00%) | <0.001 |
| Lymphocyte count (×109) | 1.25 ± 1.03 | 1.71 ± 0.85 | <0.001 |
| Reduced lymphocytes|| | 193 (53.46%) | 141 (25.50%) | <0.001 |
| Neutrophil cell count (×109) | 3.90 ± 4.28 | 4.95 ± 3.33 | <0.001 |
| Normal or reduced neutrophil cells¶ | 322 (89.20%) | 429 (77.58%) | <0.001 |
| C-reactive protein level (mg/L) | 20.99 ± 26.71 | 18.50 ± 35.29 | 0.254 |
| Chest X-ray or CT scanning | |||
| Normal | 16 (4.43%) | 242 (43.76%) | <0.001 |
| Unilateral local patchy shadowing | 93 (25.76%) | 134 (24.23%) | |
| Bilateral multiple ground-glass opacity | 126 (34.90%) | 98 (17.72%) | |
| Bilateral diffuse ground-glass shadowing with pulmonary Consolidation | 123 (34.07%) | 30 (5.42%) | |
| Other imaging abnormalities such as pulmonary nodule or pleural effusion | 3 (0.88%) | 49 (8.86%) | |
Figure 1The top 10 variables ranked by regression coefficients. The threshold of variable selection was manually set to an absolute value of 0.85. The top 10 variables were screened for further analysis.
Collinearity statistics among the top 10 variables ranked by regression coefficient.
| Parameter | Tolerance | VIF |
| Chest X-ray or CT | 0.42 | 2.38 |
| Cluster outbreak | 0.78 | 1.29 |
| Travel/residence (Wuhan) | 0.75 | 1.33 |
| Contacting with others (Wuhan) | 0.75 | 1.34 |
| Contact with others (other areas) | 0.75 | 1.34 |
| Muscle soreness | 0.94 | 1.06 |
| Dyspnea | 0.90 | 1.11 |
| Fatigue | 0.79 | 1.26 |
| Lymphocyte count | 0.25 | 4.02 |
| White blood cell count | 0.23 | 4.41 |
Figure 2The importance of the predictor variables in differentiating the probability of early COVID-19 pneumonia in the random forest model. The relative measure was scaled to a maximum of 100 and used to compare the importance of the variables in the model. COVID-19 = coronavirus disease 2019.
Diagnostic sensitivity, specificity and AUC of the fast screening model with different parameters.
| n_estimators | Sensitivity | Specificity | AUC |
| 1 | 0.7058 | 0.8522 | 0.779 |
| 5 | 0.8676 | 0.8609 | 0.9426 |
| 10 | 0.8235 | 0.8783 | 0.9491 |
| 20 | 0.8235 | 0.8783 | 0.9493 |
| 30 | 0.8235 | 0.8783 | 0.9527 |
| 40 | 0.8382 | 0.8957 | 0.9555 |
| 50 | 0.8235 | 0.8957 | 0.9552 |
| 60 | 0.8088 | 0.8957 | 0.9529 |
| 70 | 0.8382 | 0.8957 | 0.9525 |
| 80 | 0.8382 | 0.9043 | 0.9519 |
| 90 | 0.8382 | 0.8957 | 0.9525 |
| 100 | 0.8529 | 0.8783 | 0.9516 |
Figure 3The capacity of the screening variables to predict early COVID-19 pneumonia. The ROC curve and AUC in the rapid screening model for discriminating the probability of early COVID-19 pneumonia. AUC = area under the curve, COVID-19 pneumonia = novel coronavirus pneumonia, ROC = receiver operating characteristic curve.
The confusion matrix in predicting early COVID-19 pneumonia in the validation set.
| Parameter | Group size | COVID-19 pneumonia | Non-COVID-19 pneumonia | Precision | Recall | F1-score |
| COVID-19 pneumonia | 68 | 57 | 11 | 0.83 | 0.84 | 0.83 |
| Non-COVID-19 pneumonia | 115 | 12 | 103 | 0.90 | 0.90 | 0.90 |
| Overall | 0.87 | 0.87 | 0.87 |