| Literature DB >> 29180702 |
Guangjian Liu1, Yi Xu2, Xinming Wang3, Xutian Zhuang3, Huiying Liang1, Yun Xi3, Fangqin Lin1, Liyan Pan1, Taishan Zeng4, Huixian Li1, Xiaojun Cao5, Gansen Zhao6, Huimin Xia7.
Abstract
Children of severe hand, foot, and mouth disease (HFMD) often present with same clinical features as those of mild HFMD during the early stage, yet later deteriorate rapidly with a fulminant disease course. Our goal was to: (1) develop a machine learning system to automatically identify cases with high risk of severe HFMD at the time of admission; (2) compare the effectiveness of the new system with the existing risk scoring system. Data on 2,532 HFMD children admitted between March 2012 and July 2015, were collected retrospectively from a medical center in China. By applying a holdout strategy and a 10-fold cross validation method, we developed four models with the random forest algorithm using different variable sets. The prediction system HFMD-RF based on the model of 16 variables from both the structured and unstructured data, achieved 0.824 sensitivity, 0.931 specificity, 0.916 accuracy, and 0.916 area under the curve in the independent test set. Most remarkably, HFMD-RF offers significant gains with respect to the commonly used pediatric critical illness score in clinical practice. As all the selected risk factors can be easily obtained, HFMD-RF might prove to be useful for reductions in mortality and complications of severe HFMD.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29180702 PMCID: PMC5703994 DOI: 10.1038/s41598-017-16521-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Baseline Characteristics.
| Characteristics | Total Data Set (n = 2532) | Training Set (n = 1899) | Test Set (n = 633) | χ2/t | P-value |
|---|---|---|---|---|---|
| Stage (Severe/Mild) | 365/2167 | 274/1625 | 91/542 | 0.13 | 0.732 |
| Gender (Male/Female) | 1715/817 | 1282/617 | 433/200 | 0.17 | 0.677 |
| Vomiting (Yes/No) | 393/2139 | 292/1607 | 101/532 | 0.12 | 0.727 |
| Age (month) | 24.11 ± 17.14 | 24.19 ± 16.94 | 23.86 ± 17.72 | 0.43 | 0.669 |
| Respiratory rate (/min) | 27.27 ± 5.61 | 27.27 ± 6.16 | 27.27 ± 3.46 | −0.01 | 0.989 |
| Peak temperature (°C) | 39.08 ± 0.71 | 39.08 ± 0.71 | 39.09 ± 0.72 | −0.30 | 0.763 |
| Fever duration (day) | 2.50 ± 1.76 | 2.52 ± 1.79 | 2.47 ± 1.67 | 0.65 | 0.513 |
| Blood glucose (mmol/L) | 5.56 ± 6.14 | 5.54 ± 1.35 | 5.62 ± 1.42 | −1.21 | 0.226 |
| Platelet (109/L) | 322.03 ± 93.06 | 321.61 ± 92.70 | 323.29 ± 94.20 | −0.27 | 0.790 |
| Percentage of lymphocytes (%) | 39.35 ± 16.41 | 39.53 ± 16.50 | 38.82 ± 16.16 | 0.95 | 0.345 |
| Lactate dehydrogenase (U/L) | 310 (73) | 309 (72) | 313 (73) | 0.50 | 0.620* |
| Alkaline phosphatase (IU/L) | 178.96 ± 66.34 | 179.12 ± 61.69 | 178.46 ± 78.73 | 0.49 | 0.624 |
| Creatine kinase (IU/L) | 114 (80) | 114 (80) | 113 (80) | 0.30 | 0.761* |
| Creatine kinase-MB (IU/L) | 32.34 ± 18.07 | 32.31 ± 17.98 | 32.42 ± 18.36 | −0.12 | 0.904 |
| Creatinine (µmol/L) | 24.21 ± 5.84 | 24.13 ± 5.73 | 24.45 ± 6.15 | −1.38 | 0.169 |
| Uric acid (µmol/L) | 285.41 ± 83.18 | 285.33 ± 83.26 | 285.66 ± 83.00 | −0.09 | 0.932 |
| Blood chlorine (mmol/L) | 100.31 ± 2.34 | 100.31 ± 2.25 | 100.29 ± 2.61 | 0.22 | 0.830 |
| Alanine aminotransferase (IU/L) | 18 (11) | 18 (11) | 18 (11) | −0.75 | 0.456* |
Data are mean ± standard derivation or median (interquartile range). The Chi-square test was used for comparison of categorical variables and the two-sample t test for continuous variables between the training set and the test set.
*Transformed logarithmically to assume a near-normal distribution for t-test.
Figure 1Identification and selection procedure of clinical features for machine learning models.
Characteristics of severe and mild HFMD groups.
| Variables | Severe HFMD | Mild HFMD | χ2/t | p-value |
|---|---|---|---|---|
| Vomiting (Yes/No) | 133/365 | 260/2167 | 142.31 | <0.001 |
| Age (month) | 30.58 ± 20.01 | 23.02 ± 16.36 | 6.85 | <0.001 |
| Respiratory rate (/min) | 28.72 ± 12.94 | 27.03 ± 2.87 | 2.49 | 0.013 |
| Peak temperature (°C) | 39.21 ± 0.57 | 39.06 ± 0.73 | 4.40 | <0.001 |
| Fever duration (day) | 3.27 ± 1.93 | 2.38 ± 1.69 | 8.30 | <0.001 |
| Blood glucose (mmol/L) | 6.14 ± 1.39 | 5.47 ± 1.34 | 8.78 | <0.001 |
| Platelet (109/L) | 353.53 ± 108.05 | 316.72 ± 89.23 | 6.07 | <0.001 |
| Percentage of lymphocytes (%) | 39.77 ± 16.43 | 36.90 ± 16.15 | −3.09 | 0.002 |
| Lactate dehydrogenase (U/L) | 320 (67) | 228 (250) | −13.10 | <0.001* |
| Alkaline phosphatase (IU/L) | 163.19 ± 43.48 | 181.61 ± 69.12 | −6.69 | <0.001 |
| Creatine kinase (IU/L) | 120 (53) | 113 (79) | −3.94 | <0.001* |
| Creatine kinase-MB (IU/L) | 34.56 ± 17.84 | 19.12 ± 13.13 | −17.82 | <0.001 |
| Creatinine (µmol/L) | 24.44 ± 6.93 | 23.60 ± 5.94 | −2.43 | 0.015 |
| Uric acid (µmol/L) | 287.51 ± 82.58 | 272.98 ± 85.74 | −3.09 | 0.002 |
| Blood chlorine (mmol/L) | 100.00 ± 3.46 | 100.36 ± 2.09 | 2.74 | 0.006 |
| Alanine aminotransferase (IU/L) | 19 (6) | 18 (11) | −2.58 | 0.010* |
Data are mean ± standard derivation or median (interquartile range). The Chi-square test was used for comparison of categorical variables and the two-sample t test for continuous variables between the severe group and the mild group.
*Transformed logarithmically to assume a near-normal distribution for t-test.
Figure 2The receiver operating characteristic (ROC) curves of the four random forest models.
Figure 3The importance of the 16 variables of the fourth random forest model.
Figure 4The receiver operating characteristic (ROC) curves of the HFMD-RF system and the pediatric critical illness score (PCIS).
Performance comparison of HFMD-RF against the pediatric critical illness score (PCIS).
| Sensitivity | Specificity | Accuracy | AUC | ||
|---|---|---|---|---|---|
| Performance | |||||
| Training Set | PCIS | 0.631 | 0.742 | 0.726 | 0.723 |
| HFMD-RF | 0.807 | 0.969 | 0.945 | 0.919 | |
| Test Set | PCIS | 0.670 | 0.753 | 0.741 | 0.768 |
| HFMD-RF | 0.824 | 0.931 | 0.916 | 0.916 | |
|
|
|
|
| ||
| Performance comparison | |||||
| Training Set | HFMD-RF against PCIS | 27.9%, P < 0.001 | 30.6%, P < 0.001 | 30.2%, P < 0.001 | 26.8%, P < 0.001 |
| Test Set | HFMD-RF against PCIS | 21.3%, P = 0.017 | 25.2%, P < 0.001 | 24.7%, P < 0.001 | 18.8%, P < 0.001 |
The Chi-square test was used for comparison of Sensitivity, Specificity and Accuracy, and the two-sample t test for AUC between HFMD-RF and PCIS.