| Literature DB >> 30865675 |
Chen-Ying Hung1,2,3, Ching-Heng Lin4,5, Tsuo-Hung Lan4, Giia-Sheun Peng2, Chi-Chun Lee1,6.
Abstract
BACKGROUND: Intelligent decision support systems (IDSS) have been applied to tasks of disease management. Deep neural networks (DNNs) are artificial intelligent techniques to achieve high modeling power. The application of DNNs to large-scale data for estimating stroke risk needs to be assessed and validated. This study aims to apply a DNN for deriving a stroke predictive model using a big electronic health record database. METHODS ANDEntities:
Mesh:
Year: 2019 PMID: 30865675 PMCID: PMC6415884 DOI: 10.1371/journal.pone.0213007
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The 3 year and 8 year stroke rate of patients in the 5 risk categories in the testing datasets.
Characteristics of development and testing datasets.
| Characteristics | Development | Testing | Testing | |||
|---|---|---|---|---|---|---|
| No. of records | 8,952,000 | 1,118,320 | 1,122,596 | |||
| No. of records with stroke events | 43,911 | 4,578 | 5,000 | |||
| Patient demographics | ||||||
| No. of patients | 672,214 | 84,342 | 83,931 | |||
| No. of patients with stroke events | 2,060 | 239 | 245 | |||
| No. of OPD visits in 2003, median (IQR) | 11 | (5–20) | 11 | (5–20) | 11 | (5–20) |
| Men, No. (%) | 326,337 | (48.5) | 41,078 | (48.7) | 40,916 | (48.7) |
| Age in years, mean (SD) | 35.5 | (20.2) | 35.5 | (20.2) | 35.5 | (20.3) |
| Co-morbidity, No. (%) | ||||||
| Hypertension | 79,696 | (11.9) | 9,968 | (11.8) | 9,944 | (11.8) |
| Hyperlipidemia | 50,929 | (7.6) | 6,395 | (7.6) | 6,288 | (7.5) |
| Diabetes mellitus | 38,635 | (5.7) | 5,017 | (5.9) | 4,780 | (5.7) |
| Ischemic heart disease | 32,126 | (4.8) | 3,975 | (4.7) | 3,973 | (4.7) |
| Atrial fibrillation | 1,958 | (0.3) | 235 | (0.3) | 224 | (0.3) |
| Heart failure | 7,959 | (1.2) | 989 | (1.2) | 1,020 | (1.2) |
| Medication use, No. (%) | ||||||
| Antiplatelet agents | 75,252 | (11.2) | 9,425 | (11.2) | 9,311 | (11.1) |
| Renin angiotensin system inhibitors | 50,885 | (7.6) | 6,438 | (7.6) | 6,306 | (7.5) |
| Beta blockers | 87,051 | (12.9) | 11,019 | (13.1) | 10,966 | (13.1) |
| Calcium channel blockers | 67,373 | (10.0) | 8,438 | (10.0) | 8,268 | (9.9) |
| Other antihypertensive drugs | 70,876 | (10.5) | 9,023 | (10.7) | 8,844 | (10.5) |
| Statins | 18,567 | (2.8) | 2,382 | (2.8) | 2,312 | (2.8) |
| Oral hypoglycemic agents | 28,149 | (4.2) | 3,641 | (4.3) | 3,415 | (4.1) |
| Insulins | 4,746 | (0.7) | 637 | (0.8) | 592 | (0.7) |
OPD = outpatient department; IQR = interquartile range; SD = standard deviation.
Fig 2Performance of the deep learning model for predicting 3 year stroke occurrence in (A) testing dataset 1 and (B) testing dataset 2.
Fig 3Sensitivity and specificity of the DNN model for predicting 3 year stroke occurrence in different testing time periods under (A) the high specificity operating point and (B) the high sensitivity operating point.
Performance of currently available stroke risk assessment scores and the deep learning model.
| Characteristics of testing population | No. in corresponding population in our testing datasets | Performance of the DNN model, AUC (95% CI) | Performance of current scores, AUC, name of stroke risk score, published year | |
|---|---|---|---|---|
| age 35–74, women | 39,248 | 0.870 (0.845–0.896) | 0.774 | Framingham, 1991 [ |
| 0.788 | QRISK1, 2007 [ | |||
| 0.784 | ASSIGN, 2007 [ | |||
| 0.817 | QRISK2, 2008 [ | |||
| age 35–74, men | 35,595 | 0.832 (0.803–0.862) | 0.760 | Framingham, 1991 [ |
| 0.767 | QRISK1, 2007 [ | |||
| 0.764 | ASSIGN, 2007 [ | |||
| 0.792 | QRISK2, 2008 [ | |||
| age 30–74, women | 46,450 | 0.887 (0.864–0.909) | 0.774 | Framingham, 2008 [ |
| age 30–74, men | 41,913 | 0.853 (0.826–0.879) | 0.835 | Framingham, 2008 [ |
| age >45, women | 27,458 | 0.819 (0.795–0.842) | 0.809 | Reynolds, 2007 [ |
| age >50, men | 19,629 | 0.746 (0.718–0.775) | 0.708 | Reynolds, 2008 [ |
| age 25–84, women | 56,929 | 0.896 (0.877–0.915) | 0.880 | QRISK3, 2017 [ |
| age 25–84, men | 51,796 | 0.870 (0.851–0.890) | 0.858 | QRISK3, 2017 [ |
a Age ranges and gender characteristics of testing populations in reference papers.
AUC = area under the receiver operating characteristic curve; CI = confidence intervals.