| Literature DB >> 33011665 |
Branimir Ljubic1, Shoumik Roychoudhury1, Xi Hang Cao1, Martin Pavlovski1, Stefan Obradovic2, Richard Nair3, Lucas Glass3, Zoran Obradovic4.
Abstract
BACKGROUND ANDEntities:
Keywords: Alzheimer's disease prediction; Cognitive impairment; Deep learning; Electronic medical records; Recurrent Neural Networks
Mesh:
Year: 2020 PMID: 33011665 PMCID: PMC7502243 DOI: 10.1016/j.cmpb.2020.105765
Source DB: PubMed Journal: Comput Methods Programs Biomed ISSN: 0169-2607 Impact factor: 5.428
Datasets used in the experiments and the number of patients in each of them.
| Dataset | Number of patients |
|---|---|
| Naïve model, patients with 4 visits, all studied domains | 2,600 |
| SCRP model, 4 visits, all studied domains | 2,324 |
| SCRP model, 3 visits, all studied domains | 3,199 |
| SCRP model, 2 visits, all studied domains | 3,568 |
| SCRP, 4 visits, conditions and drugs domains | 3,726 |
| SCRP, 4 visits, conditions and measurements domains | 3,846 |
| SCRP, 4 visits, conditions domain | 6,418 |
Fig. 1LSTM RNN deep learning system designed in this research: an input layer (patient timeline), an embedding layer, a sequence modeling layer, and an output layer (disease risk score).
Fig. 2The naïve model results, AUPRC score obtained by LSTM RNN for condition and measurement domains separately and for their ensemble (c.m.) when using the naïve AD dataset selection for patients with at least 4 visits.
Fig. 3The average precision/recall comparison of models using condition, measurement, and an ensemble of two domains (combined).
Fig. 4AUPRCs of AD predictions by LSTM RNN trained on the SCRP dataset using each of three domains (condition, drug, measurement) separately and as an ensemble (c.m.d).
Fig. 5The average precision/recall comparison of LSTM RNN models trained on the SCRP dataset using condition, measurement, and drugs domains and ensemble information of all three domains (c.m.d.)
Prediction of AD by LSTM RNN trained on the SCRP dataset using the drugs, the measurements, the condition domain, and their ensemble (c.m.d.), with the fixed size of the testing dataset (234 patients) and different sizes of the training dataset. The evaluation metric - AUPRC.
| No. of patients in training dataset | No. of patients in testing dataset | AUPRC Drugs | AUPRC Measurements | AUPRC Conditions | AUPRC c.m.d. |
|---|---|---|---|---|---|
| 2,090 | 234 | 0.985 ± 0.040 | 0.986 ± 0.048 | 0.651 ± 0.031 | 0.991 ± 0.038 |
| 1,800 | 234 | 0.982 ± 0.048 | 0.985 ± 0.050 | 0.650 ± 0.040 | 0.990 ± 0.033 |
| 1,500 | 234 | 0.869 ± 0.053 | 0.889 ± 0.060 | 0.648 ± 0.043 | 0.908 ± 0.035 |
| 1,250 | 234 | 0.860 ± 0.049 | 0.866 ± 0.052 | 0.647 ± 0.037 | 0.871 ± 0.042 |
| 1,000 | 234 | 0.810 ± 0.058 | 0.814 ± 0.059 | 0.645 ± 0.052 | 0.835 ± 0.055 |
Fig. 6Prediction of AD by LSTM RNN model, trained on the SCRP dataset using the drugs domain in different splits of the dataset for training and testing.
Results of experiments with SCRP datasets with different numbers of visits (2, 3, or 4) for drugs, measurements, and conditions domains as well as for an ensemble of all 3 domains (c.m.d.).
| Datasets Positive cohorts | No. of patients | AUPRC Drugs | AUPRC Measurements | AUPRC Conditions | AUPRC c.m.d. |
|---|---|---|---|---|---|
| 4 visits | 2,324 | ||||
| 3 visits | 3,199 | 0.974 ± 0.042 | 0.973 ± 0.039 | 0.640 ± 0.032 | 0.982 ± 0.035 |
| 2 visits | 3,568 | 0.893 ± 0.046 | 0.908 ± 0.048 | 0.601 ± 0.030 | 0.924 ± 0.042 |