| Literature DB >> 31288818 |
Yijun Shao1,2, Qing T Zeng1,2, Kathryn K Chen3,4, Andrew Shutes-David3,5, Stephen M Thielke3,4, Debby W Tsuang6,7.
Abstract
BACKGROUND: Dementia is underdiagnosed in both the general population and among Veterans. This underdiagnosis decreases quality of life, reduces opportunities for interventions, and increases health-care costs. New approaches are therefore necessary to facilitate the timely detection of dementia. This study seeks to identify cases of undiagnosed dementia by developing and validating a weakly supervised machine-learning approach that incorporates the analysis of both structured and unstructured electronic health record (EHR) data.Entities:
Keywords: Dementia; Diagnosis; Machine learning; Medical records; Veterans
Mesh:
Year: 2019 PMID: 31288818 PMCID: PMC6617952 DOI: 10.1186/s12911-019-0846-4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Results of the manual chart review
| Risk score bin | # of dementias | # of unclears | # of non-dementias | Rate of undiagnosed dementiaa | |
|---|---|---|---|---|---|
| Unclear = dementia | Unclear = non-dementia | ||||
| 0.0 ~ 0.1 | 3 | 1 | 26 | 0.133 | 0.1 |
| 0.1 ~ 0.2 | 2 | 1 | 7 | 0.3 | 0.2 |
| 0.2 ~ 0.3 | 2 | 0 | 8 | 0.2 | 0.2 |
| 0.3 ~ 0.4 | 6 | 1 | 3 | 0.7 | 0.6 |
| 0.4 ~ 0.5 | 3 | 2 | 5 | 0.5 | 0.3 |
| 0.5 ~ 0.6 | 5 | 0 | 5 | 0.5 | 0.5 |
| 0.6 ~ 0.7 | 5 | 1 | 4 | 0.6 | 0.5 |
| 0.7 ~ 0.8 | 6 | 0 | 4 | 0.6 | 0.6 |
| 0.8 ~ 0.9 | 8 | 0 | 2 | 0.8 | 0.8 |
| 0.9 ~ 1.0 | 7 | 2 | 1 | 0.9 | 0.7 |
a“Unclear = dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “dementia” group and the rate of undiagnosed dementia is calculated using this formula: (# of dementias + # of unclears) / (# of dementias + # of unclears + # of non-dementias). Conversely, “Unclear = non-dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “non-dementia” group and the rate of undiagnosed dementia is calculated using this formula: (# of non-dementias + # of unclears) / (# of dementias + # of unclears + # of non-dementias)
Demographics of the cases and controls
| Cases ( | Controls ( | |
|---|---|---|
| Mean age | 79.8 | 79.5 |
| Gender | ||
| Female | 62 (3.3%) | 310 (3.3%) |
| Male | 1799 (96.7%) | 8995 (96.7%) |
| Race | ||
| Black | 112 (6.0%) | 428 (4.6%) |
| White | 1434 (77.1%) | 7099 (76.3%) |
| Other | 64 (3.4%) | 245 (2.6%) |
| Unknown | 251 (13.5%) | 1533 (16.5%) |
| Ethnicity | ||
| Hispanic | 28 (1.5%) | 135 (1.5%) |
| Non-Hispanic | 1679 (90.2%) | 8170 (87.8%) |
| Unknown | 154 (8.3%) | 1000 (10.7%) |
The most significant topic features (p < 0.01) between cases and controls
| # | Topic (showing 10 of the most |
|---|---|
| 1 | dementia, memory, cognitive, wife, problems, loss, impairment, galantamine, mmse, Alzheimer, … |
| 2 | angry, asked, behavior, police, upset, told, staff, agitated, made, leave, ... |
| 3 | family, home, daughter, care, member, members, sister, granddaughter, grandson, brother, ... |
| 4 | qd, bid, prn, mg, qhs, lisinopril, tid, asa, metoprolol, meds, ... |
| 5 | plan, agree, reviewed, note, examined, discussed, findings, assessment, resident, concur, ... |
| 6 | ct, scan, contrast, chest, radiology, abdomen, pelvis, ordered, cat, pet, ... |
| 7 | taking, meds, pills, medication, takes, stopped, states, prescribed, pill, tabs, ... |
| 8 | resident, care, visit, nursing, home, staff, contract, daily, offered, date, ... |
| 9 | issues, related, health, problems, medical, issue, discussed, time, plan, treatment, ... |
| 10 | transfer, patient, report, transferred, ward, care, receiving, rn, condition, unit, ... |
| 11 | continues, continue, reports, remains, continued, time, encouraged, work, plan, improved, ... |
| 12 | housing, stable, months, part, stay, living, worried, household, rent, past, ... |
Fig. 1Prediction scores and originally assigned case/control status. The distribution of cases (red bars) and controls (blue bars) established by our original inclusion and exclusion criteria compared to the results of our logistic regression model (prediction score indicates the likelihood of having dementia)
Linear regression results
| Unclear = Dementiaa | Unclear = Non-Dementiaa | |||
|---|---|---|---|---|
| Intercept | Slope | Intercept | Slope | |
| Value | 0.1565 | 0.7335 | 0.1015 | 0.6970 |
| 0.084 | 0.001 | 0.199 | 0.001 | |
a“Unclear = dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “dementia” group, whereas “Unclear = non-dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “non-dementia” group
Fig. 2Lines fit to the rates of undiagnosed dementias estimated in the 10 risk bins. The x values for the bins were taken as the midpoints of the bin intervals. The left figure illustrates the results when the “Unclear” diagnoses were treated as dementia (i.e., “Unclear = Dementia”), whereas the right figure illustrates the results when the “Unclear” diagnoses were treated as non-dementia (i.e., “Unclear = Non-Dementia”)
Fig. 3ROC curves of the prediction with AUC values. The blue and red colors correspond to “Unclear = Dementia” and “Unclear = Non-Dementia,” respectively. AUC: area under ROC; ROC: receiver of characteristic curve
Performance for the identification of undiagnosed dementias
| Threshold | Unclear = Dementiaa | Unclear = Non-Dementiaa | ||
|---|---|---|---|---|
| SEN | SPE | SEN | SPE | |
| 0.037 | 0.889 | 0.756 | 0.888 | 0.751 |
| 0.061 | 0.825 | 0.832 | 0.826 | 0.827 |
| 0.102 | 0.736 | 0.895 | 0.736 | 0.890 |
SEN: sensitivity; SPE: specificity. a “Unclear = dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “dementia” group, whereas “Unclear = non-dementia” indicates that subjects who were classified as “unclear” during the manual chart review are classified in the “non-dementia” group