Malaz Boustani1,2,3, Anthony J Perkins4, Rezaul Karim Khandker5, Stephen Duong6, Paul R Dexter3, Richard Lipton7, Christopher M Black5, Vasu Chandrasekaran8, Craig A Solid9, Patrick Monahan10. 1. Indiana University Center for Health Innovation and Implementation Science, Indiana Clinical Translational Science Institute, Indianapolis, Indiana. 2. Sandra Eskenazi Center for Brain Care Innovation, Eskenazi Health, Indianapolis, Indiana. 3. Indiana University Center for Aging Research, Regenstrief Institute, Inc, Indianapolis, Indiana. 4. Department of Biostatistics, Indiana University School of Medicine, Indianapolis, Indiana. 5. Center for Observational and Real-World Evidence, Merck & Co., Inc., Kenilworth, Indiana. 6. The Gerontology Institute, Georgia State University, Atlanta, Georgia. 7. Department of Neurology, Albert Einstein College of Medicine, Bronx, New York. 8. Center for Observational and Real-World Evidence, Merck & Co., Inc., Boston, Massachusetts. 9. Solid Research Group, LLC, Saint Paul, Minnesota. 10. Department of Biostatistics, Indiana University School of Medicine and School of Public Health, Indianapolis, Indiana.
Abstract
OBJECTIVES: Developing scalable strategies for the early identification of Alzheimer's disease and related dementia (ADRD) is important. We aimed to develop a passive digital signature for early identification of ADRD using electronic medical record (EMR) data. DESIGN: A case-control study. SETTING: The Indiana Network for Patient Care (INPC), a regional health information exchange in Indiana. PARTICIPANTS: Patients identified with ADRD and matched controls. MEASUREMENTS: We used data from the INPC that includes structured and unstructured (visit notes, progress notes, medication notes) EMR data. Cases and controls were matched on age, race, and sex. The derivation sample consisted of 10 504 cases and 39 510 controls; the validation sample included 4500 cases and 16 952 controls. We constructed models to identify early 1- to 10-year, 3- to 10-year, and 5- to 10-year ADRD signatures. The analyses included 14 diagnostic risk variables and 10 drug classes in addition to new variables produced from unstructured data (eg, disorientation, confusion, wandering, apraxia, etc). The area under the receiver operating characteristics (AUROC) curve was used to determine the best models. RESULTS: The AUROC curves for the validation samples for the 1- to 10-year, 3- to 10-year, and 5- to 10-year models that used only structured data were .689, .649, and .633, respectively. For the same samples and years, models that used both structured and unstructured data produced AUROC curves of .798, .748, and .704, respectively. Using a cutoff to maximize sensitivity and specificity, the 1- to 10-year, 3- to 10-year, and 5- to 10-year models had sensitivity that ranged from 51% to 62% and specificity that ranged from 80% to 89%. CONCLUSION: EMR-based data provide a targeted and scalable process for early identification of risk of ADRD as an alternative to traditional population screening. J Am Geriatr Soc 68:511-518, 2020.
OBJECTIVES: Developing scalable strategies for the early identification of Alzheimer's disease and related dementia (ADRD) is important. We aimed to develop a passive digital signature for early identification of ADRD using electronic medical record (EMR) data. DESIGN: A case-control study. SETTING: The Indiana Network for Patient Care (INPC), a regional health information exchange in Indiana. PARTICIPANTS: Patients identified with ADRD and matched controls. MEASUREMENTS: We used data from the INPC that includes structured and unstructured (visit notes, progress notes, medication notes) EMR data. Cases and controls were matched on age, race, and sex. The derivation sample consisted of 10 504 cases and 39 510 controls; the validation sample included 4500 cases and 16 952 controls. We constructed models to identify early 1- to 10-year, 3- to 10-year, and 5- to 10-year ADRD signatures. The analyses included 14 diagnostic risk variables and 10 drug classes in addition to new variables produced from unstructured data (eg, disorientation, confusion, wandering, apraxia, etc). The area under the receiver operating characteristics (AUROC) curve was used to determine the best models. RESULTS: The AUROC curves for the validation samples for the 1- to 10-year, 3- to 10-year, and 5- to 10-year models that used only structured data were .689, .649, and .633, respectively. For the same samples and years, models that used both structured and unstructured data produced AUROC curves of .798, .748, and .704, respectively. Using a cutoff to maximize sensitivity and specificity, the 1- to 10-year, 3- to 10-year, and 5- to 10-year models had sensitivity that ranged from 51% to 62% and specificity that ranged from 80% to 89%. CONCLUSION: EMR-based data provide a targeted and scalable process for early identification of risk of ADRD as an alternative to traditional population screening. J Am Geriatr Soc 68:511-518, 2020.