Literature DB >> 29054261

Development and validation of various phenotyping algorithms for Diabetes Mellitus using data from electronic health records.

Santiago Esteban1, Manuel Rodríguez Tablado2, Francisco E Peper2, Yamila S Mahumud2, Ricardo I Ricci2, Karin S Kopitowski3, Sergio A Terrasa4.   

Abstract

BACKGROUND AND
OBJECTIVE: Recent progression towards precision medicine has encouraged the use of electronic health records (EHRs) as a source for large amounts of data, which is required for studying the effect of treatments or risk factors in more specific subpopulations. Phenotyping algorithms allow to automatically classify patients according to their particular electronic phenotype thus facilitating the setup of retrospective cohorts. Our objective is to compare the performance of different classification strategies (only using standardized problems, rule-based algorithms, statistical learning algorithms (six learners) and stacked generalization (five versions)), for the categorization of patients according to their diabetic status (diabetics, not diabetics and inconclusive; Diabetes of any type) using information extracted from EHRs.
METHODS: Patient information was extracted from the EHR at Hospital Italiano de Buenos Aires, Buenos Aires, Argentina. For the derivation and validation datasets, two probabilistic samples of patients from different years (2005: n = 1663; 2015: n = 800) were extracted. The only inclusion criterion was age (≥40 &amp; <80 years). Four researchers manually reviewed all records and classified patients according to their diabetic status (diabetic: diabetes registered as a health problem or fulfilling the ADA criteria; non-diabetic: not fulfilling the ADA criteria and having at least one fasting glycemia below 126 mg/dL; inconclusive: no data regarding their diabetic status or only one abnormal value). The best performing algorithms within each strategy were tested on the validation set.
RESULTS: The standardized codes algorithm achieved a Kappa coefficient value of 0.59 (95% CI 0.49, 0.59) in the validation set. The Boolean logic algorithm reached 0.82 (95% CI 0.76, 0.88). A slightly higher value was achieved by the Feedforward Neural Network (0.9, 95% CI 0.85, 0.94). The best performing learner was the stacked generalization meta-learner that reached a Kappa coefficient value of 0.95 (95% CI 0.91, 0.98).
CONCLUSIONS: The stacked generalization strategy and the feedforward neural network showed the best classification metrics in the validation set. The implementation of these algorithms enables the exploitation of the data of thousands of patients accurately.
Copyright © 2017 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Diabetes Mellitus; Electronic health records; Electronic phenotyping algorithms; Stacked generalization

Mesh:

Year:  2017        PMID: 29054261     DOI: 10.1016/j.cmpb.2017.09.009

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  5 in total

1.  A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

Authors:  Kuang-Ming Kuo; Paul Talley; YuHsi Kao; Chi Hsien Huang
Journal:  PeerJ       Date:  2020-09-10       Impact factor: 2.984

2.  Phenotype Inference with Semi-Supervised Mixed Membership Models.

Authors:  Victor A Rodriguez; Adler Perotte
Journal:  Proc Mach Learn Res       Date:  2019-08

3.  Performance evaluation of case definitions of type 1 diabetes for health insurance claims data in Japan.

Authors:  Tasuku Okui; Chinatsu Nojiri; Shinichiro Kimura; Kentaro Abe; Sayaka Maeno; Masae Minami; Yasutaka Maeda; Naoko Tajima; Tomoyuki Kawamura; Naoki Nakashima
Journal:  BMC Med Inform Decis Mak       Date:  2021-02-11       Impact factor: 2.796

Review 4.  Diabetes and the direct secondary use of electronic health records: Using routinely collected and stored data to drive research and understanding.

Authors:  Tim Robbins; Sarah N Lim Choi Keung; Sailesh Sankar; Harpal Randeva; Theodoros N Arvanitis
Journal:  Digit Health       Date:  2018-10-08

5.  Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems.

Authors:  Courtney E Walters; Rachana Nitin; Katherine Margulis; Olivia Boorom; Daniel E Gustavson; Catherine T Bush; Lea K Davis; Jennifer E Below; Nancy J Cox; Stephen M Camarata; Reyna L Gordon
Journal:  J Speech Lang Hear Res       Date:  2020-08-11       Impact factor: 2.297

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.