Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Development and validation of various phenotyping algorithms for Diabetes Mellitus using data from electronic health records.

Literature DB >> 29054261

Development and validation of various phenotyping algorithms for Diabetes Mellitus using data from electronic health records.

Santiago Esteban¹, Manuel Rodríguez Tablado², Francisco E Peper², Yamila S Mahumud², Ricardo I Ricci², Karin S Kopitowski³, Sergio A Terrasa⁴.

Abstract

BACKGROUND AND
OBJECTIVE: Recent progression towards precision medicine has encouraged the use of electronic health records (EHRs) as a source for large amounts of data, which is required for studying the effect of treatments or risk factors in more specific subpopulations. Phenotyping algorithms allow to automatically classify patients according to their particular electronic phenotype thus facilitating the setup of retrospective cohorts. Our objective is to compare the performance of different classification strategies (only using standardized problems, rule-based algorithms, statistical learning algorithms (six learners) and stacked generalization (five versions)), for the categorization of patients according to their diabetic status (diabetics, not diabetics and inconclusive; Diabetes of any type) using information extracted from EHRs.
METHODS: Patient information was extracted from the EHR at Hospital Italiano de Buenos Aires, Buenos Aires, Argentina. For the derivation and validation datasets, two probabilistic samples of patients from different years (2005: n = 1663; 2015: n = 800) were extracted. The only inclusion criterion was age (≥40 & <80 years). Four researchers manually reviewed all records and classified patients according to their diabetic status (diabetic: diabetes registered as a health problem or fulfilling the ADA criteria; non-diabetic: not fulfilling the ADA criteria and having at least one fasting glycemia below 126 mg/dL; inconclusive: no data regarding their diabetic status or only one abnormal value). The best performing algorithms within each strategy were tested on the validation set.
RESULTS: The standardized codes algorithm achieved a Kappa coefficient value of 0.59 (95% CI 0.49, 0.59) in the validation set. The Boolean logic algorithm reached 0.82 (95% CI 0.76, 0.88). A slightly higher value was achieved by the Feedforward Neural Network (0.9, 95% CI 0.85, 0.94). The best performing learner was the stacked generalization meta-learner that reached a Kappa coefficient value of 0.95 (95% CI 0.91, 0.98).
CONCLUSIONS: The stacked generalization strategy and the feedforward neural network showed the best classification metrics in the validation set. The implementation of these algorithms enables the exploitation of the data of thousands of patients accurately.

Entities: Chemical Disease Species

Keywords: Diabetes Mellitus; Electronic health records; Electronic phenotyping algorithms; Stacked generalization

Mesh：

Year: 2017 PMID： 29054261 DOI： 10.1016/j.cmpb.2017.09.009

Source DB: PubMed Journal: Comput Methods Programs Biomed ISSN： 0169-2607 Impact factor: 5.428

Keyword Cloud
Cited

5 in total

1. A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

Authors: Kuang-Ming Kuo; Paul Talley; YuHsi Kao; Chi Hsien Huang
Journal: PeerJ Date: 2020-09-10 Impact factor: 2.984

2. Phenotype Inference with Semi-Supervised Mixed Membership Models.

Authors: Victor A Rodriguez; Adler Perotte
Journal: Proc Mach Learn Res Date: 2019-08

3. Performance evaluation of case definitions of type 1 diabetes for health insurance claims data in Japan.

Authors: Tasuku Okui; Chinatsu Nojiri; Shinichiro Kimura; Kentaro Abe; Sayaka Maeno; Masae Minami; Yasutaka Maeda; Naoko Tajima; Tomoyuki Kawamura; Naoki Nakashima
Journal: BMC Med Inform Decis Mak Date: 2021-02-11 Impact factor: 2.796

Review 4. Diabetes and the direct secondary use of electronic health records: Using routinely collected and stored data to drive research and understanding.

Authors: Tim Robbins; Sarah N Lim Choi Keung; Sailesh Sankar; Harpal Randeva; Theodoros N Arvanitis
Journal: Digit Health Date: 2018-10-08

5. Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems.

Authors: Courtney E Walters; Rachana Nitin; Katherine Margulis; Olivia Boorom; Daniel E Gustavson; Catherine T Bush; Lea K Davis; Jennifer E Below; Nancy J Cox; Stephen M Camarata; Reyna L Gordon
Journal: J Speech Lang Hear Res Date: 2020-08-11 Impact factor: 2.297

5 in total