Kushan De Silva1, Wai Kit Lee2, Andrew Forbes3, Ryan T Demmer4, Christopher Barton5, Joanne Enticott2. 1. Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, Victoria, Australia. Electronic address: kushan.ranakombu@monash.edu. 2. Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, Victoria, Australia. 3. Biostatistics Unit, Division of Research Methodology, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Melbourne, Victoria, Australia. 4. Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA; Mailman School of Public Health, Columbia University, New York, USA. 5. Department of General Practice, School of Primary and Allied Health Care, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Notting Hill, Victoria, Australia.
Abstract
OBJECTIVE: We aimed to identify machine learning (ML) models for type 2 diabetes (T2DM) prediction in community settings and determine their predictive performance. METHOD: Systematic review of ML predictive modelling studies in 13 databases since 2009 was conducted. Primary outcomes included metrics of discrimination, calibration, and classification. Secondary outcomes included important variables, level of validation, and intended use of models. Meta-analysis of c-indices, subgroup analyses, meta-regression, publication bias assessments and sensitivity analyses were conducted. RESULTS: Twenty-three studies (40 prediction models) were included. Studies with high-, moderate-, and low- risk of bias were 3, 14, and 6 respectively. All studies conducted internal validation whereas none conducted external validation of their models. Twenty studies provided classification metrics to varying extents whereas only 7 studies performed model calibration. Eighteen studies reported information on both the variables used for model development and the feature importance. Twelve studies highlighted potential applicability of their models for T2DM screening. Meta-analysis produced a good pooled c-index (0.812). Sources of heterogeneity were identified through subgroup analyses and meta-regression. Issues pertaining to methodological quality and reporting were observed. CONCLUSIONS: We found evidence of good performance of ML models for T2DM prediction in the community. Improvements to methodology, reporting and validation are needed before they can be used at scale.
OBJECTIVE: We aimed to identify machine learning (ML) models for type 2 diabetes (T2DM) prediction in community settings and determine their predictive performance. METHOD: Systematic review of ML predictive modelling studies in 13 databases since 2009 was conducted. Primary outcomes included metrics of discrimination, calibration, and classification. Secondary outcomes included important variables, level of validation, and intended use of models. Meta-analysis of c-indices, subgroup analyses, meta-regression, publication bias assessments and sensitivity analyses were conducted. RESULTS: Twenty-three studies (40 prediction models) were included. Studies with high-, moderate-, and low- risk of bias were 3, 14, and 6 respectively. All studies conducted internal validation whereas none conducted external validation of their models. Twenty studies provided classification metrics to varying extents whereas only 7 studies performed model calibration. Eighteen studies reported information on both the variables used for model development and the feature importance. Twelve studies highlighted potential applicability of their models for T2DM screening. Meta-analysis produced a good pooled c-index (0.812). Sources of heterogeneity were identified through subgroup analyses and meta-regression. Issues pertaining to methodological quality and reporting were observed. CONCLUSIONS: We found evidence of good performance of ML models for T2DM prediction in the community. Improvements to methodology, reporting and validation are needed before they can be used at scale.
Authors: Georgios Baskozos; Andreas C Themistocleous; Harry L Hebert; Mathilde M V Pascal; Jishi John; Brian C Callaghan; Helen Laycock; Yelena Granovsky; Geert Crombez; David Yarnitsky; Andrew S C Rice; Blair H Smith; David L H Bennett Journal: BMC Med Inform Decis Mak Date: 2022-05-29 Impact factor: 3.298
Authors: Kushan De Silva; Siew Lim; Aya Mousa; Helena Teede; Andrew Forbes; Ryan T Demmer; Daniel Jönsson; Joanne Enticott Journal: PLoS One Date: 2021-05-05 Impact factor: 3.240
Authors: Luis Fregoso-Aparicio; Julieta Noguez; Luis Montesinos; José A García-García Journal: Diabetol Metab Syndr Date: 2021-12-20 Impact factor: 3.320
Authors: Paula Dhiman; Jie Ma; Constanza L Andaur Navarro; Benjamin Speich; Garrett Bullock; Johanna A A Damen; Lotty Hooft; Shona Kirtley; Richard D Riley; Ben Van Calster; Karel G M Moons; Gary S Collins Journal: Diagn Progn Res Date: 2022-07-07