| Literature DB >> 34930452 |
Luis Fregoso-Aparicio1, Julieta Noguez2, Luis Montesinos3, José A García-García4.
Abstract
Diabetes Mellitus is a severe, chronic disease that occurs when blood glucose levels rise above certain limits. Over the last years, machine and deep learning techniques have been used to predict diabetes and its complications. However, researchers and developers still face two main challenges when building type 2 diabetes predictive models. First, there is considerable heterogeneity in previous studies regarding techniques used, making it challenging to identify the optimal one. Second, there is a lack of transparency about the features used in the models, which reduces their interpretability. This systematic review aimed at providing answers to the above challenges. The review followed the PRISMA methodology primarily, enriched with the one proposed by Keele and Durham Universities. Ninety studies were included, and the type of model, complementary techniques, dataset, and performance parameters reported were extracted. Eighteen different types of models were compared, with tree-based algorithms showing top performances. Deep Neural Networks proved suboptimal, despite their ability to deal with big and dirty data. Balancing data and feature selection techniques proved helpful to increase the model's efficiency. Models trained on tidy datasets achieved almost perfect models.Entities:
Keywords: Deep learning; Diabetes; Electronic health records; Machine learning; Review
Year: 2021 PMID: 34930452 PMCID: PMC8686642 DOI: 10.1186/s13098-021-00767-9
Source DB: PubMed Journal: Diabetol Metab Syndr ISSN: 1758-5996 Impact factor: 3.320
Strings used in the search
| Data Base | String of keywords |
|---|---|
| PubMed | ((diabetes[Title] AND predictive) AND machine learning) |
| Web of Science | Ti=(diabetes) AND |
| All=(predictive AND machine learning) |
Fig. 1Flow diagram indicating the results of the systematic review with inclusions and exclusions
Fig. 2Percentage of each subgroup in the quality assessment. The criteria does not apply for two result for the Quality Assessment Questions 1 and 3
Detailed classification of methods that predict the main factors for diagnosing the onset of diabetes
| Cite | References | Machine learning model | Validation parameter | Data sampling | Complementary techniques | Description of the population |
|---|---|---|---|---|---|---|
| [ | Arellano-Campos et al. (2019) | Cox proportional hazard regression | Accuracy: 0.75 hazard ratios | Cross-validation (k = 10) and bootstrapping | Beta-coefficients model | Base L: 7636 follow: 6144 diabetes: 331 age: 32–54 |
| [ | You et al. (2019) | Super learning: ensemble learner by choosing a weighted combination of algorithms | Average treatment effect | Cross-validation | Targeted learning query language logistic and tree regression | Total: 78,894 control: 41,127 diabetes: 37,767 age: > 40 |
| [ | Maxwell et al. (2017) | Sigmoid function-Deep Neural Network with cross entropy as loss function | Accuracy: 0.921 F1-score: 0.823 precision: 0.915 sensitivity: 0.867 | Training set (90%) test set (10%) tenfold cross-validation | RAkEL-LibSVM RAkEL-MLP RAkEL-SMO RAkEL-J48 RAkEL-RF MLkNN | Total: 110,300 imbalanced 6 disease categories |
| [ | Nguyen et al. (2019) | Deep Neural Network with three embedding and two hidden layers | Specificity 0.96 accuracy: 0.84 sensitivity: 0.31 AUC (ROC): 0.84 | Training set (70%): cross-validation 9:1 test set (30%) | Generalized linear model large-scale regression | Total: 76,214 78 diseases age: 25–78 |
| [ | Pham et al. (2017) | Recurrent Neural Network Convolutional-Long Short-Term Memory (C-LSTM) | F1-score: 0.79 precision: 0.66 | Training set (66%) tuning set (17%) test set (17%) | Support vector machine and random forests | Diabetes: 12,000 age: 18–100 mean age: 73 |
| [ | Spänig et al. (2019) | Deep Neural Networks with tangens hyperbolicus | AUC (ROC) = 0.71 AUC (ROC) = 0.68 | Training set (80%) test set (20%) | Sub-sampling approach support vector machine with RBF kernel | Total: 4814 diabetes: 646 diagnosis: 397 not diag: 257 age: 45–75 imbalance |
| [ | Wang et al. (2020) | Convolutional neural network and bidirectional long short-term memory | Precision: 92.3 recall: 90.5 F score: 91.3 accuracy: 92.8 | Training set (70%) validation set (10%) test set (20%) | SVM-TFIDF CNN BiLSTM | Total: 18,625 diabetes: 5645 10 disease categories |
| [ | Kim et al. (2020) | Class activation map and CNN (SSANet) | R2 = 0.75 MAE = 3.55 AUC (ROC) = 0.77 | Training set (89%) validation set (1%) test set (10%) | Linear regression | Total: 412,026 norm: 243,668 diabetes: 14,189 age: 19–90 |
| [ | Bernardini et al. (2020) | Sparse balanced support vector machine (SB-SVM) | Recall = 0.7464 AUC (ROC) = 0.8143 | Tenfold cross-validation | Sparse 1-norm SVM | Total: 2433 diabetes: 225 control: 2208 age: 60–80 imbalanced |
| [ | Mei et al. (2017) | Hierarchical recurrent neural network | AUC (ROC) = 0.9268 Accuracy = 0.6745 | Training set (80%) validation set (10%) test set (10%) | Linear regression | Total: 620,633 |
| [ | Prabhu et al. (2019) | Deep belief neural network | Recall: 1.0 precision: 0.68 F1 score: 0.80 | Training set validation set test set | Principal component analysis | Pima Indian Women Diabetes Dataset |
| [ | Bernardini et al. (2020) | Multiple instance learning boosting | Accuracy: 0.83 F1-score: 0.81 precision: 0.82 recall: 0.83 AUC (ROC): 0.89 | Tenfold cross-validation | None | Total: 252 diabetes: 252 age: 54–72 |
| [ | Solares et al. (2019) | Hazard ratios using Cox regression | AUC (ROC): 0.75, concordance (C-statistic) | Derivation set (80%) validation (20%) | None | Total: 80,964 diabetes: 2267 age: 50 |
| [ | Kumar et al. (2017) | Support vector machine, Naive Bayes, K-nearest neighbor C4.5 decision tree | Precision: 0.65, 0.68, 0.7, 0.72 recall: 0.69, 0.68, 0.7, 0.74 accuracy: 0.69, 0.67, 0.7, 0.74 F-score: 0.65, 0.68, 0.7, 0.72 | N-fold (N = 10) cross validation | None | Diabetes: 200 age: 1–100 |
| [ | Olivera et al. (2017) | Logistic regression artificial neural network K-nearest neighbor Naïve Bayes | AUC (ROC): 75.44, 75.48, 74.94, 74.47 balanced accuracy: 69.3, 69.47, 68.74, 68.95 | Training set (70%) test set (30%) tenfold cross-validation | Forward selection | Diabetes: 12,447 unknown: 1359 age: 35–74 |
| [ | Alghamdi et al. (2017) | Naïve Bayes tree, random forest, and logistic model tree, j48 decision tree | Kappa: 1.34, 3.63 1.37, 0.70, 1.14 recall (%) 99.2, 99.2, 90.8, 99.9, 99.4 Specificity (%) 1.6, 3.1, 21.2 0.50, 1.3 accuracy (%) 83.9, 84.1, 79.9, 84.3, 84.1 | N-fold cross validation | Multiple linear regression gain ranking method synthetic minority oversampling technique | Total: 32,555 diabetes: 5099 imbalanced |
| [ | Xie et al. (2017) | K2 structure-learning algorithm | Accuracy = 82.48 | Training set (75%) test set (25%) | None | Total: 21,285 diabetes: 1124 age: 35–65 |
| [ | Peddinti et al. (2017) | Regularised least-squares regression for binary risk classification | Odds ratio accuracy: 0.77 | Tenfold cross-validation | Logistic regression | Total: 543 diabetes: 146 age: 48–50 |
| [ | Maniruzzaman et al. (2017) | Linear discriminant analysis, quadratic discriminant analysis, Naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, random forest | Accuracy: 0.92 sensitivity: 0.96 specificity: 0.80 PPV: 0.91 NPV: 0.91 AUC (ROC): 0.93 | Cross-validation K2, K4, K5, K10, and JK | Random forest, logistic regression, mutual information, principal component analysis, analysis of variance Fisher discriminant ratio | Pima Indian diabetic dataset |
| [ | Dutta et al. (2018) | Logistic regression support vector machine random forest | Sensitivity: 0.80, 0.75, 0.84 F1-score: 0.80, 0.79, 0.84 | Training set (67%) test set (33%) | None | Diabetes: 130 control: 262 imbalanced age: 21–81 |
| [ | Alhassan et al. (2018) | Long short-term memory deep learning gated-recurrent unit deep learning | Accuracy: 0.97 F1-score: 0.96 | Training set (90%) test set (10%) tenfolds cross-validation | Logistic regression support vector machine, multi-layer perceptron | Total: 41,000,000 imbalanced diabetes: 62% |
| [ | Hertroijs et al. (2018) | Latent growth mixture modelling | Specificity: 81.2% sensitivity: 78.4% accuracy: 92.3% | Training set (90%) test set (10%) fivefold cross-validation | K-nearest neighbour | Total: 105814 age: > 18 |
| [ | Kuo et al. (2020) | Random forest C5.0 support vector machine | Accuracy: 1 F1-score: 1 AUC (ROC): 1 sensitivity: 1 | Tenfold cross-validation | Information gain (features) gain ratio | Total: 149 diabetes: 149 age: 21–91 |
| [ | Pimentel et al. (2018) | Naïve Bayes, alternating decision tree, random forest, random tree, k-nearest neighbor, support vector machine | Specificity: 0.76, 0.88, 0.87, 0.97, 0.82, 0.85 sensitivity: 0.62, 0.50, 0.33, 0.42, 0.40, 0.59 AUC (ROC): 0.73, 0.81, 0.87, 0.74, 0.62, 0.63 | Training set (70%) test set (30%) tenfold cross-validation | SMOTE | Total: 9947 imbalanced diabetes: 13% age: 21–93 |
| [ | Talaei-Khoeni et al. (2018) | Artificial neural network, support vector machine, logistic regression, decision tree | AUC (ROC): 0.614, 0.831, 0.738, 0.793 sensitivity: 0.608, 0.683, 0.677, 0.687 specificity: 0.783, 0.950, 0.712, 0.651 MCC: 0.797. 0.922, 0.581, 0.120 MCE: 0.844, 0.989, 0.771, 0.507 | Oversampling technique, random under sampling | Syntactic minority LASSO, AIC and BIC | Total: 10,911 imbalance diabetes: 51.9% |
| [ | Perveen et al. (2019) | J48 decision tree, Naïve Bayes | TPR: 0.85, 0.782, 0.852, 0.774 FPR: 0.218, 0.15 0.226, 0.148 precision: 0.814, 0.782, 0.807 recall: 0.85, 0.802, 0.852, 0.824 F-measure: 0.831, 0.634, 0.829, 0.774 MCC: 0.634, 0.823, 0.628, 0.798 AUC (ROC): 0.883, 0.873, 0.836, 0.826 | K-medoids under sampling | Logistic regression | Total: 667, 907 age: 22–74 diabetes: 8.13% imbalance |
| [ | Yuvaraj et al. (2019) | Decision tree Naïve Bayes random forest | Precision: 87, 91, 94 recall: 77, 82, 88 F-measure: 82, 86, 91 accuracy: 88, 91, 94 | Training set (70%) test set (30%) | Information gain RHadoop | Total: 75,664 |
| [ | Deo et al. (2019) | Bagged trees, linear support vector machine | Accuracy: 91% AUC (ROC): 0.908 | Training set (70%) test set (30%) fivefold cross-validation, holdout validation | Synthetic minority oversampling technique, Gower’s distance | Total: 140 diabetes: 14 imbalanced age: 12–90 |
| [ | Jakka et al. (2019) | K nearest neighbor, decision tree, Naive Bayes, support vector machine, logistic regression, random forest | Accuracy: 0.73, 0.70, 075, 0.66, 0.78, 0.74 recall: 0.69, 0.72, 0.74, 0.64 0.76, 0.69 F1-score: 0.69, 0.72, 0.74, 0.40, 0.75, 0.69 misclassification rate: 0.31, 0.29, 0.26, 0.36, 0.24, 0.29 AUC (ROC): 0.70, 0.69, 0.70, 0.61, 0.74, 0.70 | None | None | Pima Indians Diabetes dataset |
| [ | Radja et al. (2019) | Naive Bayes, support vector machine, decision table, J48 decision tree | Precision: 0.80, 0.79, 0.76, 0.79 precision: 0.68, 0.74, 0.60, 0.63 recall: 0.84, 0.90, 0.81, 0.81 recall: 0.61, 0.54, 0.53, 0.60 F1-score: 0.76, 0.76, 0.71, 0.74 | Tenfold cross-validation | None | Total: 768 diabetes: 500 control: 268 |
| [ | Choi et al. (2019) | Logistic regression, linear discriminant analysis, quadratic discriminant analysis, K-nearest neighbor | AUC (ROC): 0.78, 0.77 0.76, 0.77 | Tenfold cross-validation | Information gain | Total: 8454 diabetes: 404 age: 40–72 |
| [ | Akula et al. (2019) | K nearest neighbor, support vector machine, decision tree, random forest, gradient boosting, neural network, Naive Bayes | Overall accuracy: 0.86 precision: 0.24 negative prediction: 0.99 sensitivity: 0.88 specificity: 0.85 F1-score: 0.38 | Training set: 800 test set: 10,000 | None | Pima Indians Diabetes Dataset Practice Fusion Dataset total: 10,000 age: 18–80 |
| [ | Xie et al. (2019) | Support vector machine, decision tree, logistic regression, random forest, neural network, Naive Bayes | Accuracy: 0.81, 0.74, 0.81, 0.79, 0.82, 0.78 sensitivity: 0.43, 0.52, 0.46, 0.50, 0.37, 0.48 specificity: 0.87, 0.78, 0.87, 0.84 0.90, 0.82 AUC (ROC): 0.78, 0.72, 0.79, 0.76, 0.80, 0.76 | Training set (67%) test set (33%) | Odds ratio synthetic minority over-sampling technique | Total: 138,146 diabetes: 20,467 age: 30–80 |
| [ | Lai et al. (2019) | Gradient boosting machine, logistic regression, random forest, Rpart | AUC (ROC): 84.7%, 84.0% 83.4%, 78.2% | Training set (80%) test set (20%) tenfold cross-validation | Misclassification costs | Total: 13,309 diabetes: 20.9% age: 18–90 imbalanced |
| [ | Brisimi et al. (2018) | Alternating clustering and classification | AUC (ROC): 0.8814, 0.8861, 0.8829, 0.8812 | Training set (40%) test set (60%) | Sparse (l1-regularized), support vector machines, random forests, gradient tree boosting | Diabetes: 47,452 control: 116,934 age mean: 66 |
| [ | Abbas et al. (2019) | Support vector machine with Gaussian radial basis | Accuracy: 96.80% sensitivity: 80.09% | Tenfold cross-validation | Minimum redundancy maximum relevance algorithm | Total: 1438 diabetes: 161 age: 25–64 |
| [ | Sarker et al. (2020) | K-nearest neighbors | Precision: 0.75 recall: 0.76 F-score: 0.75 AUC (ROC): 0.72 | Tenfold cross validation | Adaptive boosting, logistic regression, Naive Bayes, support vector machine decision tree | Total: 500 age: 10–80 |
| [ | Cahn et al. (2020) | Gradient boosting trees model | AUC (ROC): 0.87 sensitivity: 0.61 specificity: 0.91 PPV: 0.16 | Training set: THIN dataset validation set: AppleTree dataset MHS dataset | Logistic-regression | Age: 40–80 THIN: total = 3,068,319 pre-DM: 40% DM: 2.9% Apple Tree: P-DM: 381,872 DM: 2.3% MHS: pre-DM: 12,951 DM: 2.7% |
| [ | Garcia-Carretero et al. (2020) | K-nearest neighbors | Accuracy: 0.977 sensitivity 0.998 specificity 0.838 PPV: 0.976 NPV: 0.984 AUC (ROC): 0.89 | Tenfold cross-validation | Random forest | Age: 44–72 pre-DM = 1647 diabetes: 13% |
| [ | Zhang et al. (2020) | Logistic regression, classification and regression tree, gradient boosting machine, artificial neural networks, random forest, support vector machine | AUC (ROC): 0.84, 0.81, 0.87, 0.85, 0.87, 0.84 accuracy: 0.75, 0.80, 0.81, 0.74, 0.86, 0.76 sensitivity: 0.79, 0.67, 0.76, 0.81, 0.80, 0.75 specificity: 0.75, 0.81, 0.82, 0.73, 0.78, 0.77 PPV: 0.23, 0.26, 0.29, 0.26, 0.26, 0.24 NPV: 0.97, 0.96, 0.97, 0.98, 0.98, 0.97 | Tenfold cross-validation | Synthetic minority over-sampling technique | Total: 36,652 age: 18–79 |
| [ | Albahli et al. (2020) | Logistic regression | Accuracy: 0.97 | Tenfold cross-validation | Random Forest eXtreme Gradient Boosting | Total: diabetes age: 21–81 Pima Indians Diabetes dataset |
| [ | Haq et al. (2020) | Decision tree (iterative Dichotomiser 3) | Accuracy: 0.99 sensitivity 1 specificity 0.98 MCC: 0.99 F1-score: 1 AUC (ROC): 0.998 | Training set (70%) test set (30%) hold out training set (90%) test set (10%) tenfold cross-validation | Ada Boost, random forest | Total = 2000 diabetes: 684 age: 21–81 |
| [ | Yang et al. (2020) | Linear discriminant analysis, support vector machine random forest | AUC: 0.85, 0.84, 0.83 sensitivity: 0.80, 0.79, 0.78 specificity: 0.74, 0.75, 0.73 accuracy: 0.75 0.74,0.74 PPV: 0.36, 0.36, 0.35 | Training set: (80%, 2011–2014), test set: (20%, 2011–2014) and validation set: (2015–2016) fivefold cross-validation | Binary logistic regression | Total = 8057 age: 20–89 imbalanced |
| [ | Ahn et al. (2020) | Random forest, support vector machine | AUC (ROC): 1.00, 0.95 | Tenfold cross-validation | ELISA | Age: 43–68 |
| [ | Sarwar et al. (2018) | K nearest neighbors, Naive Bayes, support vector machine, decision tree, logistic regression, random forest | Accuracy: 0.77, 0.74, 0.77, 0.71, 0.74, 0.71 | Training set (70%) test set (30%) tenfold cross-validation | None | Pima Indians Diabetes Dataset |
| [ | Zou et al. (2018) | Random forest J48 decision tree Deep Neural Network | Accuracy: 0.81, 0.79, 0.78 sensitivity: 0.85 0.82, 0.82 specificity: 0.77, 0.76, 0.75 MCC: 0.62, 0.57, 0.57 | Fivefold cross-validation | Principal component analysis, minimum redundancy maximum relevance | Pima Indian diabetic Luzhou |
| [ | Farran et al. (2019) | Logistic regression k-nearest neighbours support vector machine | AUC (ROC): 3-year: 0.74, 0.83, 0.73 5-year: 0.72, 0.82, 0.68 7-year: 0.70, 0.79, 0.71 | Fivefold cross-validation | None | Diabetes: 40,773 control: 107,821 age: 13–65 |
| [ | Xiong et al. (2019) | Multilayer perceptron, AdaBoost, random forest, support vector machine, gradient boosting | Accuracy: 0.87, 0.86, 0.86, 0.86, 0.86 | Training set (60%) test set (20%) tenfolds cross-validation set (20%) | Missing values feature mean | Total: 11845 diabetes: 845 age: 20–100 |
| [ | Dinh et al. (2019) | Support vector machine, random forest, gradient boosting, logistic regression | AUC (ROC): 0.890.94, 0.96, 0.72 sensitivity: 0.81, 0.86, 0.89, 0.67 precision: 0.81, 0.86, 0.89, 0.67 F1-score: 0.81, 0.86, 0.89, 0.67 | Training set (80%) test set (20%) tenfold cross-validation | None | Case 1: 21,131 diabetes: 5532 case 2: 16,426 prediabetes: 6482 |
| [ | Liu et al. (2019) | LASSO, SCAD, MCP, stepwise regression | AUC (ROC): 0.710.70, 0.70, 0.71 sensitivity: 0.64, 0.64, 0.64, 0.63 specificity: 0.68, 0.68, 0.68, 0.68, precision: 0.35, 0.35, 0.35, 0.35 NPV: 0.87, 0.87, 0.87, 0.87 | Training set (70%) test set (30%) tenfold cross-validation | None | Total: 5481 age: > 40 |
| [ | Muhammad et al. (2020) | Logistic regression support vector machine K-nearest neighbor random forest Naive Bayes gradient boosting | Accuracy: 0.81, 0.85, 0.82, 0.89, 0.77, 0.86 AUC (ROC): 0.80, 0.85, 0.82, 0.86 0.77, 0.86 | None | Correlation coefficient analysis | Total: 383 age: 1–150 diabetes: 51.9% |
| [ | Tang et al. (2020) | EMR-image multimodal network (CNN) | Accuracy: 0.86 F1-score: 0.76 AUC (ROC): 0.89 Sensitivity: 0.68 Precision: 0.88 | Fivefold cross-validation | None | Total: 997 diabetes: 401 |
| [ | Maniruzzaman et al. (2021) | Naive Bayes decision tree Adaboost random forest | Accuracy: 0.87, 0.90, 0.91, 0.93 AUC (ROC): 0.82, 0.78, 0.90, 0.95 | Tenfold cross-validation | Logistic regression | Total: 6561 diabetes: 657 age: 30–64 imbalanced |
| [ | Boutilier et al. (2021) | Random forest logistic regression Adaboost K-nearest neighbors decision trees | AUC (ROC): 0.91, 0.91, 0.90, 0.86, 0.78 | Tenfold cross-validation | 2-Sided Wilcoxon signed rank test | Total: 2278 diabetes: 833 age: 35–63 |
| [ | Li et al. (2021) | Extreme gradient boosting (GBT) | AUC (ROC): 0.91 precision: 0.82 sensitivity: 0.80 F1-score: 0.77 | Training set (60%) validation (20%) test set (20%) | Genetic algorithm | Diabetics: 570 control: 570 prediabetics: 570 age: 33–68 |
| [ | Lam et al. (2021) | Random forest logistic regression extreme gradient boosting GBT | AUC (ROC): 0.86 F1-score: 0.82 | Tenfold cross-validation | None | Control: 19,852 diabetes: 3103 age: 40–69 |
| [ | Deberneh et al. (2021) | Random forest support vector machine XGBoost | Accuracy: 0.73, 0.73, 0.72 precision: 0.74, 0.74, 0.74 F1-score: 0.74, 0.74, 0.73 sensitivity: 0.73, 0.74, 0.72 Kappa: 0.60, 0.60, 0.58 MCC: 0.60, 0.60, 0.58 | Tenfold cross-validation | ANOVA, Chi-squared, SMOTE feature Importance | Total: 535,169 diabetes: 4.3% prediabetes: 36% age: 18–108 |
| [ | He et al. (2021) | Cox regression | C-statics: 0.762 | Hold out | None | Total: 68,299 diabetes: 1281 age: 40–69 |
| [ | García-Ordás et al. (2021) | Convolutional neural network (DNN) | Accuracy: 0.92 | Training set (90%) test set (10%) | Variational and sparse autoencoders | Pima Indians |
| [ | Kanimozhi et al. (2021) | Hybrid particle swarm optimization-artificial fish swarm optimization | Accuracy: 1, 0.99 specificity: 0.86, 0.83 sensitivity: 1, 0.99 MCC: 0.91, 0.92 Kappa: 0.96, 0.98 | Training set (90%) test set (10%) fivefold cross-validation | Min–max scaling, kernel extreme learning machine | Pima Indians Diabetics, Diabetic Research Center |
| [ | Ravaut et al. (2021) | Extreme gradient boosting tree | AUC (ROC): 0.84 | Training set (86%) validation (7%) test set (7%) | Mean absolute Shapley values | Total: 15,862,818 diabetes: 19,137 age: 40–69 |
| [ | De Silva et al. (2021) | Logistic regression | AUC (ROC): 0.75 accuracy: 0.62 specificity: 0.62 sensitivity: 0.77 PPV: 0.09 NPV: 0.98 | Training set (30%) validation (30%) test set (40%) | SMOTE ROSE | Total: 16,429 diabetes: 5.6% age: >20 |
| [ | Kim et al. (2021) | Deep neural network, logistic regression, decision tree | Accuracy: 0.80, 0.80, 0.71 | Fivefold cross-validation | Wald test | Total: 3889 diabetes: 746 age: 40–69 |
| [ | Vangeepuram et al. (2021) | Naive Bayes | AUC (ROC): 0.75 accuracy: 0.62 specificity: 0.62 sensitivity: 0.77 PPV: 0.09 NPV: 0.98 | Fivefold cross-validation | Friedman-Nemenyi | Total: 2858 diabetes: 828 age: 12–19 |
| [ | Recenti et al. (2021) | Random forest Ada-boost gradient boosting | Accuracy: 0.90, 0.79, 0.86 precision: 0.88, 0.78, 0.84 F1-score: 0.90, 0.81, 0.87 sensitivity: 0.93, 0.84, 0.90 specificity: 0.87, 0.76, 0.82 AUC (ROC): 0.97, 0.90, 0.95 | Tenfold cross-validation | SMOTE | Total: 2943 age: 66–98 imbalance |
| [ | Ramesh et al. (2021) | Support vector machine | Accuracy: 0.83 specificity: 0.79 sensitivity: 0.87 | Tenfold cross-validation | MICE LASSO | Pima Indians |
| [ | Lama et al. (2021) | Random forest | AUC (ROC): 0.78 | Fivefold cross-validation | SHAP TreeExplainer | Total: 3342 diabetes: 556 age: 35–54 |
| [ | Shashikant et al. (2021) | Gaussian process-based kernel | Accuracy: 0.93 precision: 0.94 F1-score: 0.95 sensitivity: 0.96 specificity: 0.82 AUC (ROC): 0.89 | Tenfold cross-validation | Non-linear HRV | Total: 135 diabetes: 100 age: 20–70 |
| [ | Kalagotla et al. (2021) | Stacking multi-layer perceptron, support vector machine logistic regression | Accuracy: 0.78 precision: 0.72 sensitivity: 0.51 F1-score: 0.60 | Hold out k-fold cross-validation | Matrix correlation | Pima Indians |
| [ | Moon et al. (2021) | Logistic regression | AUC (ROC): 0.94 | Training set (47%) validation (30%) test set (23%) | Cox regression | Total: 14,977 diabetes: 636 age: 48–69 |
| [ | Ihnaini et al. (2021) | Ensemble deep learning model | Accuracy: 0.99 precision: 1 sensitivity: 0.99 F1-score: 0.99 RMSE: 0 MAE: 0.6 | Hold out | None | Pima Indians merged Hospital Frenkfurt Germany |
| [ | Rufo et al. (2021) | LightGBM | Accuracy: 0.98 specificity: 0.96 AUC (ROC): 0.98 Sensitivity: 0.99 | Tenfold cross-validation | Min–max scale | Diabetes: 1030 Control: 1079 age: 12–90 |
| [ | Haneef et al. (2021) | Linear discriminant analysis | Accuracy: 0.67 specificity: 0.67 sensitivity: 0.62 | Training set (80%) test set (20%) | Z-score transformation random down sampling | Total 44,659 age 18–69 imbalanced |
| [ | Wei et al. (2022) | Random forest | AUC (ROC): 0.70 R2: 0.40 | Training set (70%) test set (30%) tenfold cross-validation | LASSO PCA | Total: 8501 age: 15–50 diabetes: 8.92% imbalanced |
| [ | Leerojanaprapa et al. (2019) | Bayesian network | AUC (ROC): 0.78 | Training set (70%) test set (30%) | None | Total: 11,240 diabetes: 5.53% age: 15–19 |
| [ | Subbaiah et al. (2020) | Random forest | Accuracy: 1 specificity: 1 sensitivity: 1 Kappa: 1 | Training set (70%) test set (30%) | None | Pima Indians |
| [ | Thenappan et al. (2020) | Support vector machine | Accuracy: 0.97 specificity: 0.96 sensitivity: 0.94 precision: 0.96 | Training set (70%) test set (30%) | Principal component analysis | Pima Indians |
| [ | Sneha et al. (2019) | Support vector machine, random forest, Naive Bayes, decision tree, k-nearest neighbors | Accuracy: 0.78, 0.75, 0.74, 0.73, 0.63 | Training set (70%) test set (30%) | None | Total: 2500 age: 29–70 |
| [ | Jain et al. (2020) | Support vector machine, random forest, k-nearest neighbors | Accuracy: 0.74 0.74, 0.76 precision: 0.67, 0.72, 0.70 sensitivity: 0.52, 0.44, 0.54 F1-score: 0.58, 0.55, 0.61 AUC (ROC): 0.74, 0.83, 0.83 | Training set (70%) test set (30%) | None | Control: 500 diabetes: 268 age: 21–81 |
| [ | Syed et al. (2020) | Decision forest | F1-Score: 0.87 precision: 0.81 AUC (ROC): 0.90 Sensitivity: 0.91 | Training set (80%) test set (20%) | Pearson Chi-squared | Total: 4896 diabetes: 990 age: 40–60 |
| [ | Nuankaew et al. (2020) | Average weighted objective distance | Precision: 0.99 accuracy: 0.90 specificity: 0.97 | Training set (70%) test set (30%) | None | Mendeley data for diabetes |
| [ | Samreen et al. (2021) | Stack NB, LR, KNN, DT, SVM, RF, Ada-boost, GBT | Accuracy: 0.98, 0.99 (SVD) | Training set (70%) test set (30%) tenfold cross-validation | One hot encoding, singular value decomposition | Age: 20–90 |
| [ | Fazakis et al. (2021) | Weighted voting LR-RF | AUC (ROC): 0.88 | Hold-out | Forward/backward stepwise selection | English longitudinal study of ageing |
| [ | Omana et al. (2021) | Newton’s divide difference method | Accuracy: 0.97 S-error: 0.06 | Hold-out | Non-linear autoaggressive regression | Total: 812,007 diabetes: 23.49% |
| [ | Ravaut et al. (2021) | Extreme gradient boosting tree | AUC (ROC): 0.80 | Training set (87%) validation (7%) test set (6%) | Mean absolute Shapley values | Total: 14,786,763 diabetes: 27,820 age: 10–100 imbalance |
| [ | Lang et al. (2021) | Deep belief network | AUC (ROC): 0.82 sensitivity: 0.80 specificity: 0.73 | Hold-out | Stratified sampling | Total: 1778 diabetes: 279 |
| [ | Gupta et al. (2021) | Deep Neural Network | Precision: 0.90 accuracy: 0.95 sensitivity: 0.95 F1-score: 0.93 specificity: 0.95 | Hold-out | None | Pima Indians |
| [ | Roy et al. (2021) | Gradient boosting tree | Accuracy: 0.92 precision: 0.86 sensitivity: 0.87 specificity: 0.79 AUC (ROC): 0.84 | Tenfold cross-validation | Correlation matrix SMOTE | Total: 500 diabetes: 289 age: 20–80 Imbalanced |
| [ | Zhang et al. (2021) | Bagging boosting GBT, RF, GBM | Accuracy: 0.82 sensitivity: 0.85 specificity: 0.82 AUC (ROC): 0.89 | Training set (80%) test set (20%) tenfold cross-validation | SMOTE | Total: 37,730 diabetes: 9.4% age: 50–70 Imbalanced |
| [ | Turnea et al. (2018) | Decision tree | Accuracy: 0.74 sensitivity: 0.60 specificity: 0.82 RMSE: 26.1 | Training set (75%) test set (25%) | None | Pima Indians |
| [ | Vettoretti et al. (2021) | RFE-Borda | RMSE: 0.98 | None | Correlation matrix | English longitudinal study of ageing |
Fig. 3Scatterplot of AUC (ROC) vs. Accuracy for included studies. Numbers correspond to the number of reference and color dot the type of model, desired model has values of x-axis equal 1 and y-axis also equal 1
Fig. 4Average accuracy by model. For papers with more than one model the best score is the model selected to the graph. A better model has a higher value
Fig. 5Average AUC (ROC) by model. For papers with more than one model the best score is the model selected to the graph. A better model has a higher value