| Literature DB >> 34862560 |
Abstract
Diabetes mellitus has been an increasing concern owing to its high morbidity, and the average age of individual affected by of individual affected by this disease has now decreased to mid-twenties. Given the high prevalence, it is necessary to address with this problem effectively. Many researchers and doctors have now developed detection techniques based on artificial intelligence to better approach problems that are missed due to human errors. Data mining techniques with algorithms such as - density-based spatial clustering of applications with noise and ordering points to identify the cluster structure, the use of machine vision systems to learn data on facial images, gain better features for model training, and diagnosis via presentation of iridocyclitis for detection of the disease through iris patterns have been deployed by various practitioners. Machine learning classifiers such as support vector machines, logistic regression, and decision trees, have been comparative discussed various authors. Deep learning models such as artificial neural networks and recurrent neural networks have been considered, with primary focus on long short-term memory and convolutional neural network architectures in comparison with other machine learning models. Various parameters such as the root-mean-square error, mean absolute errors, area under curves, and graphs with varying criteria are commonly used. In this study, challenges pertaining to data inadequacy and model deployment are discussed. The future scope of such methods has also been discussed, and new methods are expected to enhance the performance of existing models, allowing them to attain greater insight into the conditions on which the prevalence of the disease depends.Entities:
Keywords: Deep learning; Diabetes detection; Health care; Machine learning
Year: 2021 PMID: 34862560 PMCID: PMC8642577 DOI: 10.1186/s42492-021-00097-7
Source DB: PubMed Journal: Vis Comput Ind Biomed Art ISSN: 2524-4442
Fig. 1Classification results of SVM, KNN, NB, DTs and LR in form of true positive (TP), false positive (FP), true negative (TN) and false negative (FN) which are the parameters of confusion matrix [17]
Fig. 2The weight of each of the features which yield the result variable [18]
Fig. 3Proposed architecture of the detection system [16]
Fig. 4Performance of each algorithm on the newly acquired dataset [20]
Fig. 5Depicts how at the clinical diagnosis of NIDDM, the patients had the prevalent condition of diabetic retinopathy
Comparison table obtained for the KNN and the DT model on pre-processed dataset and other without pre-processing [23]
| Data | KNN ( | DT |
|---|---|---|
| With pre-processing | 0.9545 | 0.936 |
| Without pre-processing | 0.9318 | 0.9434 |
Fig. 6Different classification algorithms result on prediction modelling where NB outgrows the other two
Accuracy results of different classifiers at different values of k-fold validation (%)
| Support vector classifier | DT | RF | LR | Multi-layer perceptron | |
|---|---|---|---|---|---|
| 77.6 | 69 | 69.9 | 77.8 | 77.5 | |
| 77.6 | 69.9 | 70 | 77.6 | 78.7 | |
| 77.5 | 71.5 | 72.9 | 77.6 | 78.2 | |
| 77.5 | 69.5 | 70 | 77.6 | 77.6 |
Fig. 7Accuracy of the classification algorithms on the PIMA Indian diabetes dataset. The proposed hybrid model shows the maximum accuracy
A comprehensive study of the machine learning methods done by some researchers
| Algorithm | Method used/innovation | Application and future work | Results and limitations (if specified) | References |
|---|---|---|---|---|
| J48, AdaBoost, and bagging on base classifier | The model was performed on Canadian Primary Care Sentinel Surveillance Network dataset with several features to train on. The author used ensemble methods AdaBoost on base classifier J48 DT. | The author claimed that these ensemble algorithms can be used on other disease datasets to increase accuracy. | The AdaBoost algorithm with the J48 as the base classifier showed the maximum accuracy followed by bagging and then the J48 classifier. The AROC was used as the parameter. | [ |
| NB with clustering | Dataset used was the PIMA Indians Diabetes Dataset with eight attributes. The model is NB performed on prior clustering. This model is compared with only the NB model. Five hundred and thirty-one instances of data were divided into 5 clusters. The fourth cluster was the only one used for testing, which consisted of 148 instances. | By collecting a large amount of data for training, the accuracy can be increased by many-fold, helping people by developing a system that gives them a correct prediction without having to consult a doctor. | The parameters used for evaluation are accuracy, sensitivity, and specificity. The model with clustering showed a 10% increased accuracy, rise in sensitivity by 53.11% but the limitation caused here was the fall of specificity by 10.99% and also a reduced amount of dataset. | [ |
| DTs, LR, and NB with bagging and boosting | Initial datasets were collected from primary care units, which (through further changes) consisted of 11 features and a data of 30122 people. The three algorithms are used along with bagging and boosting methods, which are to decrease overfitting and increase accuracy. | The final model obtained with highest accuracy was deployed on a commercial web application. | The following data shows the accuracy with bagging and boosting. DT 85.090, LR 82.308, NB 81.010, Bagging with DT (BG+DT) 85.333, bagging with LR (BG+LR) 82.318, bagging with NB (BG+NB) 80.960, boosting with DT (BT+DT) 84.098, boosting with LR (BT+LR) 82.312, and boosting with NB (BT+NB) 81.019. RF 85.558 shows the maximum accuracy. The ROC was used for final validation. | [ |
| LR, KNN, SVM, LDA, NB, DT, and RF | The author collected a raw dataset from Noakhali medical hospital containing 9843 samples with 14 attributes. Eighty percent of the data was taken for training and the rest for testing. All the algorithms chosen by the author were used for model building and then validation was performed on them using | The author proposed that we can enhance the accuracy of early treatment to lessen the suffering of patients. Additionally, we can implement more classifiers to pick up the leading one for record-breaking performance and extend it to automation analysis. | The RF classifier was the algorithm that performed the best in classifying data and LR showed the worst performance. Although machine learning classifiers are widely used, they still lack in terms of accuracy against deep learning models. | [ |
| LR and DTs | The dataset was prepared using a questionnaire carried out for 1487 individuals in which 735 were diabetic and the remaining 752 negatives. A Pearson chi-square test was carried out on all the characteristics. The models’ performance was evaluated on three parameters: accuracy, sensitivity, and specificity. Apart from this, a confusion matrix was also built to determine model performance. | Recently, many researchers have been implementing various algorithms and networks to compare them and find out the most feasible one. DTs and LR are among the ones that are most used. | LR achieved a ACC of 76.54%, sensitivity of 79.4%, and specificity of 73.54% on the testing data while the DT gained an accuracy of 76.97%, sensitivity of 78.11%, and specificity of 75.78%. Overall, the DT model performed better than the LR model. The model poses a limitation of the dataset. It is collected only from one area of China, if it had been collected from different regions, the model implementation could be more practical. | [ |
| SVM and LR | Practice fusion de-identified dataset was used for the study taken from Kaggle containing data of approximately 10000 patients. The features were divided into baseline, lab-test, diagnosis, and medication. For the classification task, LR and SVM were deployed. The LR model was implemented using the GLM function and the SVM was used on a linear kernel. The area under the ROC was the parameter used for evaluation. | LR is a model that is widely used in public health and clinical practice for disease detection and to calculate risks. | On using a smaller subset of features, the LR model performed slightly better than the SVM model. | [ |
| DTs and NB | The dataset taken for consideration was the PIMA Indian diabetes database. On applying feature selection, the author obtained five features. 10-fold cross-validation was used for data preparation after which the J48 algorithm – DTs and NB is applied. The model performance was evaluated using mean absolute error (MAE), root-mean-square error (RMSE), relative absolute error, root relative squared error, and kappa statistic. | The author proposed to gather information for the dataset from different people to make a more representative model. The work can be further enhanced to include automation. | Using a percentage split of 70:30, the J48 DT algorithm correctly classified 177 instances (76.95%) whereas the NB got an accuracy of 79.56%. The accuracy obtained performed better on the percentage split, which shows the models are not showing good accuracy on larger datasets. | [ |
| RF and XG boost | The author used the PIMA diabetes dataset. Using Jupyter Notebook as an IDE, the author trains the model using 8 attributes of the total 9 provided in the dataset. The algorithms used were RF and XGBoost. After setting the hyperparameters the models were trained. | The author suggests the use of more algorithms in this branch of machine learning like hybrid model for better accuracies. | The accuracy gained on the RF classifier came out to be 71.9%. The hybrid model proposed through XG boost gained an accuracy of 74.1%. The accuracies gained on the models were comparatively less compared those already available. The hyperparameter tuning needs to be set better for optimizing the algorithm. | [ |
| DT (J48) and NB | The author used the PIMA Indian diabetes dataset with 8 attributes, which was reduced to 5 based on the feature selection. The pre-processing was performed used the WEKA using 10-fold validation. The model was created using the 70% dataset and the rest was used for testing. | In future, it is planned to gather the information from different locales over the world and make a more precise and general prescient model for diabetes conclusion. Future study will likewise focus on gathering information from a later time period and discover new potential prognostic elements to be incorporated. The work can be extended and improved for the automation of diabetes analysis. | The J48 algorithm was 76.95% accurate with other parameters like kappa statistic, MAE, RMSE, relative absolute error, and root relative absolute error. The NB algorithm was accurate up to 79.56%. Since this model is not optimally configured, a developed model would require more training data for creation and testing. | [ |
| Genetic algorithms with fuzzy logic | This work is a model implementing the genetic and fuzzy algorithms for effective disease prediction. For the implementation of GA, MATLAB R2006b was used. The principle of feature selection was implemented using fuzzy logic algorithms. Firstly, a simultaneous mapping was performed based on an appropriateness measure of variables values to each class using suitable membership functions according to each type of feature. Then, simple fuzzy reasoning mechanisms were proposed to deal, in unified way, with classification. | The proposed work helps minimize the cost and increase accuracy and can be used in future for better implementations. | Through this approach of GA, the accuracy went up to 87% with the training cost reducing by more than 50%. | [ |
| k-NN, NB, DT, RF, SVM LR | These models were created in comparison for detection of type-2 diabetes. A total of 300 samples were taken in which 161 were diabetic, 60 non-diabetic and the rest were unconfirmed. Using feature summarization, eight features were selected with the WEKA tool used. These algorithms were used against a proposed framework that automatically extracts patterns of type 2 DM. | An application of genome wide association and phenome-wide association study in hope for its associations with DM. They proved to be an important association for future models. | Proposed model was evaluated on basis of accuracy, precision, specificity, sensitivity, and AUC. The proposed algorithm gained an AUC of 0.98, which outperforms the state-of-the-art AUC of 0.71. | [ |
| LR, DT, RF, SVM | The author used the PIMA Indian women dataset concerned with women’s health with 8 attributes. Different models were trained for this dataset under different hyperparameters. | The author proposed to create advanced models on RF because of its highest accuracy and ability to overcome overfitting. | Different models were compared on basis of accuracy. RF gained the highest accuracy with 77.06% followed by SVM. | [ |
| DT – J48, RF | The author obtained dataset from Luzhou from hospital physical examination. An independent test set was taken with 13700 samples. The data contained 14 attributes. Another dataset was the PIMA Indian diabetes dataset. The DT and RF algorithm were implemented in WEKA with principal component analysis. | The author hoped to predict the type of diabetes using a dataset containing the required data which would lead to be an added advantage for improving the accuracy. | The RF and the J48 algorithm achieved an accuracy of 73.95% and 73.88%, respectively, on the Luzhou dataset and 71.44% and 71.67%, respectively, on the PIMA dataset. | [ |
| NB and SVM | Patient dataset of 500 records was collected from diabetes healthcare institute who have symptoms of heart disease. The dataset contained 9 attributes. Both the algorithms were implemented on WEKA dataset. For SVM, a radial basis function kernel was used. | Classifiers of this kind can help in early detection of the vulnerability of a diabetic patient to heart disease. There by the patients can be forewarned to change their lifestyle. This will result in preventing diabetic patients from being affected by heart disease, thereby resulting in low mortality rates as well as reduced cost on health for the state. | NB was able to classify 74% of instances correctly. For SVM, the accuracy gained was 95.6%. | [ |
| DT, SVM, K-NN RF | The author used the PIMA Indian dataset. Then, data normalization was done followed by feature selection. | The benefit of this optimized machine learning model is that it is suitable for patients with DM and it can be applied to all the fields of health care environments. | The models are compared on the basis of accuracy, sensitivity, and specificity. DT has 78.25% accuracy. | [ |
Fig. 8General representation of a neural network, x shows the input weights and y is the output weights
Fig. 9The graphs plotted between actual value shown by the blue line and predicted values shown by the red line
A comprehensive study of the deep learning methods done by some researchers
| Algorithm | Method used/innovation | Application | Results and limitations (if specified) | References |
|---|---|---|---|---|
| Deep belief neural network | The dataset used was PIMA Indian diabetes dataset with 768 instances and 8 features. The activation hidden function used was ReLU rectifier linear unit with three hidden layers and sigmoid as the input activation function. The batch size at 100 and epochs set to 5. | As the author compares his network with conventional methods of machine learning classifiers and it obtained high results, they believe that this network can be tweaked according to convenience and used for detection of other diseases as well. | The results of the network proposed by the author and conventional methods were seen on three parameters: recall value, precision and F1 measure. This network obtained high recall, precision, and F1 measure values, which shows that it is a good model. | [ |
| Long short-term memory (LSTM) neural networks – RNNs | The author used Direct Net Inpatient Accuracy Study dataset which contains approximately 110 instances. The neural network model proposed here has one LSTM layer. The model works on predicting blood sugar levels. For the search grid, the author takes into consideration LSTM units, Dense units, and sequence lengths. The parameter taken for evaluating model performance is RMSE. | The author elaborated on the real-life applications by proposing to deploy LSTMs models on mobile platforms, apps, and cloud servers for availability to the masses. | It was concluded that the use of LSTM for blood glucose level prediction is promising. The values of RMSE obtained are 4.67 which is the minimum and 29.12 being the max value. Missing data is an issue for the model as if the patient removes the CGM device, the model should be trained in a format that it can automatically handle the missing data. | [ |
| Deep prediction model | The data taken was of six individuals in the age group 22-29 years. Deep prediction model is a multiple layer model consisting of data driven predictors which is fed with glucose measurement and time series. The first part is the autoregressive models with external inputs or the ANN’s followed by the extreme learning machine (ELM) and the glucose level predictions are given. The learning speed of ELM’s is very fast and also, they are easier to implement. | ELM models are easy to implement any classification application and can do approximations for continuous functions; they are widely used in this field of research for implementing efficient models. | The model was evaluated on three parameters–RMSE, CC, and time lag (TL) at different glucose concentrations for PH = 15, 30, and 45 mins with input combinations of three types involving glucose and sugar concentrations. For all PHs the linear models along with the ELM outperforms the ANN-ELM achieving reduced TL, reduced RMSE and increased CC. | [ |
| Empirical mode decomposition (EDM) and LSTM | The dataset was obtained from shanghai hospital containing 174 instances. The data is used for training two models: one is LSTM and the other is LSTM+EDM. LSTMs are improved versions of RNN and EDM is an adaptive signal decomposition method for non-linear and non-stationary methods. The performance evaluation measures are MAE and RMSE. | The author proposed a more accurate model by treating it with real-time data or personalized data to give better results. Deploying of the model to mobile clients was also suggested. | The parameters considered for evaluating the performance accuracy were the MAE and RMSE. The MAE measures the prediction error and the RMSE gives the deviation between the observed value and the truth value. The results are observed over a time interval of 30, 60, 90 and 120 mins. | [ |
| Temporal convolutional network with vanilla LSTM | The raw dataset included two male instances and four female instances. Feature selection was performed on the dataset followed by the hyperparameter tuning. The LSTM network used had 3 hidden layers with size equals 50. The TCN block has 10 layers with each having a dilation factor of 2 and kernel size 4. | The author stated that this study of using DL algorithms in time-series prediction would help practitioners and researchers for selecting appropriate models with pragmatical parameters. | The author used three parameters for evaluation, namely RMSE, temporal gain, and normalized energy of second order. The values of RMSE, temporal gain, and energy were much higher for the vanilla LSTM than the TCN. The limitation for the model was that it does not collect data that involves deeper meaning for personalized output. | [ |
| ANN | Medical dataset taken from Noakhali medical college containing data of 9483 patients with 14 features. The dataset was split into 80% for training and the rest 20% for testing. For ANN, the author chose the SoftMax activation function with six hidden layers. The training was done using ReLU activation function with 25 epochs. | Because ANNs achieve high accuracy, the author recommends that using those for model prediction and detection of diseases would be a much better alternative than other ML models. | For increasing model performance, an extra hidden layer was added and the epochs were increased to 100. The accuracy came out to be 95.14% on the testing data and 96.42% on training. The author claims that models achieved low accuracy because smaller datasets and models incapability of adaption to various datasets. | [ |
| LSTM and Bi-LSTM (RNN) | The author collected the dataset from real patients by monitoring their health. The model was trained and tested on 26 datasets containing real-time CGM data. The model consists of one LSTM layer and one Bi-LSTM layer each with 4 units, and three fully connected layers with 8, 64 and 8 units, respectively. The epochs ranged from 100 to 2000. The parameters used for evaluation were RMSE, CC, and TL. | The author proposed the model use for oral drugs, insulin pens, and the CSII pumps, which all incorporate CGM measurements. | The results were calculated at different PH = 15, 30, 45, and 60 mins. On running the models for different epochs from 100 to 2000 on a difference of 100, epoch number 900, 1300, 1500, 1700 were the ones that showed good accuracy and hence they were chosen as pre-train epoch number for particular PH levels. | [ |
| CNN | A risk prediction can accurately identify the risk of the disease using CNN. The study was based on a group of steel workers numbered around 5900. A research survey was conducted to gain real-time data on features like gender, age, disease history, lifestyle, and physical examinations. The model was evaluated using ROCs and area under curve. | It provides a basis for self-health management of steel workers, facilitate the rational allocation of medical and health resources and the development of health services, and provide a basis for government departments to make decisions. | The prediction accuracy of the model in the three data sets was 94.5%, 91.0%, and 89.0%, respectively, and the AUC was 0.950 (95%CI: 0.938–0.962), 0.916 (95%CI: 0.888–0.945), and 0.899 (95%CI: 0.859–0.939). It shows that the established model can accurately predict the risk of type-2 diabetes in steel workers. | [ |
| AlexNet and GoogLeNet | In this study, the author developed a model called IGRNet to detect and diagnose prediabetes effectively using a 12-lead electrocardiogram lasting 5 s of 2251 case data. The neural networks used were compared with traditional ML algorithms like SVM, RF, and KNN. | The author suggested to use this hybrid model for future predictions due to its efficient performance. | The results of the networks were compared over different activation functions. The IGRNet model gained an accuracy of 85.6%, which was higher than any other model used for comparison. | [ |
| DNN | The data collection was retrieved from the UCI machine learning repository – PIMA Indian diabetes dataset. For the neural network, the hidden layer count is 4. The numbers of neurons in the layers were 12, 16, 16, and 14. | The proposed system will be supportive for the medical staff and as well as for the common people because with five-fold cross-validation, the accuracy was more than any other model and on comparing it with other established models of authors, it came out to be the highest. | The data samples were divided into 5-fold and 10-fold cross-validation. The accuracy gained on this model for five-fold was 98.04% and for ten-fold was 97.27%. The results were also analyzed through ROCs and F1 score. | [ |
| LSTM and GRU | The dataset used included records of over 14000 patients from 2010 to 2015. Episodes were used to represent those measures. Each sequence had 30 features. The dataset used for training the LSTM and GRU models. The results were compared with the MLP models. | The models received a high accuracy of 97% even with 3-d length sequence. | The models, LSTM and GRU achieved higher than MLP. For longer dependencies, LSTM outperformed while on shorter sequences, GLU performed better. They gained an accuracy of over 97%. Due to lack of datasets that are more concentrated toward type-2 diabetes, replicating this work using different datasets can be difficult. | [ |
| Multilayer perceptron network | The author used the PIMA Indian diabetes dataset. The ANN developed contains 4 layers for the MLP network having 8-12-8-1 nodes. Three more networks were formed with nodes 8-32-32-1, 8-64-64-1 and 8-128-128-1. With the ReLU activation function in the input layer and hidden layers, the output layer is sigmoid. | The author proposed to extend this work for further accuracy increase to help in early prediction for diabetes. | The results for the perceptron were calculated for 10 runs for 150 epochs. The highest average accuracy was gained by the network having nodes 8-128-128-1 followed by 8-64-64-1 and then 8-32-32-1. | [ |
| ELM algorithm | The data was acquired from multiple sources, including medical laboratories, hospitals, and public datasets. The dataset was pre-processed and meant to include 12 important attributes affecting diabetes. ELM has faster ability for training. Three hundred and twenty samples were used for training and 480 for training. | The goal of this research was the study of diabetic treatment in healthcare using big data and machine learning. It presents a big data processing system by employing an ELM algorithm. | The proposed approach was proven to be efficient. The goal was to reduce the FP and FN and boost the precision and recall rates, which was achieved by the author. | [ |
| Neural network | The author obtained dataset from Luzhou from hospital physical examination. An independent test set was taken out with 13700 samples. The data contained 14 attributes. Another dataset was the PIMA Indian diabetes dataset. This method is a two-layer network with sigmoid hidden and SoftMax output neurons. | The author hopes to predict the type of diabetes using a dataset which is a lead for improving accuracy. | The neural network gained an accuracy of 74.14% on Luzhou dataset and 74.75% on PIMA Indian dataset with the use of principle component analysis. | [ |
| CNN | Dataset used by the author was taken from National Institute of Diabetes, which consists of nine parameters. Fuzzification was done on the dataset for CNN which becomes populated. The α values of the neural network were 2 and 5. These networks were compared with a CNN. | Fuzzification proves to be of a useful nature since it provides diverse data to train on. | The neural network with value of α = 2 performed better α value 5. CNN performed better than both of them. | [ |