| Literature DB >> 35377325 |
Andreas Triantafyllidis1, Haridimos Kondylakis2, Dimitrios Katehakis2, Angelina Kouroubali2, Lefteris Koumakis2, Kostas Marias2, Anastasios Alexiadis1, Konstantinos Votis1, Dimitrios Tzovaras1.
Abstract
BACKGROUND: Major chronic diseases such as cardiovascular disease (CVD), diabetes, and cancer impose a significant burden on people and health care systems around the globe. Recently, deep learning (DL) has shown great potential for the development of intelligent mobile health (mHealth) interventions for chronic diseases that could revolutionize the delivery of health care anytime, anywhere.Entities:
Keywords: chronic disease; deep learning; mHealth; mobile phone; review
Mesh:
Year: 2022 PMID: 35377325 PMCID: PMC9016515 DOI: 10.2196/32344
Source DB: PubMed Journal: JMIR Mhealth Uhealth ISSN: 2291-5222 Impact factor: 4.947
Figure 1PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram. CVD: cardiovascular disease; DL: deep learning; mHealth: mobile health.
Characteristics of the included studies (N=20).
| Study | Target disease | Participants, N | Age | Study period |
| Al-Makhadmeh et al [ | CVDa | 10 | N/Ab | No |
| Ali et al [ | CVD | 597 (2 data sets combined with 303 and 294 participants) | 29-79 years | No |
| Dami et al [ | CVD | Four databases: (1) 70,000 participants, (2) 20,000 participants, (3) 139 patients with hypertension, and (4) 303 participants | N/A | Participants in database 3 were followed for 12 months |
| Deperlioglu et al [ | CVD | N/A | N/A | Usability study for 4 months |
| Fu et al [ | CVD | 20,000 | N/A | No (tested in the real world) |
| Huda et al [ | CVD | 47 | N/A | No |
| Torres-Soto et al [ | CVD | 163 | Mean 68 (cardioversion cohort), 56 (exercise stress test cohort), and 67 (ambulatory cohort) years | No |
| Cappon et al [ | Diabetes (T1DMc) | 6 | 20-80 years | 8 weeks |
| Chen et al [ | Diabetes (T1DM) | 6 | 20-80 years | 8 weeks |
| Efat et al [ | Diabetes | 25 | N/A | Data collected during a 2-month period |
| Faruqui et al [ | Diabetes (T2DMd) | 10 patients in the smartphone group (overweight or obese) | 21-75 years | 6 months |
| Goyal et al [ | Diabetes | 30 | N/A | No (tested in the real world) |
| Joshi et al [ | Diabetes | 46 | 17-80 years | No |
| Sánchez-Delacruz et al [ | Diabetes | 15 | 29-62 years | No |
| Sevil et al [ | Diabetes | 25 | Mean 24.88 (SD 3.15) years | 430-hour experiment |
| Suriyal et al [ | Diabetes | N/A | N/A | No |
| Ech-Cherif et al [ | Cancer | N/A | N/A | No |
| Guo et al [ | Cancer | N/A | N/A | No |
| Hu et al [ | Cancer | 917 | N/A | No |
| Uthoff et al [ | Cancer | 99 | Mean 40 (SD 14.1) years | No (tested in the real world) |
aCVD: cardiovascular disease.
bN/A: not applicable.
cT1DM: type 1 diabetes mellitus.
dT2DM: type 2 diabetes mellitus.
Algorithms and outcomes of the included studies (N=20).
| Study | DLa outcome | DL algorithm | Data used | Features selected | Performance | Comparison with classic MLb algorithms |
| Al-Makhadmeh et al [ | Detection of heart disease | Higher-order Boltzmann deep belief neural network | 123 instances and 23 attributes collected from 10 patients using sensor devices from data sets available in UCIc repository | ECGd, blood pressure, chest pain typology, cholesterol level, vessel information, minimum and maximum heart rate, angina, and depression symptoms | 99.5% sensitivity | No |
| Ali et al [ | Detection of heart disease | Feedforward network that uses backpropagation techniques and gradient algorithms (ensemble approach) | Cleveland and Hungarian data sets available from UCI repository containing EMRe and sensor data (physiological measurements) | Demographic (age and sex), clinical (chest pain type, number of major vessels colored by fluoroscopy, and exercise test results), and sensor (resting blood pressure and fasting blood sugar) | 84% accuracy | SVMf (71.8%), logistic regression (73.7%), random forest (73.7%), decision tree (74.8%), and naïve Bayes (80.4%) |
| Dami et al [ | Prediction of cardiovascular events to prevent SCDg and heart attacks | Combination of deep belief network and LSTMh RNNi | Four databases: (1) Kaggle heart disease data set archive, (2) database from Shahid Beheshti Hospital Research Center, (3) database from PhysioNet site including patients from the Naples Federico II University Hospital in Italy, and (4) UCI4 data set Archive from 1988 | Age, sex, weight, height, body surface area, BMI, smoker or not, systolic blood pressure, diastolic blood pressure, intima media thickness, left ventricular mass index, and ejection fraction | 88% accuracy, 87% F-measure, and 87% precision | Logistic regression, SVM, and random forest (56% accuracy on average) |
| Deperlioglu et al [ | Classification of heart sounds | Autoencoder neural networks | PASCALj B-training heart sound data sets and A-training heart sound data sets | No | 96.03% accuracy for normal diagnosis, 91.91% accuracy for extrasystole diagnosis, and 90.11% accuracy for murmur diagnosis | SVM, naïve Bayes, decision tree, and AdaBoost (84.2%-93.3% accuracy) |
| Fu et al [ | CVDk detection | A hybrid of a CNNl and an RNN | The test set includes 15,437 anonymous ECG recordings collected from several tertiary hospitals in China | No | 95.53%-99.97% accuracy of CVD | No |
| Huda et al [ | Arrhythmia detection | CNN | MIT-BIHm arrhythmia data set obtained from PhysioNet | No | 94.03% accuracy in classifying abnormal cardiac rhythm | No |
| Torres-Soto et al [ | Arrhythmia event detection | Pretraining using convolutional denoising autoencoders followed by CNN, transfer learning, and auxiliary signal quality estimation | Data available through synapse (Synapse ID: syn21985690) | Regions of the upslope from the systolic phase to be informative for AFn class-specific predictions | 98% sensitivity, 99% specificity, and 96% F1 | Random forest (32% sensitivity, 79% specificity, and 39% F1) |
| Cappon et al [ | Prediction of short-time blood glucose levels | LSTM RNN | OhioT1DM data set containing CGMo data, lifestyle data (diet, exercise, and sleep), galvanic skin response, skin temperature, and magnitude of acceleration | CGM, injected insulin as reported by the pump, and self-reported meals and exercise | RMSEp 20.20 and 34.19 for 30- and 60-minute prediction, respectively | No |
| Chen et al [ | Prediction of short-time blood glucose levels | Dilated RNNs | The OhioT1DM data set of continuous glucose monitoring data and the corresponding daily events from 6 patients with type 1 diabetes | CGM, insulin doses, carbohydrate intake, and time index; additional data included exercise, heart rate, and skin temperature | 15.299 to 22.710 RMSE for different participants | No |
| Efat et al [ | Risk level classification of patients with diabetes | Artificial neural network | 2-month data from 25 patients with diabetes | Patients’ age, sex, sugar level, heart pulse, food intake, sleep time, and exercise or calorie burn | 84.29% accuracy, 82.35% sensitivity, and 86.11% specificity | No |
| Faruqui et al [ | Prediction of daily glucose levels | LSTM RNN | 10 patients with diabetes (T2DMq) being overweight or obese | Daily mobile health lifestyle data on diet, physical activity, weight, and previous glucose levels from the day before | 33.33% (patient 7) to 86.67% (patient 2) accuracy | KNNr regression (10%-56% accuracy) |
| Goyal et al [ | Real-time DFUs localization | Faster R-CNNt | Transfer learning with ImageNet (Stanford Vision Lab) and Microsoft COCO data set; 1775 images of DFUs | Low-level features such as edge detection, corner detection, texture descriptors, shape-based descriptors, and color descriptors | 91.8% mean average precision | SVM (70.3% precision) |
| Joshi et al [ | Continuous blood glucose monitoring | LMBPu | NIRv optical spectroscopy data | No | AvgEw 6.09%, mARDx 6.07% | Multiple polynomial regression—AvgE and mARD were 4.88% and 4.86% for serum glucose examination |
| Sánchez-Delacruz et al [ | Diabetic neuropathy detection | Classifiers combined with multilayer perceptron | Raw data from 5 accelerometers | Accelerometer data | 85% accuracy | No |
| Sevil et al [ | Classification of activity into 5 stages for determining the energy expenditure for diabetes therapy | RNN | Data sets not available | 23 selected information features are reported in the paper out of 2216 | 94.8% classification accuracy | KNN, SVM, naïve Bayes, decision tree, linear discrimination, and ensemble learning (75.7% to 93.1% accuracy) |
| Suriyal et al [ | Diabetic retinopathy detection | MobileNets in TensorFlow with the help of RMSprop and asynchronous gradient descent | Data set available in Kaggle database | No | 73% accuracy, 74% sensitivity, and 63% specificity | No |
| Ech-Cherif et al [ | Benign and malignant cancer detection | Resource-constrained, mobile-ready deep neural network | Three databases: DermNet, ISICy Archive, and Dermofit Image Library | Cancerous or not | 91.33% accuracy | No |
| Guo et al [ | Identification of cervix and noncervix images | Ensemble method that consists of 3 DL architectures: RetinaNet, deep SVDDz, and a customized CNN | Four data sets were used in this study: MobileODT, Kaggle, and COCO2017 for training and validation, and SEVIAaa for testing | Normal samples as cervix images and from the anomalous samples as noncervix images | 91.6% accuracy and 89% F1 score | No |
| Hu et al [ | Detection of cervical precancer | Automated visual evaluation, RetinaNet, and Adam optimization algorithm | Microsoft COCO images, 7334 training images, 970 validation images, and 1058 test images | No specific features were reported in the paper | ROCab curve (AUCac) of 0.95 | No |
| Uthoff et al [ | Early detection of precancerous and cancerous lesions in the oral cavity | CNN, VGG-Mad network pretrained on the ImageNet data set | 170 image pairs | WLIae and AFIaf provided the most information about type of lesion and size of the affected area | Sensitivity, specificity, positive predictive values, and negative predictive values (81.25%-94.94%); 0.908 AUC | No |
aDL: deep learning.
bML: machine learning.
cUCI: University of California, Irvine.
dECG: electrocardiogram.
eEMR: electronic medical record.
fSVM: support vector machine.
gSCD: sudden cardiac death.
hLSTM: long short-term memory.
iRNN: recurrent neural network.
jPASCAL: Pattern Analysis, Statistical Modeling, and Computational Learning.
kCVD: cardiovascular disease.
lCNN: convolutional neural network.
mMIT-BIH: Massachusetts Institute of Technology–Beth Israel Hospital.
nAF: atrial fibrillation.
oCGM: continuous glucose monitoring.
pRMSE: root mean squared error.
qT2DM: type 2 diabetes mellitus.
rKNN: k-nearest neighbor.
sDFU: diabetic foot ulcer.
tR-CNN: region-based convolutional neural network.
uLMBP: Levenberg–Marquardt Backpropagation.
vNIR: near-infrared.
wAvgE: average error.
xmARD: mean absolute relative difference.
yISIC: International Skin Imaging Collaboration.
zSVDD: support vector data description.
aaSEVIA: smartphone-enhanced visual inspection with acetic acid.
abROC: receiver operating characteristic.
acAUC: area under the curve.
adVGG-M: visual geometry group multi-scale.
aeWLI: white-light imaging.
afAFI: autofluorescence imaging.
Model architectures in the included studies (N=20).
| Study | DLa parameters | DL hyperparameters |
| Al-Makhadmeh et al [ | Deep belief network, trained the features using the Boltzmann machine classifiers by computing the energy consumption of the network | Cross-entropy loss of 0.0178, L1 loss of 0.0187, and L2 loss of 0.025 |
| Ali et al [ | Ensemble DL model composed of 5 layers: the input layer, 3 hidden layers, and the output layer; fully connected hidden layer with 20 nodes | Ada optimizer used and a learning rate with a value of 0.03; ReLUb activation function |
| Dami et al [ | A deep belief network selected and represented a set of features from the hybrid feature vector and then passed it to the LSTMc neural network. The LSTM neural network consists of 5 layers, including input layers, a hidden layer (with 100 hidden units), 2 fully connected layers, a softmax layer, and an output layer | SGDd for optimizing cross-entropy as the default loss function |
| Deperlioglu et al [ | Autoencoder neural network with a hidden layer size of 10. Softmax layer was used | Scaled conjugate gradient algorithm and cross-entropy cost function was used in the coding layer. The coefficient for the L2 weight |
| Fu et al [ | A hybrid of CNNe and RNNf; 32 convolutional layers (input for CNN) grouped into 8 stages, where each stage was a cascade of four 1D convolutional layers with a kernel size of 16. The final prediction layer was a fully connected dense layer | Before each convolutional layer, a nonlinear transformation occurs, which is a combination of batch normalization, ReLU activation, and a dropout |
| Huda et al [ | 1D convolution (CNN), max-pooling, and batch normalization. The flattened layer output was passed through a fully connected layer and a second fully connected dense layer. In addition, a softmax layer with 14 outputs was used | Used dropout layers |
| Torres-Soto et al [ | Convolutional and pooling layers in the encoder and upsampling and convolutional layers in the decoder; 3 convolutional layers and 3 pooling layers for the encoder segment, and 3 convolutional layers and 3 upsampling layers for the decoder segment of the CDAEg | Weights were randomly initiated according to He distribution, and Adam was used as the optimization method. Each model was trained with MSEh loss for 200 epochs, with a reduction in learning rate of 0.001 for every 25 epochs if the validation loss did not improve |
| Cappon et al [ | A bidirectional LSTM input layer composed of 128 cells having a look-back period of 15 minutes (ie, 3 samples); 2 LSTM layers composed of 64 and 32 cells, respectively; and a fully connected layer consisting of a single neuron computing the BGi level prediction at 2 different PHsj (ie, 30 and 60 minutes) | BLSTMk architecture, hyperparameters, and look-back period were chosen by trial and error to compromise between model complexity and accuracy |
| Chen et al [ | A 3-layered DRNNl with 32 cells in each layer | 1, 2, and 4 dilations implemented for the 3 layers from bottom to top, respectively |
| Efat et al [ | RNN | In forward propagation, the sigmoid activation function was applied and, for backpropagation, the margin of error of the output was measured, and the weights were adjusted accordingly |
| Faruqui et al [ | LSTM with 5-60 layers and 5-40 number of neurons in the feedforward neural network | Dropout rate of 0.10-0.45. An allowable unit change of 0.01 for the dropout rate parameter and of 1 for the number of neurons in LSTM and feedforward layers was selected. A total of 35 × 55 × 35 = 67,375 combinations were tested before finding the optimal hyperparameters |
| Goyal et al [ | Faster R-CNNm with ResNet101, Faster R-CNN with Inception-ResnetV2, Faster R-CNN with InceptionV2, and R-FCNn with ResNet101 | For Faster R-CNN, the weight was set for L2 regularizer as 0.0, initializer that generated a truncated normal distribution with SD of 0.01 and batch normalization with decay of 0.9997 and epsilon of 0.001. For training, a batch size of 2 was used, optimizer as momentum with manual step learning rate and an initial rate of 0.0002, 0.00002 at epoch 40, and 0.000002 at epoch 60. The momentum optimizer value was set at 0.9. For training R-FCN, the same hyperparameters were used as with Faster R-CNN with the only change being in the learning rate set as 0.0005 |
| Joshi et al [ | DNNo with 10 hidden layers | Sigmoid activation functions |
| Sánchez-Delacruz et al [ | 23 assembled algorithms were tested by combining them with the deep RNA multilayer perceptron. The best results were obtained with the combination of FilteredClassifier and the DL model | The base function was applied to the input values, and the softmax was used as the activation function |
| Sevil et al [ | Combination of different layers, including fully connected, LSTM, softmax, regression, ReLU, and dropout layers | L2 regularization=0.05 |
| Suriyal et al [ | MobileNet CNN with 28 layers. The first layer was a fully connected layer | After each layer, there was batch normalization and a ReLU nonlinear function except at the final layer. Training was done in TensorFlow with the help of RMSprop and asynchronous gradient descent |
| Ech-Cherif et al [ | Used the MobileNetV2 model, excluded the classification layer, and replaced it with a dense layer that has two classes: benign and malignant | Used pretrained model MobileNetV2. Adam optimizer was used with a starting learning rate of 0.4. For each experiment, the learning rate was decayed by half every 2 epochs. All experiments were run for 55 epochs. Selected batch size was 32 |
| Guo et al [ | 4 sequentially connected convolutional blocks followed by 2 fully connected layers and softmax for the last layer | N/Ap |
| Hu et al [ | ResNet-50 architecture | Number of iterations and batch size optimization were mean average precision and validation classification loss |
| Uthoff et al [ | 4 sequentially connected convolutional blocks followed by 2 fully connected layers | N/A |
aDL: deep learning.
bReLU: Rectified Linear Unit.
cLSTM: long short-term memory.
dSGD: stochastic gradient descent.
eCNN: convolutional neural network.
fRNN: recurrent neural network.
gCDAE: convolutional denoising autoencoder.
hMSE: mean squared error.
iBG: blood glucose.
jPH: prediction horizon.
kBLSTM: bidirectional long short-term memory.
lDRNN: dilated recurrent neural network.
mR-CNN: region-based convolutional neural network.
nR-FCN: region-based fully convolutional network.
oDNN: deep neural network.
pN/A: not applicable.