| Literature DB >> 35669665 |
Tanzila Saba1, Amjad Rehman Khan1, Ibrahim Abunadi1, Saeed Ali Bahaj2, Haider Ali3, Maryam Alruwaythi1.
Abstract
Depression is a global prevalent ailment for possible mental illness or mental disorder globally. Recognizing depressed early signs is critical for evaluating and preventing mental illness. With the progress of machine learning, it is possible to make intelligent systems capable of detecting depressive symptoms using speech analysis. This study presents a hybrid model to identify and predict mental illness from Arabic speech analysis due to depression. The proposed hybrid model comprises convolutional neural network (CNN) and a support vector machine (SVM) to identify and predict mental disorders. Experiments are performed on the Arabic speech benchmark data set of 200 speeches. A total of 70% of data were reserved for training, while 30% of data were to test the proposed model. The hybrid model (CNN + SVM) attained a 90.0% and 91.60% accuracy rate to predict the depression from Arabic speech analysis for training and testing stages. To authenticate the results of a proposed hybrid model, recurrent neural network (RNN) and CNN are also applied to the same data set individually, and the results are compared with each other. The RNN achieved an 80.70% and 81.60% accuracy rate to predict depression while speaking in the training and testing stages. The CNN predicted the depression in the training and testing stages with 88.50% and 86.60% accuracy rates. Based on the analysis, the proposed hybrid model secured better prediction results than individual RNN and CNN models on the same data set. Furthermore, the suggested model had a lower FPR, FNR, and higher accuracy, AUC, sensitivity, and specificity rate than individual RNN, CNN model performance in predicting depression. Finally, the achieved findings will be helpful to classify depression while speaking Arabic/speech and will be beneficial for physicians, psychiatrists, and psychologists in the detection of depression.Entities:
Mesh:
Year: 2022 PMID: 35669665 PMCID: PMC9166990 DOI: 10.1155/2022/8622022
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1The proposed research architecture.
Architecture of proposed hybrid model.
| CNN + SVM | |||
|---|---|---|---|
| Layers | Results | Parameters | |
| Conv1D1 | (Nil, 202, 32) | 128 | |
| Max pool 1 | (Nil, 101, 32) | 0 | |
| Conv1D2 | (Nil, 101, 64) | 6,208 | |
| Max pool 2 | (Nil, 50, 64) | 0 | |
| Conv1D3 | (Nil, 50, 64) | 12,352 | |
| Max pool 3 | (Nil, 25, 64) | 0 | |
| Dropout | (Nil, 25, 64) | 0 | |
| Flatten | (Nil, 1,600) | 0 | |
| Dense 1 | (Nil, 128) | 204,928 | Used for SVM |
| Dense 2 | (Nil, 1) | 129 | |
| Total params: 223,745 | |||
| Trainable params: 223,745 | |||
Figure 2Architecture of RNN.
Figure 3Hybrid model accuracy on training and testing data.
Figure 4Confusion matrix results of the hybrid model on train and test data.
Figure 5RNN and CNN accuracies comparisons for training and testing data.
Figure 6RNN and CNN accuracies vs loss against 25 epochs.
Figure 7The confusion matrix results of RNN and CNN models for training and testing data.
Performance comparisons of hybrid model with RNN and CNN.
| Accuracy (%) | AUC | Sensitivity (%) | Specificity (%) | FPR | FNR | ||
|---|---|---|---|---|---|---|---|
| Training | RNN | 80.70 | 0.81 | 100 | 61.9 | 0.0 | 0.380 |
| CNN | 88.50 | 0.89 | 95.6 | 816 | 0.043 | 0.183 | |
| CNN + SVM | 90 | 0.90 | 98.5 | 81.6 | 0.014 | 0.183 | |
|
| |||||||
| Testing | RNN | 81.60 | 0.81 | 100 | 62 | 0.0 | 0.379 |
| CNN | 86.60 | 0.86 | 93.5 | 79.3 | 0.064 | 0.206 | |
| CNN + SVM | 91.60 | 0.91 | 100 | 82.7 | 0.0 | 0.172 | |
Performance measured with precision, recall, and f1-score.
| Precision | Recall | F1-score | ||
|---|---|---|---|---|
| Training | RNN | 1 | 0.619718 | 0.765217 |
| CNN | 0.95082 | 0.816901 | 0.878788 | |
| CNN + SVM | 0.983051 | 0.816901 | 0.892308 | |
|
| ||||
| Testing | RNN | 1 | 0.62069 | 0.765957 |
| CNN | 0.92 | 0.793103 | 0.851852 | |
| CNN + SVM | 1 | 0.827586 | 0.90566 | |
Figure 8The ROC with AUC of the RNN, CNN, and hybrid model based on Arabic speech analysis.