| Literature DB >> 35071730 |
Faisal Mashel Albagmi1, Aisha Alansari2, Deema Saad Al Shawan3, Heba Yaagoub AlNujaidi3, Sunday O Olatunji2.
Abstract
The rapid spread of the Covid-19 outbreak led many countries to enforce precautionary measures such as complete lockdowns. These lifestyle-altering measures caused a significant increase in anxiety levels globally. For that reason, decision-makers are in dire need of methods to prevent potential public mental crises. Machine learning has shown its effectiveness in the early prediction of several diseases. Therefore, this study aims to classify two-class and three-class anxiety problems early by utilizing a dataset collected during the Covid-19 pandemic in Saudi Arabia. The data was collected from 3017 participants from all regions of the Kingdom via an online survey containing questions to identify factors influencing anxiety levels, followed by questions from the GAD-7, a screening tool for Generalized Anxiety Disorders. The prediction models were built using the Support Vector Machine classifier for its robust outcomes in medical-related data and the J48 Decision Tree for its interpretability and comprehensibility. Experimental results demonstrated promising results for the early classification of two-class and three-class anxiety problems. As for comparing Support Vector Machine and J48, the Support Vector Machine classifier outperformed the J48 Decision Tree by attaining a classification accuracy of 100%, precision of 1.0, recall of 1.0, and f-measure of 1.0 using 10 features.Entities:
Keywords: Anxiety; COVID-19; Machine learning; Pandemic; Saudi Arabia
Year: 2022 PMID: 35071730 PMCID: PMC8766246 DOI: 10.1016/j.imu.2022.100854
Source DB: PubMed Journal: Inform Med Unlocked ISSN: 2352-9148
Overview of the reviewed sources arranged by data of publication.
| Author(s) Citation | Title of article or chapter | Objective | Method | Findings | Limitations |
|---|---|---|---|---|---|
| [ | Modeling anxiety and fear of COVID-19 using machine learning in a sample of Chinese adults: associations with psychopathology, sociodemographic, and exposure variables | To examined vulnerability factors associated with increased anxiety and fear. | The researchers used R caret package for machine learning, with packages for specific algorithms of glmnet (lasso, ridge, and elastic net regression), rf (random forest), xgbTree (extreme gradient boosted regression), and svmRadial (support vector machine with a radial basis function kernel). | Stress and rumination were the most relevant variables in modeling COVID-19-related anxiety intensity, according to shrinkage machine learning methods. The most powerful predictor of perceived COVID-19 death threat was health anxiety. | Data was from one geographical area china. |
| [ | Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence | To identify important predictors for GAD and MDD risk using artificial intelligent | A novel machine learning process was used to re-analyze data from an observational study to tackle the problem of predicting MDD and GAD. The pipeline is an algorithmically diverse collection of machines learning approaches, including deep learning. | Being comfortable with living conditions and having public health insurance were the two most important factors in predicting MDD. Up-to-date vaccinations and marijuana usage were the two most powerful predictors of GAD. Our findings show that machine learning algorithms for detecting GAD and MDD based on EHR data have a moderate predictive performance. | The original screening for MDD and GAD outcomes may not have identified all cases in the community. |
| [ | Predicting generalized anxiety disorder among women using Shapley value | To predict GAD among women using Shapley value | On the mental health data set, the Shapley value was used as the feature selection for the data mining classifier. | The finding has been improved using feature selection among the prediction's models (Naïve Bayes, Random Forest and J48). | Small sample size 180 participants |
| [ | Toward Robust Anxiety Biomarkers: A Machine Learning Approach in a Large-Scale Sample. | To predict trait anxiety from neuroimaging measurements in humans. | They compared a suite of neuroimaging-based machine learning models using Python to predict anxiety within a discovery sample (n = 531, 307 women) via k-fold cross-validation. The final model using (a stacked model incorporating region-to-region functional connectivity, amygdala seed-to-voxel connectivity, and volumetric and cortical thickness data) in a held-out, unseen test sample (n = 348, 209 women). | Stacked model was able to predict anxiety within the discovery sample. But failed to test the generalizability in the holdout sample. | The researchers studied a limited set of brain phenotypes and applied a circumscribed set of approaches. |
| [ | Assessment of Anxiety, Depression and Stress using Machine Learning Models | To predict anxiety, depression, and stress using 8 algorithms. | Using data from the online DASS42 tool, eight machine learning algorithms were used to predict the occurrence of psychological issues such as anxiety, depression, and stress. | The prediction accuracy obtained by utilizing the hybrid algorithm was higher than that obtained by using single methods, although the radial basis function network, which falls within the category of neural networks, yielded the highest accuracy. | NA |
| [ | Learning the Mental Health Impact of COVID-19 in the United States with Explainable Artificial Intelligence | To focus on learning a ranked list of factors that could indicate a predisposition to a mental disorder during the COVID-19 pandemic. | They surveyed 17,764 adults in the United States using Bayesian network inference, they have identified key factors affecting mental health during the COVID-19 pandemic. | They discovered that patients with a chronic mental disease were more susceptible to mental problems during the COVID-19 pandemic using the Bayesian network model. | The data analyzed is limited to one geographical area (united stated) |
| [ | Screening of anxiety and depression among seafarers using machine learning technology | To compare performance of different machine learning algorithms for screening of anxiety and depression among the seafarers. | After obtaining the required approval and ethical clearance, 470 sailors were interviewed at the Haldia Dock Complex in India.Five machine learning classifiers i.e., CatBoost, Logistic Regression, Naïve Bayes, Random Forest, and Support Vector Machine, were evaluated using the Python programming language. | They found that Catboost appeared to be the best one for predicting anxiety and depression with accuracy and precision 82.6% and 84.1% respectively. | The study emphasized the application of machine learning technology in the field of automated screening for mental health illness. |
| [ | Detecting anxiety on Reddit | To detect anxiety related posts from Reddit using various linguistic features. | study anxiety disorders through personal | apply N-gram language modeling, vector embeddings, topic | They achieve an accuracy of 91% with vectorspace word embeddings, and an accuracy of 98% when combined with lexiconbased features. |
Fig. 1Maximum hyperplane distance.
Fig. 2Decision tree scheme.
Survey questions.
| Variable | Label |
|---|---|
| Q3 | Nationality |
| Q18 | Gender |
| Q19 | Age |
| Q20 | Marital status |
| Q21 | How many people are in the house? (Includes house workers and drivers) |
| Q22 | Are you or any of your household members at increased risk of contracting the coronavirus? (This includes anyone over the age of 60 or pregnant or having comorbidities) |
| Q24A1 | Have you been tested positive for COVID-19 test? |
| Q24A2 | Have you been suspected of carrying the coronavirus? |
| Q24A3 | Have any member of your family have been diagnosed with coronavirus? |
| Q25 | Qualification |
| Q26 | Occupation |
| Q28 | What is the method followed by your employer, or academic institution during the pandemic? (Online or in person) |
| Q30 | Feeling nervous, anxious, or on edge |
| Q31 | Not being able to stop or control worrying |
| Q32 | Worrying too much about different things |
| Q33 | Trouble relaxing |
| Q34 | Being so restless that it's hard to sit still |
| Q35 | Becoming easily annoyed or irritable |
| Q36 | Feeling afraid as if something awful might happen |
| Q37 | How difficult have these problems made it to do work, take care of things at home, or get along with other people |
| Georgian | Geographically region |
| Anxiety (Two category) | Anxiety two categories (Anxious and non-anxious) |
| Anxiety (Three category) | Anxiety score three categories (Mild-Moderate-Severe) |
Statistical analysis of the dataset.
| Attributes | Mean | Median | Standard Deviation | Max. | Min. |
|---|---|---|---|---|---|
| Q3 | 1.063 | 1 | 0.242 | 2 | 1 |
| Q18 | 1.560 | 2 | 0.496 | 2 | 1 |
| Q19 | 3.307 | 3 | 1.300 | 6 | 1 |
| Q20 | 1.731 | 2 | 0.560 | 4 | 1 |
| Q21 | 6.733 | 7 | 3.026 | 30 | 0 |
| Q20 | 1.651 | 2 | 0.477 | 2 | 1 |
| Q24A1 | 0.002 | 0 | 0.048 | 1 | 0 |
| Q24A2 | 0.006 | 0 | 0.075 | 1 | 0 |
| Q24A3 | 0.010 | 0 | 0.099 | 1 | 0 |
| Q25 | 3.731 | 4 | 0.954 | 5 | 1 |
| Q26 | 2.730 | 2 | 1.487 | 6 | 1 |
| Q30 | 1.056 | 1 | 1.046 | 3 | 0 |
| Q31 | 0.638 | 0 | 0.910 | 3 | 0 |
| Q32 | 0.930 | 1 | 0.982 | 3 | 0 |
| Q33 | 0.700 | 0 | 0.941 | 3 | 0 |
| Q34 | 0.754 | 0 | 0.976 | 3 | 0 |
| Q35 | 0.768 | 0 | 0.966 | 3 | 0 |
| Q36 | 0.627 | 0 | 0.900 | 3 | 0 |
| Q37 | 1.696 | 2 | 0.712 | 4 | 1 |
| Geo-region | 1.022 | 1 | 0.989 | 4 | 0 |
Correlation between each Attribute and the First Experiment Target Attribute.
| Attributes | Target Attribute | Correlation coefficient |
|---|---|---|
| Q31 | Anxiety Two category (2) | 0.69032 |
| Q32 | Anxiety Two category | 0.68472 |
| Q30 | Anxiety Two category | 0.68466 |
| Q33 | Anxiety Two category | 0.67673 |
| Q36 | Anxiety Two category | 0.65965 |
| Q35 | Anxiety Two category | 0.58508 |
| Q34 | Anxiety Two category | 0.54546 |
| Q37 | Anxiety Two category | 0.48791 |
| Q19 | Anxiety Two category | 0.14877 |
| Q22 | Anxiety Two category | 0.11936 |
| Q26 | Anxiety Two category | 0.09987 |
| Q20 | Anxiety Two category | 0.08589 |
| Q18 | Anxiety Two category | 0.06622 |
| Q24A2 | Anxiety Two category | 0.05201 |
| Georegion | Anxiety Two category | 0.05052 |
| Q3 | Anxiety Two category | 0.02726 |
| Q24A3 | Anxiety Two category | 0.02619 |
| Q21 | Anxiety Two category | 0.01486 |
| Q25 | Anxiety Two category | 0.01195 |
| Q24A1 | Anxiety Two category | 0.00648 |
Correlation between each Attribute and the Second Experiment Target Attribute.
| Attributes | Target Attribute | Correlation coefficient |
|---|---|---|
| Q31 | Anxiety Three category | 0.64316 |
| Q30 | Anxiety Three category | 0.63942 |
| Q32 | Anxiety Three category | 0.63888 |
| Q33 | Anxiety Three category | 0.63119 |
| Q36 | Anxiety Three category | 0.61451 |
| Q35 | Anxiety Three category | 0.54564 |
| Q34 | Anxiety Three category | 0.50835 |
| Q37 | Anxiety Three category | 0.45479 |
| Q19 | Anxiety Three category | 0.13954 |
| Q22 | Anxiety Three category | 0.11146 |
| Q26 | Anxiety Three category | 0.09348 |
| Q20 | Anxiety Three category | 0.08045 |
| Q18 | Anxiety Three category | 0.06233 |
| Q24A2 | Anxiety Three category | 0.04852 |
| Georegion | Anxiety Three category | 0.04767 |
| Q3 | Anxiety Three category | 0.02526 |
| Q24A3 | Anxiety Three category | 0.02427 |
| Q21 | Anxiety Three category | 0.01428 |
| Q25 | Anxiety Three category | 0.01328 |
| Q24A1 | Anxiety Three category | 0.00807 |
Fig. 3Tuning Kernel function.
Fig. 4Tuning the cost.
Optimum hyperparameters for the proposed SVM model.
| Parameters | Optimal value chosen |
|---|---|
| Kernel | Poly Kernel |
| C | 2 |
| Epsilon | 1.0E-12 |
Fig. 5Tuning the confidence factor.
Optimum hyperparameters for the proposed J48 model.
| Parameters | Optimal value chosen |
|---|---|
| Confidence Factor | 0.45 |
| MinNumObj | 2 |
Fig. 6Optimizing Kernel functions.
Fig. 7Optimizing the cost.
Optimum hyperparameters for the proposed SVM model.
| Parameters | Optimal value chosen |
|---|---|
| Kernel | Poly Kernel |
| C | 4 |
| Epsilon | 1.0E-12 |
Fig. 8Tuning the confidence factor.
Optimum hyperparameters for the proposed J48 model.
| Parameters | Optimal value chosen |
|---|---|
| Confidence Factor | 0.15 |
| MinNumObj | 2 |
Average accuracy of different feature subsets of the two-class classification experiment.
| Number of features | Accuracy of SVM | Accuracy of J48 | Average accuracy of each set of features |
|---|---|---|---|
| 100% | 95.79% | 97.90% | |
| 100% | 95.96% | 97.98% | |
| 95.76% | 95.00% | 95.38% | |
| 92.97% | 93.27% | 93.12% | |
| 91.95% | 91.51% | 91.73% | |
| 90.19% | 90.19% | 90.19% |
Average accuracy of different feature subsets of the three-class classification experiment.
| Number of features | Accuracy of SVM | Accuracy of J48 | The average accuracy of each set of features |
|---|---|---|---|
| 100% | 92.81% | 96.40% | |
| 100% | 93.50% | 96.75% | |
| 93.14% | 91.48% | 92.31% | |
| 89.63% | 89.96% | 89.79% | |
| 87.11% | 88.66% | 87.89% | |
| 85.18% | 86.77% | 85.98% |
Results of classifiers after optimization and feature selection of the two-class classification experiment.
| Performance Measure | SVM | J48 |
|---|---|---|
| 100 | 95.96% | |
| 1 | 0.974 | |
| 1 | 0.975 | |
| 1 | 0.975 |
SVM Confusion matrix after Optimization and Feature Selection of the Two-class Classification Experiment.
| Predicted | |||
|---|---|---|---|
| Anxiety | Non-Anxiety | ||
| Actual | Anxiety | 2425 (TP) | 0 (FN) |
| 0 (FP) | 592 (TN) | ||
Results of classifiers after optimization and feature selection of the three-class classification experiment.
| Performance Measure | SVM | J48 |
|---|---|---|
| 100 | 93.50% | |
| 1 | 0.933 | |
| 1 | 0.935 | |
| 1 | 0.934 |
J48 confusion matrix after optimization and feature selection of the two-class classification experiment.
| Predicted | |||
|---|---|---|---|
| Anxiety | Non-Anxiety | ||
| Actual | Anxiety | 2365 (TP) | 60 (FN) |
| 62 (FP) | 530 (TN) | ||
SVM Confusion matrix after Optimization and Feature Selection of the Three-class Classification Experiment.
| Predicted | ||||
|---|---|---|---|---|
| Mild | Moderate | Severe | ||
| Actual | Mild | 2425 | 0 | 0 |
| 0 | 247 | 0 | ||
| 0 | 0 | 345 | ||
J48 confusion matrix after optimization and feature selection of the three-class classification experiment.
| Predicted | ||||
|---|---|---|---|---|
| Mild | Moderate | Severe | ||
| Actual | Mild | 2375 | 0 | 76 |
| 0 | 211 | 34 | ||
| 50 | 36 | 235 | ||
J48 TP, FP, FN, and TN rates of the Three-class Classification Experiment.
| Class | ||||
|---|---|---|---|---|
| Mild | Moderate | Severe | ||
| Rate | TP | 2375 | 211 | 235 |
| 50 | 36 | 110 | ||
| 76 | 34 | 86 | ||
| 516 | 2736 | 2586 | ||
Comparing the accuracies of classifiers in 2-class and 3-class Experiments.
| Classifier | Anxiety Two-class | Anxiety Three-class |
|---|---|---|
| 100% | 100% | |
| 95.96% | 93.50% |
Fig. 9SVM Roc curve for classifying two-class problem: (a) Class zero (b) class one.
Fig. 10SVM Roc curve for classifying three-class problem: (a) Class zero (b) class one (c) class two.
Fig. 11J48 Roc curve for classifying two-class problem: (a) Class zero (b) Class one.
Fig. 12J48 Roc curve for classifying three-class problem: (a) Class zero (b) Class one (c) Class two.