| Literature DB >> 30572595 |
Abstract
Correlation analysis is an extensively used technique that identifies interesting relationships in data. These relationships help us realize the relevance of attributes with respect to the target class to be predicted. This study has exploited correlation analysis and machine learning-based approaches to identify relevant attributes in the dataset which have a significant impact on classifying a patient's mental health status. For mental health situations, correlation analysis has been performed in Weka, which involves a dataset of depressive disorder symptoms and situations based on weather conditions, as well as emotion classification based on physiological sensor readings. Pearson's product moment correlation and other different classification algorithms have been utilized for this analysis. The results show interesting correlations in weather attributes for bipolar patients, as well as in features extracted from physiological data for emotional states.Entities:
Keywords: correlation analysis; data analytics; health care; machine learning
Mesh:
Year: 2018 PMID: 30572595 PMCID: PMC6313491 DOI: 10.3390/ijerph15122907
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Correlation based on direction, form, and dispersion strength.
Correlation techniques based on data types. The highlighted Spearman’s correlation shows that it considers only ordinal data in the categorical dataset type.
| Data Types | Dependent Variables | ||||
|---|---|---|---|---|---|
| Categorical | Quantitative | ||||
| Nominal | Ordinal | ||||
|
|
|
| Chi square test of Independence | Analysis of variance (ANOVA) | |
|
| Chi square test of Independence | Spearman’s Correlation | |||
|
| Lift, X2 test of Independence (Categorized Quantitative variable) | Pearson product moment correlation, Spearman’s Correlation | |||
Figure 2Methodology to identify strong predictor attributes.
Depressive Disorder Symptoms for Bipolar and Melancholia disorder. The dds keyword in symptom ID (Identifier) represents Depressive Disorder Symptom, followed by the identifier number.
| Symptom ID | Symptoms | Bipolar Disorder | Melancholia Disorder |
|---|---|---|---|
| dds.01 | Sadness/Worthless/Hopeless | (major) | (major) |
| dds.02 | Insomnia | (major) | (major) |
| dds.03 | Retardation | (major) | (major) |
| dss.05 | Elevated feelings and energy for activity | (major) | |
| dds.16 | Isolation | (minor) | |
| dds.12 | Loss of Interest | (minor) | (major) |
| dds.22 | Fatigue | (major) | (major) |
| dds.10 | Anxiety | (minor) | |
| dds.08 | Suicide | (major) | |
| dds.11 | Weight Loss/Gain | (minor) | (major) |
| dds.07 | Irritation | (major) | (minor) |
Sample Questions for depressive disorder symptom dds.01. The dds keyword in Question ID (Identifier) represents Depressive Disorder Symptom, followed by the identifier number, followed by q, which represents Question, followed by question number.
| Question ID | Question | Response and Score | ||
|---|---|---|---|---|
| dds.01.q1 | Do you consider objects and situations as unimportant as you think you are (e.g., homework, grooming, waking up in the morning etc.)? | Yes, always | Yes, sometimes | No, never |
| 30 | 15 | 0 | ||
| dds.01.q2 | Do you feel that you have no value? | Yes, always | Yes, sometimes | No, never |
| 30 | 15 | 0 | ||
| dds.01.q3 | Do you usually walk with you head down? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q4 | Do you often have negative statements? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q5 | Do you often use gestures that are dramatic and out of context? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q6 | Do you feel loss of interest in doing activities? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q7 | Do you often perceive your skill set as inadequate for the task at hand? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q8 | Do you mostly have negative anticipation about your future? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q9 | Do you feel losing affection in things? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
| dds.01.q10 | Have you ever mentioned some of the following or similar statements: Who could ever want to be my friend? What do my parents really think of me? Why would anyone want to accept the love of a worthless person like me? | Yes, always | Yes, sometimes | No, never |
| 5 | 2.5 | 0 | ||
Profile information of dataset used for depressive disorder analysis. Min Value and Max Value are the Minimum and Maximum values in the dataset. Std. Dev is the standard deviation.
| Attribute | Type | Unit | Min Value | Max Value | Mean | Std. Dev |
|---|---|---|---|---|---|---|
| Season | Nominal | - | - | - | - | - |
| Temperature | Numeric | C | −8.2 | 31.6 | 12.748 | 10.551 |
| Atmospheric Pressure | Numeric | hPa | 998.1 | 1034.1 | 1016.88 | 8.083 |
| Humidity | Numeric | % | 32 | 99 | 64.621 | 15.665 |
| Visibility | Numeric | km | 1.8 | 20 | 12.82 | 5.071 |
| Wind Speed | Numeric | km/h | 1.9 | 15.2 | 6.481 | 2.488 |
| Rain | Nominal | - | - | - | - | - |
| Snow | Nominal | - | - | - | - | - |
| Storm | Nominal | - | - | - | - | - |
| Fog | Nominal | - | - | - | - | - |
| Ozone | Numeric | ppm | 0.006 | 0.044 | 0.027 | 0.012 |
| Carbon Monoxide | Numeric | ppm | 0.48 | 0.82 | 0.64 | 0.098 |
| Nitrogen dioxide | Numeric | ppm | 0.02 | 0.043 | 0.033 | 0.007 |
| Depression Severity | Nominal | - | - | - | - | - |
Description of the eight emotional states examined.
| Emotion Class | Source to Trigger | Description | Arousal and Valence Scale |
|---|---|---|---|
| No Emotion | Blank | Boredom | Low arousal and neutral valence |
| Anger | Images of people arousing anger | Feeling of fighting | Very high arousal and very negative valence |
| Hate | Image of injustice and cruelty | Anger of lesser severity | Low arousal and negative valence |
| Grief | Image of deformed child or thought of loss of mother | Sadness | High arousal and negative valence |
| Platonic Love | Images of family summer | Happiness and peace | Low arousal and positive valence |
| Romantic Love | Erotic imagery | Lust and feeling for romance | Very high arousal and positive valence |
| Joy | Song of joy | Stronger feelings of happiness | Medium high arousal and positive valence |
| Reverence | Images for holly places and reciting prayers | Calm and peaceful feelings | Very low arousal and neutral valence |
Extracted features of raw physiological signals described by Picard et al. [51].
| Feature Label | Description |
|---|---|
| f1 | Windowed means of the raw signals. |
| f2 | Standard deviations of the raw signals, based on windowed means. |
| f3 | Windowed means of absolute values of the first forward differences of the raw signals. |
| f4 | Windowed means of absolute values of the first forward differences of the normalized signals. |
| f5 | Windowed means of absolute values of the second forward differences of the raw signals. |
| f6 | Windowed means of absolute values of the second forward differences of the normalized signals. |
Profile information of dataset used for emotion detection. Min Value and Max Value are the Minimum and Maximum values in the dataset. Std. Dev is the standard deviation.
| Attribute | Type | Min Value | Max Value | Mean | Std. Dev |
|---|---|---|---|---|---|
| EMG-f1 | Numeric | 1.24 | 329.11 | 3.644 | 6.438 |
| EMG-f2 | Numeric | 0 | 192.05 | 1.147 | 5.582 |
| EMG-f3 | Numeric | 0 | 50.115 | 0.017 | 0.196 |
| EMG-f4 | Numeric | 0 | 7.29 | 0.003 | 0.022 |
| EMG-f5 | Numeric | 0 | 50.594 | 0.014 | 0.188 |
| EMG-f6 | Numeric | 0 | 7.29 | 0.003 | 0.021 |
| BVP-f1 | Numeric | 20.717 | 58.378 | 33.545 | 0.82 |
| BVP-f2 | Numeric | 1.002 | 48.439 | 9.023 | 4.736 |
| BVP-f3 | Numeric | 0 | 36.351 | 0.085 | 0.47 |
| BVP-f4 | Numeric | 0 | 0.493 | 0.008 | 0.022 |
| BVP-f5 | Numeric | 0 | 37.159 | 0.076 | 0.614 |
| BVP-f6 | Numeric | 0 | 0.417 | 0.007 | 0.018 |
| GSR-f1 | Numeric | 1.41 | 12.996 | 4.905 | 2.318 |
| GSR-f2 | Numeric | 0 | 0.973 | 0.025 | 0.051 |
| GSR-f3 | Numeric | 0 | 12.108 | 0.002 | 0.071 |
| GSR-f4 | Numeric | 0 | 3.358 | 0.001 | 0.013 |
| GSR-f5 | Numeric | 0 | 12.113 | 0.002 | 0.1 |
| GSR-f6 | Numeric | 0 | 3.362 | 0 | 0.018 |
| Respiration-f1 | Numeric | 37.853 | 64.598 | 56.82 | 6.802 |
| Respiration-f2 | Numeric | 0 | 3.252 | 0.297 | 0.319 |
| Respiration-f3 | Numeric | 0 | 63.365 | 0.016 | 0.707 |
| Respiration-f4 | Numeric | 0 | 3.306 | 0.009 | 0.023 |
| Respiration-f5 | Numeric | 0 | 63.365 | 0.018 | 0.999 |
| Respiration-f6 | Numeric | 0 | 3.33 | 0.001 | 0.016 |
| Emotion | Nominal | - | - | - | - |
Figure 3Process flow of Data Analytics: (a) Emotion detection. (b) Identifying depressive disorder severity.
Ranking of attributes based on Correlation Coefficient.
| Rank | Bipolar-Disorder | Merit | Melancholia-Disorder | Merit |
|---|---|---|---|---|
| 1 | Temperature | 0.526 | Season | 243.18 |
| 2 | Atmospheric Pressure | 0.421 | Ozone | 182.8 |
| 3 | Season | 0.38 | Carbon-monoxide | 155.3 |
| 4 | Ozone | 0.31 | Temperature | 102.6 |
| 5 | Nitrogen-dioxide | 0.29 | Nitrogen-dioxide | 94.4 |
| 6 | Carbon-monoxide | 0.264 | Atmospheric Pressure | 33.6 |
| 7 | Snow | 0.204 | Fog | 15.2 |
| 8 | Humidity | 0.179 | Snow | 10.2 |
| 9 | Fog | 0.13 | Storm | 6.5 |
| 10 | Rain | 0.125 | Humidity | 2.43 |
| 11 | Visibility | 0.122 | Visibility | 1.7 |
| 12 | Storm | 0.105 | Wind speed | 0.008 |
| 13 | Wind speed | 0.07 | Rain | 0.003 |
Figure 4Scatter plot in Weka, of top-ranked weather parameters for Bipolar disorder.
Correlation-based ranking for physiological dataset.
| Physiological Sensor | Emotion | |
|---|---|---|
| EMG | −0.085599801 | <0.001 |
| BVP | −0.001507079 | 0.39 |
| GSR | −0.073850046 | <0.001 |
| RESP | −0.023263153 | <0.001 |
Correlation-based ranking for physiological dataset.
| Feature | EMG-f1 | EMG-f3 | EMG-f5 | BVP-f3 | BVP-f5 | GSR-f3 | GSR-f4 | GSR-f5 | Resp-f3 |
|---|---|---|---|---|---|---|---|---|---|
|
| 1 | ||||||||
|
| 0.9095 | ||||||||
|
| 0.0200 | 0.6472 | |||||||
|
| −0.0219 | 0.2994 | 0.6227 | ||||||
|
| 4.09 × 10−5 | 0.16253 | 0.336666 | 0.647586 | 1 | ||||
|
| 0.0037 | 0.2420 | 0.2529 | 0.7875 | 0.6084 | 1 | |||
|
| 0.0001 | 0.1733 | 0.3592 | 0.5576 | 0.8610 | 0.7066 | 0.3286 | 1 | |
|
| 0.0002 | 0.0867 | 0.1793 | 0.1491 | 0.2309 | 0.333 | 0.6973 | 0.4712 | |
|
| 0.0006 | 0.2431 | 0.25349 | 0.86176 | 0.66583 | 0.90413 | 0.23156 | 0.64011 | 1 |
|
| 6.76 × 10−5 | 0.17323 | 0.35939 | 0.609938 | 0.941885 | 0.639795 | 0.163974 | 0.905452 | 0.706954 |
Correlation-based ranking for physiological dataset.
| Rank | Feature | Weight for Ranking | Rank | Feature | Weight for Ranking |
|---|---|---|---|---|---|
| 1 | BVP-f2 | 0.12407258 | 13 | BVP-f3 | 5.1806 × 10−4 |
| 2 | GSR-f1 | 0.09261232 | 14 | Resp-f5 | 6.639 × 10−5 |
| 3 | EMG-f1 | 0.07958459 | 15 | BVP-f5 | 5.092 × 10−5 |
| 4 | GSR-f2 | 0.0676282 | 16 | GSR-f5 | 3.489 × 10−5 |
| 5 | EMG-f2 | 0.0629491 | 17 | Resp-f3 | 3.022 × 10−5 |
| 6 | Resp-f2 | 0.05206328 | 18 | Resp-f6 | 2.369 × 10−5 |
| 7 | Resp-f1 | 8.8696 × 10−3 | 19 | EMG-f3 | 2.314 × 10−5 |
Classification Results for Bipolar Disorder. SVM is abbreviation for Support Vector Machines.
| Number of Predictors | Logit Boost (%) | SVM (%) | Random Forest (%) | Logistic Regression (%) |
|---|---|---|---|---|
| 13 | 85.16 | 80.77 | 87.36 | 79.12 |
| 12 | 85.16 | 80.77 | 87.91 | 80.22 |
| 11 | 86.26 | 81.32 | 88.46 | 83.52 |
| 10 | 86.26 | 82.42 | 89.01 | 84.07 |
| 9 | 86.26 | 81.32 | 88.46 | 85.16 |
| 8 | 85.71 | 84.07 | 89.01 | 84.62 |
| 7 | 85.71 | 84.62 | 87.91 | 84.07 |
| 6 | 84.62 | 84.62 | 87.91 | 81.87 |
| 5 | 85.71 | 84.62 | 85.16 | 84.07 |
| 4 | 82.42 | 84.62 | 84.07 | 81.87 |
| 3 | 74.18 | 72.53 | 67.58 | 72.53 |
Figure 5Accuracies of prediction models with respect to stepwise feature selection.