| Literature DB >> 29862165 |
Ward van Breda1, Vincent Bremer2, Dennis Becker2, Mark Hoogendoorn1, Burkhardt Funk2, Jeroen Ruwaard3, Heleen Riper3.
Abstract
In this paper, we explore the potential of predicting therapy success for patients in mental health care. Such predictions can eventually improve the process of matching effective therapy types to individuals. In the EU project E-COMPARED, a variety of information is gathered about patients suffering from depression. We use this data, where 276 patients received treatment as usual and 227 received blended treatment, to investigate to what extent we are able to predict therapy success. We utilize different encoding strategies for preprocessing, varying feature selection techniques, and different statistical procedures for this purpose. Significant predictive power is found with average AUC values up to 0.7628 for treatment as usual and 0.7765 for blended treatment. Adding daily assessment data for blended treatment does currently not add predictive accuracy. Cost effectiveness analysis is needed to determine the added potential for real-world applications.Entities:
Keywords: Classification; Depression; E-health; Prediction; Therapy success
Year: 2018 PMID: 29862165 PMCID: PMC5945603 DOI: 10.1016/j.invent.2017.08.003
Source DB: PubMed Journal: Internet Interv ISSN: 2214-7829
The EMA measures that are present in the dataset.
| Abbreviation | EMA question |
|---|---|
| Mood | How is your mood right now? |
| Worry | How much do you worry about things at the moment? |
| Self-esteem | How good do you feel about yourself right now? |
| Sleep | How did you sleep tonight? |
| Activities done | To what extent have you carried out enjoyable activities today? |
| Enjoyed activities | How much have you enjoyed the days activities? |
| Social contact | How much have you been involved in social interactions today? |
The number of successful and unsuccessful therapy effects in the TAU and BT datasets.
| Treatment | Nr patients | Unsuccessful | Successful |
|---|---|---|---|
| TAU | 276 | 231 | 45 |
| BT | 227 | 169 | 58 |
Number of features of binary encoded dataset, for each feature selection situation.
| Feature selection | TAU | BT | BT + EMA |
|---|---|---|---|
| None | 420 | 419 | 538 |
| FS | 119 | 113 | 232 |
| FS + RFE | 50 | 50 | 50 |
Number of features of mixed encoded dataset, for each feature selection situation.
| Feature selection | TAU | BT | BT+EMA |
|---|---|---|---|
| None | 292 | 272 | 391 |
| FS | 83 | 83 | 202 |
| FS + RFE | 50 | 50 | 50 |
Resulting AUCs using the different setups including 95% confidence intervals.
| Encoding | Feature selection | RF (50) | RF (100) | KNN | GLMB |
|---|---|---|---|---|---|
| Binary | No | 0.7282 | 0.7430 | 0.5515 | 0.7027 |
| FS | 0.7401 | 0.6257 | 0.7098 | ||
| FS + RFE | 0.7289 | 0.7337 | 0.6748 | 0.6856 | |
| Mixed | No | 0.5875 | 0.6757 | ||
| FS | 0.7499 | 0.6521 | 0.6871 | ||
| FS + RFE | 0.7402 | 0.7531 | 0.6871 | 0.6713 | |
| Binary | No | 0.7151 | 0.7254 | 0.6608 | 0.7344 |
| FS | 0.6968 | 0.7077 | 0.6802 | 0.7404 | |
| FS + RFE | 0.6864 | 0.6968 | 0.6919 | 0.7369 | |
| Mixed | No | 0.7244 | 0.7145 | 0.7009 | |
| FS | 0.7115 | 0.7187 | 0.6955 | ||
| FS + RFE | 0.6899 | 0.6869 | 0.6794 | 0.7496 | |
| Binary | No | 0.7126 | 0.7320 | 0.6243 | 0.7383 |
| FS | 0.7229 | 0.7200 | 0.6506 | 0.7543 | |
| FS + RFE | 0.6931 | 0.6962 | 0.6385 | 0.7000 | |
| Mixed | No | 0.7251 | 0.7196 | 0.6801 | 0.7565 |
| FS | 0.7179 | 0.7124 | 0.6522 | 0.7607 | |
| FS + RFE | 0.6669 | 0.6860 | 0.6399 | 0.6974 | |
Fig. 1Average ROC over 20 samples for TAU using a random forest with 50 trees, mixed encoding and no feature selection. The mean AUC is 0.7620 with 95% confidence intervals of 0.7181–0.7892.
Fig. 2Average ROC over 20 samples for BT using a GLMB with FS as features selection and mixed encoding. The mean AUC is 0.7765 with 95% confidence intervals of 0.7143–0.7822.
Top 10 features that were found to be important by the predictive model that generated the highest range of AUC values for the TAU dataset, a random forest with 50 trees, mixed encoding and no feature selection. The order of the features are the averaged order over the 20 training iterations.
| Imp. | Feature name | Dataset |
|---|---|---|
| 1 | aPHQ_sum | PHQ-9 (eng.) |
| 2 | aAge | Demographics |
| 3 | atreat17aMPK | Current tr. |
| 4 | aQIDS05 | QIDS-SR16 |
| 5 | aPHQ03 | PHQ-9 |
| 6 | aPHQ04 | PHQ-9 |
| 7 | aEQ5D5L4 | EQ-5D-5L |
| 8 | aPHQ02 | PHQ-9 |
| 9 | aEQ5D5L5 | EQ-5D-5L |
| 10 | aQIDS14 | QIDS-SR16 |
Top 10 features that were found to be important by the predictive model that generated the highest range of AUC values for the BT dataset, a GLMB with FS as features selection and mixed encoding. The order of the features are the averaged order over the 20 training iterations.
| Imp. | Feature name | Dataset |
|---|---|---|
| 1 | ccxPoland | Demographics |
| 2 | aQIDS08 | QIDS-SR16 |
| 3 | aQIDS11 | QIDS-SR16 |
| 4 | atreat17a1 | Current tr. |
| 5 | amini9no | M.I.N.I. |
| 6 | atreat4 | Current tr. |
| 7 | aMarital | Demographics |
| 8 | amini7b | M.I.N.I. |
| 9 | aQIDS01 | QIDS-SR16 |
| 10 | atreat7 | Current tr. |
The confusion matrix that results from the prediction of the model for TAU using a random forest, no feature selection and mixed encoding with a chosen criterion value of 0.09, which corresponds to a true positive rate of 0.8462 and a false positive rate of 0.4854.
| Actual 0 | Actual 1 | |
|---|---|---|
| 476 | 30 | |
| 449 | 165 |