| Literature DB >> 36050305 |
John Wallert1, Julia Boberg2, Viktor Kaldo2,3, David Mataix-Cols2,4, Oskar Flygare2, James J Crowley2,5, Matthew Halvorsen2,5, Fehmi Ben Abdesslem6, Magnus Boman6,7, Evelyn Andersson2, Nils Hentati Isacsson2, Ekaterina Ivanova2, Christian Rück2.
Abstract
This study applied supervised machine learning with multi-modal data to predict remission of major depressive disorder (MDD) after psychotherapy. Genotyped adult patients (n = 894, 65.5% women, age 18-75 years) diagnosed with mild-to-moderate MDD and treated with guided Internet-based Cognitive Behaviour Therapy (ICBT) at the Internet Psychiatry Clinic in Stockholm were included (2008-2016). Predictor types were demographic, clinical, process (e.g., time to complete online questionnaires), and genetic (polygenic risk scores). Outcome was remission status post ICBT (cut-off ≤10 on MADRS-S). Data were split into train (60%) and validation (40%) given ICBT start date. Predictor selection employed human expertise followed by recursive feature elimination. Model derivation was internally validated through cross-validation. The final random forest model was externally validated against a (i) null, (ii) logit, (iii) XGBoost, and (iv) blended meta-ensemble model on the hold-out validation set. Feature selection retained 45 predictors representing all four predictor types. With unseen validation data, the final random forest model proved reasonably accurate at classifying post ICBT remission (Accuracy 0.656 [0.604, 0.705], P vs null model = 0.004; AUC 0.687 [0.631, 0.743]), slightly better vs logit (bootstrap D = 1.730, P = 0.084) but not vs XGBoost (D = 0.463, P = 0.643). Transparency analysis showed model usage of all predictor types at both the group and individual patient level. A new, multi-modal classifier for predicting MDD remission status after ICBT treatment in routine psychiatric care was derived and empirically validated. The multi-modal approach to predicting remission may inform tailored treatment, and deserves further investigation to attain clinical usefulness.Entities:
Mesh:
Year: 2022 PMID: 36050305 PMCID: PMC9437007 DOI: 10.1038/s41398-022-02133-3
Source DB: PubMed Journal: Transl Psychiatry ISSN: 2158-3188 Impact factor: 7.989
Fig. 1Workflow of predictor selection, model derivation, and validation.
CV cross-validation, ICBT, Internet-based Cognitive Behaviour Therapy, LOGIT binomial logistic regression, MDD Major Depressive Disorder, RF random forest, RFE recursive feature elimination, XGB eXtreme gradient boosted trees, META blended meta-ensemble model of LOGIT, RF, and XBG.
Patient summary characteristics grouped by type and stratified by outcome.
| No Remission | Remission | Missing | |
|---|---|---|---|
| 451 | 338 | 105 | |
| Process | |||
| ICBT start week of year | 25.8 (16.1) | 24.1 (15.6) | 26 |
| MADRS-S time of day | 15:06:08 (5:10:04) | 16:01:15 (4:71:03) | 29 |
| EQ5D time to complete | 141.4 (308.2) | 133.96 (362.1) | 31 |
| Genetic | |||
| PRS-IQ ( | 0.05 (0.99) | −0.01 (1.04) | 0 |
| PRS-IQ ( | −0.07 (0.97) | 0.08 (0.99) | 0 |
| PRS-IQ ( | 0.00 (1.00) | 0.10 (0.96) | 0 |
| PRS-MDD ( | 0.02 (0.98) | −0.06 (1.03) | 0 |
| PRS-MDD ( | −0.04 (0.88) | −0.11 (0.88) | 0 |
| PRS-ASD ( | −0.00 (0.97) | 0.03 (1.03) | 0 |
| PRS-ASD ( | −0.00 (0.97) | 0.02 (1.02) | 0 |
| PRS-ADHD ( | −0.08 (1.00) | 0.04 (0.98) | 0 |
| PRS-ADHD ( | −0.04 (0.97) | −0.09 (1.00) | 0 |
| PRS-BP ( | −0.07 (0.99) | 0.06 (1.04) | 0 |
| PRS-EDU ( | 0.02 (0.95) | 0.11 (0.99) | 0 |
| PRS-EDU ( | 0.07 (0.99) | 0.08 (0.94) | 0 |
| PRS-Ancestry ( | 0.01 (1.07) | −0.02 (0.93) | 0 |
| Demographic | |||
| Age | 39.0 (11.9) | 37.8 (11.7) | 0 |
| Education | 4.9 (1.2) | 5.2 (1.2) | 2 |
| Work experience | 311 (69) | 259 (77) | 2 |
| Clinical | |||
| Prior mild MDD | 57 (13) | 59 (18.0) | 35 |
| Prior moderate MDD | 96 (22) | 72 (22) | 35 |
| Previous depression episodes | 5.6 (7.9) | 4.1 (5.2) | 0 |
| MADRS-S screen | 26.4 (5.7) | 23.3 (6.0) | 0 |
| MADRS-S pre | 24.0 (5.8) | 19.85 (6.0) | 0 |
| MADRS pre | 21.8 (5.8) | 19.7 (5.6) | 47 |
| PHQ-9 pre | 16.6 (5.1) | 13.7 (5.1) | 29 |
| EQ5D pre | 0.6 (0.2) | 0.7 (0.2) | 5 |
| EQ5D extreme anx/dep | 186 (43) | 89 (27) | 31 |
| EQ5D moderate pain | 216 (50) | 118 (36) | 31 |
| LSAS screen | 45.0 (25.2) | 38.1 (24.3) | 88 |
| PDSS screen | 5.5 (5.6) | 5.0 (5.4) | 34 |
| AUDIT screen | 5.6 (4.9) | 4.8 (4.4) | 2 |
| AUDIT-C screen | 3.5 (2.1) | 3.2 (2.0) | 59 |
| CGI-S pre | 3.7 (0.7) | 3.6 (0.8) | 111 |
| GSE pre | 24.7 (4.9) | 26.1 (5.0) | 263 |
| Prior psy meds | 0.9 (1.2) | 0.6 (0.9) | 0 |
| Current psy meds | 1.3 (1.3) | 0.9 (1.2) | 0 |
| No prior psy med | 139 (32) | 144 (44) | 35 |
| Any current psy med | 235 (65) | 147 (51) | 157 |
| Variable sleep-wake pattern | 174 (51) | 109 (43) | 218 |
| Reduced sex drive | 181 (58) | 131 (56) | 272 |
| Retarded speech | 28 (8) | 26 (10) | 207 |
| Reduced facial expressions | 35 (10) | 33 (13) | 213 |
| Agitation | 39 (11) | 27 (11) | 208 |
| GAF pre | 61.1 (7.2) | 62.7 (7.3) | 214 |
Data are integer count (%) or decimal mean (SD). Total sample n = 894.
See Table S2 in the Appendix for a full description of variables.
ADHD attention-deficit hyperactivity disorder, ASD autism spectrum disorder, AUDIT Alcohol Use Disorders Identification Test, BD bipolar disorder, CGIS Clinical Global Impression Scale, EQ5D-3L EuroQoL’s quality of life five dimension scale with three level items, EuroQoL European Quality of Life group, GAF Global Assessment of Functioning, GSS General Self-efficacy Scale, GWS genome-wide significance, ICBT internet-delivered cognitive behaviour therapy, LSAS Liebowitz Social Anxiety Scale, MADRS-S Montgomery-Åsberg Depression Rating Scale-Self report, MDD major depressive disorder, PDSS Panic Disorder Severity Scale, PRS polygenic risk score, SNP single-nucleotide polymorphism.
Fig. 2Individual predictors sorted by predictor type and relative Gini importance for predicting remission post ICBT in adults with MDD.
Needle length on the x-axis represents relative importance with the strongest predictor scaled to 1 and others as proportional fractions of 1. Colour groups predictors into one of four categories (Process, Genetic, Demographic, and Clinical). The vertical line defines the RFE cut-off for final model inclusion which 45 retained variables (solid dot) and discarded 24 predictors (transparent dot). Total sample n = 894. ICBT Internet-based Cognitive Behaviour Therapy, MDD Major Depressive Disorder, RFE Recursive Feature Elimination.
Fig. 3Individual model performance on the hold-out test set.
AUC area under the receiver operating characteristics curve, LOGIT logistic regression, RF random forest, XGB extreme gradient boosted trees.
Fig. 4Partial prediction plots of the strongest two predictors by predictor type.
Each column represents one of the four predictor types from which the two top numeric predictors have been plotted on their respective x and y axis. Colour represents the individual variable contribution to the predicted probability of remission in two ways. The upper panel row is colour-blindness friendly (Blue = higher probability of remission) and the bottom row is greyscale compatible (Light = higher probability of remission). The probability contribution of each variable in each panel is calculated when all other predictors in the model are centred and held constant.
Fig. 5Individual case predictions exemplified with six individual patients.
The top 10 predictors and their cut-off values influencing the probability for post ICBT remission in each patient. Data are from the final RF model with additional modelling using the LIME framework.