| Literature DB >> 35017634 |
Valeria De Angel1,2, Serena Lewis3,4, Katie White3, Carolin Oetzmann3, Daniel Leightley3, Emanuela Oprea3, Grace Lavelle3, Faith Matcham3, Alice Pace5, David C Mohr6,7, Richard Dobson8,9, Matthew Hotopf3,8.
Abstract
The use of digital tools to measure physiological and behavioural variables of potential relevance to mental health is a growing field sitting at the intersection between computer science, engineering, and clinical science. We summarised the literature on remote measuring technologies, mapping methodological challenges and threats to reproducibility, and identified leading digital signals for depression. Medical and computer science databases were searched between January 2007 and November 2019. Published studies linking depression and objective behavioural data obtained from smartphone and wearable device sensors in adults with unipolar depression and healthy subjects were included. A descriptive approach was taken to synthesise study methodologies. We included 51 studies and found threats to reproducibility and transparency arising from failure to provide comprehensive descriptions of recruitment strategies, sample information, feature construction and the determination and handling of missing data. The literature is characterised by small sample sizes, short follow-up duration and great variability in the quality of reporting, limiting the interpretability of pooled results. Bivariate analyses show consistency in statistically significant associations between depression and digital features from sleep, physical activity, location, and phone use data. Machine learning models found the predictive value of aggregated features. Given the pitfalls in the combined literature, these results should be taken purely as a starting point for hypothesis generation. Since this research is ultimately aimed at informing clinical practice, we recommend improvements in reporting standards including consideration of generalisability and reproducibility, such as wider diversity of samples, thorough reporting methodology and the reporting of potential bias in studies with numerous features.Entities:
Year: 2022 PMID: 35017634 PMCID: PMC8752685 DOI: 10.1038/s41746-021-00548-8
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Study selection flowchart.
Medical and computer science databases were searched to ensure relevant fields were covered. The current flowchart lists reasons for excluding the study from the data extraction and quality assessment.
Summary characteristics of included studies.
| First author | Year | Country | Field | % female | Mean age (range/SD) | RMTa follow up (days) | Sample type | Depression measure | Passive feature type | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sleep | Physical activity | Circadian rhythm | Sociability | Location | Phone use | Physiological | Environmental | Total feature types | ||||||||||
| Avila-Moraes[ | 2013 | Brazil | M | 30 | 100.0 | 44 (18–60) | 7 | Clinical | BDI, HAMD, MADRS | x | x | x | 3 | |||||
| Ben-Zeev[ | 2015 | USA | M | 37 | 21.0 | 22.5 (19–0) | 70 | Student | PHQ-9 | x | x | x | x | 4 | ||||
| Boukhechba[ | 2018 | USA | M | 72 | 51.4 | 19.8 (2.4) | 14 | Student | DASS-21 | x | x | x | 3 | |||||
| Burns[ | 2011 | USA | M | 7 | 87.5 | 37.4 (19–51) | 56 | Community | PHQ-9 | x | x | x | x | x | 5 | |||
| Byrne[ | 2019 | Australia | M | 42 | 0.0 | (18–29) | 7 | Community | SCRAM - dep | x | x | x | 3 | |||||
| Caldwell[ | 2019 | USA | M | 115 | 100.0 | 27.5 (6.1) | 3 | Community | BDI-II | x | 1 | |||||||
| Cho[ | 2016 | South Korea | M | 532 | 56.0 | 57 | 720 | Community | BDI-II | x | 1 | |||||||
| David[ | 2018 | USA | M | 132 | 60.0 | 20.68 (18–21) | 7 | Student | PHQ-4 | x | x | 2 | ||||||
| Difrancesco[ | 2019 | Netherlands | M | 359 | 62.4 | 50.1 (11.1) | 7 | Community | BDI-II | x | x | x | 3 | |||||
| Dillon[ | 2018 | Ireland | M | 396 | 50.8 | nr | 7 | Clinical | CES-D | x | 1 | |||||||
| Doane[ | 2015 | USA | M | 76 | 76.0 | 18.1 (0.4) | 3 | Student | CES-D | x | 1 | |||||||
| Doryab[ | 2014 | USA | M | 6 | 33.3 | nr | 120 | Student | CES-D | x | x | x | 3 | |||||
| Ghandeharioun[ | 2017 | USA | CS | 12 | 75.0 | 37 (20–73) | 56 | Clinical | HAM-D | x | x | x | x | x | x | 6 | ||
| Haeffel[ | 2017 | USA | M | 47 | 55.3 | 20.9 | 7 | Student | BDI-II | x | 1 | |||||||
| Hori[ | 2016 | Japan | M | 40 | 52.5 | 39.8 | 7 | Clinical | HAM-D | x | 1 | |||||||
| Jacobson[ | 2019 | Brazil | M | 15 | 87.0 | 47.6 (10.5) | 7 | Clinical | BDI, HAMD | x | x | 2 | ||||||
| Kawada[ | 2007 | Japan | M | 105 | 29.5 | 24.1 (1.8) | 4 | Student | CES-D | x | x | x | 3 | |||||
| Knight[ | 2018 | Australia | M | 23 | 77.0 | 20.7 (3.2) | 3 | Community | DASS-21 | x | 1 | |||||||
| Li[ | 2018 | Australia | M | 375 | 53.9 | 59.5 (5.5) | 7 | Community | CES-D | x | 1 | |||||||
| Lu[ | 2018 | USA | CS | 103 | 76.7 | (18–25) | nr | Student | QIDS | x | x | x | 3 | |||||
| Luik[ | 2013 | Netherlands | M | 1734 | 53.4 | 62.3 (9.4) | 7 | Community | CES-D | x | 1 | |||||||
| Luik[ | 2015 | Netherlands | M | 1714 | 53.6 | 62.2 (9.4) | 7 | Community | CES-D | x | 1 | |||||||
| McCall[ | 2015 | USA | M | 58 | 67.0 | 42.1 (12.4) | 56 | Clinical | HAM-D | x | 1 | |||||||
| Mendoza-Vasconez[ | 2019 | USA | M | 266 | nr | 40.6 (9.9) | 7 | Community | HAM-D | x | 1 | |||||||
| Moukaddam[ | 2019 | USA | M | 22 | 76.0 | 50.3 (10.1) | 56 | Clinical | PHQ-9 | x | x | 2 | ||||||
| Naismith[ | 2011 | Australia | M | 44 | 43 | 62.3 | 14 | Clinical | HAM-D | x | 1 | |||||||
| Park[ | 2007 | USA | M | 54 | 57.4 | 43 (21–76) | 14 | Community | CES-D | x | x | 2 | ||||||
| Pillai[ | 2014 | USA | M | 39 | 73.8 | 55 (3.2) | 7 | Student | BDI-II | x | 1 | |||||||
| Pratap[ | 2019 | USA | M | 271 | 77.8 | 33.4 (10.7) | 90 | Community | PHQ-2 | x | x | 2 | ||||||
| Robillard[ | 2013 | Australia | M | 66 | 62.7 | 21.5 | 7 | Clinical | clinician assessment | x | 1 | |||||||
| Robillard[ | 2014 | Australia | M | 238 | 64.3 | 40.4 | 10 | Clinical | HAM-D | x | x | 2 | ||||||
| Robillard[ | 2015 | Australia | M | 342 | 55.1 | 22.3 | 14 | Clinical | clinician assessment | x | x | 2 | ||||||
| Robillard[ | 2016 | Australia | M | 25 | 48.0 | 20.9 (4.6) | 14 | Clinical | clinician assessment | x | x | 2 | ||||||
| Robillard[ | 2018 | USA | M | 12 | 58.0 | 20.1 (18–31) | 13 | Clinical | clinician assessment | x | 1 | |||||||
| Saeb[ | 2015 | USA | M | 21 | 71.4 | 28.9 (19– 58) | 14 | Student | PHQ-9 | x | x | x | 3 | |||||
| Saeb[ | 2016 | USA | M | 38 | 20.8 | nr | 70 | Community | PHQ-9 | x | x | 2 | ||||||
| Sano[ | 2018 | USA | M | 47 | 72.0 | (18– 25) | 30 | Student | MCSF-12 | x | x | x | x | x | x | x | 7 | |
| Slyepchenko[ | 2019 | Canada | M | 70 | 57.9 | (18– 65) | 15 | Clinical | MINI | x | x | x | 3 | |||||
| Smagula (a)[ | 2018a | USA | M | 145 | 67.0 | 60 (36-82) | 9 | Community | HAM-D | x | 1 | |||||||
| Smagula (b)[ | 2018 | USA | M | 45 | 38.8 | 38.08 | 10 | Community | HAM-D | x | 1 | |||||||
| Stremler[ | 2017 | Canada | M | 101 | 62.7 | 34.1 | 5 | Community | CES-D | x | 1 | |||||||
| Tao[ | 2019 | China | M | 220 | 52.3 | 20.3 (2.4) | 7 | Student | PROMIS - dep | x | 1 | |||||||
| Vallance[ | 2013 | Canada | M | 385 | 0.0 | 65.3 (7.5) | 3 | Community | CES-D | x | 1 | |||||||
| Vanderlind[ | 2014 | USA | M | 35 | 42.3 | 19.8 (18–23) | 21 | Student | CES-D | x | x | 2 | ||||||
| Wahle[ | 2016 | Switzerland | M | 36 | 64.3 | (20–57) | 14 | Community | PHQ-9 | x | x | x | x | 4 | ||||
| Wang[ | 2014 | USA | CS | 48 | 20.8 | nr | 7 | Student | PHQ-9 | x | x | x | 3 | |||||
| Wang[ | 2018 | USA | CS | 83 | 51.8 | 20.1 (2.3) | 126 | Student | PHQ-8 | x | x | x | x | x | x | 6 | ||
| White[ | 2017 | USA | M | 418 | 60.3 | 57 (35–85) | 7 | Community | CES-D | x | x | 2 | ||||||
| Yang[ | 2017 | China | CS | 48 | nr | nr | 70 | Student | PHQ-9 | x | 1 | |||||||
| Yaugher[ | 2015 | USA | M | 100 | 58.3 | 18.6 (18– 27) | 7 | Student | PAI-dep | x | 1 | |||||||
| Yue[ | 2018 | USA | CS | 54 | nr | (18–25) | nr | Student | PHQ-9 | x | x | 2 | ||||||
| 58.0 | 57.9 | 37.2 | 9 | 16 | 31 | 24 | 14 | 14 | 14 | 7 | 4 | 1 | ||||||
RMT remote measurement technologies, SD standard deviation, M medical field, CS computer science field, BDI Beck’s Depression Inventory, HAM-D Hamilton Depression Rating Scale, MADRS Montgomery–Åsberg Depression Rating Scale, PHQ Patient Health Questionnaire, PAI-dep Personality Assessment Inventory-depression subscale, CES-D Center for Epidemiologic Studies Depression Scale, MINI Mini International Neuropsychiatric Interview, PROMIS Patient-Reported Outcomes Measurement Information System, MCSF-12 Mental Component of the Short Form Health Survey, QIDS Quick Inventory of Depressive Symptomatology, DASS Depression Anxiety Stress Scales, SCRAM sleep, circadian rhythms, and mood questionnaire.
aNumber of participants/length of follow-up included in passive data collection samples; these may be lower than overall study sample sizes.
The breakdown of study designs within each sample type.
| Study design | Total | Student | Community | Clinical |
|---|---|---|---|---|
| Cross-sectional | 19 | 4 | 10 | 5 |
| Case-control | 6 | 0 | 1 | 5 |
| Cohort | 25 | 14 | 6 | 3 |
| RCT | 3 | 0 | 2 | 1 |
| Total | 51 | 18 | 19 | 14 |
Fig. 2Sample sizes and follow-up times for all included studies.
The number of studies by the length of time participants were followed up for in each study, differentiated by sample size.
Fig. 3Quality of the literature by each domain.
The figure shows the number of studies scoring on each study quality item. 2 points are given for fully addressing quality criteria, 1 point for partially addressing quality criteria, and 0 points for failing to address quality criteria.
Fig. 4Feature associations with depression by behaviour type.
The number of times each feature (a sleep, b physical activity, c circadian rhythm, d sociability, e location and f phone use) has been reported in all included studies and their association with depression, where these associations are defined as having a below-threshold p-value (“Significant Association”), above-threshold p-value (“Non-Significant Association”), and where statistical methods have been used that do not yield p-values (“Non-p-value”). The graphs also show the number of studies assessing each feature.
Details for studies analysing combined features using classification models.
| Study ID | Quality rating | First Author, Year | Device | Groups | No. of features | Feature type | Algorithm/model | Performance measure | Discrimination value | Missing data handling | Validation method | Comparison models | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 12 | Sano, 2018 | Q sensor, smartphone | MCS SF-12 Low vs. High | 47 | 204 | PA, SC, Li | SVM RBF | Accuracy | 85.1 | Interpolation | 10-fold cross-validation | LASSO, SVM Linear |
| 441 | PA, L, PU, SC, ST | SVM RBF | Accuracy | 86.1 | |||||||||
| 700 | S, PA, PU, SC, ST, HR, Cl | SVM RBF | Accuracy | 77.2 | |||||||||
| 296 | S, PA, PU | SVM RBF | Accuracy | 78.7 | |||||||||
| 25 | PU | SVM RBF | Accuracy | 71.1 | |||||||||
| 25 | S | SVM RBF | Accuracy | 65 | |||||||||
| 2 | 8 | Yue, 2018 | Android | Clinician MDD vs. HC | 25 | 8 | PA, L | SVM RBF | F1 | 0.66 | Multiple Imputation | LOOCV | l2-regularised (ridge) regression |
| iPhone | 54 | 8 | PA, L | SVM RBF | F1 | 0.76 | |||||||
| 3 | 8 | Wahle, 2016 | Smartphone | PHQ-9 Dep vs. HC | 36 | 120 | PA, So, L, PU | Random Forest | Accuracy | 60.1 | Unclear | LOOCV | SVM |
| 4 | 10 | Pratap, 2019 | Smartphone | PHQ-2 Dep vs. HC | 93 | 10 | So, L | Random Forest | Median AUC | >0.50 (for 80.6% sample) | Mean imputation | None | |
| 5 | 7 | Saeb, 2015 | Android | PHQ-9 Dep vs. HC | 18 | 8 | CR, L | Elastic Net Logistic Regression | Accuracy | 78.8 | Unclear | LOOCV | |
| 6 | 7 | Wang, 2018 | Smartphone | PHQ 4 Dep vs. HC | 83 | 9 | S, PA, L, PU, HR | Lasso Logistic Regression | AUC | 0.809 | Unclear | 10-fold cross validation | |
| 7 | 9 | Lu, 2018 | smartphone and Fitbit | QIDS | 69 | 36 | S, PA, So | Multi-Task Deep Learning | F1 | 0.77 | Exclusion | LO(W)OCV | STL (Lasso) STL (Ridge), MTL Lasso and Ridge |
MCS SF mental component survey short form, PHQ Patient Health Questionnaire, MDD major depressive disorder, HC healthy control, S sleep, PA physical activity, CR circadian rhythm, So Sociability, L Location, PU phone use, SC skin conductance, ST skin temperature, HR heart rate, Li light, Cl clinical data, SVM RBF Support Vector Machine - Radial Basis Function, AUC Area Under the Curve, LOOCV Leave One Out Cross Validation, STL Single Task Learning, MTL = Multi-Task Learning
Details for studies analysing combined features using regression models.
| Study ID | Quality rating | First Author, Year | Device | Outcome | No. of features | Feature type | Algorithm | Performance measure | Exact statistic | Missing data handling | Validation method | Comparison | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 8 | Yue, 2018 | Android | PHQ9 | 25 | 8 | PA, L | SVM RBF | r | 0.46 | Multiple Imputation | LOOCV | Support Vector Multivariate Linear Regression |
| iPhone | PHQ9 | 54 | 8 | PA, L | SVM RBF | r | 0.41 | Support Vector Multivariate Linear Regression | |||||
| 4 | 10 | Pratap, 2019 | Smartphone | PHQ2 | 93 | 10 | PA, So, L | Random Forests | R2 | ≈ 0 | Mean Imputation | None Reported | |
| 5 | 7 | Saeb, 2015 | Smartphone | PHQ9 | 18 | 8 | CR, L | Elastic net Linear Regression | Mean NRMSD | 0.251 | Unclear | LOOCV | |
| 21 | 2 | PU | Elastic net linear regression | Mean NRMSD | 0.273 | ||||||||
| 6 | 7 | Wang, 2018 | Smartphone | pre PHQ 8 | 83 | 10 | S, PA, L, PU, HR | Lasso Linear Regression | MAE | 2.4 | Unclear | 10-fold cross validation | |
| post PHQ 8 | 83 | 5 | S, PA, So, L, PU | Lasso Linear Regression | MAE | 3.6 | |||||||
| 7 | 9 | Lu, 2018 | Smartphone, Fitbit | QIDS | 69 | 36 | S, PA, So | Multi-Task deep Learning | R2 | 0.44 | Exclusion | LO(W)OCV | STL (Lasso) STL (Ridge), MTL Lasso and Ridge |
| 8 | 7 | Burns, 2011 | Smartphone | PHQ9 | 7 | 38 | PA, So, L, PU, Li | Regression Trees | Accuracy | nr | Unclear | 10-fold cross validation | |
| 9 | 8 | Jacobson, 2019 | Actiwatch | BDI-II | 15 | nr | PA, Li | Xgboost | r | 0.86 | Unclear | LOOCV | |
| 10 | 7 | Ghandeharioun, 2017 | Empatica, Smartphone | HRDS | 12 | 700 | S, PA, PU | Combination of regularised regression, robust-to-outlier, boosting, Random Forest and Gaussian Process | RMSE | 4.5 | Multiple Imputation | 10-fold cross validation |
PHQ Patient Health Questionnaire, QIDS Quick Inventory of Depressive Symptomatology, nr not reported, S sleep, PA physical activity, CR circadian rhythm, So sociability, L location, PU phone use, SC skin conductance, ST skin temperature, HR heart rate, Li light, Cl clinical data, SVM RBF support vector machine-radial basis function, NRMSD normalised root-mean-square deviation, RMSE root-mean-square error, MAE mean absolute error, STL single-task learning, MTL multi-task learning.