| Literature DB >> 27580487 |
Konstantinos Ioannidis1, Samuel R Chamberlain1, Matthias S Treder2, Franz Kiraly3, Eric W Leppink4, Sarah A Redden4, Dan J Stein5, Christine Lochner5, Jon E Grant6.
Abstract
Problematic internet use is common, functionally impairing, and in need of further study. Its relationship with obsessive-compulsive and impulsive disorders is unclear. Our objective was to evaluate whether problematic internet use can be predicted from recognised forms of impulsive and compulsive traits and symptomatology. We recruited volunteers aged 18 and older using media advertisements at two sites (Chicago USA, and Stellenbosch, South Africa) to complete an extensive online survey. State-of-the-art out-of-sample evaluation of machine learning predictive models was used, which included Logistic Regression, Random Forests and Naïve Bayes. Problematic internet use was identified using the Internet Addiction Test (IAT). 2006 complete cases were analysed, of whom 181 (9.0%) had moderate/severe problematic internet use. Using Logistic Regression and Naïve Bayes we produced a classification prediction with a receiver operating characteristic area under the curve (ROC-AUC) of 0.83 (SD 0.03) whereas using a Random Forests algorithm the prediction ROC-AUC was 0.84 (SD 0.03) [all three models superior to baseline models p < 0.0001]. The models showed robust transfer between the study sites in all validation sets [p < 0.0001]. Prediction of problematic internet use was possible using specific measures of impulsivity and compulsivity in a population of volunteers. Moreover, this study offers proof-of-concept in support of using machine learning in psychiatry to demonstrate replicability of results across geographically and culturally distinct settings.Entities:
Keywords: ADHD; Compulsivity; Impulsivity; Internet use; Machine learning; OCD
Mesh:
Year: 2016 PMID: 27580487 PMCID: PMC5119576 DOI: 10.1016/j.jpsychires.2016.08.010
Source DB: PubMed Journal: J Psychiatr Res ISSN: 0022-3956 Impact factor: 4.791
Demographic and clinical characteristics in the full sample (n = 2006, controls = 1825, cases = 181).
| Variable | IAT score <50 | IAT score ≥ 50 | p-value | Corrected p-value (*177) | Effect size |
|---|---|---|---|---|---|
| IAT scores (SD) | 30.6 (7.3) | 59.9 (9.8) | <0.0001 v | <0.0001 v | 0.57 |
| Age, years (SD) | 29.8 (13.3) | 33.2 (14.3) | <0.0010 v | 0.1685 v | |
| Gender, male, n (%) | 1199 (65.6) | 117 (64.6) | 0.8386 | >0.99 | |
| Race, Caucasian, n (%) | 1345 (73.6) | 102 (56.3) | <0.0001 | 0.0002 | 0.11 |
| Education, n (%) | 12 (0.6) | 1 (0.6) | |||
| High school graduate | 198 (10.8) | 26 (14.3) | |||
| Some college | 444 (24.3) | 68 (37.5) | 0.0001 | 0.0253 | 0.10 |
| College graduate | 740 (40.5) | 63 (34.8) | |||
| Beyond College | 431 (23.6) | 23 (12.7) | |||
| GAD, n (%) | 322 (17.6) | 78 (43.1) | <0.0001 | <0.0001 | 0.18 |
| Social Anxiety Disorder, n (%) | 209 (11.4) | 58 (32.0) | <0.0001 | <0.0001 | 0.17 |
| ADHD, n (%) | 753 (41.2) | 131 (72.3) | <0.0001 | <0.0001 | 0.18 |
| OCD, n (%) | 159 (8.7) | 50 (27.6) | <0.0001 | <0.0001 | 0.17 |
Internet addiction test (IAT) score <50 (Controls n = 1825).
IAT score ≥ 50 (problematic internet use n = 181); All scores are mean (SD) unless otherwise noted. Statistic: chi-square except where indicated with ‘v’ ANOVAs for. Numbers in parentheses are percentages of each element in the respective groups. GAD: Generalized Anxiety Disorder; ADHD: Attention-Deficit Hyperactivity Disorder; OCD: Obsessive-Compulsive Disorder.
Bonferroni correction applied.
Effect sizes are eta squared for ANOVA and phi for chi square tests.
Fig. 1Summary figure of comparisons between models that included both impulsivity and compulsivity measures against baseline models in all validation set-ups. ROC-AUC: Receiver-operating characteristic Curve – Area Under the curve; PR-AUC: Precision-Recall curve – Area. under the curve; All p values are Wilcoxon signed rank test with continuity correction. All significant values support the alternative hypothesis that true location shift is not equal to zero and therefore models that included both impulsivity and compulsivity were superior to models with baseline variables only. IMP-COMP: Models that includes impulsivity and compulsivity variables as well as baseline variables. Baseline: Models that includes baseline variables only. Stellenb.=>Chicago: Models trained in the Stellenbosch set and tested on the Chicago set. Chicago=>Stellenb.: Models trained in the Chicago set and tested on the Stellenbosch set. Significance codes: ‘***’ <0.001 ‘**’ <0.01 ‘*’ <0.05 ‘.’ ≥0.05.
Overview of variable importance results of Logistic Regression and Random Forest models listed by averaged variable importance ranks from all sets – only first 15 items displayed.
| Variable | VI rank average |
|---|---|
| Race (non-Caucasian) | 2.6 |
| Age (older) | 3.2 |
| Impulses to harm self or others (PADUA) | 3.8 |
| Checking compulsion (PADUA) | 4 |
| Motor impulsivity (BIS) | 5 |
| ASRS | 7.6 |
| ADHD diagnosis | 8 |
| PADUA dressing grooming Compulsions (PADUA) | 8.8 |
| GAD diagnosis | 9.6 |
| Attention impulsivity (BIS) | 10.6 |
| PADUA contamination obsessions and washing compulsions (PADUA) | 10.8 |
| Social Anxiety Diagnosis | 11.8 |
| Thoughts of harm to self or others (PADUA) | 12.2 |
| Non-planning impulsivity (BIS) | 12.6 |
| OCD diagnosis | 13 |
| … | … |
ADHD – Attention Deficit Hyperactivity Disorder; ASRS – Adult ADHD Self-Report Scale (ASRS-v1.1); BIS – Barratt Impulsiveness Scale 11; GAD – Generalized Anxiety disorder; OCD – Obsessive-Compulsive disorder; PADUA – Padua Inventory-Revised; VI – Variable importance.
Fig. 2Receiver operating characteristic and Precision-Recall Curves for Logistic Regression and Random Forest Machine Learning prediction models trained and tested in the full data set. ‘Blue’ line: Prediction model curve using baseline plus impulsivity and compulsivity variables. ‘Green dotted’ line: Prediction model using baseline plus impulsivity variables. ‘Blue dotted’ line: Prediction model using baseline plus compulsivity variables. ‘Red’ line: Prediction model curve using baseline variables only. ‘Grey dotted’ line: Prediction model curve ‘at chance’ level with randomized variable scores.
Logistic Regression model in the full data set (in-sample), with problematic internet use category (moderate and severely problematic versus controls) as dependent variable.
| Variable | Estimate ± Std. Error | z value | Pr(>|z|) |
|---|---|---|---|
| Age | 0.51 ± 0.09 | 5.59 | <0.0001 |
| Gender | −0.12 ± 0.19 | −0.63 | 0.5304 |
| Race | 0.74 ± 0.19 | 3.97 | <0.0001 |
| Education | 0.71 ± 1.24 | 0.57 | 0.5681 |
| 0.69 ± 1.23 | 0.56 | 0.5773 | |
| 0.43 ± 1.23 | 0.35 | 0.7293 | |
| 0.02 ± 1.25 | 0.02 | 0.9845 | |
| ASRS | 0.52 ± 0.01 | 5.31 | <0.0001 |
| Attention impulsivity (BIS) | 0.15 ± 0.12 | 1.22 | 0.2213 |
| Motor impulsivity (BIS) | 0.37 ± 0.12 | 3.15 | 0.0016 |
| Non-planning impulsivity (BIS) | 0.05 ± 0.10 | 0.53 | 0.5981 |
| Checking compulsion (PADUA) | 0.41 ± 0.12 | 3.41 | 0.0007 |
| PADUA contamination obsessions and washing compulsions (PADUA) | −0.02 ± 0.11 | −0.17 | 0.8681 |
| PADUA dressing grooming compulsions (PADUA) | 0.16 ± 0.10 | 1.69 | 0.0904 |
| Impulses to harm self or others (PADUA) | 0.32 ± 0.08 | 4.01 | <0.0001 |
| Thoughts of harm to self or others (PADUA) | 0.11 ± 0.11 | 0.97 | 0.3305 |
ASRS – Adult ADHD Self-Report Scale (ASRS-v1.1) Symptom Checklist; BIS – Barratt Impulsiveness Scale 11; PADUA – Padua Inventory-Revised.