| Literature DB >> 31508477 |
Zonglin Di1, Xiaoliang Gong1, Jingyu Shi2, Hosameldin O A Ahmed3, Asoke K Nandi3,1.
Abstract
With the unprecedented development of the Internet, it also brings the challenge of Internet Addiction (IA), which is hard to diagnose and cure according to the state-of-art research. In this study, we explored the feasibility of machine learning methods to detect IA. We acquired a dataset consisting of 2397 Chinese college students from the University (Age: 19.17 ± 0.70, Male: 64.17%) who completed Brief Self Control Scale (BSCS), the 11th version of Barratt Impulsiveness Scale (BIS-11), Chinese Big Five Personality Inventory (CBF-PI) and Chen Internet Addiction Scale (CIAS), where CBF-PI includes five sub-features (Openness, Extraversion, Conscientiousness, Agreeableness, and Neuroticism) and BSCS includes three sub-features (Attention, Motor and Non-planning). We applied Student's t-test on the dataset for feature selection and Support Vector Machines (SVMs) including C-SVM and ν-SVM with grid search for the classification and parameters optimization. This work illustrates that SVM is a reliable method for the assessment of IA and questionnaire data analysis. The best detection performance of IA is 96.32% which was obtained by C-SVM in the 6-feature dataset without normalization. Finally, the BIS-11, BSCS, Motor, Neuroticism, Non-planning, and Conscientiousness are shown to be promising features for the detection of IA.Entities:
Keywords: Feature selection; IA detection; Internet addiction (IA); Personality questionnaire; Support vector machine
Year: 2019 PMID: 31508477 PMCID: PMC6726843 DOI: 10.1016/j.abrep.2019.100200
Source DB: PubMed Journal: Addict Behav Rep ISSN: 2352-8532
The details about the participants and the valid ones.
| Name | Participants | The valid |
|---|---|---|
| Male | 2004 (64.17%) | 1686 (60.98%) |
| Female | 1119 (35.83%) | 1079 (39.02%) |
| Total | 3123 (100.00%) | 2765 (100.00%) |
| Age | 19.17 ± 0.70 | 19.17 ± 0.68 |
| Gender | 0.21 | 0.22 |
Mean value and standard deviation of every feature between two groups used in the experiments.
| Name | Normal | IA |
|---|---|---|
| Gender | 0.24 | 0.17 |
| Age | 19.19 ± 0.67 | 19.13 ± 0.73 |
| Attention | 37.98 ± 9.44 | 35.79 ± 4.69 |
| Motor | 20.16 ± 5.81 | 25.41 ± 5.94 |
| Non-planning | 22.39 ± 6.81 | 27.70 ± 5.55 |
| Openness | 51.66 ± 9.97 | 48.83 ± 9.19 |
| Conscientiousness | 50.92 ± 9.64 | 43.84 ± 9.44 |
| Extroversion | 50.71 ± 10.28 | 47.09 ± 9.69 |
| Agreeableness | 51.43 ± 10.15 | 47.49 ± 10.40 |
| Neuroticism | 48.82 ± 9.44 | 56.90 ± 10.58 |
| BIS-11 total score | 80.53 ± 18.07 | 88.89 ± 16.17 |
| CBF-PI total score | 253.55 ± 49.49 | 244.12 ± 49.30 |
| Self-Control Scale | 22.18 ± 6.81 | 27.70 ± 5.54 |
| CIAS score | 39.74 ± 10.44 | 76.42 ± 11.05 |
Fig. 1The Framework of the present study.
Fig. 2The illustration of SVM in 2-feature dataset. The filled points on the dashed line indicate the support vectors.
Student's t-test results for feature selection according to p-value degressively.
| Feature | |||
|---|---|---|---|
| BIS | 1 | 1.34e-130 | [7.71, 8.99] |
| BSCS | 1 | 8.74e-127 | [4.16, 4.86] |
| Motor | 1 | 1.02e-99 | [4.78, 5.71] |
| Neuroticism | 1 | 9.04e-84 | [−8.76, −7.19] |
| Non-planning | 1 | 2.66e-82 | [4.71, 5.74] |
| Conscientiousness | 1 | 8.32e-67 | [−7.66, −6.13] |
| Attention | 1 | 7.33e-23 | [1.70, 2.53] |
| Agreeableness | 1 | 4.59e-21 | [−4.76, −3.13] |
| Extraversion | 1 | 1.57e-17 | [2.72, 4.33] |
| CBF-PI | 1 | 1.81e-16 | [7.00, 11.34] |
| Openness | 1 | 3.08e-12 | [−3.56, −2.00] |
| Sex | 0 | 0.14 | [−0.11, −0.00] |
| Age | 0 | 0.50 | [−0.01, 0.07] |
Detection results of C-SVM, ν-SVM and FNN without grid search on every dataset.
| Dataset | C-SVM (%) | FNN (%) | |
|---|---|---|---|
| 13F0 | 51.00 ± 1.49 | 67.51 ± 1.53 | 69.73 |
| 13F1 | 51.46 ± 0.83 | 54.47 ± 1.32 | 57.41 |
| 13F2 | 51.95 ± 0.66 | 65.71 ± 1.91 | 57.39 |
| 11F0 | 52.66 ± 1.14 | 56.93 ± 0.73 | 54.98 |
| 11F1 | 51.30 ± 1.08 | 57.42 ± 1.30 | 57.35 |
| 11F2 | 51.59 ± 0.30 | 62.74 ± 2.81 | 57.73 |
| 8F0 | 72.51 ± 0.07 | 60.65 ± 0.79 | |
| 8F1 | 77.52 ± 0.33 | 75.00 ± 0.72 | 74.91 |
| 8F2 | 77.23 ± 0.37 | 75.00 ± 0.71 | 74.67 |
| 7F0 | 74.89 ± 0.32 | 73.79 ± 0.68 | 74.45 |
| 7F1 | 77.60 ± 1.28 | 76.04 ± 0.28 | 73.89 |
| 7F2 | 72.47 | ||
| 6F0 | 76.05 ± 0.74 | 75.03 ± 0.73 | 71.74 |
| 6F1 | 75.15 ± 0.78 | 75.76 ± 0.71 | 72.47 |
| 6F2 | 75.08 ± 0.73 | 75.79 ± 0.72 | 72.39 |
F0 refers to the dataset without normalization. F1 refers to the dataset with the normalization in [−1, 1]. F2 refers to the dataset with the normalization in [0, 1].
The bold emphases the best detection accuracy by different classifiers.
Detection results of C-SVM, ν-SVM and FNN without grid search on every dataset.
| Dataset | C-SVM (%) | The best in | |
|---|---|---|---|
| 13F0 | 51.00 ± 1.48 | 67.77 ± 2.01 | 69.73 |
| 13F1 | 51.46 ± 0.83 | 56.91 ± 1.14 | 57.41 |
| 13F2 | 51.95 ± 0.66 | 65.71 ± 2.78 | 65.71 ± 1.91 |
| 11F0 | 51.16 ± 1.14 | 57.40 ± 1.65 | 56.93 ± 0.73 |
| 11F1 | 51.30 ± 1.10 | 57.42 ± 1.32 | 57.42 ± 1.30 |
| 11F2 | 51.59 ± 0.30 | 65.16 ± 2.81 | 62.74 ± 2.81 |
| 8F0 | 78.27 ± 0.41 | 73.70 ± 0.29 | 75.58 |
| 8F1 | 78.74 ± 0.35 | 77.34 ± 0.38 | 77.52 ± 0.33 |
| 8F2 | 78.90 ± 0.19 | 77.06 ± 0.53 | 77.23 ± 0.37 |
| 7F0 | 72.28 ± 0.34 | 75.08 ± 0.29 | 74.89 ± 0.32 |
| 7F1 | 84.78 ± 1.31 | 77.60 ± 1.28 | |
| 7F2 | 84.58 ± 1.47 | 78.33 ± 1.11 | |
| 6F0 | 78.32 ± 0.36 | 76.05 ± 0.74 | |
| 6F1 | 76.05 ± 0.74 | 78.54 ± 0.44 | 75.76 ± 0.71 |
| 6F2 | 76.02 ± 0.59 | 78.74 ± 0.43 | 75.79 ± 0.72 |
The bold emphases the best detection accuracy by different classifiers.
The feature is the same as that in Table 3.
This indicates that the best results in Table 3 is C-SVM.
This indicates that the best results in Table 3 is ν-SVM.
This indicates that the best results in Table 3 is FNN.
Fig. 3The relationship between g-step and accuracy when g-step is fixed for C-SVM in 6-feature dataset without normalization. Each line with different color stands for different g-step from 1 to 256. The x-axis is the g from 1 to 256 while the y-axis is the accuracy. Point A has the best performance whose C-step is 35 and g-step is 12. Point B (g-step is 16) and C (g-step is 17) are the tuning points, which shows that g-step is the key parameter.
Fig. 4The relationship between C-step and accuracy when C-step and accuracy when C-step is fixed for C-SVM in 6-feature dataset without normalization. Each line with different color stands for different C-step from 1 to 256. The x-axis is the C-step from 1 to 256 while the y-axis is the accuracy. Point A has the best performance whose C-step is 35 and g-step is 12, which corresponds to Point A in Fig. 3.
Fig. 5The relationship among g-step, time and accuracy. Point E's accuracy is 96.06% whose g-step is 37 and calculating time is 33.94 s. Point D's accuracy is 96.08% whose g-step is 1 and calculating time is 228.7 s. The x-axis is the g-step value. The y-axis is the mean value of time for each g-step. The z-axis is the mean value of accuracy for each g-step.