| Literature DB >> 35502414 |
Şaban Karayağız1, Burcu Oralhan2, Zeki Oralhan3, Hamza Turabieh4, Monirujjaman Khan5.
Abstract
Data mining is a method that is used to find data that are precise, previously uncertain, and logical values from a comprehensive set of information. Data mining is used as a tool for determining the accuracy of classifications of data obtained in the field of bioinformatics by using different algorithm approaches. In this study, the data mining method was used to classify the accuracy of different algorithms and predict the types of compulsive behavior of patients with obsessive compulsive disorder. Data collected from a total of 164 people, 70 males and 94 females, were analyzed. The age range of the people participating in the study was between 7 and 73, and the calculated mean age was 32.4. Data about sociodemographic characteristics, course of disease, treatments, family histories, obsession, and compulsion types of the participants were collected through data collection instruments. Classification algorithm methods found in WEKA software were chosen to process the data. The effect of the types of obsession on the types of compulsion was determined using regression models. The levels of success of the generated models were compared. The results of the study demonstrated the presence of a moderate positive correlation (.35) between these two variables. According to the coefficient of determination, obsession explained 11% of the variance in compulsion. These findings supported the established hypothesis that the effect of the types of obsession was effective on the types of compulsion.Entities:
Mesh:
Year: 2022 PMID: 35502414 PMCID: PMC9056265 DOI: 10.1155/2022/8040622
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.809
Sociodemographic characteristics of the participants.
| Variable | CODE | Frequency | Percent | Cumulative percent | |
|---|---|---|---|---|---|
| Gender | Female | 1 | 94 | 57.3 | 57.3 |
| Male | 2 | 70 | 42.7 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Age | 0-18 | 1 | 13 | 7.9 | 7.9 |
| 19-35 | 2 | 101 | 61.6 | 69.5 | |
| Up to 36 | 3 | 50 | 30.5 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Marital status | Single | 1 | 75 | 45.7 | 45.7 |
| Marriage | 2 | 89 | 54.3 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Child | No | 1 | 87 | 53.0 | 53.0 |
| Yes | 2 | 77 | 47.0 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Occupation | Employed | 1 | 109 | 66.5 | 66.5 |
| Unemployed | 0 | 55 | 33.5 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Income | 0-2000 | 1 | 103 | 62.8 | 62.8 |
| Up to 2001 | 2 | 61 | 37.2 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Disease_his | First | 1 | 30 | 18.3 | 18.3 |
| Chronic | 2 | 30 | 18.3 | 36.6 | |
| Repetitive | 3 | 104 | 63.4 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Family_his | (1) Degree relative | 1 | 86 | 52.4 | 52.4 |
| (2) Degree relative | 2 | 14 | 8.5 | 61.0 | |
| No feature | 3 | 64 | 39.0 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Obsession | Damage | 1 | 17 | 10.4 | 10.4 |
| Contamination | 2 | 44 | 26.8 | 37.2 | |
| Order | 3 | 19 | 11.6 | 48.8 | |
| Doubt | 4 | 50 | 30.5 | 79.3 | |
| Other | 5 | 34 | 20.7 | 100.0 | |
| Total | 164 | 100.0 | |||
|
| |||||
| Compulsive | Cleaning | 1 | 59 | 36.0 | 36.0 |
| Avoidance | 2 | 15 | 9.1 | 45.1 | |
| Checking | 3 | 71 | 43.3 | 88.5 | |
| Other | 4 | 19 | 11.6 | 100.0 | |
| Total | 164 | 100.0 | |||
Figure 1Data mining research framework [23].
Comparison of results of the generated models.
| Classifier | Time taken to build model | Correctly classified instances | Kappa statistic | Mean absolute error | Root mean squared error | Relative absolute error | Root relative squared error |
|---|---|---|---|---|---|---|---|
| JRIP |
|
|
| 0.267 |
| 80.89% |
|
| J.48 |
|
|
|
|
|
|
|
| Naive Bayes |
| 57.92% | 0.345 | 0.264 |
| 79.58% |
|
| PART |
| 56.09% | 0.329 |
| 0.408 |
| 100.49% |
| Random Forest |
| 55.48% | 0.316 | 0.255 | 0.387 | 77.06% | 95.17% |
| Multilayer Perception | 0.44 | 51.82% | 0.255 |
| 0.442 |
| 108.79% |
| ONER |
|
|
|
| 0.427 |
| 105.14% |
| SMO | 0.04 |
|
| 0.299 |
| 90.16% |
|
Figure 2Decision tree of result.
Scores of the variables affecting the types of compulsion.
| Variable | Rank | Normalized |
|---|---|---|
| 10 obsession | 0.479 | 0.793 |
| 9 family_his | 0.044 | 0.073 |
| 7 disease_his | 0.027 | 0.044 |
| 5 occupation | 0.016 | 0.027 |
| 4 child | 0.009 | 0.014 |
| 6 income | 0.009 | 0.014 |
| 2 age | 0.007 | 0.012 |
| 1 gender | 0.007 | 0.012 |
| 3 marital status | 0.006 | 0.010 |
| Total | 0.604 | 1.000 |