Literature DB >> 33269221

Best early-onset Parkinson dementia predictor using ensemble learning among Parkinson's symptoms, rapid eye movement sleep disorder, and neuropsychological profile.

Haewon Byeon1.   

Abstract

BACKGROUND: Despite the frequent progression from Parkinson's disease (PD) to Parkinson's disease dementia (PDD), the basis to diagnose early-onset Parkinson dementia (EOPD) in the early stage is still insufficient. AIM: To explore the prediction accuracy of sociodemographic factors, Parkinson's motor symptoms, Parkinson's non-motor symptoms, and rapid eye movement sleep disorder for diagnosing EOPD using PD multicenter registry data.
METHODS: This study analyzed 342 Parkinson patients (66 EOPD patients and 276 PD patients with normal cognition), younger than 65 years. An EOPD prediction model was developed using a random forest algorithm and the accuracy of the developed model was compared with the naive Bayesian model and discriminant analysis.
RESULTS: The overall accuracy of the random forest was 89.5%, and was higher than that of discriminant analysis (78.3%) and that of the naive Bayesian model (85.8%). In the random forest model, the Korean Mini Mental State Examination (K-MMSE) score, Korean Montreal Cognitive Assessment (K-MoCA), sum of boxes in Clinical Dementia Rating (CDR), global score of CDR, motor score of Untitled Parkinson's Disease Rating (UPDRS), and Korean Instrumental Activities of Daily Living (K-IADL) score were confirmed as the major variables with high weight for EOPD prediction. Among them, the K-MMSE score was the most important factor in the final model.
CONCLUSION: It was found that Parkinson-related motor symptoms (e.g., motor score of UPDRS) and instrumental daily performance (e.g., K-IADL score) in addition to cognitive screening indicators (e.g., K-MMSE score and K-MoCA score) were predictors with high accuracy in EOPD prediction. ©The Author(s) 2020. Published by Baishideng Publishing Group Inc. All rights reserved.

Entities:  

Keywords:  Discriminant analysis; Early-onset Parkinson dementia; Ensemble learning method; Naive Bayesian model; Neuropsychological test; Risk factor

Year:  2020        PMID: 33269221      PMCID: PMC7672787          DOI: 10.5498/wjp.v10.i11.245

Source DB:  PubMed          Journal:  World J Psychiatry        ISSN: 2220-3206


Core Tip: It is believed that if the Korean Mini Mental State Examination (K-MMSE) is given priority over other cognitive screening tests in order to distinguish early-onset Parkinson dementia (EOPD) from Parkinson’s disease, the accuracy of detecting EOPD will be higher than conducting other screening tests first. However, further epidemiological studies will be needed to fully comprehend the results of better accuracy of the K-MMSE than that of Korean Montreal Cognitive Assessment while detecting EOPD using the developed ensemble-based prediction model.

INTRODUCTION

Dementia is a typical senile disease and indicates that a person with normal cognition experiences cognitive impairment due to various causes in the aging process. Dementia shows diverse symptoms, such as memory impairment; decreased cognitive functions, including language ability and frontal lobe executive function; and psychological symptoms of dementia (BPDS), depending on the type and progression of the disease[1]. It burdens caregivers psychologically and economically as well as the dementia patients[2]. In particular, care costs are extremely high as there is currently no cure for dementia, and it is necessary to take care of patients for a long time. As of 2019, South Korea spends KRW 14.6 trillion managing dementia, which is 0.8% of the GDP, and it is expected to increase more than seven times (KRW 106.5 trillion)[3]. It was reported that South Korea had 700000 dementia patients out of 7 million elderly people in 2017, which is already over 10% of the total elderly population[3]. It is approximately a 35% increase from 540000 dementia patients in 2012[3]. It tends to increase steadily by more than 7% per year[3]. Therefore, the reduction of dementia prevalence through the prevention, early diagnosis, and early management of dementia is the key to the mental health policy that the South Korean government must resolve as South Korea has an aging society. Geriatricians evaluate the characteristics of dementia by classifying it into several types to diagnose dementia as soon as possible. Recently, many studies[4,5] examined the characteristics of the disease after categorizing it into early-onset dementia (EOD: Occurring before 65 years old) and late-onset dementia (LOD: Occurring at 65 years old or later) based on the onset of dementia symptoms (age). These studies revealed that EOD and LOD showed differences in imaging tests as the disease progressed. For example, in the case of Alzheimer’s disease, EOD caused a greater loss of cerebral synapses or severe infiltration of senile plaques and neurofibrillary tangles than LOD[6-8]. Moreover, even the frontal and parietal lobes, as well as the temporal lobe, atrophied[6-8]. Additionally, since EOD patients are more likely to have a family history of dementia than LOD patients, it is suspected that EOD is affected by genetic predisposition more than LOD[7]. However, these imaging tests are not the ideal way to identify the onset of dementia in the early stages because they can only be used to diagnose dementia accurately by skilled medical personnel after dementia has progressed to some extent. Moreover, previous studies[9] that examined the characteristics of EOD mostly evaluated Alzheimer’s dementia. The demographic and neuropsychological characteristics of early-onset Parkinson dementia (EOPD) are relatively unknown. In summary, despite the frequent progression from Parkinson’s disease (PD) to Parkinson dementia (PDD), the basis to diagnose EOPD in the early stages is still insufficient. Currently, it is impossible to detect EOPD in the early stages just by using the cognitive screening test that is simply and commonly conducted for all types of dementia patients in South Korean public health centers. Although several biomarker candidates have been suggested based on the cerebrospinal fluid (CSF) test (e.g., Aβ1-42 and total tau) for diagnosing dementia in the early stages[10], the CSF test is not versatile because it causes pain in examinees, thus examinees tend to reject the test, and the reliability of it cannot be tested; these are limitations of this test. In other words, because a range of factors (e.g., gender, education level, and depression) affect EOPD[11-13], it would be necessary to develop a prediction model by applying PD motor and non-motor symptoms and sociodemographic indices[11,12] in addition to cognitive characteristics[14]. Byeon[15] argued that previous studies[16,17] were limited to the exploration of individual risk factors because they used regression models for predicting dementia and variables were quite limited because the prediction models mainly included neuropsychological tests. Therefore, there are limitations to developing a highly reliable model to predict EOPD using individual (single) indicators, such as PD symptoms and neuropsychological tests. In order to develop an accurate prediction model, it is necessary to develop a comprehensive model that includes sociodemographic indices, PD motor symptoms, PD non-motor symptoms, rapid eye movement (REM) sleep behavior disorder, and neuropsychological indices. Recent studies have used machine learning algorithms as a method to predict a high disease risk group[18,19]. Machine learning is a process of analyzing relationships and rules in data to extract valuable information from the data. Random decision forest (RF) has been used widely; it produces many decision trees using an ensemble algorithm to overcome the limitations of overfitting and predicts target variables by combining them[20]. We are not aware of any published RF-based machine learning studies to analyze EOPD prediction capability by considering sociodemographic factors, PD motor symptoms, PD non-motor symptoms, REM sleep disorder, and neuropsychological profiles, together with cognitive function. This study explored the prediction accuracy of sociodemographic factors, PD motor symptoms, PD non-motor symptoms, and REM sleep disorder for diagnosing EOPD using a large-scale PD registry dataset.

MATERIALS AND METHODS

Participants

This study was performed by analyzing the Parkinson’s Disease Epidemiology in Korea (PDEPI-Korea) multicenter registry data provided by the National Biobank of Korea (NB-Korea, No. KBN-2019-005). The study was approved by the Research Ethics Review Board of the NB-Korea (No. KBN-2019-005) and the Korea Centers for Disease Control and Prevention (Korea-CDC, No. KBN-2019-1327). The NB-K was founded in 2008 upon the approval of the Ministry of Health and Welfare due to the necessity of managing bio data systematically at the national level. It has been managed by the Korea CDC. The ultimate goal of the NBK is to promote biomedical research and public health. Please see Byeon[20] for details on the data source. PD was diagnosed by a psychiatrist according to the diagnostic criteria of the United Kingdom Parkinson's Disease Society Brain Bank[21]. In this study, PDD was defined as patients who met the diagnostic criteria of probable PDD, suggested by the Movement Disorder Society Task Force[22]. This study excluded patients who had other causative diseases, such as hydrocephalus and vascular Parkinsonism, determined from magnetic resonance imaging. This study analyzed 342 PD patients [66 EOPD patients and 276 PD patients with normal cognition (PD-NC)] who were younger than 65 years. Sample size calculations based on power analysis are shown in Figure 1. As the minimum number of samples calculated based on power analysis was 210 (group 1 = 105, group 2 = 105) with significance level (α) = 0.05, effect size d = 0.5 and power of test (1-β) = 0.95 on the standard of normal distribution, the number of samples (n = 342) in our study was appropriate.
Figure 1

Sample size calculations.

Sample size calculations.

Measurement

The outcome variable was defined as the presence of EOPD (yes or no), a binary variable, from a diagnosis by a neurologist. The explanatory variables included age; gender; education level (middle school graduate and below, or high school graduate and above); dominant hand (left hand or right hand); family PD history (yes or no); family dementia history (yes or no); pack-years (non-smoking, 1-20, 21-40, 41-60, or ≥ 61 pack-years); coffee-drinking (yes or no); coffee drinking period (no, ≤ 5, 6-9, or ≥ 10 years); mean coffee intake per day (no, ≤ 1, 2-3, or ≥ 4 cups); pesticide exposure recognition (never, currently not exposed but exposed previously, or currently exposed to pesticide); disease history (manganese poisoning, carbon monoxide poisoning, encephalitis, traumatic brain injury, stroke, alcoholism, diabetes, hyperlipidemia, hypertension, and/or atrial fibrillation); PD-related motor signs (tremor, akinesia/bradykinesia, postural instability, and/or late motor complications); neuropsychological characteristics determined by assessments such as the Korean Mini Mental State Examination (K-MMSE)[23], the Korean Montreal Cognitive Assessment (K-MoCA)[24], the sum of boxes in Clinical Dementia Rating (CDR)[25], the global CDR score[26], Korean Instrumental Activities of Daily Living (K-IADL)[27], the total score of Untitled Parkinson’s Disease Rating (UPDRS)[28], the motor score of UPDRS[29], Hoehn and Yahr staging (H&Y staging)[30], and the Schwab & England Activities of Daily Living scale (Schwab & England ADL)[31]; and REM sleep behavior disorders. The definitions of the explanatory variables are shown in Table 1.
Table 1

Measurement and definition of variables

Variable
Measurement
Characteristics
Sociodemographic factorsAgeContinuous variable
GenderMale or female
EducationMiddle school graduate and below or high school graduate and above
Mainly used handLeft hand, right hand, or both hands
Family dementia historyYes or No
Family PD historyYes or No
Environmental factorsExposure to pesticideNever, currently not exposed but exposed previously, or currently exposed to pesticide
Health behaviorsPack-yearsNon-smoking, 1-20, 21-40, 41-60, or ≥ 61 pack-years
Coffee-drinkingYes or No
Mean coffee intake per day (cups/d)No, ≤ 1, 2-3, or ≥ 4 cups
Coffee drinking period (yr)No, ≤ 5, 6-9, or ≥ 10
Disease historyCarbon monoxide poisoningYes or No
Manganese poisoningYes or No
EncephalitisYes or No
Traumatic brain injuryYes or No
StrokeYes or No
AlcoholismYes or No
DiabetesYes or No
HypertensionYes or No
HyperlipidemiaYes or No
Atrial fibrillationYes or No
Neuropsychological characteristicsTotal score of KMMSEContinuous variable
Total score of KMoCA
Global CDR score
Sum of boxes in CDR
K-IADL
Total score of UPDRS
Motor score of UPDRS
H&Y staging
Schwab & England ADL
Sleep behavior disordersREM sleep behavior disordersYes or No
Exercise characteristics related to PD (PD related motor signs)TremorYes or No
RigidityYes or No
BradykinesiaYes or No
Postural instabilityYes or No
LMCYes or No

PD: Parkinson’s disease; KMMSE: Korean Mini Mental State Examination; KMoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living scale; REM: Rapid eye movement; LMC: Late motor complications.

Measurement and definition of variables PD: Parkinson’s disease; KMMSE: Korean Mini Mental State Examination; KMoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living scale; REM: Rapid eye movement; LMC: Late motor complications.

Development and evaluation of EOPD prediction model

The EOPD prediction model was developed using a RF algorithm and the accuracy of the developed model was compared with the naive Bayesian model and discriminant analysis. All analyses were performed using R version 3.5.2 (Foundation for Statistical Computing, Vienna, Austria). RF is an ensemble classifier that randomly learns multiple decision trees and is a machine learning method based on the meta-learning of decision trees. It consists of a training stage composing many decision trees and a test stage that classifies or predicts when an input vector is entered. The ensemble form of training data can be expressed as Forest F = {f1,… , fn}. The distributions obtained from the decision trees of each forest were averaged by the number (T) of decision trees and were then classified. For combining the predictors of each sample, the average was used when the target variable was a continuous variable, and the majority vote was used when it was a categorical variable (Figure 2).
Figure 2

The random forest.

The random forest. RF is similar to bagging in that it improves stability by combining decision trees generated from multiple bootstrap samples, based on the majority rule. However, it is conceptually different from bagging because it uses explanatory variables, which are randomly selected in each bootstrap sample. The RF can be theoretically free from overfitting because it contains randomness for both features and learning instances. Moreover, it is not much affected by noise or outliers and it is more accurate than other machine learning methods, such as decision trees. The accuracy of RF increases when the number of trees increases. However, it may suffer from an elbow point, indicating a steep decrease in slope. Moreover, each tree is more likely to have a more complex structure when non-critical explanatory variables are selected. Consequently, this study used the grid search method that can minimize problems such as elbow point by considering mtry number (n_estimator), indicating the number of candidates for explanatory variables among RF hyperparameters in advance. The procedure of developing an RF-based prediction model is presented in Figure 3.
Figure 3

The development process of a random decision forest-based prediction model.

The development process of a random decision forest-based prediction model.

Comparison of model prediction accuracy

This study selected an algorithm with the best model performance as the final model by comparing the overall accuracy of RF, discriminant analysis, and the naive Bayesian model. Moreover, this study showed the variable importance of the final model. A partial dependence plot was presented to visually confirm the marginal effects of an input variable with the highest importance on a response variable. The function of partial dependence is given in the following Equation.
Equation
In the above equation, p1 (x, xic) is Pr (Y = 1), calculated from a specific value of an interest variable (x) and a fixed value of the remaining predictor (xic). This probability is calculated as the ratio classified as Y = 1 category in the corresponding random decision tree. In other words, partial dependence and the log odds of the logic model share the same concept, and it is the mean after calculating the log odds from all observations i.

RESULTS

General characteristics of the participants

The general characteristics of 342 participants with PD were analyzed (Table 2). The mean age of the subjects was 57.3 years old (SD = 5.7). The initial age at diagnosis of PD was 56.7 years old (SD = 5.9). Smokers made up 88.6% of the participants, subjects with a family history of PD were 5.1% of the participants, and subjects with a family history of dementia were 7.4% of the participants. It was found that 19.3% of the subjects had EOPD.
Table 2

General characteristics of the subjects (n = 342)

Characteristics
n (%)
Age, mean ± SD (yr)57.3 ± 5.7
K-MMSE, mean ± SD25.6 ± 4.0
K-MoCA, mean ± SD21.2 ± 5.1
Global CDR score, mean ± SD0.4 ± 0.3
Sum of boxes in CDR, mean ± SD1.5 ± 1.7
K-IADL, mean ± SD1.0 ± 2.4
Total score of UPDRS, mean ± SD41.3 ± 21.8
Motor score of UPDRS, mean ± SD 23.1 ± 11.1
H&Y staging, mean ± SD2.3 ± 0.6
Schwab & England ADL, mean ± SD77.5 ± 15.0
Gender
Male174 (50.9)
Female168 (49.1)
Education
Middle school graduate and below195 (57.0)
High school graduate and above147 (43.0)
Handness
Right318 (93.0)
Left15 (4.4)
Both hands9 (2.6)
Family PD history
No279 (94.9)
Yes15 (5.1)
Family dementia history
No264 (92.6)
Yes21 (7.4)
Smoking (pack year)
1-2018 (7.9)
21-409 (2.6)
41-603 (0.9)
61+303 (88.6)
Coffee consumption
No174 (50.9)
Yes168 (49.1)
Carbon monoxide poisoning
No294 (93.3)
Yes21 (6.7)
Traumatic brain injury
No306 (97.1)
Yes9 (2.9)
Diabetes
No276 (82.3)
Yes60 (17.7)
Hypertension
No249 (73.5)
Yes90 (26.5)
Hyperlipidemia
No303 (89.4)
Yes36 (10.6)
Atrial fibrillation
No336 (99.1)
Yes3 (0.9)
Tremor
No120 (36.0)
Yes213 (64.0)
Rigidity
No24 (7.2)
Yes309 (92.8)
Bradykinesia
No21 (6.3)
Yes312 (93.7)
Postural instability
No159 (50.5)
Yes156 (49.5)
REM sleep behavior disorders
No195 (61.3)
Yes123 (38.7)
Late motor complications
Only ON-OFF/Wearing OFF57 (17.9)
Only levodopa-induced dyskinesia12 (3.8)
Both ON-OFF/Wearing OFF and levodopa-induced dyskinesia are present48 (15.1)
Both ON-OFF/Wearing OFF and levodopa-induced dyskinesia are absent201 (63.2)
Depression
No147 (67.1)
Yes72 (32.9)

K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living; PD: Parkinson’s disease; REM: Rapid eye movement.

General characteristics of the subjects (n = 342) K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living; PD: Parkinson’s disease; REM: Rapid eye movement.

Development of the EOPD prediction model using RF

This study changed mtry values, presenting the number of explanatory variables to be used in the decision tree constituting RF, from 3 to 13, and selected values with the smallest error of Out-Of-Bag. The changes in the error of Out-Of-Bag are presented in Table 3. The optimal mtry to be applied in this study was 4, showing the lowest error rate (10.5%). When n tree, the number of tree generations, and mtry were set as 500 and 4, respectively, the final RF model of this study had an overall accuracy of 89.5%.
Table 3

Error of Out-Of-Bag

Mtry (n)
Error of Out-Of-Bag
30.140
40.105
50.149
60.132
70.140
80.123
90.149
100.123
110.140
120.158
130.149
Error of Out-Of-Bag

Selection of the final EOPD prediction model

The overall accuracy of the RF was 89.5%, and it was higher than that of both discriminant analysis (78.3%) and the naive Bayesian model (85.8%). Therefore, the RF was assumed to be the most accurate prediction model among EOPD prediction models, and it was selected as the final prediction model. In Figure 3, the black line indicates the changes in each error rate against 500 bootstrap samples. Figure 4 shows that the changes in error rate became relatively stable when the number of bootstrap samples exceeded 60. Additionally, the multidimensional scaling plot of RF, which visualizes the classification results through a two-dimensional diagram, is presented in Figure 5.
Figure 4

Error rate of the random forest model (500 trees).

Figure 5

Multidimensional scaling plot of random forest (blue = early-onset Parkinson dementia and red = Parkinson’s disease)

Error rate of the random forest model (500 trees). Multidimensional scaling plot of random forest (blue = early-onset Parkinson dementia and red = Parkinson’s disease)

Importance of variables in the final EOPD prediction model

The normalized importance of variables in the RF model, the final model, is presented in Figure 6 and Table 4. In this model, K-MMSE score, K-MoCA score, sum of boxes in CDR, global score of CDR, motor score of UPDRS, and K-IADL score were confirmed as the major variables with high weight for EOPD prediction. Among them, K-MMSE score was the most important factor in the final model.
Figure 6

Importance of variables in the random forest-based early-onset Parkinson dementia prediction model (only the top six are presented). K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating.

Table 4

The normalized importance of variables in the random forest model

Variables
Mean decrease Gini
K-MMSE7.224
K-MoCA2.992
Sum of boxes in CDR score2.872
Global CDR score2.304
Motor UPDRS2.104
K-IADL1.720
Total UPDRS1.587
Schwab & England ADL1.258
H&Y staging1.040
Late motor complications0.775
Consumption of coffee0.699
Education level0.527
Pack year0.505
BDI0.409
Tremor0.338
Postural instability0.338
Rigidity0.331
Gender0.255
REM sleep behavior disorders0.249
Hypertension0.168
Handness0.151
Diabetes0.146
Hyperlipidemia0.129
Carbon monoxide poisoning0.124
Family PD history0.074
Family dementia history0.066
Bradykinesia0.034
Manganese poisoning0.013
Traumatic brain injury0.002
Atrial fibrillation< 0.001

K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living; PD: Parkinson’s disease; REM: Rapid eye movement; BDI: Beck Depression Inventory.

Importance of variables in the random forest-based early-onset Parkinson dementia prediction model (only the top six are presented). K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating. The normalized importance of variables in the random forest model K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating; K-IADL: Korean Instrumental Activities of Daily Living; UPDRS: Score of Untitled Parkinson’s Disease Rating; H&Y staging: Hoehn and Yahr staging; Schwab & England ADL: Schwab & England Activities of Daily Living; PD: Parkinson’s disease; REM: Rapid eye movement; BDI: Beck Depression Inventory. The partial dependence plot for K-MMSE, the most important variable in the EOPD prediction model, is presented in Figure 7. When the other factors (variables) were identical, the probability of the absence of EOPD tended to decrease as K-MMSE scores increased (Figure 6). In other words, it was confirmed that K-MMSE had the largest impact on EOPD prediction even after adjusting for other neuropsychological tests, PD symptoms, medical history, REM sleep disorder, depression, and sociodemographic factors.
Figure 7

Partial dependence plot. K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating.

Partial dependence plot. K-MMSE: Korean Mini Mental State Examination; K-MoCA: Korean Montreal Cognitive Assessment; CDR: Clinical Dementia Rating.

DISCUSSION

Choosing a test with high feasibility and accuracy is critical to easily detect EOPD from PD in the point-of-care environment. It is required to comprehensively compare prediction accuracy for various predictors of EOPD, such as neuropsychological tests, lifestyle, sociodemographic factors, PD symptoms, depression, and REM sleep disorders. This study analyzed the prediction accuracy of various cognitive screening tests and neuropsychological profiles that could distinguish EOPD from PD using RF. The tests were ranked from greatest to least accurate as follows: K-MMSE score, K-MoCA score, sum of boxes in CDR, global score of CDR, motor score of UPDRS, and K-IADL score. It is noteworthy that the motor score of UPDRS, in addition to cognitive screening tests, was an important test in predicting EOPD. This is probably because participants with EOPD were more likely to show non-typical symptoms, such as movement problems, gait problems, and coordination problems[22], and the motor score of UPDRS could comprehensively measure these PD motor symptoms. In this study, K-MMSE score was the most important neuropsychological test for detecting EOPD. Moreover, the accuracy of K-MMSE was higher than that of K-MoCA. An essential factor in the diagnosis of EOPD is a decline in cognitive function that began after the onset of PD. This decline in cognitive function gradually progresses in various domains, such as executive function, memory, and visuospatial function. In particular, it has been reported that PDD patients experience impaired executive functions, reflecting the decrease in the ability to solve problems from the early stages of dementia[32] as well as impaired visuospatial function[33]. It is known that the impairment of other types of dementia, such as Alzheimer’s disease, was significantly lower than that[34]. K-MMSE and K-MoCA have been widely used as screening tests that simply compare the decline of various cognitive functions by types and comprehensively assess cognitive functions prior to in-depth tests in the point-of-care environment. The results of this study showed that the accuracy of K-MMSE was higher than that of K-MoCA when distinguishing EOPD from PD. Therefore, it is believed that if K-MMSE is given priority over other cognitive screening tests in order to distinguish EOPD from PD, the accuracy of detecting EOPD will be higher than conducting other screening tests first. However, further epidemiological studies will be needed to fully comprehend the results of the better accuracy of K-MMSE than that of K-MoCA while detecting EOPD, using the developed ensemble-based prediction model. Machine learning has the disadvantage of being unable to interpret the derived results, although it has better prediction accuracy than the traditional regression model. Therefore, future studies are required to develop explainable artificial intelligence models that have high prediction accuracy and are able to interpret results. Another finding of this study was that the accuracy of RF was higher than that of both the naive Bayesian model and discriminant analysis. These results agreed with Byeon[35], which showed that the ensemble algorithm was more accurate than a regression analysis or decision trees for predicting cognitive impairment in old age. RF has high prediction performance because it generates various decision trees from a number of bootstrap samples, and the possibility of overfitting is low[35]. In particular, RF showed good predictive performance even when classifying binary variables using imbalanced disease data[15,18]. Therefore, it is believed that, compared to traditional statistical techniques such as discriminant analysis, using RF will increase accuracy while exploring major variables, allowing us to predict EOPD. The importance of this study was that it identified the prediction accuracy of sociodemographic factors, PD motor symptoms, PD non-motor symptoms, REM sleep disorder, and neuropsychological profiles for distinguishing EOPD from PD, using national examination data conducted by the National Biobank of Korea. The limitations of the study are as follows: (1) The data source of this study was the registry data of multiple institutions and subjects were not randomly sampled; (2) The prediction model did not include candidate markers, genetic information, or biomarkers; (3) Genes such as PRKN and LRRK2 are known to be risk factors for PD and highly related to cognitive functions; and (4) Even though administration of PD medicine could affect the results of cognitive tests, it was not considered as an input variable of the prediction model. Therefore, it is expected that it will be possible to derive more clinically meaningful results when a prediction model is developed by including genetic information or biomarkers in addition to neuropsychological tests. Furthermore, since PD medicine influences the expression of behavioral symptoms and cognitive symptoms, it is necessary to investigate the application of it when developing an EOPD prediction model in the future.

CONCLUSION

It was found that Parkinson-related motor symptoms (e.g., motor score of UPDRS) and instrumental daily performance (e.g., K-IADL score), in addition to cognitive screening indicators (e.g., K-MMSE score and K-MoCA score), were highly accurate predictors in EOPD prediction. Moreover, the accuracy of RF was higher than that of both the naive Bayesian model and discriminant analysis. This study showed the need for a customized screening test that can detect EOPD early using biomarkers or genetic big data.

ARTICLE HIGHLIGHTS

Research background

Despite the frequent progression from Parkinson’s disease (PD) to Parkinson dementia, the basis to diagnose early-onset Parkinson dementia (EOPD) in the early stage is still insufficient.

Research motivation

It is limited to develop a highly-reliable model to predict EOPD using individual indicators such as PD symptoms and neuropsychological tests. In order to develop an accurate prediction model, it is necessary to develop a comprehensive model that includes sociodemographic indices, Parkinson’s motor symptoms, Parkinson’s non-motor symptoms, rapid eye movement (REM) sleep behavior disorder, and neuropsychological indices.

Research objectives

The objectives of our study were to explore the prediction accuracy of sociodemographic factors, Parkinson’s motor symptoms, Parkinson’s non-motor symptoms, and REM sleep disorder for diagnosing EOPD using PD multicenter registry data.

Research methods

This study was performed by analyzing the Parkinson’s Disease Epidemiology multicenter registry data provided by the National Biobank of Korea. This study analyzed 342 Parkinson patients (66 EOPD patients and 276 PD patients with normal cognition, younger than 65 years). The EOPD prediction model was developed using a random forest algorithm and the accuracy of the developed model was compared with the naive Bayesian model and discriminant analysis.

Research results

When the factors of EOPD were compared using “normalized importance of variables”, the Korean Mini Mental State Examination score was the most important factor of EOPD. Also, the accuracy of random decision forest was higher than that of naive Bayesian model and that of discriminant analysis.

Research conclusions

It is believed that using random forest will increase accuracy while exploring major variables allowing us to predict EOPD, compared to traditional statistical techniques such as discriminant analysis.

Research perspectives

It is necessary to develop a customized screening test that can early detect EOPD using biomarkers or genetic big data.

ACKNOWLEDGEMENTS

The authors wish to thank the NB-Korea for providing the raw data.
  26 in total

1.  Interrater reliability of the Unified Parkinson's Disease Rating Scale motor examination.

Authors:  M Richards; K Marder; L Cote; R Mayeux
Journal:  Mov Disord       Date:  1994-01       Impact factor: 10.338

Review 2.  Cognitive dysfunction and dementia in Parkinson's disease.

Authors:  J L W Bosboom; D Stoffers; E Ch Wolters
Journal:  J Neural Transm (Vienna)       Date:  2004-06-30       Impact factor: 3.575

Review 3.  Diagnostic procedures for Parkinson's disease dementia: recommendations from the movement disorder society task force.

Authors:  Bruno Dubois; David Burn; Christopher Goetz; Dag Aarsland; Richard G Brown; Gerald A Broe; Dennis Dickson; Charles Duyckaerts; Jefferey Cummings; Serge Gauthier; Amos Korczyn; Andrew Lees; Richard Levy; Irene Litvan; Yoshikuni Mizuno; Ian G McKeith; C Warren Olanow; Werner Poewe; Cristina Sampaio; Eduardo Tolosa; Murat Emre
Journal:  Mov Disord       Date:  2007-12       Impact factor: 10.338

Review 4.  The Unified Parkinson's Disease Rating Scale (UPDRS): status and recommendations.

Authors: 
Journal:  Mov Disord       Date:  2003-07       Impact factor: 10.338

5.  A longitudinal of Parkinson's disease: clinical and neuropsychological correlates of dementia.

Authors:  W G Reid; M A Hely; J G Morris; G A Broe; M Adena; D J Sullivan; P M Williamson
Journal:  J Clin Neurosci       Date:  1996-10       Impact factor: 1.961

6.  Quality of Life in People with Young-Onset Alzheimer's Dementia and Frontotemporal Dementia.

Authors:  Lara Hvidsten; Knut Engedal; Geir Selbæk; Torgeir Bruun Wyller; Frøydis Bruvik; Hege Kersten
Journal:  Dement Geriatr Cogn Disord       Date:  2018-04-25       Impact factor: 2.959

7.  Validity of early-onset dementia diagnoses in VA electronic medical record administrative data.

Authors:  Janice C Marceaux; Jason R Soble; Justin J F O'Rourke; Alicia A Swan; Margaret Wells; Megan Amuan; Hari Krishna Raju Sagiraju; Blessen C Eapen; Mary Jo Pugh
Journal:  Clin Neuropsychol       Date:  2019-10-23       Impact factor: 3.535

8.  Predictive factors associated with carcinoid syndrome in patients with gastrointestinal neuroendocrine tumors.

Authors:  Beilei Cai; Michael S Broder; Eunice Chang; Tingjian Yan; David C Metz
Journal:  World J Gastroenterol       Date:  2017-10-28       Impact factor: 5.742

9.  Application of Machine Learning Technique to Distinguish Parkinson's Disease Dementia and Alzheimer's Dementia: Predictive Power of Parkinson's Disease-Related Non-Motor Symptoms and Neuropsychological Profile.

Authors:  Haewon Byeon
Journal:  J Pers Med       Date:  2020-04-28

10.  Is the Random Forest Algorithm Suitable for Predicting Parkinson's Disease with Mild Cognitive Impairment out of Parkinson's Disease with Normal Cognition?

Authors:  Haewon Byeon
Journal:  Int J Environ Res Public Health       Date:  2020-04-10       Impact factor: 3.390

View more
  1 in total

1.  Predicting the Severity of Parkinson's Disease Dementia by Assessing the Neuropsychiatric Symptoms with an SVM Regression Model.

Authors:  Haewon Byeon
Journal:  Int J Environ Res Public Health       Date:  2021-03-04       Impact factor: 3.390

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.