Literature DB >> 35088670

Random-Forest-Algorithm-Based Applications of the Basic Characteristics and Serum and Imaging Biomarkers to Diagnose Mild Cognitive Impairment.

Juan Yang1,2, Haijing Sui3, Ronghong Jiao4, Min Zhang5, Xiaohui Zhao5, Lingling Wang6, Wenping Deng7, Xueyuan Liu1,2.   

Abstract

BACKGROUND: Mild cognitive impairment (MCI) is considered the early stage of Alzheimer's Disease (AD). The purpose of our study was to analyze the basic characteristics and serum and imaging biomarkers for the diagnosis of MCI patients as a more objective and accurate approach.
METHODS: The Montreal Cognitive Test was used to test 119 patients aged ≥65. Such serum biomarkers were detected as preprandial blood glucose, triglyceride, total cholesterol, Aβ1-40, Aβ1-42, and P-tau. All the subjects were scanned with 1.5T MRI (GE Healthcare, WI, USA) to obtain DWI, DTI, and ASL images. DTI was used to calculate the anisotropy fraction (FA), DWI was used to calculate the apparent diffusion coefficient (ADC), and ASL was used to calculate the cerebral blood flow (CBF). All the images were then registered to the SPACE of the Montreal Neurological Institute (MNI). In 116 brain regions, the medians of FA, ADC, and CBF were extracted by automatic anatomical labeling. The basic characteristics included gender, education level, and previous disease history of hypertension, diabetes, and coronary heart disease. The data were randomly divided into training sets and test ones. The recursive random forest algorithm was applied to the diagnosis of MCI patients, and the recursive feature elimination (RFE) method was used to screen the significant basic features and serum and imaging biomarkers. The overall accuracy, sensitivity, and specificity were calculated, respectively, and so were the ROC curve and the area under the curve (AUC) of the test set.
RESULTS: When the variable of the MCI diagnostic model was an imaging biomarker, the training accuracy of the random forest was 100%, the correct rate of the test was 86.23%, the sensitivity was 78.26%, and the specificity was 100%. When combining the basic characteristics, the serum and imaging biomarkers as variables of the MCI diagnostic model, the training accuracy of the random forest was found to be 100%; the test accuracy was 97.23%, the sensitivity was 94.44%, and the specificity was 100%. RFE analysis showed that age, Aβ1-40, and cerebellum_4_6 were the most important basic feature, serum biomarker, imaging biomarker, respectively.
CONCLUSION: Imaging biomarkers can effectively diagnose MCI. The diagnostic capacity of the basic trait biomarkers or serum biomarkers for MCI is limited, but their combination with imaging biomarkers can improve the diagnostic capacity, as indicated by the sensitivity of 94.44% and the specificity of 100% in our model. As a machine learning method, a random forest can help diagnose MCI effectively while screening important influencing factors. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.

Entities:  

Keywords:  Machine learning; algorithms; cognitive dysfunction; diagnostic tool; mild cognitive impairment; screening

Mesh:

Substances:

Year:  2022        PMID: 35088670      PMCID: PMC9189735          DOI: 10.2174/1567205019666220128120927

Source DB:  PubMed          Journal:  Curr Alzheimer Res        ISSN: 1567-2050            Impact factor:   3.040


INTRODUCTION

Mild cognition impairment (MCI) is well recognized as the early stage of Alzheimer's Disease (AD) [1-3]. Especially, about 38.2% of MCI patients developed AD within 5-10 years, while 2017 meta-analysis demonstrated that 5-15% of those aged over 65 and had MCI (amnestic MCI) developed AD every year [4, 5]. With aging, the incidence rate of MCI grows higher, the prevalence rate of aMCI in people aged ≥60 being approximately 17%-25% in different countries [6-9]. It is important that screening be conducted for MCI; however, up till now, the diagnostic method of MCI has taken the form of scale, which is known to be subjective and heterogeneous [10, 11]. Since the imaging and serum demographic characteristics have been considered the risk factors of MCI, researchers have never stopped pursuing the biomarkers of MCI, which indicate the progress to AD [12-14]. Neuroimaging biomarkers are considered promising as they are noninvasive and convenient [15]. The common neuroimaging biomarkers are found across different MRI modalities and sequences, such as Diffusion Tensor Imaging (DTI), Diffusion-Weighted Imaging (DWI), and Arterial Spin Labeling (ASL). As previously reported, DTI is used to examine the correlation between dynamic Vasomotor Reactivity (DVR), indicating the dysregulation of the cerebral microcirculation, which is considered an early cause of cognitive impairment [16]. DTI can help find microstructural changes; for example, reduced anisotropy always indicates MCI or the early stage of AD [17, 18]. Gyebnár et al. found that when voxel-wise and region of interest (ROI) analyses of fractional anisotropy (FA) were performed with ANOVA, FA was found to be lower in white matter ROI of individuals with MCI and that as logistic regression showed, measuring FA of the crus of fornix along grey matter volumetry improved the discrimination of aMCI from non-MCI individuals [17]. Tu et al. concluded that FA measures appeared to be more sensitive DTI parameters than MD values in detecting microstructural changes between subjective cognitive decline and MCI, suggesting that MCI had significant inverse correlations with FA value within the genu of the corpus callosum and left forceps minor and that based on regression analysis, MCI was best predicted by the FA value within the left forceps minor [19]. Ray et al. evaluated the regional alterations in the ADC of cortical gray and white matter and subcortical structures, which are known to be involved in MCI, stating that ADC from gray and white matter of different brain regions could be analyzed by applying an automated template-masking method in conjunction with a skeleton-based region competition segmentation algorithm [20]. Moreover, DTI-based network measures were found to be novel predictors of AD progression [21], which supports the evidence that the vascular mechanism is considered as an underlying component of cognitive impairment [22]. Recently, new mathematical methods have been proposed to utilize multi-modality MRI imaging to identify the potential imaging biomarkers of MCI or pre-AD, such as binary network matrices derived from DWI and DTI [23]. Considering serum biomarkers, some studies have suggested that elevated total cholesterol (TC), triglycerides (TG), and low-density lipoprotein (LDL-C) levels were the potential risk factor for cognitive impairment [24-27]. Glucose has also been considered the biomarker of MCI or AD [28, 29]. Considering the basic characteristics, age, gender, education, and hypertension are also regarded as the risk factors and biomarkers of MCI or AD [30-33]. However, fewer studies have explored why imaging and serum biomarkers and basic characteristics can be considered the best variables for diagnosing MCI. The purpose of our study was to apply the random-forest algorithm to analyze the basic characteristics and serum and imaging biomarkers to develop a better diagnostic model of MCI, which is to be more accurate and objective.

MATERIALS AND METHODS

Ethical Statement

This study was approved by the Medical Ethics Committee of Shanghai Pudong New Area People’s Hospital, Shanghai, China (Prylz-2020-085). Written informed consent was obtained from all participants or their legally acceptable representatives.

Subject Recruitment

MCI was defined according to the following criteria: 1) cognitive concern or complaint by the subject, informant, nurse, or physician, with CDR 5 0.5; 2) objective impairment in at least 1 cognitive domain based on performance 1.5 SD below the mean using the norms obtained in the pilot study; 3) essentially normal functional activities, determined by the CDR and the ADL evaluation; and 4) absence of dementia, decided by DSM-IV. For the study, all subjects were chosen from the local community. The Montreal Cognitive Assessment (MoCA) was used to assess MCI in individuals aged ≥65 without a medical history of schizophrenia, mental retardation, Parkinson's disease, stroke, or other medical conditions that could cause any problem to the assessment. Those defined as MCI patients and normal persons received MRI examinations based on their informed consent forms signed voluntarily. We excluded those who had vascular disease, Parkinson’s disease, tumors, mental disorders, abnormal hearing and vision, and drug abuse based on history, MRI examination, and blood biomarker testing.

Medical and Neurological Examination

The questionnaire survey of MoCA/CDR/ADL and score calculation were performed by the five physicians who received unified training on MoCA-testing. The demographic data referred to age, gender, and medical history involving hypertension, diabetes, and heart disease. To begin with, the participants were screened by MoCA, with the score of 26 being the threshold of normal and MCI [34].

MRI Examination

MRI 1.5T (GE Healthcare, WI, USA) examination was performed at the Department of Image, Shanghai Pudong New Area People’s Hospital in Shanghai, China. The sequences included T1, DTI, ASL, and DWI. The indexes selected were fractional anisotropy (FA), cerebral blood flow (CBF), and apparent diffusion coefficient (ADC).

Brain Imaging Segmentation

The skull-stripped images were obtained using the mixture models before registering to the Montreal Neurological Institute (MNI) space using the SyN method [35]. FA, ADC, and CBF median values were extracted from 116 brain regions using automated anatomical labeling parcellation (AAL) [36].

Serum Examination

Whole blood after overnight fasting was collected from all participants using venipuncture; 4ml of blood was collected into an anticoagulant tube (BD vacutainer, USA), which was kept for 1h at room temperature (RT) before centrifuged at 1,000g for 10 mins at RT. The resultant supernatant (serum) was divided into 2 Eppendorf tubes (1ml each) to be temporarily stored at -80°C until examination. Eight components were selected for this study (Aβ1-40, Aβ1-42, P-tau, preprandial blood glucose, high-density lipoprotein cholesterol (HDL), low-density lipoprotein cholesterol (LDL). The enzyme-linked immunosorbent assay (ELISA)-based techniques were used to assess serum Aβ1-40/Aβ1-42/P-tau. The serum levels of Aβ1-40 (Cat. No: DAB140B, Sensitivity: 1.31-8.17pg/ml), Aβ1-42 (Cat. No: DAB142, Sensitivity:0.762-4.73pg/ml), and P-tau (Cat. No: CSB-E17929h, Sensitivity:<7.8pg/ml) were quantified using commercial ELISA kits purchased from R&D Systems (MN, USA) according to the manufacturer’s protocol. The concentrations of serum HDL/LDL/preprandial blood glucose were determined by a Cobas C501 automatic biochemistry analyzer using the enzymatic conversion method. The kit was supplied by the Roche Diagnostics GmbH (Mannheim, Germany).

Random-forest Algorithm

All the data were randomly split into the training and test set. Recursive random forest algorithm, a method of supervised machine learning, was applied to the diagnosis of MCI, by screening the basic characteristics and significant biomarkers of serum and imaging using recursive feature elimination (RFE) as well as calculating the overall accuracy, sensitivity, and specificity by receiver operating characteristic curves (ROC) and areas under the curve (AUC) of the test set. The diagnostic models of MCI based on the basic characteristics, serum biomarkers, and imaging biomarkers, separately and as a whole based on all the biomarkers involving the basic characteristics and serum and imaging biomarkers were established. We examined the superiority of the model in terms of accuracy, sensitivity, specificity, and AUC. Afterward, through RFE, we analyzed the significant basic characteristics and serum and imaging biomarkers in different models. Random forest algorithm was implemented based on Anaconda, a Python-based data science platform.

RESULTS

A total of 119 people aged over 65 were assessed by MOCA before voluntarily being undergone MRI and serum examination. Of 119 subjects, 55 and 64 were placed under MCI and NC groups, respectively; a list of their basic characteristics was made (Table ). With the brain regions segmented, 116 cerebral areas were extracted (Attached in accessory), accounting for FA value of DTI, ADC value of DWI, and CBF value of ASL. In the case of the basic characteristics as the variables of the diagnostic model of MCI, the random forest indicated training accuracy to be 96.38%, test accuracy to be 38.88%, sensitivity to be 63.15%, and specificity to be 58.82% with AUC=0.56. In the case of serum biomarkers as the variables of the diagnostic model of MCI, the random forest indicated training accuracy to be 100%, test accuracy to be 38.99%, sensitivity to be 44%, specificity to be 27.27% with AUC=0.35. In the case of the imaging biomarkers as the variables of the diagnostic model of MCI, the random forest indicated training accuracy to be 100%, test accuracy to be 86.23%, sensitivity to be 88.89%, and specificity to be 94.44% with AUC=0.97. When the three sets of variables of the diagnostic model of MCI were combined, the random forest demonstrated training accuracy to be 100%, test accuracy to be 97.23%, sensitivity to be 94.44%, and specificity to be 100%. All ROC of the test set was achieved using the three sets of variables of the diagnostic model of MCI (Fig. ). According to the observation made on the importance of the basic characteristics as the variables of the diagnostic model of MCI, the random forest algorithm sorted an order: age > hypertension > education > gender > diabetes > coronary heart disease (Fig. ). Considering the importance of serum characteristics, the random forest algorithm presented an order as: Aβ1-42/1-40 > Aβ1-42 > TAU > triglycerides > Aβ1-40 >LDL cholesterol (Fig. ). In the case of the importance of imaging biomarkers, the random-forest algorithm presented an order as top ten imaging biomarkers: FA of left cerebellum_4_6, DWI of right insula, DWI of left olfactory, FA of left inferior frontal gyrus-opercular part, CBF of vermis_7, DWI of the posterior cingulate gyrus, DWI of right fusiform, FA of the left rolandic operculum, DWI of right olfactory, and FA of left insula. Especially, FA of cerebellum_4_6_L was the most important variable of the diagnostic model of MCI (Fig. ).

DISCUSSION

Considering the scale-depended diagnostic method of MCI, which is known to be subjective and heterogeneous, our study aimed to pursue the biomarkers that could serve as a more objective approach to identifying MCI in aged individuals to develop a more effective diagnostic model of MCI. In our study, we chose a random forest algorithm to analyze the basic characteristics and biomarkers of serum and imaging as the variables of our diagnostic model of MCI, investigating which variables were more accurate and reliable. Random-forest algorithm in virtue of a backward elimination random forest (BWERF) improved the accuracy because it used backward elimination to exclude the noise genes and aggregated the individual importance values to determine the transcription factors (TFs) retention [37]. Guo L et al. used a random-forest algorithm to analyze biomarkers to predict prognosis in the patients with hepatocellular carcinoma so that they developed the valuable model verified by external authentication [38]. Our findings also suggested that the diagnostic model of MCI based on a random-forest algorithm was accurate by sorting out the important variables of the diagnostic model. In our study, we screened out the basic characteristics, serum biomarkers, and imaging biomarkers using a random forest algorithm. The imaging biomarkers as the variables of the diagnostic model of MCI were found to be more accurate and reliable than the basic characteristics and serum biomarkers, with the sensitivity of 88.89% and the specificity of 94.44%. Moreover, when the three dimensions of basic characteristics and serum and imaging biomarkers were combined, the diagnostic model of MCI came to be optimum, with the sensitivity of 94.44% and the specificity of 100%. The findings suggest that the random-forest algorithm can be quicker, more effective, and more accurate in screening out the diagnostic variables from many variables by sorting them into order according to the significance. This further suggests that in applying the random forest algorithm, the more variables the diagnostic model has, the better it will be. Thus, it is safe to conclude that random forest algorithms can have superiority over the traditional statistical methodology in terms of diagnostic or prediction models. In the current study, we analyzed through RFE the significant biomarkers of the basic characteristics, serum biomarkers, and imaging biomarkers in different models and found that in the diagnostic model of MCI in which we chose the basic characteristics as variables, the random forest algorithm helped sort out such an order of importance as age>hypertension> education > gender> diabetes> coronary heart disease. Although these demographic characteristics have been reported in different studies, in which they were considered intimate correlative with MCI or AD [28, 39-42], fewer research works have reported an order of importance. In the diagnostic model of MCI in which the serum biomarkers were chosen as variables, the random forest algorithm sorted out an order of importance as Aβ1-42/1-40 > Aβ1-42 > TAU > triglycerides > Aβ1-40 > LDL cholesterol. Serum Aβ and P-tau and LDL cholesterol have been considered effective biomarkers of MCI or AD [43-45]; however, we further evaluated their importance through RFE. Considering the imaging biomarkers in the current study, FA, ADC, and CBF median values were extracted from 116 brain regions using an automated anatomical labeling parcellation template, which differed from the traditional selection of image regions of interest (ROI), as indicated by its ability to obtain more comprehensive information about the whole brain, thus providing more accurate image information for machine learning and higher-quality image markers for the establishment of a diagnostic model. This could be ascribed to the use of brain image segmentation techniques. Thus, our random-forest-based diagnostic model of MCI was found to be highly sensitive and specific. The top ten imaging biomarkers were found to be FA of left cerebellum_4_6, DWI of right insula, DWI of left olfactory, FA of left inferior frontal gyrus-opercular part, CBF of vermis_7, DWI of the posterior cingulate gyrus, DWI of right fusiform, FA of the left rolandic operculum, DWI of right olfactory, and FA of left insula. Especially, FA of cerebellum_4_6_L was found to be the most important variable of the current diagnostic model. The FA of the left cerebellum was considered as the most important imaging biomarker, suggesting that the FA impairment of the left cerebellum could be a significant indicator in MCI individuals, which to some extent corresponds with the previously reported findings that abnormal cerebellum was correlated with MCI [46], but different from AD in which the hippocampus was found to be impaired [47]. In general, hippocampus atrophy is accepted as an imaging biomarker for AD diagnosis [48]; however, in our study, the cerebellum_4_6 of the MCI patients was found to be altered. Thus, we hypothesized that the cerebellum_4_6 could be altered earlier than the hippocampal atrophy.

CONCLUSION

Based on the random forest, the diagnostic model of MCI with imaging features as variables is found to be superior to the model with serum biomarkers and basic features. With the basic characteristics combined with the imaging and serum biomarkers, the diagnostic model of MCI is found to be optimum. The FA of left cerebellum through RFE is considered as the most important imaging biomarker. Moreover, the order of importance of the basic characteristics can be listed through RFE as age > hypertension > education > gender > diabetes > coronary heart disease, and the order of importance of the serum biomarkers is as follows: Aβ1-42/1-40 > Aβ1-42 > TAU > triglycerides > Aβ1-40 > LDL cholesterol.

LIMITATION

This was a small sample size study, which, to a great extent, made it a little difficult to obtain the serum and imaging biomarkers at the same time. Considering the diagnostic model, the validation was only internal, lacking external validation. The serum biomarkers were relatively fewer, which could show disadvantages when compared to the imaging ones as the features of the diagnostic model of MCI. Further investigations are needed to decide whether adding serum biomarkers could be useful as the serum-biomarker diagnostic model.

AUTHORS’ CONTRIBUTIONS

YJ conceived and designed the project. YJ performed all the experiments and wrote the manuscript. CGY performed the examination. DWP evaluated the random forest algorithm. ZM performed the brain imaging segmentation. All authors reviewed the manuscript.
Table 1

The basic characteristics and serum biomarkers of MCI and NC groups.

- MCI (N=55) NC (N=64)
Gender (Female)32 (43.8%)40(54.8%)
Hypertension (Yes)41(51.2%)39 (48.8%)
Diabetes (Yes)8 (57.1%)6 (42.9%)
Coronary heart disease (Yes)14 (60.9%)9 (39.1%)
Age (year)72±5.2470.20±4.77
Fasting plasma glucose (mmol/l)5.25±1.386.12±7.73
High density lipoprotein cholesterol (mmol/l)1.43±0.451.45±0.42
Low density lipoprotein cholesterol (mmol/l)2.37±0.822.54±0.82
Total cholesterol (mmol/l)4.13±1.004.43±1.24
Triglycerides (mmolL)1.27±0.821.34±0.80
Aβ1-40 (pg/ml)171.93±90.45141.58±49.31
Aβ1-42 (pg/ml)58.69±87.0338.49±37.06
P-tau (pg/ml)177.77±78.30175.46±98.37
  47 in total

Review 1.  Cognitive and psychological issues in postural tachycardia syndrome.

Authors:  Vidya Raj; Morwenna Opie; Amy C Arnold
Journal:  Auton Neurosci       Date:  2018-03-27       Impact factor: 3.145

2.  What can DTI tell about early cognitive impairment? - Differentiation between MCI subtypes and healthy controls by diffusion tensor imaging.

Authors:  Gyula Gyebnár; Ádám Szabó; Enikő Sirály; Zsuzsanna Fodor; Anna Sákovics; Pál Salacz; Zoltán Hidasi; Éva Csibri; Gábor Rudas; Lajos R Kozák; Gábor Csukly
Journal:  Psychiatry Res Neuroimaging       Date:  2017-10-31       Impact factor: 2.376

3.  Mild cognitive impairment: apparent diffusion coefficient in regional gray matter and white matter structures.

Authors:  Kimberly M Ray; Huali Wang; Yong Chu; Ya-Fang Chen; Alberto Bert; Anton N Hasso; Min-Ying Su
Journal:  Radiology       Date:  2006-10       Impact factor: 11.105

4.  Random-forest algorithm based biomarkers in predicting prognosis in the patients with hepatocellular carcinoma.

Authors:  Lingyun Guo; Zhenjiang Wang; Yuanyuan Du; Jie Mao; Junqiang Zhang; Zeyuan Yu; Jiwu Guo; Jun Zhao; Huinian Zhou; Haitao Wang; Yanmei Gu; Yumin Li
Journal:  Cancer Cell Int       Date:  2020-06-17       Impact factor: 5.722

5.  Metabolomics in the Development and Progression of Dementia: A Systematic Review.

Authors:  Yanfeng Jiang; Zhen Zhu; Jie Shi; Yanpeng An; Kexun Zhang; Yingzhe Wang; Shuyuan Li; Li Jin; Weimin Ye; Mei Cui; Xingdong Chen
Journal:  Front Neurosci       Date:  2019-04-12       Impact factor: 4.677

6.  A New Serum Biomarker Set to Detect Mild Cognitive Impairment and Alzheimer's Disease by Peptidome Technology.

Authors:  Koji Abe; Jingwei Shang; Xiaowen Shi; Toru Yamashita; Nozomi Hishikawa; Mami Takemoto; Ryuta Morihara; Yumiko Nakano; Yasuyuki Ohta; Kentaro Deguchi; Masaki Ikeda; Yoshio Ikeda; Koichi Okamoto; Mikio Shoji; Masamitsu Takatama; Motohisa Kojo; Takeshi Kuroda; Kenjiro Ono; Noriyuki Kimura; Etsuro Matsubara; Yosuke Osakada; Yosuke Wakutani; Yoshiki Takao; Yasuto Higashi; Kyoichi Asada; Takehito Senga; Lyang-Ja Lee; Kenji Tanaka
Journal:  J Alzheimers Dis       Date:  2020       Impact factor: 4.472

7.  Later-Onset Hypertension Is Associated With Higher Risk of Dementia in Mild Cognitive Impairment.

Authors:  Hongyun Qin; Binggen Zhu; Chengping Hu; Xudong Zhao
Journal:  Front Neurol       Date:  2020-11-26       Impact factor: 4.003

8.  Blood Lipids and Cognitive Performance of Aging Polish Adults: A Case-Control Study Based on the PolSenior Project.

Authors:  Oliwia McFarlane; Mariusz Kozakiewicz; Kornelia Kędziora-Kornatowska; Dominika Gębka; Aleksandra Szybalska; Małgorzata Szwed; Alicja Klich-Rączka
Journal:  Front Aging Neurosci       Date:  2020-11-17       Impact factor: 5.750

9.  Age- and Sex-Specific Prevalence and Modifiable Risk Factors of Mild Cognitive Impairment Among Older Adults in China: A Population-Based Observational Study.

Authors:  Jingzhu Fu; Qian Liu; Yue Du; Yun Zhu; Changqing Sun; Hongyan Lin; Mengdi Jin; Fei Ma; Wen Li; Huan Liu; Xumei Zhang; Yongjie Chen; Zhuoyu Sun; Guangshun Wang; Guowei Huang
Journal:  Front Aging Neurosci       Date:  2020-10-30       Impact factor: 5.750

10.  Serum Tau Proteins as Potential Biomarkers for the Assessment of Alzheimer's Disease Progression.

Authors:  Eunjoo Nam; Yeong-Bae Lee; Cheil Moon; Keun-A Chang
Journal:  Int J Mol Sci       Date:  2020-07-15       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.