| Literature DB >> 31649738 |
Xia-An Bi1,2, Ruipeng Cai1,2, Yang Wang1,2, Yingchao Liu1,2.
Abstract
Alzheimer's disease (AD) is a complex neurodegenerative disease involving a variety of pathogenic factors, and the etiology detection of this disease has been a major concern of researchers. Neuroimaging is a basic and important means to explore the problem. It is the main current scientific research direction for combining neuroimaging with other modal data to dig deep into the potential information of AD through the complementarities among multiple data points. Machine learning methods possess great potentiality and have reached some achievements in this research area. A few studies have proposed some solutions to the effects of multimodal data fusion, however, the overall analytical framework for data fusion and fusion result analysis has thus far been ignored. In this paper, we first put forward a novel multimodal data fusion method, and further present a new machine learning framework of data fusion, classification, feature selection, and disease-causing factor extraction. The real dataset of 37 AD patients and 35 normal controls (NC) with functional magnetic resonance imaging (fMRI) and genetic data was used to verify the effectiveness of the framework, which was more accurate in classification and optimal feature extraction than other methods. Furthermore, we revealed disease-causing brain regions and genes, such as the olfactory cortex, insula, posterior cingulate gyrus, lingual gyrus, CNTNAP2, LRP1B, FRMD4A, and DAB1. The results show that the machine learning framework could effectively perform multimodal data fusion analysis, providing new insights and perspectives for the diagnosis of Alzheimer's disease.Entities:
Keywords: Alzheimer’s disease; disease diagnosis; functional magnetic resonance imaging; gene; multimodal fusion analysis framework
Year: 2019 PMID: 31649738 PMCID: PMC6795747 DOI: 10.3389/fgene.2019.00976
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Baseline characteristics of AD and CN.
| Variables (Mean±SD) | AD(n =37) | CN(n = 35) | Statistic |
|---|---|---|---|
| Gender(M/F) | 19/18 | 13/22 |
|
| Age(years) | 75.35 ± 7.949 | 77.14 ± 6.175 |
|
*The p value was gained by the chi-square test.
**The p value was gained by the two-sample t-test.
Figure 1The overview of multimodal fusion analysis framework. The (A) denotes unprocessed fMRI data. The (B) denotes unprocessed gene data. The (C) denotes the fusion process of multimodal data. The (D) denotes the construction process of multimodal random forest model for classification and optimal feature extraction. The (E) denotes the extraction results of pathogenic brain regions and genes by feature fusion scheme and multimodal random forest model.
Figure 2The accuracy of MRF with different quantities of base classifiers.
Figure 3Classification performance of feature subset with different numbers of features.
Figure 4The top 20 features with the strongest recognition abilities.
The performance comparison of different methods.
| Method | Discoveries | Classification accuracy of SVM | Overlap with our method |
|---|---|---|---|
| Pearson + MRF | 245 | 0.8667 | — |
| Pearson + RSVMC | 400 | 0.8000 | 135 (p = 5.710129e-23) |
| Pearson + t-test | 351 | 0.7000 | 88 (p=1.054523e-11) |
| CCA + t-test | 313 | 0.7667 | 116 (p=1.883298e-06) |
| DCA + t-test | 329 | 0.7333 | 99 (p=4.343267e-14) |
Figure 5The comparison of multimodal method and unimodal method.
Figure 6The (A) denotes the frequencies of abnormal brain regions related to AD. The (B) denotes the location of the corresponding abnormal brain regions.
Figure 7The frequencies of main pathogenic genes.