| Literature DB >> 35912089 |
Keping Chai1, Xiaolin Zhang2, Shufang Chen1, Huaqian Gu1, Huitao Tang1, Panlong Cao1, Gangqiang Wang1, Weiping Ye1, Feng Wan2, Jiawei Liang3, Daojiang Shen1.
Abstract
Aberrant deposits of neurofibrillary tangles (NFT), the main characteristic of Alzheimer's disease (AD), are highly related to cognitive impairment. However, the pathological mechanism of NFT formation is still unclear. This study explored differences in gene expression patterns in multiple brain regions [entorhinal, temporal, and frontal cortex (EC, TC, FC)] with distinct Braak stages (0- VI), and identified the hub genes via weighted gene co-expression network analysis (WGCNA) and machine learning. For WGCNA, consensus modules were detected and correlated with the single sample gene set enrichment analysis (ssGSEA) scores. Overlapping the differentially expressed genes (DEGs, Braak stages 0 vs. I-VI) with that in the interest module, metascape analysis, and Random Forest were conducted to explore the function of overlapping genes and obtain the most significant genes. We found that the three brain regions have high similarities in the gene expression pattern and that oxidative damage plays a vital role in NFT formation via machine learning. Through further filtering of genes from interested modules by Random Forest, we screened out key genes, such as LYN, LAPTM5, and IFI30. These key genes, including LYN, LAPTM5, and ARHGDIB, may play an important role in the development of AD through the inflammatory response pathway mediated by microglia.Entities:
Keywords: Braak stages; WGCNA; neurodegeneration; random forest; ssGSEA
Year: 2022 PMID: 35912089 PMCID: PMC9326231 DOI: 10.3389/fnagi.2022.837770
Source DB: PubMed Journal: Front Aging Neurosci ISSN: 1663-4365 Impact factor: 5.702
Figure 1(A) Comparison between EC set-specific modules and EC-FC consensus modules of the global co-expression network. The numbers in the table represent genes that are shared between EC modules and consensus modules. The color code of the table is -log(p), where p is the p-value of Fisher's exact test of the overlap of the two modules. The darker the red, the more pronounced the overlap. (B,C) Clustering dendrograms of consensus module eigengenes for identifying meta-modules show the presence of similar major branching patterns in the EC and FC eigengene network. (D,G) The heatmap shows the eigengene adjacencies in EC and FC eigengene networks. Each row and column correspond to an eigengene tagged by consensus module color. Within each heatmap, red represents high adjacency (positive correlation) and blue represents low adjacency (negative correlation) as represented by the color legend. (E) Bar plot shows the preservation degree of each consensus eigengene as the height of the bar (y-axis) where each colored bar corresponds to the eigengene of the associated consensus module. The high-density value D (Preserve EC, FC) = 0.91 indicates the high overall preservation between the EC and FC networks. (F) Adjacency heatmap of the preservation network between EC and FC consensus eigengene networks. The saturation of the red color indicates a correlation preservation of EC and FC module eigengenes.
Figure 2ssGSEA and WGCNA analysis of the EC data. (A) Heatmap shows the ssGSEA scores of the different gene sets in corresponding samples. (B) The heatmap shows the adj R2 of the Best subset regression result of each ssGSEA pathway. (C) The plot shows the adj R2 of the number of features in the Best subset regression model. (D) The bar plot shows the importance of each ssGSEA pathway by the RF model. (E) Pearson correlation coefficient between the pathway and module eigengenes, numbers in brackets indicate the corresponding p-values.
Figure 3Identifying the overlapping genes between downregulated DEGs in the aged group and genes in the black module. (A) Heatmap of the expression of DEGs. (B) Heatmap of the Top30 gene expression in the black module. (C) Using veen tools to find the overlap genes between downregulated genes in DEGs and genes in the black module. (D) Heatmap showing the expression of the overlapping genes in different samples.
Figure 4The Metascape of the overlapping genes. The network shows the GO terms that the log P (−23 to −8) correlates with the significance of the enrichment.
Figure 5Identifying the most important genes via RF and the cellular distribution of the important genes in the brain. (A) Random Forest algorithm result. The blue box plot corresponds to the minimum, average, and maximum Z scores of a color attribute. The red, yellow, and green boxes represent the Z scores of rejected, tentative, and confirmed genes, respectively. (B) The PPI network of important genes via String. (C) The heatmap shows the distribution of the selected genes in different cell types.