| Literature DB >> 28800327 |
Shweta Bagewadi Kawalia1,2, Tamara Raschka1,3, Mufassra Naz1,2, Ricardo de Matos Simoes4, Philipp Senger1, Martin Hofmann-Apitius1,2.
Abstract
Alzheimer's disease (AD) progressively destroys cognitive abilities in the aging population with tremendous effects on memory. Despite recent progress in understanding the underlying mechanisms, high drug attrition rates have put a question mark behind our knowledge about its etiology. Re-evaluation of past studies could help us to elucidate molecular-level details of this disease. Several methods to infer such networks exist, but most of them do not elaborate on context specificity and completeness of the generated networks, missing out on lesser-known candidates. In this study, we present a novel strategy that corroborates common mechanistic patterns across large scale AD gene expression studies and further prioritizes potential biomarker candidates. To infer gene regulatory networks (GRNs), we applied an optimized version of the BC3Net algorithm, named BC3Net10, capable of deriving robust and coherent patterns. In principle, this approach initially leverages the power of literature knowledge to extract AD specific genes for generating viable networks. Our findings suggest that AD GRNs show significant enrichment for key signaling mechanisms involved in neurotransmission. Among the prioritized genes, well-known AD genes were prominent in synaptic transmission, implicated in cognitive deficits. Moreover, less intensive studied AD candidates (STX2, HLA-F, HLA-C, RAB11FIP4, ARAP3, AP2A2, ATP2B4, ITPR2, and ATP2A3) are also involved in neurotransmission, providing new insights into the underlying mechanism. To our knowledge, this is the first study to generate knowledge-instructed GRNs that demonstrates an effective way of combining literature-based knowledge and data-driven analysis to identify lesser known candidates embedded in stable and robust functional patterns across disparate datasets.Entities:
Keywords: Alzheimer’s disease; gene regulatory networks; microarray analysis; synaptic transmission
Mesh:
Substances:
Year: 2017 PMID: 28800327 PMCID: PMC5611835 DOI: 10.3233/JAD-170011
Source DB: PubMed Journal: J Alzheimers Dis ISSN: 1387-2877 Impact factor: 4.472
Fig.1The overall strategy applied to obtain robust gene expression patterns across public Alzheimer’s disease studies. Firstly, four gene expression datasets were shortlisted from NeuroTransDB database. The selected studies underwent preprocessing and quality control. In each dataset, the intensity values were limited to the seed gene list. To enrich the seed, functional enrichment was applied where genes from the identified significant pathways from each dataset’s subnetwork (edge weight >0.5), generated using BC3Net10 approach, were included. When no additional genes were identified, subnetworks from each iteration, separately for each dataset, were merged into an aggregated network for further prioritization of the genes using genetic variant analysis.
List of datasets shortlisted from NeuroTransDB database for generating gene regulatory networks. Final selected studies are highlighted in bold
| GEO ID | Number of Samples | Sample Source | Stage | Platform | |
| Diseased | Control | ||||
| 87 | 74 | Entorhinal cortex, Hippocampus, | – | Affymetrix HG U133 Plus 2 | |
| Primary visual cortex, Prefrontal | |||||
| cortex, Medial temporal gyrus, | |||||
| Superior frontal gyrus | |||||
| 129 | 101 | Cerebellum | LOAD | Rosetta/Merck Human 44k 1.1 microarray | |
| 129 | 101 | Visual cortex | LOAD | Rosetta/Merck Human 44k 1.1 microarray | |
| 129 | 101 | Dorsolateral prefrontal cortex | LOAD | Rosetta/Merck Human 44k 1.1 microarray | |
| GSE13214 | 52 | 40 | Hippocampal, Cortex frontal | Braak 4–6 | Homo sapiens 4.8K 02-01 amplified cDNA |
| GSE15222 | 176 | 187 | Cortical tissue | LOAD | Sentrix HumanRef-8 Expression BeadChip |
| GSE29676 | 350 | 200 | Blood | – | Invitrogen ProtoArray v5.0 |
| GSE33528 | 615 | 600 | Blood | LOAD | Illumina Human- Hap650Yv2 Genotyping BeadChip |
Fig.2Venn diagram depicting the gene overlap between the subnetworks (edge weight >0.5) of the four datasets, generated using the initial seed. The initial seed was compiled from top 500 genes retrieved by querying SCAIView for Alzheimer’s disease related genes. It is evident that there are no common genes among the four dataset’s subnetworks. Differing factors between platforms, analytical methods, tissue source, etc. could contribute to such a behavior.
Statistics of the iterative functional enrichment approach
| Iteration (i) | Seed | No. of overlapping | No. of enriched candidate |
| pathways between the | genes obtained from the | ||
| four datasets | overlapping pathways | ||
| 1 | SCAIView (500) | 10 | 820 |
| 2 | i1+820 | 38 | 1148 |
| 3 | i2+1148 | 30 | 361 |
| 4 | i3+361 | 30 | 84 |
| 5 | i4+84 | 32 | 41 |
| 6 | i5+41 | 33 | 7 |
| 7 | i6+7 | 37 | – |
Fig.3Stratification of the nodes and edges in four aggregated networks. Each stack in the bar plot represents the fraction of nodes added in that iteration (IT) relative to the aggregated network (considered as 1). The addition of nodes remained stable across the datasets in each iteration. However, the inclusion of edges varies, which could be presumed due to newly inferred interactions from the newly included nodes in each iteration. (a) Fraction of added nodes in different iterations; (b) Fraction of added edges in different iterations.
Fig.4Mean and variance distribution across four datasets for the added nodes and edges in each iteration. Enrichment of nodes and edges reach saturation after 7th iteration, suggesting the completeness of the generated GRNs. Relatively high number of edges (see y-axis range) show immense inter-connectivity between the genes in the GRNs. (a) Boxplot for mean and variance distribution of nodes; (b) Boxplot for mean and variance distribution of edges.
Hub genes identified in the aggregated network for the four datasets. The genes are sorted by their hub degree within each dataset. Only significant pathways are listed here (see Table 4 for the list)
| GEO ID | Gene Symbols | Hub Degree | Pathway Annotation (CPDB) | Similar results in |
| other datasets? | ||||
| GSE5281 | HFE | 244 | – | – |
| ATP2A3 | 162 | Calcium signaling, pancreatic | – | |
| secretion | ||||
| GLP1R | 150 | Insulin Secretion | – | |
| ADRBK1 | 145 | Endocytosis | GSE44770 | |
| CACNG4, CACNG6 | 141 | – | – | |
| KCNJ5 | 132 | Estrogen signaling | – | |
| P2RX2 | 130 | Calcium signaling | GSE44770 | |
| KPNA2 | 122 | – | – | |
| NOX1 | 118 | – | – | |
| CACNG5 | 113 | – | – | |
| EPN1 | 113 | Endocytosis | – | |
| WAS | 112 | – | – | |
| CASP10 | 111 | Apoptosis | – | |
| HSPB6, EPHA4 | 109 | – | – | |
| ADNP | 108 | – | – | |
| DNAH3 | 106 | – | – | |
| GRIN2A | 105 | Calcium signaling | – | |
| UBQLN1 | 101 | – | – | |
| IL34, ATP5A1, UBE2L3 | 100 | – | – | |
| DPYSL2 | 99 | – | – | |
| FOLR2 | 98 | Endocytosis | – | |
| NPR1 | 96 | – | – | |
| DNM1L, KLC1, ATP5G3 | 92 | – | – | |
| GSE44768 | RASGRF1 | 80 | – | – |
| DNAL4 | 63 | – | – | |
| EPHA1 | 60 | – | – | |
| CHRND | 59 | – | – | |
| TRPC1 | 54 | Pancreatic secretion | GSE5281, GSE44770 | |
| PAK7 | 50 | – | – | |
| NDUFA4 | 44 | – | – | |
| CHMP4B | 44 | Endocytosis | – | |
| GSE44770 | IVNS1ABP | 103 | – | – |
| FGF18 | 92 | – | – | |
| ATF2 | 90 | Estrogen signaling, Insulin secretion | – | |
| CTSG | 88 | – | – | |
| GABRE | 86 | – | – | |
| FBXL2 | 81 | – | – | |
| GAPDH | 75 | – | – | |
| DIO1 | 72 | Thyroid hormone signaling | – | |
| CACNB3, CDK2 | 66 | – | – | |
| NFKBIB | 66 | Adipocytokine signaling, | GSE44768 | |
| neurotrophin signaling, NOD-like | ||||
| receptor signaling | ||||
| PRDM4 | 64 | Neurotrophin signaling | – | |
| MAPK9 | 63 | Adipocytokine signaling, | – | |
| neurotrophin signaling, NOD-like | ||||
| receptor signaling | ||||
| PIK3CB | 63 | Apoptosis, estrogen signaling, | GSE5281 | |
| neurotrophin signaling, thyroid | ||||
| hormone signaling | ||||
| GSE44771 | HSPA2 | 18 | Endocytosis, estrogen signaling | – |
Landscape of significant pathways (p-value <0.05) determined across datasets
| Common | Pathway Category | Total no. of genes | Number of genes enriched for the pathway | ||||
| Pathways | in the pathway | GSE5281 | GSE44768 | GSE44770 | GSE44771 | Consensus | |
| Cancer | Basal cell carcinoma | 55 | 5 | 2 | 7 | 1 | 15 |
| Cancer | Colorectal cancer | 62 | 6 | 2 | 8 | 1 | 14 |
| Cancer | Pathways in cancer | 398 | 64 | 27 | 40 | 3 | 119 |
| Cancer | Small cell lung cancer | 86 | 15 | 4 | 9 | 2 | 27 |
| Comorbidity | Amyotrophic lateral sclerosis | 51 | 14 | 2 | 5 | 1 | 18 |
| Comorbidity | Arrhythmogenic right ventricular cardiomyopathy | 74 | 13 | 5 | 8 | 2 | 21 |
| Comorbidity | Dilated cardiomyopathy | 90 | 17 | 6 | 7 | 2 | 26 |
| Comorbidity | Hypertrophic cardiomyopathy | 83 | 14 | 7 | 7 | 2 | 23 |
| Comorbidity | Rheumatoid arthritis | 91 | 10 | 4 | 7 | 1 | 20 |
| Infection | Epithelial cell signaling in Helicobacter pylori infection | 68 | 9 | 2 | 6 | 1 | 16 |
| Infection | Influenza A | 177 | 25 | 4 | 22 | 2 | 46 |
| Infection | Shigellosis | 61 | 12 | 3 | 7 | 1 | 19 |
| Infection | Toxoplasmosis | 120 | 14 | 3 | 12 | 2 | 26 |
| Infection | Tuberculosis | 179 | 21 | 4 | 23 | 3 | 46 |
| Infection | Vibrio cholera infection | 54 | 9 | 2 | 2 | 1 | 13 |
| Infection | Viral myocarditis | 60 | 12 | 2 | 5 | 2 | 19 |
| Others | Melanogenesis | 101 | 18 | 3 | 10 | 1 | 30 |
| Others | Neuroactive ligand-receptor interaction | 275 | 57 | 24 | 30 | 3 | 98 |
| Potential | Apoptosis | 86 | 14 | 2 | 6 | 1 | 20 |
| Potential | Calcium signaling pathway | 180 | 43 | 12 | 16 | 2 | 62 |
| Potential | Endocytosis | 213 | 47 | 10 | 21 | 4 | 70 |
| Potential | Neurotrophin signaling pathway | 120 | 24 | 6 | 17 | 1 | 44 |
| Potential | NOD-like receptor signaling pathway | 57 | 9 | 3 | 6 | 1 | 16 |
| Potential | PPAR signaling pathway | 69 | 11 | 4 | 9 | 2 | 22 |
| Potential | Synaptic vesicle cycle | 63 | 15 | 4 | 8 | 1 | 26 |
| Potential | Adipocytokine signaling pathway | 70 | 17 | 6 | 8 | 1 | 27 |
| Potential | Insulin secretion | 86 | 18 | 3 | 10 | 1 | 28 |
| Potential | Pancreatic secretion | 96 | 21 | 5 | 9 | 1 | 30 |
| Potential (hormones) | Estrogen signaling pathway | 100 | 23 | 4 | 7 | 1 | 32 |
| Potential (hormones) | Thyroid hormone signaling pathway | 119 | 26 | 3 | 10 | 1 | 37 |
| Potential (others) | Lysosome | 122 | 13 | 7 | 11 | 4 | 33 |
| Potential (others) | Phagosome | 155 | 31 | 4 | 16 | 2 | 48 |
Fig.5The landscape of p-value for the final list of significant pathways. For easy visualization, we have used 1-p value instead of p-value on Y-axis. Each line in the graph represents aggregated GRN for specified dataset (see chart legend). The listed pathways show higher significance level in consensus GRN in comparison to the individual dataset aggregated GRNs.
List of genes prioritized using genetic variant analysis
| Rank | Gene | RegulomeDB | No. evidences | Pathways involved |
| Symbol | score | for AD | ||
| 1 | IL1B | 1b | 1073 | Apoptosis, NOD-like receptor signaling |
| 2 | NSF | 1d | 8 | Synaptic vesicle cycle |
| 3 | HLA-F | 1f | 0 | Endocytosis |
| 4 | NOTCH4 | 1f | 3 | Thyroid hormone signaling |
| 5 | VCL | 1f | 10 | Shigellosis |
| 6 | PSAP | 1f | 3 | Lysosome |
| 7 | STX2 | 1f | 2 | Synaptic vesicle cycle |
| 8 | GGA2 | 1f | 4 | Lysosome |
| 9 | STK11 | 1f | 7 | Adipocytokine signaling |
| 10 | CSF3R | 1f | 5 | Pathways in cancer |
| 11 | LMNA | 1f | 11 | Arrhythmogenic right ventricular cardiomyopathy, Dilated cardiomyopathy, Hypertrophic cardiomyopathy |
| 12 | CTNNA2 | 1f | 3 | Arrhythmogenic right ventricular cardiomyopathy |
| 13 | HLA-C | 1f | 1 | Endocytosis |
| 14 | RAB11FIP4 | 1f | 0 | Endocytosis |
| 15 | GRIN2A | 2a | 52 | Calcium signaling |
| 16 | RBX1 | 2a | 0 | Viral Myocarditis |
| 17 | KCNJ5 | 2a | 0 | Estrogen signaling |
| 18 | EPHA4 | 2b | 18 | Hub Genes |
| 19 | CACNG4 | 2b | 0 | Arrhythmogenic right ventricular cardiomyopathy, Dilated cardiomyopathy, Hypertrophic cardiomyopathy |
| 20 | PLA2G5 | 2b | 7 | Pancreatic secretion |
| 21 | ATP2B4 | 2b | 1 | Calcium signaling, pancreatic secretion |
| 22 | P2RY14 | 2b | 0 | Neuroactive ligand receptor interaction |
| 23 | P2RY13 | 2b | 0 | Neuroactive ligand receptor interaction |
| 24 | PTGER4 | 2b | 11 | Neuroactive ligand receptor interaction |
| 25 | ARAP3 | 2b | 0 | Endocytosis |
| 26 | FGF1 | 2b | 22 | Pathways in cancer |
| 27 | RPS6KA2 | 2b | 0 | Neurotrophin signaling |
| 28 | RAPGEF1 | 2b | 0 | Neurotrophin signaling |
| 29 | GABBR2 | 2b | 1 | Estrogen signaling |
| 30 | PRF1 | 2b | 1 | Viral myocarditis |
| 31 | ITGA8 | 2b | 0 | Arrhythmogenic right ventricular cardiomyopathy, Dilated cardiomyopathy, Hypertrophic cardiomyopathy |
| 32 | AP2A2 | 2b | 0 | Endocytosis, Synaptic vesicle cycle |
| 33 | ITPR2 | 2b | 2 | Calcium signaling, Estrogen signaling, pancreatic secretion |
| 34 | MED13L | 2b | 0 | Thyroid hormone signaling |
| 35 | COL4A1 | 2b | 0 | Pathways in cancer |
| 36 | KCNJ6 | 2b | 3 | Estrogen signaling |
| 37 | ATP2A3 | 2b | 0 | Calcium signaling, Pancreatic secretion |
| 38 | ASAP2 | 3a | 1 | Endocytosis |
| 39 | FYN | 3a | 70 | Viral myocarditis |
| 40 | NTRK2 | 3a | 124 | Neurotrophin signaling |
| 41 | PAK1 | 3a | 7 | Epithelial cell signaling in Helicobacter pylori infection |
| 42 | COL4A2 | 3a | 0 | Small cell lung cancer, Pathways in cancer |
| 43 | BMP4 | 3a | 5 | Thyroid hormone signaling |
| 44 | GABRB3 | 3a | 0 | Neuroactive ligand receptor interaction |
| 45 | CEBPB | 3a | 12 | Tuberculosis |
| 46 | EPHA1 | 5 | 31 | Hub Genes |
| 47 | DPYSL2 | 5 | 47 | Hub Genes |
Fig.6Subnetworks of the three shortlisted potential pathways (extracted from consensus network) involved in neurotransmission. Nodes in Cyan are involved in more than one pathways and the size of the nodes depends on the number of pathways involved. Triangle nodes represent the presence of a SNP. (a) Calcium signaling pathway; (b) Endocytosis pathway; (c) Synaptic vesicle cycle.