| Literature DB >> 33051491 |
Yuqing Hang1, Mohammed Aburidi2, Benafsh Husain2, Allison R Hickman1, William L Poehlman1, F Alex Feltus3,4,5.
Abstract
The human brain is a complex organ that consists of several regions each with a unique gene expression pattern. Our intent in this study was to construct a gene co-expression network (GCN) for the normal brain using RNA expression profiles from the Genotype-Tissue Expression (GTEx) project. The brain GCN contains gene correlation relationships that are broadly present in the brain or specific to thirteen brain regions, which we later combined into six overarching brain mini-GCNs based on the brain's structure. Using the expression profiles of brain region-specific GCN edges, we determined how well the brain region samples could be discriminated from each other, visually with t-SNE plots or quantitatively with the Gene Oracle deep learning classifier. Next, we tested these gene sets on their relevance to human tumors of brain and non-brain origin. Interestingly, we found that genes in the six brain mini-GCNs showed markedly higher mutation rates in tumors relative to matched sets of random genes. Further, we found that cortex genes subdivided Head and Neck Squamous Cell Carcinoma (HNSC) tumors and Pheochromocytoma and Paraganglioma (PCPG) tumors into distinct groups. The brain GCN and mini-GCNs are useful resources for the classification of brain regions and identification of biomarker genes for brain related phenotypes.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33051491 PMCID: PMC7553962 DOI: 10.1038/s41598-020-73611-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Normal brain gene co-expression network. (A) The right panel represents the whole gene co-expression network (GCN) constructed from 1671 GTEx brain RNAseq samples from 13 different brain regions. The left panel is the corresponding t-SNE visualization for the 1691 brain GCN genes where RNA expression profiles sorted regions into multiple clusters. Each color represents a different region shown in the legend. (B) Six brain region mini-GCNs are shown on the right side of each panel. Corresponding t-SNE visualization pictures for those region-specific genes are shown on the left side of each panel. Non-black dots in each tSNE plot represent the corresponding region-specific samples and black dots represent samples from all other regions. For all basal ganglia specific gene sets, red, orange and yellow dots represent caudate basal ganglia, nucleus accumben basal ganglia, and putamen basal ganglia samples respectively. The red and orange dots from cerebellum and cerebellar hemisphere specific gene sets represent cerebellum and cerebellar hemisphere samples respectively. All red dots from other region-specific gene sets represent the particular region-specific samples.
Normal brain GCN edge attributes.
| Region | Samples | Nodes | Edgesa | Modulesa | All genes specific eQTLs | [RNA] All genes mean | [RNA] All genes stdev | [RNA] Enriched node mean | [RNA] Enriched node stdev | k | Unique edges | Unique edge percentage |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Full network (all regions) | 1671 | 1691 | 7812 | 183 | 38,549 | 0.57 | 3.34 | 3.65 | 2.71 | 9.24 | 434 | 0.06 |
| Brain amygdala | 100 | 188 | 145 | 87 | 4,097 | 0.57 | 3.34 | 4.50 | 1.99 | 1.54 | 0 | 0.00 |
| Brain anterior cingulate cortex BA24 | 121 | 419 | 468 | 41 | 11,833 | 0.57 | 3.34 | 4.62 | 1.90 | 2.23 | 0 | 0.00 |
| Brain caudate (basal ganglia) | 160 | 690 | 2076 | 131 | 33,554 | 0.57 | 3.34 | 3.89 | 2.49 | 6.02 | 200 | 0.10 |
| Brain cerebellar hemisphere | 136 | 270 | 225 | 4 | 43,794 | 0.57 | 3.34 | 4.86 | 1.90 | 1.67 | 1 | 0.00 |
| Brain cerebellum | 173 | 327 | 301 | 13 | 129,623 | 0.57 | 3.34 | 4.98 | 1.84 | 1.84 | 60 | 0.20 |
| Brain cortex | 158 | 646 | 928 | 74 | 42,199 | 0.57 | 3.34 | 4.87 | 1.87 | 2.87 | 24 | 0.03 |
| Brain frontal cortex BA9 | 129 | 545 | 735 | 54 | 20,048 | 0.57 | 3.34 | 5.03 | 1.81 | 2.70 | 0 | 0.00 |
| Brain hippocampus | 123 | 377 | 909 | 114 | 8147 | 0.57 | 3.34 | 3.99 | 2.70 | 4.82 | 2 | 0.00 |
| Brain hypothalamus | 121 | 440 | 1502 | 103 | 7853 | 0.57 | 3.34 | 3.67 | 2.59 | 6.83 | 70 | 0.05 |
| Brain nucleus accumbens (basal ganglia) | 147 | 536 | 693 | 73 | 24,106 | 0.57 | 3.34 | 4.77 | 1.83 | 2.59 | 5 | 0.01 |
| Brain putamen (basal ganglia) | 124 | 427 | 466 | 75 | 14,446 | 0.57 | 3.34 | 4.70 | 1.84 | 2.18 | 2 | 0.00 |
| Brain spinal cord cervical c1 | 91 | 145 | 95 | 37 | 14,663 | 0.57 | 3.34 | 3.80 | 2.26 | 1.31 | 26 | 0.27 |
| Brain substantia nigra | 88 | 111 | 84 | 51 | 2782 | 0.57 | 3.34 | 3.75 | 2.30 | 1.51 | 44 | 0.52 |
[1] For edge enrichment, we consider the significant edges for each sub-cluster as those with p-values less than 1E−10; [2] For module enrichment, we consider the significant modules for each sub-cluster will be those with p-values less than 1E−3.
Figure 2Brain region-specific GCN attributes. (A) Number of link community modules unique to 0–13 brain regions. (B) Number of edges unique in 0–13 brain regions. (C) Number of region-specific edge associated GTEx eQTLs unique in 1–13 brain regions.
Unique region-specific edge attributes.
| Region | Edges | Nodes | eQTLs | Nodes counted | eQTLs/node |
|---|---|---|---|---|---|
| Full network (all regions) | 434 | 344 | 2228 | 59 | 37.76 |
| Brain amygdala | 0 | 0 | 0 | 0 | 0 |
| Brain anterior cingulate cortex BA24 | 0 | 0 | 0 | 0 | 0 |
| Brain caudate (basal ganglia) | 200 | 139 | 917 | 22 | 41.68 |
| Brain cerebellar hemisphere | 1 | 2 | 0 | 0 | 0 |
| Brain cerebellum | 60 | 76 | 874 | 17 | 51.41 |
| Brain cortex | 24 | 40 | 296 | 9 | 32.89 |
| Brain frontal cortex BA9 | 0 | 0 | 0 | 0 | 0 |
| Brain hippocampus | 2 | 4 | 0 | 0 | 0 |
| Brain hypothalamus | 70 | 63 | 108 | 6 | 18 |
| Brain nucleus accumbens (basal ganglia) | 5 | 9 | 23 | 1 | 23 |
| Brain putamen (basal ganglia) | 2 | 4 | 0 | 0 | 0 |
| Brain spinal cord cervical C1 | 26 | 28 | 0 | 0 | 0 |
| Brain substantia nigra | 44 | 43 | 10 | 4 | 2.5 |
Region-specific module information.
| Module | Edges | Enriched region | p-value |
|---|---|---|---|
| M005 | 5 | Cerebellum | 4.20E−11 |
| M126 | 3 | Cerebellar hemisphere; cerebellum | 2.54E−78; 1.57E−102 |
Unique brain module functional enrichment analysis.
| Module | Region | Adj. p1 | Term ID | Term definition |
|---|---|---|---|---|
| M0005 | Cerebellum | 1.22E−03 | MIM:114850 | CARBOXYPEPTIDASE A1 |
| M0005 | Cerebellum | 1.22E−03 | MIM:246600 | PANCREATIC LIPASE |
| M0005 | Cerebellum | 1.22E−03 | MIM:276000 | PROTEASE, SERINE, 1 |
| M0005 | Cerebellum | 1.94E−03 | GO:0005615 | Extracellular space |
| M0005 | Cerebellum | 3.42E−03 | GO:0006508 | Proteolysis |
| M0005 | Cerebellum | 1.57E−03 | GO:0008233 | Peptidase activity |
| M0005 | Cerebellum | 5.63E−03 | GO:0008236 | Serine-type peptidase activity |
| M0005 | Cerebellum | 2.25E−03 | GO:0016787 | Hydrolase activity |
| M0005 | Cerebellum | 7.70E−03 | GO:0061365 | Positive regulation of triglyceride lipase activity |
| M0005 | Cerebellum | 9.55E−03 | IPR001314 | Peptidase S1A, chymotrypsin family |
| M0005 | Cerebellum | 8.05E−03 | IPR018114 | Serine proteases, trypsin family, histidine active site |
| M0005 | Cerebellum | 6.96E−03 | IPR033116 | Serine proteases, trypsin family, serine active site |
| M0005 | Cerebellum | 1.18E−03 | PF00089 | Trypsin |
| M0005 | Cerebellum | 9.42E−03 | R-HSA-196854 | Metabolism of vitamins and cofactors |
| M0126 | Cerebellar hemisphere; cerebellum | 2.18E−03 | MIM:603140 | PHOSPHATIDYLINOSITOL 5-PHOSPHATE 4-KINASE, TYPE II, ALPHA |
| M0126 | Cerebellar hemisphere; cerebellum | 2.18E−03 | MIM:609410 | SYNAPTOJANIN 2 |
| M0126 | Cerebellar hemisphere; cerebellum | 2.18E−03 | MIM:610072 | ERMIN |
| M0126 | Cerebellar hemisphere; cerebellum | 2.18E−03 | MIM:616027 | ACTIN-BINDING PROTEIN ANILLIN |
| M0126 | Cerebellar hemisphere; cerebellum | 8.74E−03 | IPR031970 | Anillin, N-terminal domain |
| M0126 | Cerebellar hemisphere; cerebellum | 8.74E−03 | IPR034973 | Synaptojanin-2, RNA recognition motif |
| M0126 | Cerebellar hemisphere; cerebellum | 8.74E−03 | IPR034974 | Synaptojanin-2 |
| M0126 | Cerebellar hemisphere; cerebellum | 6.71E−03 | PF08174 | Cell division protein anillin |
| M0126 | Cerebellar hemisphere; cerebellum | 5.03E−03 | PF08952 | Domain of unknown function (DUF1866) |
| M0126 | Cerebellar hemisphere; cerebellum | 3.35E−03 | PF16018 | Anillin N-terminus |
| M0126 | Cerebellar hemisphere; cerebellum | 4.30E−03 | R-HSA-1483255 | PI Metabolism |
| M0126 | Cerebellar hemisphere; cerebellum | 1.73E−03 | R-HSA-1660499 | Synthesis of PIPs at the plasma membrane |
[1] Bonferroni adjusted p-value< 0.01.
Figure 3Gene Oracle classification of brain regions with brain region-specific edges. (A) Classification accuracy (X-axis) of region-specific gene sets (Y-axis; green bars) versus matched number of random genes (red bars) over 1671 GTEx brain samples from 13 different brain regions. (B) Confusion plot showing precise classifications (diagonal boxes) and misclassified samples for each region-specific gene sets. The upper number in the diagonal boxes indicates the number of samples that are correctly classified, and the lower number indicates its percent for each class. Other boxes show a number of misclassified samples.
Figure 4Combinatorial analysis of spinal cord, cortex and substantia nigra gene sets. Heatmaps depicting the frequency of genes present in the classification subsets that were generated at each Gene Oracle Phase 2 iteration. Each row is an iteration and each column is a gene from the cortex/spinal/substantia nigra sets. Darker colors correspond to higher frequencies.
Gene oracle candidate genes for brain GTEx dataset.
| Region-specific set | Candidate genes identified by Gene Oracle phase II |
|---|---|
| Substantia nigra | DRD2, FAM189A1, KCNJ6, SYNGR3, SNCA, CCDC85A, PTPRU, KCND3, CADPS2, RFK, SLC8A1 |
| Cortex | SNAP25, DMTN, STX1B, CABLES2, L1CAM, PKP4, AAK1, KCNAB2, DAAM2, IL12A-AS1 |
| Spinal cord | PLPPR3, CAMK2N2, PTPN5, ADGRB1, TUNAR, MIR124-2HG, CACNG3 |
Random Forest candidate genes for brain GTEx dataset.
| Region-specific set | Candidate genes identified by Random Forest |
|---|---|
| Substantia nigra | |
| Cortex | |
| Spinal cord | PNMA6F, |
Genes in bold emphasis are common between the two methods.
Figure 5Classification potential for decomposed gene sets. (A) Classification accuracies for the full region-specific gene sets (green) were compared to accuracies of the candidate genes identified by Gene Oracle (blue), non-candidate genes identified by Gene Oracle (gray), candidate genes identified by Random Forest (orange), and non-candidate genes identified by Random forest (purple). (B) Same as (A) but only for decomposed genes identified by Random Forest.
Figure 6t-SNE visualization of region-specific genes on TCGA tumor data. t-SNE was performed using TCGA RNAseq data from brain region sub-GCN genes. 1431 tumor samples from four tumor subtypes are shown. Tumor RNA expression profiles sorted regions into multiple clusters. Each color represents different regions. Red represents GBM; green represents HNSC; blue represents LGG; yellow represents PCPG.
Mutation rates for brain region-specific gene sets in five TCGA tumors.
| Region | TS | Polymorphism | Tumor | Mutated | Mutated | TS genes | Random genes | TS genes | Random genes | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Genes | Detection method | Type | TS genes | Randomized | P-value | Mutated | Mutated | P-value | Total | Total | P-value | |
| Control gene mean | Tumors | Tumors mean | Mutations | Mutations mean | ||||||||
| Basal ganglia | 145 | Muse | GBM | 122 | 77.1 | < 0.01 | 139 | 114.77 | 0.12 | 446 | 308.3 | 0.04 |
| Basal ganglia | 145 | Muse | HNSC | 124 | 84.2 | < 0.01 | 286 | 248.61 | 0.05 | 610 | 496.47 | 0.09 |
| Basal ganglia | 145 | Muse | LGG | 96 | 63.6 | < 0.01 | 106 | 79.55 | 0.08 | 389 | 316.2 | 0.1 |
| Basal ganglia | 145 | Muse | PCPG | 10 | 8.7 | 0.23 | 11 | 9.53 | 0.23 | 12 | 9.89 | 0.21 |
| Basal ganglia | 145 | Muse | KIRC | 67 | 55.5 | 0.03 | 118 | 91.4 | 0.04 | 150 | 115.49 | 0.05 |
| Cerebellum | 78 | Muse | GBM | 68 | 44.9 | < 0.01 | 154 | 86.21 | 0.01 | 402 | 207.68 | 0 |
| Cerebellum | 78 | Muse | HNSC | 69 | 48.4 | < 0.01 | 293 | 196.33 | 0 | 703 | 344.58 | 0 |
| Cerebellum | 78 | Muse | LGG | 60 | 37.8 | < 0.01 | 100 | 53.93 | 0.02 | 428 | 202.14 | 0 |
| Cerebellum | 78 | Muse | PCPG | 9 | 5.48 | 0.05 | 10 | 6.13 | 0.06 | 10 | 6.36 | 0.07 |
| Cerebellum | 78 | Muse | KIRC | 53 | 33.39 | < 0.01 | 102 | 62.69 | 0 | 137 | 73.17 | 0 |
| Cortex | 40 | Muse | GBM | 34 | 23.5 | < 0.01 | 49 | 45.57 | 0.32 | 118 | 103.52 | 0.26 |
| Cortex | 40 | Muse | HNSC | 36 | 25.4 | < 0.01 | 126 | 122.54 | 0.35 | 194 | 176.84 | 0.26 |
| Cortex | 40 | Muse | LGG | 27 | 19.4 | < 0.01 | 36 | 27.44 | 0.17 | 129 | 103.29 | 0.18 |
| Cortex | 40 | Muse | PCPG | 4 | 2.45 | 0.1 | 4 | 2.91 | 0.19 | 4 | 2.99 | 0.2 |
| Cortex | 40 | Muse | KIRC | 19 | 16.95 | 0.18 | 35 | 34.35 | 0.36 | 39 | 37.99 | 0.38 |
| Hypothalamus | 63 | Muse | GBM | 60 | 33 | < 0.01 | 92 | 54.57 | 0.04 | 196 | 125.23 | 0.04 |
| Hypothalamus | 63 | Muse | HNSC | 54 | 36.5 | < 0.01 | 175 | 135.66 | 0.04 | 298 | 202.4 | 0.03 |
| Hypothalamus | 63 | Muse | LGG | 44 | 27.4 | < 0.01 | 56 | 37.39 | 0.04 | 188 | 134.15 | 0.06 |
| Hypothalamus | 63 | Muse | PCPG | 2 | 3.6 | 0.75 | 2 | 3.95 | 0.78 | 2 | 4.14 | 0.78 |
| Hypothalamus | 63 | Muse | KIRC | 28 | 22.54 | 0.01 | 48 | 41.54 | 0.17 | 55 | 46.45 | 0.16 |
| Spinal cord | 28 | Muse | GBM | 20 | 12.3 | < 0.01 | 33 | 20.4 | 0.05 | 74 | 41.44 | 0.02 |
| Spinal cord | 28 | Muse | HNSC | 22 | 13.5 | < 0.01 | 86 | 57.66 | 0.05 | 113 | 70.03 | 0.04 |
| Spinal cord | 28 | Muse | LGG | 21 | 9.9 | < 0.01 | 22 | 12.48 | 0.04 | 77 | 42.4 | 0.02 |
| Spinal cord | 28 | Muse | PCPG | 1 | 1.1 | 0.3 | 1 | 1.2 | 0.34 | 1 | 1.2 | 0.34 |
| Spinal cord | 28 | Muse | KIRC | 11 | 8.21 | 0.12 | 19 | 15.2 | 0.23 | 21 | 16.01 | 0.17 |
| Substantia nigra | 43 | Muse | GBM | 33 | 23.4 | < 0.01 | 61 | 45.25 | 0.11 | 123 | 98.89 | 0.15 |
| Substantia nigra | 43 | Muse | HNSC | 36 | 25.4 | < 0.01 | 143 | 114.3 | 0.1 | 198 | 162.79 | 0.14 |
| Substantia nigra | 43 | Muse | LGG | 27 | 19.9 | < 0.01 | 41 | 28.45 | 0.05 | 139 | 101.72 | 0.08 |
| Substantia nigra | 43 | Muse | PCPG | 2 | 2.74 | 0.5 | 7 | 3.2 | 0.05 | 8 | 3.28 | 0.03 |
| Substantia nigra | 43 | Muse | KIRC | 20 | 16.76 | 0.13 | 30 | 33.29 | 0.59 | 30 | 36.66 | 0.7 |
| Kidney | 20 | Muse | GBM | 19 | 13.44 | 0.01 | 18 | 32.43 | 0 | 34 | 67.57 | 0 |
| Kidney | 20 | Muse | HNSC | 20 | 14.07 | < 0.01 | 56 | 90.82 | 0 | 62 | 121.02 | 0 |
| Kidney | 20 | Muse | LGG | 19 | 11.43 | < 0.01 | 8 | 18.48 | 0 | 44 | 65.24 | 0 |
| Kidney | 20 | Muse | PCPG | 12 | 1.8 | < 0.01 | 1 | 1.96 | 0 | 1 | 1.98 | 0 |
| Kidney | 20 | Muse | KIRC | 19 | 10.55 | < 0.01 | 256 | 25.96 | 0 | 507 | 30.32 | 0 |