| Literature DB >> 27305007 |
Jinmyung Choi1, Parisa Shooshtari1, Kaitlin E Samocha2,3,4,5, Mark J Daly2,3,4, Chris Cotsapas1,2,3,4,6.
Abstract
Using robust, integrated analysis of multiple genomic datasets, we show that genes depleted for non-synonymous de novo mutations form a subnetwork of 72 members under strong selective constraint. We further show this subnetwork is preferentially expressed in the early development of the human hippocampus and is enriched for genes mutated in neurological Mendelian disorders. We thus conclude that carefully orchestrated developmental processes are under strong constraint in early brain development, and perturbations caused by mutation have adverse outcomes subject to strong purifying selection. Our findings demonstrate that selective forces can act on groups of genes involved in the same process, supporting the notion that purifying selection can act coordinately on multiple genes. Our approach provides a statistically robust, interpretable way to identify the tissues and developmental times where groups of disease genes are active.Entities:
Mesh:
Year: 2016 PMID: 27305007 PMCID: PMC4909280 DOI: 10.1371/journal.pgen.1006121
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1the Protein Interaction Network Tissue Search (PINTS) workflow.
We project gene-wise selective constraint scores [6] onto the InWeb protein-protein interaction dataset [16] and use a heuristic version of the prize-collecting Steiner Tree algorithm [17,29] to detect clusters of interacting constrained genes. We assess significance empirically, by randomly assigning the scores to genes 1000 times and calibrating detected subnetwork parameters. We then test any significant subnetwork for usual patterns of preferential expression [32] across the Roadmap Epigenome Project expression data [14], a cosmopolitan tissue atlas, using a Markov random field approach. The approach is flexible and modular, so gene interaction and tissue expression reference datasets can be altered according to the application.
Fig 2selectively constrained genes form a 72-member network, preferentially expressed in fetal brain, heart and immune cell populations.
A: constrained genes form a connected subnetwork of genes in the extreme of the constraint score distribution. B: the constrained subnetwork contains more genes (node p < 0.001), has more connections (edge p < 0.001), is more densely connected (clustering coefficient p = 0.008) and explains more total constraint (sum p < 0.001) than expected by chance (orange dots) compared to networks discovered in 1000 permutations of the constraint data (boxplots and black dots). C: the constrained subnetwork is preferentially expressed in a subset of Roadmap Epigenome Project tissues, including fetal brain. Preferentially expressed nodes and the shortest paths connecting them are in color; grey nodes are not preferentially expressed in each displayed tissue. D: The most consistent preferential expression signal is seen in fetal brain, which is robust to stringency of preferential expression threshold.
A 72-member constrained gene subnetwork.
We find that 67/107 significantly constrained genes form a single protein-protein interaction subnetwork. Five additional genes are also included (gray shading), as our cluster detection algorithm by design looks for a backbone of null nodes connected to many signal nodes. As shown in Fig 2, the subnetwork is significantly larger and more densely connected than expected by chance, and is preferentially expressed in a subset of early-stage neural and immune tissues.
| Gene | Constraint score | Chr | Start | End | Gene | Constraint score | Chr | Start | End |
|---|---|---|---|---|---|---|---|---|---|
| 9.977 | 14 | 101964528 | 102050792 | 4.940 | 1 | 19074506 | 19210276 | ||
| 8.302 | 17 | 1650629 | 1684882 | 4.905 | 17 | 7884806 | 7912760 | ||
| 7.973 | X | 53532096 | 53686729 | 4.866 | 16 | 8892094 | 8964514 | ||
| 6.604 | 19 | 10961001 | 11065395 | 4.826 | 20 | 63981135 | 64033100 | ||
| 6.578 | 17 | 7484366 | 7514618 | 4.806 | 20 | 58839718 | 58911192 | ||
| 6.436 | 1 | 237042205 | 237833988 | 4.791 | X | 123600561 | 123733056 | ||
| 6.388 | X | 71118556 | 71142454 | 4.772 | 13 | 32031300 | 32299122 | ||
| 6.166 | 2 | 96274336 | 96305515 | 4.753 | X | 71533083 | 71575897 | ||
| 6.162 | 12 | 6570083 | 6607476 | 4.729 | 4 | 56977722 | 57031168 | ||
| 5.974 | 1 | 11106535 | 11262507 | 4.687 | 10 | 76869601 | 77638595 | ||
| 5.971 | 9 | 137138390 | 137168762 | 4.685 | 17 | 29390464 | 29551904 | ||
| 5.794 | 19 | 49119389 | 49151026 | 4.683 | X | 80670854 | 80809688 | ||
| 5.747 | 11 | 118436490 | 118526832 | 4.671 | 9 | 128552558 | 128633665 | ||
| 5.720 | 8 | 102253012 | 102412841 | 4.670 | 6 | 78935867 | 79078236 | ||
| 5.589 | 3 | 4493348 | 4847840 | 4.670 | 11 | 61299451 | 61342596 | ||
| 5.547 | 17 | 59619689 | 59696956 | 4.665 | 14 | 64535905 | 64546173 | ||
| 5.541 | X | 154348524 | 154374638 | 4.644 | 2 | 219434846 | 219498287 | ||
| 5.514 | 19 | 18831938 | 18868236 | 4.639 | 10 | 110567691 | 110604636 | ||
| 5.450 | X | 153947553 | 153971807 | 4.629 | 17 | 8474205 | 8630761 | ||
| 5.428 | 3 | 47802909 | 47850195 | 4.621 | 2 | 61477849 | 61538626 | ||
| 5.423 | 2 | 54456285 | 54671445 | 4.610 | 2 | 224470150 | 224585397 | ||
| 5.418 | 2 | 197389784 | 197435091 | 4.592 | 13 | 109752698 | 109786568 | ||
| 5.387 | 9 | 2015219 | 2193624 | 4.587 | 7 | 45574140 | 45723116 | ||
| 5.363 | 22 | 39570753 | 39689737 | 4.564 | 19 | 1446302 | 1473244 | ||
| 5.360 | X | 53374149 | 53422728 | 4.547 | 1 | 15941869 | 15976132 | ||
| 5.334 | 12 | 13537337 | 13980119 | 4.517 | 9 | 35696948 | 35732395 | ||
| 5.211 | 19 | 48394875 | 48444931 | 4.496 | 22 | 36281281 | 36388018 | ||
| 5.178 | X | 71366239 | 71532374 | 4.478 | 19 | 3976056 | 3985469 | ||
| 5.162 | 9 | 35056064 | 35073249 | 4.451 | 4 | 39822863 | 39977956 | ||
| 5.146 | 16 | 58519951 | 58629886 | 4.438 | 19 | 46674275 | 46717127 | ||
| 5.109 | 5 | 14143702 | 14532128 | 4.436 | 19 | 15235519 | 15332545 | ||
| 5.100 | 5 | 157266079 | 157395598 | 4.364 | 11 | 123057489 | 123063230 | ||
| 5.065 | 19 | 39436156 | 39476670 | 4.198 | 3 | 41194837 | 41260096 | ||
| 5.028 | 10 | 35638249 | 35642278 | 3.997 | 12 | 124911604 | 124917368 | ||
| 4.993 | 19 | 12699194 | 12724011 | 3.858 | 1 | 9651732 | 9729114 | ||
| 4.945 | 7 | 74657667 | 74760692 | 2.170 | 5 | 68215720 | 68301821 |
The 72-member constrained gene subnetwork is enriched for canonical pathways reflecting neuronal and immune functionality and basic aspects of cell cycle control.
We tested pathways from two sources (the Reactome database and KEGG, the Kyoto Encyclopedia of Genes and Genomes), assessing how many genes are in each pathway (All), how many map onto the 9729 inteconnected genes in our analysis (Mapped), and how many are present in the constrained subnetwork (Subnetwork). We assess significance using both the GSEA approach of a Kolmogorov-Smirnov (KS) test and a simple hypergeometric (HG) test of expected overlaps.
| Name | All | Mapped | Subnetwork | KS | HG |
|---|---|---|---|---|---|
| Developmental biology (Reactome) | 397 | 344 | 10 | 2.53E-19 | 1.83E-05 |
| Immune system (Reactome) | 934 | 702 | 9 | 4.98E-08 | 1.96E-02 |
| Adaptive immune system (Reactome) | 540 | 421 | 8 | 3.13E-10 | 2.08E-03 |
| Axon guidance (Reactome) | 252 | 220 | 8 | 4.62E-16 | 1.65E-05 |
| mRNA Processing (Reactome) | 162 | 120 | 8 | 9.06E-12 | 1.05E-07 |
| Calcium signaling pathway (KEGG) | 179 | 163 | 7 | 6.00E-08 | 1.36E-05 |
| Spliceosome (KEGG) | 129 | 85 | 7 | 5.32E-21 | 9.60E-08 |
| mRNA splicing (Reactome) | 112 | 74 | 7 | 6.26E-20 | 3.19E-08 |
| Processing of capped intron containing pre-mRNA (Reactome) | 141 | 102 | 7 | 3.21E-13 | 3.99E-07 |
| Pathways in cancer (KEGG) | 329 | 301 | 6 | 4.05E-12 | 4.21E-03 |
| Regulation of actin cytoskeleton (KEGG) | 217 | 188 | 6 | 8.08E-13 | 2.74E-04 |
| Cell cycle (Reactome) | 422 | 332 | 6 | 7.95E-03 | 7.12E-03 |
| mRNA splicing minor pathway (Reactome) | 46 | 20 | 6 | 2.90E-05 | 3.56E-11 |
| Signalling by NGF (Reactome) | 218 | 191 | 6 | 3.04E-21 | 3.02E-04 |
| Focal adhesion (KEGG) | 202 | 188 | 5 | 9.72E-07 | 1.70E-03 |
| Long term potentiation (KEGG) | 71 | 60 | 5 | 9.64E-15 | 2.97E-06 |
| MAPK signaling pathway (KEGG) | 268 | 233 | 5 | 3.02E-15 | 4.92E-03 |
| HIV infection (Reactome) | 208 | 163 | 5 | 1.32E-06 | 8.12E-04 |
| HIV life cycle (Reactome) | 126 | 95 | 5 | 1.03E-02 | 4.27E-05 |
| Late phase of HIV life cycle (Reactome) | 105 | 85 | 5 | 1.36E-02 | 2.27E-05 |
| Neuronal system (Reactome) | 280 | 219 | 5 | 3.80E-28 | 3.64E-03 |
| NGF signalling via TRKa from the plasma membrane (Reactome) | 138 | 120 | 5 | 1.09E-14 | 1.57E-04 |
| Signaling by GPCR (Reactome) | 921 | 415 | 5 | 1.63E-11 | 6.21E-02 |
The 72-member constrained gene subnetwork is preferentially expressed in a range of tissues and brain structures.
We find strong enrichment in a variety of tissues, predominantly neural and immune-derived samples sourced from the Roadmap Epigenome Project (REP) and the BrainSpan Atlas. We report only tissues passing significance with two conservative independent empirical approaches: random permutation of preferential expression values for the subnetwork across tissues (permutation); and comparison to the largest subnetworks detected when we permute constraint scores for all 9729 InWeb genes.
| Source | Tissue | Developmental stage | Permutation p-value | Resampled p-value | Tissue-specific genes |
|---|---|---|---|---|---|
| REP | CD34+ | Perinatal (cord blood) | 0.00100 | 0.00100 | 10 |
| REP | Fetal brain | Fetal | 0.01100 | 0.00100 | 16 |
| REP | CD8+ | Adult (>20 years) | 0.01700 | 0.00100 | 10 |
| REP | Fetal thymus | Fetal | 0.04800 | 0.00100 | 5 |
| BrainSpan | Caudal ganglionic eminence | 2A (8–9 pcw) | 0.00125 | 0.00125 | 20 |
| BrainSpan | Dorsolateral prefrontal cortex | 2A (8–9 pcw) | 0.00125 | 0.00125 | 16 |
| BrainSpan | Hippocampal anlage | 2A (8–9 pcw) | 0.00125 | 0.00125 | 17 |
| BrainSpan | Lateral ganglionic eminence | 2A (8–9 pcw) | 0.00125 | 0.00125 | 19 |
| BrainSpan | Primary motor-sensory cortex | 2A (8–9 pcw) | 0.00125 | 0.00125 | 20 |
| BrainSpan | Medial frontal cortex | 2A (8–9 pcw) | 0.00125 | 0.00125 | 19 |
| BrainSpan | Orbital frontal cortex | 2A (8–9 pcw) | 0.00250 | 0.00125 | 14 |
| BrainSpan | Parietal neocortex | 2A (8–9 pcw) | 0.00250 | 0.00125 | 18 |
| BrainSpan | Medial ganglionic eminence | 2A (8–9 pcw) | 0.00375 | 0.00125 | 18 |
| BrainSpan | Occipital neocortex | 2A (8–9 pcw) | 0.00500 | 0.00125 | 18 |
| BrainSpan | Hippocampus | 2B (10–12 pcw) | 0.00625 | 0.00125 | 18 |
| BrainSpan | Hippocampus | 3A (13–15 pcw) | 0.00625 | 0.00125 | 19 |
| BrainSpan | Primary somatosensory cortex | 3A (13–15 pcw) | 0.01250 | 0.00125 | 20 |
| BrainSpan | Primary visual cortex | 4 (19–24 pcw) | 0.01750 | 0.00125 | 22 |
| BrainSpan | Posterior superior temporal cortex | 3B (16–18 pcw) | 0.01875 | 0.00125 | 22 |
| BrainSpan | Posteroventral parietal cortex | 3A (13–15 pcw) | 0.02250 | 0.00125 | 19 |
| BrainSpan | Cerebellar cortex | 4 (19–24 pcw) | 0.02500 | 0.00125 | 19 |
| BrainSpan | Primary motor cortex | 3A (13–15 pcw) | 0.02750 | 0.00125 | 19 |
| BrainSpan | Striatum | 3A (13–15 pcw) | 0.04125 | 0.00125 | 17 |
| BrainSpan | Dorsolateral prefrontal cortex | 4A (19–24 pcw) | 0.04625 | 0.00250 | 21 |
Fig 3the 72-member selectively constrained gene subnetwork is active in early brain development, particularly in the hippocampus.
A: the constrained subnetwork shows elevated signatures of preferential expression in early stages of brain development. B: the signature is most robust in the hippocampus and its ancestral structures (orange), with some enrichment in ventral forebrain and parietal cortical wall structures very early in development (8–9 post-conception weeks). C: The constrained subnetwork shows significant preferential expression in early developmental stages, with patterns of expression losing this enrichment signature by mid-gestation. Preferentially expressed nodes and the shortest paths connecting them are colored orange; grey nodes are not preferentially expressed in each displayed tissue. Overall, these data suggest the constrained subnetwork is specifically active in very early stages of hippocampal formation.