| Literature DB >> 32902713 |
Abstract
Recent advances in identification and characterization of human-specific regulatory DNA sequences set the stage for the assessment of their global impact on physiology and pathology of modern humans. Gene set enrichment analyses (GSEA) of 8405 genes linked with 35,074 human-specific neuro-regulatory single-nucleotide changes (hsSNCs) revealed numerous significant associations with morphological structures, physiological processes, and pathological conditions of modern humans. Significantly enriched traits include more than 1000 anatomically distinct regions of the adult human brain, many different types of cells and tissues, more than 200 common human disorders, and more than 1000 records of rare diseases. Thousands of genes connected with neuro-regulatory hsSNCs have been identified, which represent essential genetic elements of the autosomal inheritance and offspring survival phenotypes. A total of 1494 hsSNC-linked genes are associated with either autosomal dominant or recessive inheritance, and 2273 hsSNC-linked genes have been associated with premature death, embryonic lethality, as well as pre-, peri-, neo-, and post-natal lethality phenotypes of both complete and incomplete penetrance. Differential GSEA implemented on hsSNC-linked loci and associated genes identify a set of 7990 hsSNC-target genes linked to evolutionary distinct classes of human-specific regulatory sequences (HSRS). Notably, the expression of a majority of these genes (5389 genes; 67%) is regulated by stem cell-associated retroviral sequences (SCARS) and SCARS-regulated genes captured a dominant fraction (91%) of significant phenotypic associations linked with hsSNCs. Interrogations of the MGI database revealed readily available mouse models tailored for precise experimental definitions of functional effects of hsSNCs and SCARS on genes causally affecting thousands of mammalian phenotypes and implicated in hundreds of common and rare human disorders. These observations suggest that a preponderance of human-specific traits evolved under a combinatorial regulatory control of distinct classes of HSRS and neuro-regulatory loci harboring hsSNCs that are fixed in humans, distinct from other primates, and located in differentially accessible chromatin regions during brain development.Entities:
Keywords: fixed neuro-regulatory human-specific single nucleotide mutations; human phenotypic uniqueness; human-specific regulatory sequences; human-specific traits; stem cell–associated retroviral sequences
Year: 2020 PMID: 32902713 PMCID: PMC7480002 DOI: 10.1007/s10577-020-09639-w
Source DB: PubMed Journal: Chromosome Res ISSN: 0967-3849 Impact factor: 5.239
Fig. 1GREAT analysis identifies 8405 human genes associated with 35,074 neuro-regulatory human-specific single nucleotide changes (hsSNCs) identified in differentially accessible (DA) chromatin regions during human and chimpanzee brain development in cerebral organoids. a Patterns of genomic associations between neuro-regulatory hsSNCs and putative target genes defined at different single nearest gene maximum extensions. GREAT algorithm version 4.0.4. b A total of 1064 of all 35,074 SNCs (3%) are not associated with any genes in the human genome, while a total of 34,010 (97%) human-specific SNCs in DA regions appear associated with 8405 human genes. GREAT algorithm version 3.0.0
Associations with human physiological processes and pathological conditions of 8405 genes linked with 35,074 human-specific single nucleotide changes (SNC) within differentially accessible (DA) chromatin regions identified during human and chimpanzee brain development in cerebral organoids
| Database | Number of significant records* |
|---|---|
| ARCHS4 Human Tissues | 39 |
| GO Biological Process 2018 | 392 |
| GO Molecular Function 2018 | 89 |
| GO Cellular Component 2018 | 33 |
| KEGG 2019 Human | 129 |
| KEGG 2019 Mouse | 106 |
| MGI Mammalian Phenotype Level 4 2019 | 407 |
| MGI Mammalian Phenotype 2017 | 749 |
| Human Phenotype Ontology | 298 |
| GWAS Catalog 2019 | 241 |
| Rare Diseases AutoRIF Gene Lists | 1116 |
| Rare Diseases GeneRIF Gene Lists | 473 |
| Rare Diseases GeneRIF ARCHS4 Predictions | 603 |
| Rare Diseases AutoRIF ARCHS4 Predictions | 641 |
| Aging Perturbations from GEO (upregulated genes) | 34 |
| Aging Perturbations from GEO (downregulated genes) | 67 |
| Human Brain Regions: Allen Brain Atlas (upregulated genes) | 1218 |
| Human Brain Regions: Allen Brain Atlas (downregulated genes) | 1102 |
| Disease Perturbations from GEO (downregulated genes) | 240 |
| Disease Perturbations from GEO (upregulated genes) | 204 |
| Human Database of Genotypes and Phenotype (dbGaP) | 136 |
| DisGeNET database | 1313 |
| UK Biobank GWAS v1 | 357 |
GEO gene expression omnibus, GO Gene Ontology, GWAS genome-wide association studies, ARCHS4 all RNA-seq and ChIP-seq sample and signature search, KEGG Kyoto Encyclopedia of Genes and Genomes, MGI mouse genome informatics.
*Defined at adjusted p value < 0.05
Fig. 2A dominant majority (6640 of 8405 genes; 79%) of genes linked to 35,074 human-specific single nucleotide changes (hsSNCs) in chromatin’s differentially accessible (DA) regions during human and chimpanzee brain development in cerebral organoids represents genetic markers of 26 human brain regions. a Number of brain regions’ marker genes linked to 35,074 neuro-regulatory hsSNCs in specified human brain regions (the normalized values calculated per 1000 region-specific marker genes are shown). Genes linked to hsSNCs were identified among genes significantly upregulated in specified human brain regions using the Allen Brain Atlas database (records manifesting increased expression at 1.5-fold cutoff were identified and selected for analyses). b Number (percent) of genes differentially expressed (DE) in human versus chimpanzee adult brains among genes linked to neuro-regulatory human-specific SNCs. Genes linked to hsSNCs were identified among genes differentially expressed in eight regions of human versus chimpanzee adult brains (Xu et al., 2018). c The relative ranking of the 26 brain regions based on the numbers of hsSNC-linked marker genes identified in each region of the human brain (normalized values per 1000 marker genes are reported)
Gene set enrichment analyses of the MGI Mammalian Phenotype Level 4 (2019) database identify mammalian phenotypes manifesting significant associations with neuro-regulatory human-specific SNC-linked genes. Top 40 of the 407 significant records are reported
| Mammalian phenotype | Overlap | Adjusted | |
|---|---|---|---|
| 315/517 | 1.55E-18 | 8.15E-15 | |
| MP:0001262_decreased_body_weight | 773/1471 | 2.04E-17 | 5.36E-14 |
| MP:0001405_impaired_coordination | 247/405 | 6.90E-15 | 1.21E-11 |
| 362/643 | 9.65E-14 | 1.27E-10 | |
| MP:0001463_abnormal_spatial_learning | 120/172 | 1.46E-13 | 1.54E-10 |
| MP:0002169_no_abnormal_phenotype_detected | 958/1944 | 6.87E-12 | 6.02E-09 |
| MP:0004811_abnormal_neuron_physiology | 78/107 | 8.69E-11 | 6.53E-08 |
| MP:0002206_abnormal_CNS_synaptic_transmission | 75/103 | 2.18E-10 | 1.27E-07 |
| MP:0000267_abnormal_heart_development | 111/168 | 2.43E-10 | 1.28E-07 |
| 246/432 | 2.04E-10 | 1.34E-07 | |
| 360/669 | 2.90E-10 | 1.39E-07 | |
| 122/191 | 8.21E-10 | 3.60E-07 | |
| MP:0001899_absent_long_term_depression | 27/28 | 1.11E-09 | 4.50E-07 |
| MP:0001732_postnatal_growth_retardation | 360/677 | 1.85E-09 | 6.48E-07 |
| MP:0001698_decreased_embryo_size | 293/537 | 2.09E-09 | 6.88E-07 |
| MP:0002152_abnormal_brain_morphology | 119/187 | 1.84E-09 | 6.92E-07 |
| 154/256 | 3.23E-09 | 9.44E-07 | |
| MP:0002741_small_olfactory_bulb | 35/40 | 3.07E-09 | 9.51E-07 |
| MP:0001469_abnormal_contextual_conditioning_behavior | 48/61 | 5.45E-09 | 1.43E-06 |
| MP:0001473_reduced_long_term_potentiation | 84/124 | 6.05E-09 | 1.45E-06 |
| MP:0000788_abnormal_cerebral_cortex_morphology | 104/161 | 5.79E-09 | 1.45E-06 |
| 172/293 | 5.33E-09 | 1.48E-06 | |
| MP:0001954_respiratory_distress | 121/194 | 7.90E-09 | 1.81E-06 |
| MP:0001575_cyanosis | 129/210 | 1.00E-08 | 2.19E-06 |
| MP:0001953_respiratory_failure | 103/161 | 1.47E-08 | 3.09E-06 |
| MP:0002906_increased_susceptibility_to_pharmacologically_induced_seizures | 70/101 | 2.60E-08 | 5.25E-06 |
| MP:0002910_abnormal_excitatory_postsynaptic_currents | 59/83 | 7.68E-08 | 1.50E-05 |
| 499/997 | 9.80E-08 | 1.84E-05 | |
| MP:0002066_abnormal_motor_capabilities/coordination/movement | 102/164 | 1.41E-07 | 2.55E-05 |
| MP:0006254_thin_cerebral_cortex | 53/74 | 2.36E-07 | 4.14E-05 |
| 339/656 | 2.60E-07 | 4.28E-05 | |
| MP:0000807_abnormal_hippocampus_morphology | 63/92 | 2.56E-07 | 4.34E-05 |
| MP:0006009_abnormal_neuronal_migration | 59/85 | 2.94E-07 | 4.69E-05 |
| 143/247 | 3.20E-07 | 4.95E-05 | |
| MP:0000031_abnormal_cochlea_morphology | 44/59 | 3.84E-07 | 5.77E-05 |
| MP:0000852_small_cerebellum | 58/84 | 4.91E-07 | 7.17E-05 |
| MP:0002063_abnormal_learning/memory/conditioning | 42/56 | 5.43E-07 | 7.52E-05 |
| MP:0003633_abnormal_nervous_system_physiology | 79/123 | 5.36E-07 | 7.62E-05 |
| MP:0010025_decreased_total_body_fat_amount | 263/498 | 5.94E-07 | 8.01E-05 |
| MP:0009937_abnormal_neuron_differentiation | 75/116 | 7.01E-07 | 9.22E-05 |
Gene set enrichment analyses were performed employing the Enrichr bioinformatics platform (see “Methods”). All analyzed records are reported in the Supplemental Table Set S2. Overlap refers to the number of hsSNC-linked genes and mammalian phenotype–associated genes in corresponding categories. Italicized records highlight the classification categories defined as offspring survival phenotypes
Structurally, functionally, and evolutionary distinct families of human-specific regulatory sequences (HSRS) manifest common enrichment patterns of associations with 8405 hsSNC-linked genes
| Classification category/reference database | Number of records (hg19) | Associated genes | Common with all 8405 hsSNC-linked genes | Percent, hsSNCs genes | Percent, HSRS genes |
|---|---|---|---|---|---|
| Fixed human-specific insertions. | 11,878 | 7979 | 5290 | 62.94 | 66.30 |
| Human-specific TE loci expressed in human dorsolateral prefrontal cortex | 4637 | 4051 | 2719 | 32.35 | 67.12 |
| Set of duplicated regions in GRCh38 space | 7599 | 6618 | 3654 | 43.47 | 55.21 |
| Fixed human-specific deletions | 5883 | 5489 | 3835 | 45.63 | 69.87 |
| Human-specific STR expansions | 4875 | 4844 | 3354 | 39.90 | 69.24 |
| hsTFBS | 3803 | 1087 | 750 | 8.92 | 69.00 |
| ace-DHS | 3538 | 3445 | 2553 | 30.37 | 74.11 |
| FHSRR | 4249 | 2810 | 1899 | 22.59 | 67.58 |
| Human-specific STR contractions | 1279 | 973 | 554 | 6.59 | 56.94 |
| hESC_FHSRR_DHS | 1932 | 1458 | 1096 | 13.04 | 75.17 |
| DHS_FHSRR (non-hESC) | 2118 | 552 | 307 | 3.65 | 55.62 |
| Human accelerated regions (HARs) | 2745 | 2281 | 1890 | 22.49 | 82.86 |
| haDHS | 524 | 747 | 659 | 7.84 | 88.22 |
| Human-biased CNCC enhances | 1000 | 1439 | 1110 | 13.21 | 77.14 |
| Chimp-biased CNCC enhances | 1000 | 1445 | 1106 | 13.16 | 76.54 |
| H3K4me3 peaks with human-specific enrichment in prefrontal neurons | 410 | 578 | 308 | 3.66 | 53.29 |
| Human-specific hESC functional enhancers | 1619 | 1214 | 816 | 9.71 | 67.22 |
| All HSRS | 59,089 | 13,824 | 7406 | 88.11 | 53.57 |
Definitions of structurally, functionally, and evolutionary distinct families of human-specific regulatory sequences (HSRS) can be found in Glinsky (2020a); Glinsky and Barakat (2019).
Fig. 3Distinct families of regulatory DNA sequences comprising a compendium of 59,089 human-specific regulatory sequences (HSRS) manifest common enrichment patterns of associations with sub-sets of 8405 neuro-regulatory hsSNC-linked genes. a Number of genes identified by the GREAT algorithm as putative regulatory targets of distinct families of HSRS. b Number of hsSNC-linked genes among genes comprising putative regulatory targets of distinct families of HSRS. c Genome-wide correlation patterns between the number of genes comprising the putative regulatory targets of distinct families of HSRS and the number of hsSNC-linked genes among the HSRS-target genes
Enrichment within human-specific genomic regulatory networks (GRNs) of 8405 genes associated with human-specific neuro-regulatory single nucleotide changes (hsSNCs)
| Classification category | Number of genes | Genes associated with hsSNCs |
|---|---|---|
| Networks of genes associated with expression of transposable elements (TE) in human dorsolateral prefrontal cortex | ||
| Human genome | 63,677 | 8405 |
| Networks of genes associated with human DLPFC-expressed TE | 22,863 | 6547 |
| Percent | 35.9 | 77.89 |
| Enrichment** | 1 | 2.17 |
|
| 0 | |
| GES of the multi-lineage markers expressing (MLME) cells of human preimplantation embryo | ||
| Human genome | 63,677 | 8405 |
| GES of the MLME cells of human preimplantation embryo | 12,735 | 5218 |
| Percent | 20 | 62.08 |
| Enrichment** | 1 | 3.10 |
|
| 0 | |
| Regulatory networks of genes associated with human-specific structural variants*** | ||
| Human genome | 63,677 | 8405 |
| Genes associated with human-specific deletions and insertions | 10,992 | 3056 |
| Percent | 17.26 | 36.36 |
| Enrichment** | 1 | 2.11 |
|
| 0 | |
| Gene expression signature of the HERVH/LBP9 network in hESC | ||
| Human genome | 63,677 | 8405 |
| Genes associated with the HERVH/LBP9 pathway in hESC | 11,507 | 4073 |
| Percent | 18.07 | 48.46 |
| Enrichment** | 1 | 2.68 |
|
| 0 | |
| Network of genes associated with regulatory TE in naïve and primed hESC | ||
| Human genome | 63,677 | 8405 |
| Genes associated with regulatory TE in naïve and primed hESC | 6148 | 2787 |
| Percent | 9.65 | 33.16 |
| Enrichment** | 1 | 3.44 |
|
| 0 | |
| Network of genes differentially expressed in human fetal brain and adult neocortex | ||
| Human genome | 63,677 | 8405 |
| Human fetal brain/adult neocortex signature genes | 4764 | 2448 |
| Percent | 7.48 | 29.13 |
| Enrichment** | 1 | 3.89 |
|
| 0 | |
| Human neurogenesis in vivo network^ | ||
| Human genome | 63,677 | 8405 |
| Gene expression signatures of human neurogenesis in vivo | 11,911 | 5467 |
| Percent | 18.71 | 65.04 |
| Enrichment** | 1 | 3.48 |
|
| 0 | |
| Human corticogenesis in vitro network | ||
| Human genome | 63,677 | 8405 |
| Gene expression signatures of human corticogenesis in vitro | 12,334 | 5253 |
| Percent | 19.37 | 62.50 |
| Enrichment** | 1 | 3.23 |
|
| 0 | |
Human-specific genomic regulatory networks (GRNs) were defined previously (Glinsky, 2020a, see text for details) based on the following primary contributions:
Networks of genes associated with expression of transposable elements (TE) in human dorsolateral prefrontal cortex: Guffanti et al. (2018)
Gene expression signature (GES) of the multi-lineage markers expressing (MLME) cells of human preimplantation embryo: Glinsky et al. (2018)
Regulatory networks of genes associated with human-specific structural variants: Kronenberg et al. (2018)
Gene expression signature of the HERVH/LBP9 network in hESC: Wang et al. (2014)
Network of genes associated with regulatory TE in naïve and primed hESC: Theunissen et al. (2016)
Network of genes differentially expressed in human fetal brain and adult neocortex: Glinsky (2017). doi: 10.1101/022913
Human neurogenesis in vivo network: Nowakowski et al. (2017)
Human corticogenesis in vitro network: van de Leemput et al. (2014)
TE transposable genetic elements, hESC human embryonic stem cell, DLPFC dorsolateral prefrontal cortex, MLME multi lineage markers expression
*p values were estimate using the hypergeometric distribution test
**Expected values were estimated based on the number of genes in the human genome (63,677) and the number of genes in the corresponding category of human-specific regulatory networks
***This category of genes was reported in Kronenberg et al. (2018)
^This category of genes was reported in Nowakowski et al. (2017)
Fig. 4SCARS-regulated genes associated with neuro-regulatory human-specific SNCs manifest prominent patterns of significant associations with physiological and pathological phenotypes of modern humans. Among all neuro-regulatory human-specific SNC-linked genes, 5389 SCARS-regulated genes were identified and segregated from the remaining 3016 non-SCARS-regulated hsSNC-linked genes (non-SCARS-regulated genes). These two categories of hsSNC-linked genes were subjected to gene set enrichment analyses to identify statistically significant associations with morphological features and physiological and pathological phenotypes of modern humans. Results of the analyses were separated into two sets (a and b) using a threshold of 250 significant records per database and plotted for visualization. a The results with less than 250 significant records, while the results with more than 250 significant records are reported in the b. c The summary of the results reflecting prominent contributions of SCARS-regulated genes to a global scale of significant associations of neuro-regulatory human-specific SNC-linked genes to morphological features and physiological and pathological phenotypes of modern humans.
Comparisons of impacts of 5389 SCARS-regulated genes and 3016 non-SCARS-regulated genes linked with human-specific neuro-regulatory SNCs on a global scale of significant associations to aging and diseases of modern humans
| Aging-associated gene expression changes | |||
| Aging upregulated genes | |||
| All genes linked with human-specific neuro-regulatory SNCs ( | 34 | ||
| SCARS-regulated hsSNC-linked genes ( | 130 | 97.0 | 32.5 |
| Non-SCARS-regulated hsSNC-linked genes ( | 4 | 3.0 | |
| Aging downregulated genes | |||
| All genes linked with human-specific neuro-regulatory SNCs ( | 67 | ||
| SCARS-regulated hsSNC-linked genes ( | 155 | 98.7 | 77.5 |
| Non-SCARS-regulated hsSNC-linked genes ( | 2 | 1.3 | |
| Disease-associated gene expression changes | |||
| Diseases GEO database (upregulated genes) | |||
| All genes linked with human-specific neuro-regulatory SNCs ( | 204 | ||
| SCARS-regulated hsSNC-linked genes ( | 507 | 99.2 | 126.8 |
| Non-SCARS-regulated hsSNC-linked genes ( | 4 | 0.8 | |
| Diseases GEO database (downregulated genes) | |||
| All genes linked with human-specific neuro-regulatory SNCs ( | 240 | ||
| SCARS-regulated hsSNC-linked genes ( | 476 | 97.9 | 47.6 |
| Non-SCARS-regulated hsSNC-linked genes ( | 10 | 2.1 | |
| DisGeNET database | |||
| All genes linked with human-specific neuro-regulatory SNCs ( | 1313 | ||
| SCARS-regulated hsSNC-linked genes ( | 1126 | 98.5 | 66.2 |
| Non-SCARS-regulated hsSNC-linked genes ( | 17 | 1.5 | |
Differential GSEA were carried out independently on 5389 SCARS-regulated and 3016 non-SCARS-regulated neuro-regulatory hsSNC-linked genes employing 29 genomic databases. Results were recorded and reported for comparisons of the numbers of significantly enriched records. Percent and enrichment ratio columns report metrics reflective of the relative contributions of SCARS-regulated and non-SCAS-regulated hsSNC-linked genes