| Literature DB >> 27503290 |
Abstract
Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements.Entities:
Keywords: DNase I hypersensitive sites; exaptation of ancestral regulatory DNA; human accelerated regions; human-specific regulatory sequences; human-specific transcription factor binding sites
Mesh:
Year: 2016 PMID: 27503290 PMCID: PMC5630920 DOI: 10.1093/gbe/evw185
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Distribution of Highly Conserved in Non-Human Primates Regulatory Sequences among 15,371 Candidate Human-Specific Regulatory Sequence
| HSRS/Genomes | haDHS | HARs | HSTFBS | DHS_FHSRR | hESC_FHSRR | Other_FHSRR |
|---|---|---|---|---|---|---|
| Human genome (hg19) | 524 | 2,745 | 3,803 | 2,118 | 1,932 | 4,249 |
| Human genome (hg38) | 524 | 2,739 | 3,714 | 2,114 | 1,928 | 4,235 |
| Mouse genome conversion (mm10) | 66 | 1,004 | 12 | 4 | 0 | 0 |
| Reciprocal conversion to human genome | 23 | 560 | 1 | 2 | 0 | 0 |
| Percent conserved in rodents' genome |
|
|
|
|
|
|
| Chimpanzee genome conversion | 439 | 2,404 | 56 | 5 | 0 | 13 |
| Reciprocal conversion to human genome | 390 | 2,146 | 40 | 0 | 0 | 1 |
| Percent conserved in Chimpanzee | 74.4 | 78.3 | 1.1 | 0 | 0 | 0 |
| Bonobo genome conversion | 425 | 2,341 | 495 | 242 | 396 | 438 |
| Reciprocal conversion to human genome | 383 | 2,123 | 262 | 199 | 306 | 350 |
| Percent conserved in Bonobo | 73.1 | 77.5 | 7.1 | 9.4 | 15.9 | 8.3 |
| Conserved in non-human primates |
|
|
|
|
|
|
| Percent conserved in non-human primates |
|
|
|
|
|
|
| Bonobo and Chimp conserved | 370 | 2,004 | 31 | 0 | 0 | 0 |
| Chimp only conserved | 21 | 141 | 9 | 0 | 0 | 1 |
| Bonobo only conserved | 13 | 117 | 231 | 199 | 306 | 350 |
NOTE.—LiftOver algorithm MinMatch Minimum ratio of bases that must remap) threshold was 1.00. HSRS, human-specific regulatory sequences; HSTFBS, human-specific transcription factor-binding sites; haDHS, human accelerated DNase I hypersensitive sites; HARs, human accelerated regions; DHS, DNase I hypersensitive sites; FHSRR, fixed human-specific regulatory regions.
Chimpanzee genome PanTro4 conversion.
Conserved in non-human primates sequences were defined based on both direct and reciprocal conversions to either one or both Chimpanzee and Bonobo genomes at MinMatch threshold of 1.00.
Distribution of Highly Conserved in Non-Human Primates Regulatory Sequences among Candidate Human-Specific Regulatory Sequence Defined by the Functional Divergence from Chimpanzee or Deletions of Ancestral DNA in the Human Genome
| HSRS/Genomes | Human-Biased CNCC’s Enhancers | Chimp-Biased CNCC’s Enhancers | hCONDELs | H3K4me3 Signatures in Human Prefrontal Neurons | All HSRS |
|---|---|---|---|---|---|
| Human genome (hg19) | 1,000 | 1,000 | 583 | 410 | 18,364 |
| Human genome (hg38) | 996 | 998 | 245 | 394 | 17,887 |
| Mouse genome conversion (mm10) | 21 | 30 | 22 | 0 | 1,159 |
| Reciprocal conversion to human genome | 4 | 7 | 18 | 0 | 615 |
| Percent conserved in rodents' genome | 0.4 | 0.7 | 7.3 | 0 | 3.4 |
| Chimpanzee genome conversion | 871 | 884 | 17 | 86 | 4,775 |
| Reciprocal conversion to human genome | 765 | 785 | 12 | 36 | 4,175 |
| Percent conserved in Chimpanzee | 76.8 | 78.7 | 4.9 | 9.1 | 23.3 |
| Bonobo genome conversion | 844 | 847 | 71 | 74 | 6,173 |
| Reciprocal conversion to human genome | 754 | 760 | 63 | 36 | 5,236 |
| Percent conserved in Bonobo | 75.7 | 76.2 | 25.7 | 9.1 | 29.3 |
| Conserved in non-human primates |
|
|
|
|
|
| Percent conserved in non-human primates |
|
|
|
|
|
| Bonobo and Chimp conserved | 715 | 725 | 7 | 22 | 3,874 |
| Chimp only conserved | 50 | 60 | 5 | 14 | 301 |
| Bonobo only conserved | 39 | 35 | 56 | 14 | 1,360 |
NOTE.—LiftOver algorithm MinMatch Minimum ratio of bases that must remap) threshold was 1.00. HSRS, human-specific regulatory sequences; hCONDELs, human-specific deletions of regulatory DNA; CNCCs, cranial neural crest cells; All HSRS column shows the sum of records for each categories from the corresponding entries in tables 1 and 2.
*Chimpanzee genome PanTro4 conversion.
**Conserved in non-human primates sequences were defined based on both direct and reciprocal conversions to either one of both Chimpanzee and Bonobo genomes at MinMatch threshold of 1.00.
FTwo distinct pathways of human regulatory DNA divergence during evolution of human-specific genomic regulatory networks. (A) Sequence conservation analyses of 18,364 candidate human-specific regulatory sequences (HSRS) revealed two distinct patterns of regulatory DNA alignments to genomes of non-human primates (NHP): (i) an alignment pattern with a significant majority (from 77.1% to 82.6%) of candidate HSRS being highly conserved in genomes of Bonobo and Chimpanzee (blue colored features in the figure); (ii) an alignment pattern with only a minority (from 7.3% to 15.9%) of candidate HSRS being highly conserved in genomes of Bonobo and Chimpanzee (red colored features in the figure). It is proposed that these two distinct sequence conservation patterns reflect two mechanistically distinct pathways of human regulatory DNA divergence during evolution (see text for details). For each family of HSRS the percentage of highly conserved in NHP (blue) and human-specific (red) regulatory DNA segments are shown. The results in the (A) represent the graphical summary of the primary data reported in the tables 1 and 2 based on definition of the sequence conservation threshold of 1.00 during both direct and reciprocal conversions thus requiring that 100% bases must remap during the alignments. The results in the (B) illustrate the sequence conservation analyses based on definition of the sequence conservation threshold of 0.95 during direct conversion from human to NHP genomes without reciprocal conversion corrections.
Genomic Features Associated with 60 Rapidly Evolving in Humans Topologically Associating Domains
| Genomic features | Genome | revTADs | Expected | Enrichment |
|
|---|---|---|---|---|---|
| Human Accelerated Regions (HARs) | 2,745 | 378 | 53 | 7.4 | <0.0001 |
| Human-specific TFBS | 3,803 | 1,370 | 73 | 18.8 | <0.0001 |
| Lamina-associated domains (LADs) | 1,344 | 54 | 26 | 2.1 | 0.0019 |
| Human-specific CTCF-binding sites | 591 | 312 | 11 | 28.4 | <0.0001 |
| Human-specific NANOG-binding sites | 826 | 192 | 16 | 12 | <0.0001 |
| Human-specific RNAPII-binding sites | 290 | 181 | 6 | 30.2 | <0.0001 |
| Human-specific regulatory regions identified in H1-hESC | 1,932 | 109 | 37 | 2.9 | <0.0001 |
| Human-specific regulatory regions identified in multiple cells | 4,249 | 417 | 82 | 5.1 | <0.0001 |
| DHS-defined human-specific regulatory regions | 2,118 | 558 | 41 | 13.6 | <0.0001 |
| Human-specific conservative deletions (CONDELs) | 583 | 29 | 11 | 2.6 | <0.0001 |
| Human ESC enhancers | 6,823 | 240 | 131 | 1.8 | <0.0001 |
| Human-specific transcriptional network in the brain | 6,622 | 147 | 127 | 1.2 | 0.3856 |
| Primate-specific CTCF-binding sites | 29,081 | 1,269 | 558 | 2.3 | <0.0001 |
| H3K27ac peaks with human-specific enrichment in embryonic limb at E33 stage | 780 | 31 | 15 | 2.1 | 0.0238 |
| H3K4me3 peaks with human-specific enrichment in prefrontal cortex (PFC) neurons | 410 | 29 | 8 | 3.6 | <0.0001 |
NOTE.—hESC, human embryonic stem cells; TFBS, transcription factor-binding site; HARs, human accelerated region; LAD, lamina-associated domain; TAD, topologically-associating domain; RNAPII, RNA polymerase II; PFC, prefrontal cortex; DHS, DNase hypersensitive sites; CONDELs, conservative deletions; E33, embryonic day 33; Expected number of genomic features was estimated based on the ratio of the number of human rapidly-evolving TADs (n = 60) to the total number of TADs in hESC (n = 3,127).
FHigh-complexity patterns of the genomic architecture of individual rapidly evolving in humans Topologically Associating Domains (revTADs) harboring hundreds regulatory elements and reflecting distinct association profile between placements of HARs and TFBS residing within the revTADs. UCSC Genome Browser view of the revTAD on human chr6 harboring 10 Human Accelerated Regions, HARs (red bars), 10 hESC-enriched enhancers (black bars), 52 primate-specific TFBS for NANOG (26 sites), POU5F1 (10 sites), CTCF (26 sites), and 72 recombination hotspots with recombination rates at least 10 cM/Mb (blue bars). Genomic coordinates of POU3F2 super-enhancer domain in the hESC genome is depicted by the horizontal arrow. Supplementary figure S1, Supplementary Material online reports multiple correlation screens revealing distinct patterns of associations between placements of HARs and TFBS residing within the revTADs.
FDistinct correlation profiles of HSRS and recombination rates within the revTADs distinguish placement patterns of HARs and HSTFBS. (A, B) Visualization of the placement distribution patterns of HARs (low positioned red bars), recombination hotspots, RHs (blue bars), and HSTFBS (high positioned red bars with designations of TF names) within the revTADs. Note that revTADs containing high numbers of RHs (192–402 RHs) tend to harbor higher numbers of HARs (15–24 HARs) and no HSTFBS (figures in the panel A). In contrast, revTADs containing intermediate (69–94 RHs) or low (2 RHs) numbers of RHs tend to harbor intermediate and low numbers of HARs and multiple HSTFBS (figures in the panel B). (C) A model of genome evolution driven by the increasing complexity of genomic regulatory networks (GRNs). It is proposed that mechanistically distinct processes creating HSRS occur within the context of the intrinsic division of mammalian genomes into regions of high and low recombination rates. Genomic regions of high and low recombination rates are associated with the low and high probabilities of TE insertion and/or retention as well as C/G and A/T alleles’ bias, respectively. According to this model, the continuing emergence of new enhancer elements constitutes a critical creative event driving the increasing complexity of GRNs in the hESC genome. Potential mechanisms of HSRS-mediated effects on principal regulatory structures of interphase chromatin involve: (i) creation of new TFBS and novel enhancer elements; (ii) increasing density of conventional enhancers which would facilitate a transition to super-enhancer structures; (iii) emergence of overlapping CTCF/cohesin-binding sites and LMNB1-binding sites; (iv) continuing insertion of clusters of Alu elements near the putative DNA bending sites. Collectively, the ensemble of these structural changes facilitated by the targeted placements and retention of HSRS at defined genomic locations would enable the emergence of new super-enhancer domains and facilitate the remodeling of existing TADs to drive evolution of GRNs. MADE, cytosine methylation associated DNA editing.