| Literature DB >> 31228201 |
Álvaro Perdomo-Sabogal1, Katja Nowick1.
Abstract
Differences in gene regulation have been suggested to play essential roles in the evolution of phenotypic changes. Although DNA changes in cis-regulatory elements affect only the regulation of its corresponding gene, variations in gene regulatory factors (trans) can have a broader effect, because the expression of many target genes might be affected. Aiming to better understand how natural selection may have shaped the diversity of gene regulatory factors in human, we assembled a catalog of all proteins involved in controlling gene expression. We found that at least five DNA-binding transcription factor classes are enriched among genes located in candidate regions for selection, suggesting that they might be relevant for understanding regulatory mechanisms involved in human local adaptation. The class of KRAB-ZNFs, zinc-finger (ZNF) genes with a Krüppel-associated box, stands out by first, having the most genes located on candidate regions for positive selection. Second, displaying most nonsynonymous single nucleotide polymorphisms (SNPs) with high genetic differentiation between populations within these regions. Third, having 27 KRAB-ZNF gene clusters with high extended haplotype homozygosity. Our further characterization of nonsynonymous SNPs in ZNF genes located within candidate regions for selection, suggests regulatory modifications that might influence the expression of target genes at population level. Our detailed investigation of three candidate regions revealed possible explanations for how SNPs may influence the prevalence of schizophrenia, eye development, and fertility in humans, among other phenotypes. The genetic variation we characterized here may be responsible for subtle to rough regulatory changes that could be important for understanding human adaptation.Entities:
Keywords: zzm321990 Krüppel-associated box (KRAB-ZNF) cluster; positive selection; schizophrenia; transcription factor
Mesh:
Year: 2019 PMID: 31228201 PMCID: PMC6685493 DOI: 10.1093/gbe/evz131
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Composition of 3,344 GRF Genes Considered in This Study (see Supplementary Material, Supplementary Material Online, for Selection Criteria) and the Sources Where These Genes Were Previously Cataloged
| Extant Inventories Human GRFs | Genes Included | % Included |
|---|---|---|
|
| 1,640 | 84.1 |
|
| 1,804 | 96.6 |
|
| 1,734 | 87.2 |
|
| 572 | 96.5 |
|
| 339 | 96.3 |
|
| 2,998 | 92.3 |
|
| 2,225 | 86.6 |
|
| 1,506 | 99.8 |
| Present work | 3,344 | 100 |
Association between GRF and non-GRF genes and the level of significance for three statistics for identifying candidate regions for positive selection and Measuring genetic differentiation (F).
| Test | Populations | Fisher Exact Test (Bonferroni Corrected | Odds Ratio | Feature |
|---|---|---|---|---|
| CLR | CEU | 3.96E-15 | 1.207 | Enrichment |
| CHB | 9.72E-02 | 1.066 | No difference | |
| YRI | 2.70E-07 | 1.132 | Enrichment | |
| XP-CLR | CEU versus CHB | 3.96E-04 | 1.145 | Enrichment |
| CEU versus YRI | 1.58E-14 | 1.278 | Enrichment | |
| CHB versus CEU | 3.42E-10 | 1.235 | Enrichment | |
| CHB versus YRI | 8.64E-08 | 1.203 | Enrichment | |
| YRI versus CEU | 4.50E-09 | 1.219 | Enrichment | |
| YRI versus CHB | 1 | 1.01 | No difference | |
| XP-EHH | CEU versus CHB | 3.96E-15 | 1.367 | Enrichment |
| CEU versus YRI | 3.96E-15 | 0.906 | Depletion | |
| CHB versus CEU | 1.73E-03 | 1.043 | No difference | |
| CHB versus YRI | 3.96E-15 | 0.896 | Depletion | |
| YRI versus CEU | 1 | 1.016 | No difference | |
| YRI versus CHB | 1 | 0.988 | No difference | |
|
| CEU versus CHB | 1.04E-01 | 0.971 | No difference |
| YRI versus CEU | 1.19E-01 | 1.023 | No difference | |
| YRI versus CHB | 1 | 1.013 | No difference |
. 1.—Enrichment analyses for genes from the ten largest DNA–binding GRFs classes located in regions exhibiting high scores for four methods for detecting candidate regions for positive selection and one for measuring genetic differentiation. This heatmap shows the results from the Fisher’s exact test after correcting for multiple testing by using the Bonferroni correction for each population or cross-population comparison, respectively.
Main Biological Roles of the Five Repeatedly Enriched GRF Classes within the Top 5% of Putative Regions for Positively Selection
| GRF Family | Examples of Main Regulatory Roles |
|---|---|
| Forkhead boxes | Cell growth, proliferation, differentiation, and longevity; embryonic development; cell migration; organ development, T-lymphocyte proliferation ( |
| C2H2 | Establishment of the chromosomal architecture; embryonic development, cell differentiation and proliferation, regulation of the cell cycle and apoptosis ( |
| KRAB-ZNF | Recruitment of TRIM28/KAP-1 for repression of gene expression, epigenetic silencing; early embryonic development; repression of ERVs and transposable elements; establishment of postzygotic reproductive isolation (speciation) ( |
| Homeo domain | Body plan specification during embryogenesis, regulation of axial patterning, segment or cell identity and proliferation; formation and cell fate determination in metazoan development, crucial for normal temporospatial limb and organ development ( |
| High-mobility HMG | Bind temporally to nucleosomes to modify local chromatin architecture; DNA replication and repair; architectural proteins of nucleus and mitochondrial DNA; signaling regulators in the cytoplasm and as inflammatory cytokines ( |
KRAB-ZNF Clusters Exhibiting One to Multiple Regions Candidate for Positive Selection in Three Human Populations (CEU, CHB, and YRI)
| Chromosome | Start | End | Length Haplotype | Population | GRF Genes | Non-GRF Genes |
|
|---|---|---|---|---|---|---|---|
| chr19 | 9746367 | 9886927 | 0.14 | CEU |
| 0.001 | |
| chr19 | 9679258 | 9871747 | 0.19 | CHB |
| 0.001 | |
| chr19 | 9623427 | 9710798 | 0.09 | CEU |
|
| 0.001 |
| chr19 | 9433260 | 9579560 | 0.15 | CHB |
| 0.039 | |
| chr7 | 99049790 | 99226981 | 0.18 | CEU |
|
| 0.001 |
| chr19 | 12290691 | 12477728 | 0.19 | CEU |
|
| 0.001 |
| chr19 | 11569316 | 11654956 | 0.09 | CEU |
|
| 0.001 |
| chr19 | 11569316 | 11651077 | 0.08 | CHB |
|
| 0.001 |
| chr19 | 11681367 | 11763981 | 0.08 | CHB |
|
| 0.001 |
| chr19 | 11911546 | 12194995 | 0.28 | CHB |
|
| 0.001 |
| chr19 | 19518253 | 19658472 | 0.14 | CEU |
|
| 0.041 |
| chr19 | 20219280 | 20473261 | 0.25 | CEU |
| 0.001 | |
| chr19 | 22736627 | 22847686 | 0.11 | CEU |
|
| 0.001 |
| chr19 | 22849806 | 23075779 | 0.23 | CEU |
| 0.001 | |
| chr19 | 22736073 | 22789623 | 0.05 | CHB |
| 0.032 | |
| chr19 | 22797143 | 23066423 | 0.27 | CHB |
|
| 0.008 |
| chr19 | 23167970 | 23274391 | 0.11 | CEU |
| 0.001 | |
| chr19 | 23566484 | 23647327 | 0.08 | CEU |
|
| 0.014 |
| chr19 | 24159713 | 24258543 | 0.1 | CEU |
|
| 0.001 |
| chr19 | 24165702 | 24249831 | 0.08 | CHB |
|
| 0.001 |
| chr19 | 20912174 | 21159445 | 0.25 | CHB |
| 0.009 | |
| chr19 | 20961835 | 21046198 | 0.08 | YRI |
| 0.009 | |
| chr19 | 35379737 | 35443530 | 0.06 | CHB |
|
| 0.001 |
| chr19 | 37401178 | 37684941 | 0.28 | CHB |
| 0.003 | |
| chr19 | 38129568 | 38255337 | 0.13 | CHB |
| 0.039 | |
| chr19 | 52350176 | 52471785 | 0.12 | CHB |
|
| 0.033 |
| chr19 | 52350054 | 52407858 | 0.06 | CEU |
| 0.005 | |
| chr19 | 52409615 | 52511217 | 0.1 | CEU |
|
| 0.025 |
| chr19 | 52533305 | 52665989 | 0.13 | CEU |
| 0.014 | |
| chr19 | 52995729 | 53064163 | 0.07 | CEU |
| 0.031 | |
| chr3 | 40531136 | 40630291 | 0.1 | CEU |
| 0.031 | |
| chr6 | 28040581 | 28337801 | 0.3 | CEU |
|
| 0.001 |
| chr6 | 28342884 | 28426378 | 0.08 | CEU |
| 0.003 | |
| chr12 | 1.33E+08 | 1.34E+08 | 0.3 | CHB |
| 0.001 | |
| chr1 | 2.47E+08 | 2.47E+08 | 0.1 |
|
| 0.001 | |
| chr3 | 44554702 | 44742478 | 0.19 | CHB |
| 0.001 | |
| chr16 | 31009588 | 31165239 | 0.16 |
|
| 0.001 |
Note.—The patterns of variation are considered unlikely to be expected under neutrality based on the results from our simulated data. Regions found in two populations were kept separately. The significance was assessed by simulating a null model using coalescence (see Materials and Methods). An extended version of this table can be found in , online.
. 2.—KRAB-ZNF gene cluster located on the chromosome 6 of four European population (6: 28.04–28.42) exhibiting very high genetic differentiation (a), high CLR (b) and XP-CLR (c) scores, long EHH (e, f) and multiple high frequency haplotypes. Note that the scale on the y axis differs between plots. All values correspond to the raw scores obtained for each method. In the FST track (a), SNPs over the solid lines indicating moderate (FST > 0.15, blue line) and high (FST > 0.25, red line) genetic differentiation. Bigger dots indicate two highly differentiated SNPs, rs1635 (CEU vs. CHB, red) and rs1997660 (CEU vs. YRI, green). H12 track statistics (e) shows the H scores for: homozygosity of the most frequent haplotype (H1), homozygosity calculated using all, except the most frequent haplotype (H2), the ratio between H2/H1, and the combination of the most and second most frequent haplotypes (H12). In H12 track (f) for four populations with European, one with Asian and one with African background. The H12 threshold we defined genome wide (solid red line, 0.1). Dotted vertical line indicate extension of positively selected region within this KRAB-ZNF cluster.
. 3.—Three missense variants located in two genes within a KRAB-ZNF gene cluster that might have undergone positive selection in European populations. Top left and middle, allelic frequencies of two nonsynonymous SNPs located in NKAPL gene. Top right, allelic frequencies of one nonsynonymous SNP located in PGDB1 gene. Bottom, genotypic frequencies for CEU, CHB, and YRI.
. 4.—KRAB-ZNF gene cluster exhibiting hard sweep on the chromosome 3 from CHB population (3: 44.55–44.74). Three methods for detecting positive selection and FST for measuring genetic differentiation produced very high scores for this region (a–d) when compared with other regions genome wide. Note that the scale on the y axis differs between plots. All values correspond to the raw scores obtained for each method. FST (b) and XP-EHH (d) results indicate very high genetic differentiation and a haplotype with EHH that spans about 188 kb (vertical dotted lines). This KRAB-ZNF cluster contains eight ZNF genes. The regions flanking up and downstream of this 188-kb haplotype also exhibit EHH, which suggests that they correspond to the same selective sweep (about 272 kb). Four highly differentiated nonsynonymous SNPs (green: CHB vs. YRI; red: CEU vs. YRI) in regions coding for protein domains of ZKSCAN7, ZNF35, ZNF501, and ZNF502 may be of functional relevance. Hierarchical boosting results (e) suggest this correspond to an incomplete recent selective sweep. Solid and dotted horizontal lines indicate thresholds for FST (blue: FST > 0.15, red: FST >0.25) and boosting significance thresholds as defined by Pybus et al (2015) (red: complete, orange: incomplete, blue: recent, and purple: ancient), respectively.