| Literature DB >> 24970883 |
Sarah H Lockwood1, Anna Guan2, Abigail S Yu3, Chi Zhang2, Artem Zykovich3, Ian Korf4, Bruce Rannala5, David J Segal6.
Abstract
Variants that alter the DNA-binding specificity of transcription factors could affect the specificity for and expression of potentially many target genes, as has been observed in several tumor-derived mutations. Here we examined if such trans expression quantitative trait loci (trans-eQTLs) could similarly result from common genetic variants. We chose to focus on the Cys2-His2 class of zinc finger transcription factors because they are the most abundant superfamily of transcription factors in human and have well-characterized DNA binding interactions. We identified 430 SNPs that cause missense substitutions in the DNA-contacting residues. Fewer common missense SNPs were found at DNA-contacting residues compared with non-DNA-contacting residues (P = 0.00006), consistent with possible functional selection against SNPs at DNA-contacting positions. Functional predictions based on zinc finger transcription factor (ZNF) DNA binding preferences also suggested that many common substitutions could potentially alter binding specificity. However, Hardy-Weinberg Equilibrium analysis and examination of seven orthologs within the primate lineage failed to find evidence of trans-eQTLs associated with the DNA-contacting positions or evidence of a different selection pressure on a contemporary and evolutionary timescales. The overall conclusion was that common SNPs that alter the DNA-contacting residues of these factors are unlikely to produce strong trans-eQTLs, consistent with the observations by others that trans-eQTLs in humans tend to be few and weak. Some rare SNPs might alter specificity and remained rare due to purifying selection. The study also underscores the need for large-scale eQTLs mapping efforts that might provide experimental evidence for SNPs that alter the choice of transcription factor binding sites.Entities:
Keywords: Hardy-Weinberg Equilibrium; nonsynonymous SNPs; trans-expression quantitative trait loci; transcription factors; zinc finger proteins
Mesh:
Substances:
Year: 2014 PMID: 24970883 PMCID: PMC4169156 DOI: 10.1534/g3.114.012195
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Common SNPs (MAF ≥1%) are observed significantly less frequently at DNA-contacting than at non-DNA-contacting amino acids. (A) The sequence logo representing the 3415 human zinc fingers in this study. The PFAM description of a C2H2 zinc finger motif (PF00096) is shown above the logo. # indicates amino acid positions important for the structure of the motif. Conserved Cys and His residues are indicated in orange. Bracketed regions represent additional amino acids observed in a minority of fingers (percent indicated in gray). Position numbering, by convention starting at the first amino acid of the α-helix, is shown below the logo. (B) The structure of a typical C2H2 zinc finger (Finger 2 of Zif268, PDB accession AAY1) with conserved Cys and His residues (orange) ligating a zinc ion (yellow). The β carbons (spheres) of residues on the DNA-contacting face of the finger (red) point down toward the DNA bases (not shown), whereas non-DNA-contacting residues (blue) face away from the DNA. (C) The number of common SNPs (as defined in Materials and Methods) at DNA-contacting (red) and non-DNA-contacting (blue) residues of the zinc finger.
eQTLs reported in the literature for the 1040 missense SNPs (P < 1.0E−5)
| Finger Position | SNP | Probe | Distance (bp) | Effect | Tissue | Study | |||
|---|---|---|---|---|---|---|---|---|---|
| ID | Gene | ID | Gene | ||||||
| C2 | rs2230752 | ZNF177 | GI_37622342-A | ZNF266 | 30,946 | 2.0E−26 | LBL | ||
| ILMN_1753782 | ZNF266 | 30,901 | 7.0E−12 | Cerebellum | |||||
| C2 | rs7257872 | ZNF584 | GI_13325056-S | SLC27A5 | 81,182 | 3.1E−11 | LBL | ||
| −2 | rs2074060 | ZNF772 | ILMN_1680693 | ZNF419 | 13,513 | 1.7E−09 | Temporal cortex | ||
| 1.2E−08 | Frontal cortex | ||||||||
| 5 | rs1465789 | ZNF132 | GI_13325056-S | SLC27A5 | 63,912 | 3.2E−14 | LBL | ||
| ZNF132 | 1345 | 7.8E−08 | LBL | ||||||
LBL, lymphoblastoid cell lines.
Not in Hardy-Weinberg Equilibrium; observed heterozygotes more than expected.
Figure 2SNPs deviating from Hardy-Weinberg Equilibrium (HWE) for each of the 14 populations in the 1000 Genomes data set. The probability of deviation from HWE is shown for SNPs occurring in DNA-contacting (light and dark red) and non-DNA-contacting (light and dark blue) positions. The black horizontal line in each graph corresponds to a Fisher exact P value of 0.05.
Figure 3Selection pressures on zinc finger orthologs within the primate lineage. The posterior probability of positive (red), neutral (green), or negative (blue) selection pressure at each amino acid position (x-axis) based on the Bayes Empirical Bayes (BEB) analysis of orthologs in the primate lineages are shown for seven zinc finger proteins. Gray bars on the cartoon above each graph indicate the positions of zinc fingers within the protein. The dashed purple horizontal line at the top of each graph corresponds to 95% probability. The approximate evolutionary distances, in millions of years ago [Mya; based on (Perelman )], of the eight species used in this analysis are shown at the lower right.
Figure 4Predictions of deleterious and altered DNA sequence recognition effects of missense SNPs. (A) The overlap of predictions made by four algorithms that determine if a SNP will have a deleterious effect on protein function. (B) The confidence of the predictions (i.e., an increased number of algorithms making the prediction) is shown as a function of the MAF. SNPs occurring in DNA-contacting (light and dark red) and non-DNA-contacting (light and dark blue) positions are shown. The black vertical line corresponds to MAF of 1%. The frequency of SNPs occurring at less than 0.01% in the 1000 Genomes populations cannot be accurately quantified; therefore, such SNPs are clustered as MAF ≤0.001. SNPs with variable low MAFs may be overlapping in these columns. (C) The summed Kullback-Leibler distance between the DNA binding sites of the reference and SNP forms of the affected zinc finger, as predicted by ZFModels (Gupta ), is shown as a function of the MAF. Larger values indicate grater divergence. (D) Representative sequence motifs for several summed Kullback-Leibler distances. The amino acid changed by the SNP is underlined. Note that the actual protein–DNA interactions are inverted in this depiction; the amino acid in position −1 typically affects the most 3′ base and position 6 typically affects the most 5′ base.