| Literature DB >> 20958978 |
Zhenguo Lin1, Wei-Sheng Wu, Han Liang, Yong Woo, Wen-Hsiung Li.
Abstract
BACKGROUND: How the transcription factor binding sites (TFBSs) are distributed in the promoter region have implications for gene regulation. Previous studies used the translation start codon as the reference point to infer the TFBS distribution. However, it is biologically more relevant to use the transcription start site (TSS) as the reference point. In this study, we reexamined the spatial distribution of TFBSs, investigated various promoter features that may affect the distribution, and studied the effect of TFBS distribution on transcriptional regulation.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20958978 PMCID: PMC3091728 DOI: 10.1186/1471-2164-11-581
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1The distribution of TFBSs in yeast promoters. (A) The frequency distribution of the distances from the TFBSs to the transcription starting site (TSS) in genes. This figure shows the spatial distribution of TFBSs in the promoters of 4369 genes (using TFBS dataset IV). The value at each position relative to the TSS is a moving average of a window of 41 bp. The distribution has a very sharp peak ~115 bp upstream of the TSS and the TFBSs are strongly concentrated in the ~ 100 bp region from 180 to 80 bp upstream of the TSS. The blue solid line represents observed values; The red solid and dotted lines represent the mean of randomization and 95% confidence intervals for 1000 randomized tests (B) No sharp peak was found in the frequency distribution of TFBSs relative to the translation start codon. This figure was generated using the same data as in (A) except that the translation start codon instead of the TSS was used as the reference point. (C) The frequency of deletion polymorphisms in the TSS-proximal region is the lowest in the promoter region and is significant lower than random expectation. This figure was generated using the same data as in (A).
Figure 2The effects of promoter architecture features on the TFBS distribution. (A) Strong negative correlation between the TFBS frequency and the genome-wide average of nucleosome occupancy in the ~200 bp region upstream of the TSS. The intergenic regions were aligned with reference to TSS. (B) The yeast genes were clustered into four groups by k-means clustering based on a 1201 bp region surrounding the TSS. (C) The frequency distributions of TFBSs relative to the TSS in four clusters of genes with diverse nucleosome occupancy patterns. The figure is presented in the same way as Figure 1A. (D) The TFBS distribution differs between TATA box-containing and TATA box-less genes. The TATA box-containing genes have more broadly distributed TFBSs and higher TFBS frequency in the promoter region. (E) The distributions of TFBSs relative to the TSS in three categories of 5'UTR length: short, medium and long. The long 5'UTR genes have more broadly distributed TFBSs and a higher TFBS frequency in the promoter region.
Figure 3Intrinsic correlations among nucleosome positioning, presence/absence of TATA box and 5'UTR length. (A) Average 5'UTR lengths among the four gene groups clustered by k-mean clustering based on their nucleosome occupancy patterns. The mean value of each group is indicated by a bar and the error bars indicate one standard error. (B) The nucleosome occupancy in the 1201 bp window surrounding the TSS in the three groups of genes with different 5'UTR lengths. The TSS proximal region of long 5'UTR genes has the highest nucleosome occupancy. (C) The proportion of TATA box-containing genes and TATA box-less genes in each 5'UTR length group of genes. The long 5'UTR genes have the highest proportion of TATA box-containing genes, but the lowest proportion of TATA box-less genes. (D) TATA box-containing genes have slightly longer 5'UTRs than do TATA box-less genes.
Figure 4The distribution patterns of TFBSs among genes with different expression profiles. (A) High expression plasticity genes have distinct TFBS distribution patterns compared to the low plasticity genes. (B) The density of TFBSs has a positive correlation with gene expression level under rich media.