| Literature DB >> 30883542 |
Chuan Xu1, Joong-Ki Park2, Jianzhi Zhang1.
Abstract
Alternative transcriptional initiation (ATI) refers to the frequent observation that one gene has multiple transcription start sites (TSSs). Although this phenomenon is thought to be adaptive, the specific advantage is rarely known. Here, we propose that each gene has one optimal TSS and that ATI arises primarily from imprecise transcriptional initiation that could be deleterious. This error hypothesis predicts that (i) the TSS diversity of a gene reduces with its expression level; (ii) the fractional use of the major TSS increases, but that of each minor TSS decreases, with the gene expression level; and (iii) cis-elements for major TSSs are selectively constrained, while those for minor TSSs are not. By contrast, the adaptive hypothesis does not make these predictions a priori. Our analysis of human and mouse transcriptomes confirms each of the three predictions. These and other findings strongly suggest that ATI predominantly results from molecular errors, requiring a major revision of our understanding of the precision and regulation of transcription.Entities:
Mesh:
Year: 2019 PMID: 30883542 PMCID: PMC6438578 DOI: 10.1371/journal.pbio.3000197
Source DB: PubMed Journal: PLoS Biol ISSN: 1544-9173 Impact factor: 8.029
Fig 1The TSS diversity of a gene generally decreases with the gene expression level.
(A) The Simpson index of TSS diversity of a gene in the human universal sample declines with the expression level of the gene in the sample. (B) Spearman's correlations between gene expression level and Simpson index of TSS diversity in each of five human cell lines and 11 human tissue samples examined. (C) The Shannon index of TSS diversity of a gene in the human universal sample declines with the expression level of the gene in the sample. (D) Spearman's correlations between gene expression level and Shannon index of TSS diversity in each human cell line and tissue sample examined. In (A) and (C), each black dot represents a gene. Spearman's rank correlation coefficient (ρ) and associated P-value are presented for the original unbinned data (gray) and down-sampled data (black), respectively. Each red dot shows the mean X-value and mean Y-value of the genes in each of 10 equal-interval bins (i.e., all bins have the same log10RPM interval), while the error bars show standard errors (error bar is absent when a bin contains only one gene). In (B) and (D), gray squares and black triangles show the correlations on the basis of the original unbinned data and down-sampled data, respectively. P < 5 × 10−3 for all correlations. Sample IDs listed on the x-axis refer to those in S1 Table. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. ID, identifier; RPM, reads mapped to the gene per million reads; TSS, transcription start site.
Fig 2Increased fractional use of the most frequently used TSS of a gene and decreased fractional use of each other TSS when gene expression level rises.
(A) Spearman's correlation (ρ) between the expression level of a gene and the fractional uses of its TSSs in the human universal sample. TSSs are ranked on the basis of their fractional uses in the sample concerned, with rank #1 being the most frequently used one (major TSS). Each dot represents a gene. Gray and black ρ and P are based on the original and down-sampled data, respectively. (B) Spearman's rank correlation between the expression level of a gene and the fractional uses of its TSSs in each human cell line or tissue sample examined. P < 10−39 in all cases. Squares and triangles show the correlations on the basis of the original and down-sampled data, respectively. In both panels, the correlation for TSSs with a particular rank is calculated using the genes that have at least that particular number of TSSs. Sample IDs listed on the x-axis of (B) refer to those in S1 Table. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. ID, identifier; RPM, reads mapped to the gene per million reads; TSS, transcription start site.