| Literature DB >> 31740578 |
Wu Wei1,2,3, Vicent Pelechano4, Bianca P Hennig5, Jingwen Wang4, Yujie Zhang4, Ilaria Piazza6, Yerma Pareja Sanchez4, Christophe D Chabbert5, Sophie H Adjalley7, Lars M Steinmetz3,5,8.
Abstract
Cryptic transcription is widespread and generates a heterogeneous group of RNA molecules of unknown function. To improve our understanding of cryptic transcription, we investigated their transcription start site (TSS) usage, chromatin organization, and posttranscriptional consequences in Saccharomyces cerevisiae We show that TSSs of chromatin-sensitive internal cryptic transcripts retain comparable features of canonical TSSs in terms of DNA sequence, directionality, and chromatin accessibility. We define the 5' and 3' boundaries of cryptic transcripts and show that, contrary to RNA degradation-sensitive ones, they often overlap with the end of the gene, thereby using the canonical polyadenylation site, and associate to polyribosomes. We show that chromatin-sensitive cryptic transcripts can be recognized by ribosomes and may produce truncated polypeptides from downstream, in-frame start codons. Finally, we confirm the presence of the predicted polypeptides by reanalyzing N-terminal proteomic data sets. Our work suggests that a fraction of chromatin-sensitive internal cryptic promoters initiates the transcription of alternative truncated mRNA isoforms. The expression of these chromatin-sensitive isoforms is conserved from yeast to human, expanding the functional consequences of cryptic transcription and proteome complexity.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31740578 PMCID: PMC6886497 DOI: 10.1101/gr.243378.118
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Genome-wide identification of chromatin- and RNA degradation–sensitive TSSs. Detected chromatin-sensitive cryptic transcripts tend to overlap coding genes in the same orientation. (A) Representative 5′ cap sequencing track. Score (normalized counts) of collapsed replicates is shown (see Methods). Significantly differential expressed TSSs clusters are marked by * (P-adj <0.001). (B) Classification of differentially expressed TSSs in respect to annotated features. Annotation of stable unannotated transcripts (SUTs), CUTs, and UTR lengths are from Xu et al. (2009). (C) Distribution of differentially expressed TSSs in respect to annotated ORF-T TSSs. ORF-T refer to transcripts associated with canonical ORFs as described by strand-specific tiling arrays (Xu et al. 2009). (D) Relationship between TSSs identified in the analyzed strains. Each horizontal line represents an identified TSS cluster. On the left side, we display the relative fold change enrichment (FC) with respect to the wild-type strain in log2 (red, up-regulated, to blue, down-regulated). In black, we indicate which of those identified TSSs can be classified as iTSSs. Finally, significantly differentially expressed TSSs compared to wild type are shown at the right (in red). Only TSSs identified as differentially expressed with respect to the wild type in at least one condition are shown.
Figure 2.The sequence and chromatin features of iTSSs resemble those of canonical TSSs. (A) Sequence preference of set2Δ iTSSs compared with canonical TSSs (set2Δ down-regulated that often overlap with canonical TSSs). (B) MNase protection pattern for canonical ORF-T TSSs. MNase fragments are distributed in nucleosome protection fragments (nuc) and subnucleosomal ones (sub) according to their length. Vertical dotted lines depict canonical dyad nucleosome axes (in black) and putative TF binding sites (in red). (C) Heatmaps depicting in detail the MNase protection pattern for canonical ORF-T TSSs in the wild-type strain and set2Δ. Each line of the heatmaps corresponds to an analyzed region for nucleosome fragments (in blue) and subnucleosomal fragments (in red) ordered by gene expression (Xu et al. 2009). The metagene with aggregation of all the heatmap information is shown above in black dots. (D) Heatmaps depicting in detail the MNase protection pattern for set2Δ iTSSs as in C. Chromatin data are reanalyzed from Chabbert et al. (2015). Heatmap sorted by iTSS expression level.
Figure 3.Full-lengths of set2Δ iTSS-derived transcripts use canonical polyadenylation sites. (A) The TSS and TTS comparison between set2Δ iTSSs-initiated transcripts and annotated ORF-T boundaries (Xu et al. 2009). set2Δ iTSS-derived transcripts originate within the body of the gene (internal 5′) but use canonical 3′ polyadenylations sites. (B) Down-regulated TSSs in set2Δ use canonical 5′ and 3′ sites. (C) Example of TIFSeq coverage for the YOL022C gene as an example. The upper part shows TSS mapping (as in Fig. 1A). In the bottom part, we show full-length transcript in blue. Each line connecting between one identified TSS and poly(A) site represents one full-length transcript. The red arrow indicates the appearance of a set2Δ-sensitive iTSS. Nucleosomes are shown in green (Venters and Pugh 2009).
Figure 4.A fraction of iTSS-derived transcripts associate to ribosomes, and the internal methionine can be recognized as a novel start codon. (A) Relative association with polyribosome fraction after sucrose fractionation versus total extract. Analyzed events (present at a sufficient level in the wild-type strain) are indicated to the right of each plot. (B) Example of 5PSeq start codon–associated signature after glucose depletion for coding genes. To decrease the effect of potential outliers, we assigned a value corresponding to the 95th percentile to values that were over this threshold at each distance from the start codon. (C) Start codon–associated signature after glucose depletion for predicted novel start codons in set2Δ iTSS-derived transcripts. Those positions are expected to behave as internal methionines in a wild-type strain. (D) As in C, but showing the subset of cryptic start codons in which mRNAs are more associated with polyribosomes (fold change >0 in A).
Figure 5.Chromatin-sensitive iTSSs encode peptides that can be detected by MS. Sequencing tracks display the 5′ cap sequence score (normalized counts) of collapsed replicates for wild type (in black) and Δset2 (in blue). Identified N-terminal peptides are highlighted in yellow, and their orientations are displayed using a red arrow. We display in gray the three potential translations of DNA in the same orientation of the detected peptide. (A) Truncation of SAS4 (MEVEPEVIR). (B) Truncation of CNA1 (MNAGVLPR). (C) Chromatin-sensitive transcript encoding a peptide in the 3′ UTR of MON2 (YDMLIEIVVCFIPST). N-terminal COFRADIC data from Varland et al. (2018).