| Literature DB >> 31732725 |
Benoît Castandet1,2,3, Arnaud Germain1, Amber M Hotto1, David B Stern1.
Abstract
Chloroplast transcription requires numerous quality control steps to generate the complex but selective mixture of accumulating RNAs. To gain insight into how this RNA diversity is achieved and regulated, we systematically mapped transcript ends by developing a protocol called Terminome-seq. Using Arabidopsis thaliana as a model, we catalogued >215 primary 5' ends corresponding to transcription start sites (TSS), as well as 1628 processed 5' ends and 1299 3' ends. While most termini were found in intergenic regions, numerous abundant termini were also found within coding regions and introns, including several major TSS at unexpected locations. A consistent feature was the clustering of both 5' and 3' ends, contrasting with the prevailing description of discrete 5' termini, suggesting an imprecision of the transcription and/or RNA processing machinery. Numerous termini correlated with the extremities of small RNA footprints or predicted stem-loop structures, in agreement with the model of passive RNA protection. Terminome-seq was also implemented for pnp1-1, a mutant lacking the processing enzyme polynucleotide phosphorylase. Nearly 2000 termini were altered in pnp1-1, revealing a dominant role in shaping the transcriptome. In summary, Terminome-seq permits precise delineation of the roles and regulation of the many factors involved in organellar transcriptome quality control.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31732725 PMCID: PMC7145512 DOI: 10.1093/nar/gkz1059
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Plastome-scale view of Terminome-seq results End coverages are the average of two Col-0 biological replicates and given in RPM. 5′ ends obtained with or without TAP treatment are red and blue, respectively, and 3′ ends are displayed in green. Gene models are indicated between the tracks corresponding to the plus and minus strands of the plastome. One copy of the large inverted repeat is omitted for clarity. Selected TSS described in more detail in the main text are marked by arrows and labeled, as are the psbB and rRNA operons. Tick marks are every 1000 nt.
Figure 2.Coverage and distribution of transcript termini (A) Comparison of genome coverage between RNA-seq and Terminome-seq. While the plastome is almost fully covered by at least one read in RNA-seq, only 12.7 and 15.8% is covered by 5′ and 3′ ends, respectively. Data for RNA-seq correspond to the average of two previously published WT replicates (19). Termini coverage at >10 RPM is marked and discussed in the text. (B) Terminome-seq read distribution in the WT. The results are the average of two biological replicates. Reads antisense to exons (as-exon) refers to reads mapping to the antisense strand of a known coding region.
Figure 3.TSS analysis (A) The abundance of 5′ ends at each position for both genome strands was compared between +TAP and −TAP libraries. The dashed line separates the 352 ends that have a +TAP/−TAP ratio >10 from those with a lower ratio. Putative TSS were filtered to remove any ends that did not reach 50% of the coverage of the most represented read within a 5 nt stretch. The pie chart graphs the initiating nucleotide of the remaining 215 TSS. (B) The novel TSS detected within clpP intron 1 were mapped using 5′ RACE. 5′ RACE was completed with (red) and without (blue) prior TAP treatment, and sequenced clones are represented by colored arrowheads above/below the nucleotide sequence. The stained gel of the corresponding PCR reactions is shown at right. The gene model between exons 1 and 2 is shown with Terminome-seq results. +TAP 5′ ends are in red, −TAP ends in blue and 3′ ends are in green. The X-axis is genomic position and the Y-axis is RPM coverage. Black arrowheads P1 and P2 represent the 3′ primers used for 5′ RACE.
Figure 4.Transcript termini of the psbB operon highlighting the role of a secondary structure and RNA binding protein in defining ends (A) Terminome-seq coverage of the psbB operon with the corresponding gene models, with exons in gray and introns in white. −TAP 5′ ends are in blue and 3′ ends are in green; bent arrows represent TSS inferred from +TAP data. Underlined letters mark the locations that are expanded in panels B and C; numbered peaks and promoters refer to features listed in Table 1. (B) 3′ end coverage for a stem-loop structure between psbT and psbN. The stem is highlighted in green in the nucleotide sequence and the Mfold (119) predicted secondary structure is at right. Asterisks highlight the previously described ends (30). (C) The gene model, nucleotide sequence and end coverage for the HCF152 binding site. Reads accumulate at both the 5′ and 3′ ends of the binding site on the plus strand, indicative of a protected RNA fragment. The color code is the same as in panel A.
Description of transcript ends originating from the psbB operon
| End number | Genome position | Notes |
|---|---|---|
| TSS, +TAP 5′ ends | ||
| 1 | 72 200 | P |
| 2 | 72 409 | Internal to |
| 3 | 74 393 | P |
| 4 | 76 153 | Internal to |
| 5 | 76 375–76 376 | Upstream |
| 6 | 76 391 | Internal to |
| 7 | 76 780 | Internal to |
| 8 | 75 482 | Antisense to |
| Processed 5′ ends, −TAP 5′ends | ||
| 1 | 72 320 |
|
| 2 | 73 211 | Internal to |
| 3 | 73 658 | Internal to |
| 4 | 74 418 |
|
| 5 | 74 441 |
|
| 6 | 74 794 |
|
| 7 | 74 847 | first nucleotide of |
| 8 | 76 627 | internal to |
| 9 | 76 679 | internal to |
| 10 | 76 760 | internal to |
| 11 | 76 830 | internal to |
| 12 | 76 863 | internal to |
| 3′ ends | ||
| 1 | 72 601 | Internal to |
| 2 | 72 786 | Internal to |
| 3 | 73 371 | Internal to |
| 4 | 73 838 | Internal to |
| 5 | 74 082 | First nucleotide of |
| 6 | 74 242 | 3′ end of |
| 7 | 74 405 |
|
| 8 | 74 687 | Internal to |
| 9 | 74 814 |
|
| 10 | 76 358 |
|
| 11 | 76 543 | internal to |
| 12 | 77 014 | internal to |
| 13 | 77 047 | internal to |
| 14 | 77 765 | 3′ end of |
| 15 | 77 892 | 3′ end of |
| 16 | 77 716 | 3′ end of |
| 17 | 74 211 | 3′ end of |
Figure 5.Distribution, coverage and location of transcript termini in pnp1–1 (A) Plastome-scale view of end coverages from the average of two Col-0 and pnp1–1 biological replicates in RPM. 5′ ends obtained without TAP treatment are blue (WT) and pink (pnp1–1), and 3′ ends are displayed in green (WT) and orange (pnp1–1). Gene models are indicated between the tracks corresponding to the plus and minus strands of the plastome. One copy of the large inverted repeat is omitted for clarity. Tick marks are every 1000 nt. (B) Comparison of Terminome-seq coverage for WT and pnp1–1. While 12.7 and 15.8% of the WT plastome is represented by 5′ and 3′ ends, respectively, these numbers increase to 26.1 and 39.5%, respectively, in pnp1–1. Termini coverage at >10 RPM (0.94 and 1.07% for 5′ and 3′ ends, respectively, in pnp1–1) is marked and discussed in the text. (C) Terminome-seq read distribution in pnp1–1. The results are the average of two biological replicates. as-exon refers to reads mapping to the antisense strand of known coding regions.
Figure 6.Terminome-seq coverage in pnp1–1 (A) The RPM abundance of −TAP 5′ ends and 3′ ends was compared between WT and pnp1–1 at a genomic level. The dashed lines separate ends that are at least 10-fold more abundant in a given genotype. For example, 349 5′ ends are more abundant in the PNPase mutant. (B) Terminome-seq coverage upstream of the rbcL gene. Color coding of ends is provided in an inset. Genome position 54 958 is the rbcL coding region 5′ end according to the TAIR10 annotation. The rbcL processed 5′ end (position 54 889) correlates with the 5′ end of the smRNA footprint of MRL1 (highlighted in blue). (C) Terminome-seq coverage downstream of the rbcL gene, with labeling as in Panel B. Genome position 56 397 is the 3′ end of the coding region. The stem-loop downstream of the gene (positions 56 437–56 488) is highlighted in green and matches a smRNA (66). Other genome positions discussed in the text are also labeled.