Literature DB >> 27841720

Measures of RNA metabolism rates: Toward a definition at the level of single bonds.

Leonhard Wachutka1, Julien Gagneur1.   

Abstract

We give an overview of experimental and computational methods to estimate RNA metabolism rates genome-wide. We then advocate a local definition of RNA metabolism rate at the level of individual phosphodiester bonds. Rates of formation and disappearance of individual bonds are unambiguously defined, in contrast to rates of complete transcripts. We show that over previous approaches, the recently developed transient transcriptome sequencing (TT-seq) protocol allows for estimation of metabolism rates of individual bonds with least positional bias.

Entities:  

Keywords:  RNA metabolism; RNA sequencing; kinetics; splicing; transcription

Mesh:

Substances:

Year:  2016        PMID: 27841720      PMCID: PMC5423486          DOI: 10.1080/21541264.2016.1257972

Source DB:  PubMed          Journal:  Transcription        ISSN: 2154-1272


Importance of RNA metabolism rates

All stages of RNA metabolism contribute to the control of gene expression, including RNA synthesis, splicing, and degradation. The ratio between the synthesis and degradation rates determines steady-state levels of mature RNA. Upon a transcriptional trigger, both degradation and splicing rates contribute to the time until which new steady-state levels are reached. Whereas variations in RNA synthesis rates are the major determinants of mRNA levels, RNA degradation rates further fine-tunes mRNA abundance and can be dynamically changed to shape gene expression. Combinations of synthesis and degradation rates enable different gene regulatory strategies that can favor turn-over or robustness, and high or low levels of expression. Although the number of genome-wide studies of RNA splicing rates is more limited, it is clear that splicing rates also vary a lot between and within genes, with impact on the composition of the isoform repertoire of a cell. Altogether, precise quantitative measurements of transcription, degradation, and splicing rates are necessary to obtain a deeper understanding of gene expression control and of the underlying mechanisms.

Limitations of steady-state RNA-seq data

At steady-state, production and degradation of every molecular species balance each other. The mature RNA concentration is consequently the ratio of the synthesis rate over the degradation rate (Fig. 1). Hence, steady-state RNA-seq cannot untangle synthesis rate from degradation rates. The same issue affects all proxies for splicing kinetics that are derived from steady-state data. For instance, there is a possible confounding when defining using the ratio of reads spanning exon–exon over exon–intron junctions to assess splicing efficiency. Indeed, steady-state levels of exon–exon junction reads are proportional to synthesis over degradation rates and the exon–intron junction reads have steady-state values proportional to synthesis over splicing rates (Fig. 1). Hence, the ratio of exon–exon over exon–intron reads equals to splicing rate over decay rate. Although this ratio allows comparing the splicing rate of different junctions within one gene (under the reasonable assumption that all exon–exon junctions have a similar decay rate) a straightforward comparison of splicing between different genes is biased due to the generally varying decay rates among genes. Similarly, the ratio of exonic over intronic reads not only depends on how fast the precursor RNA is processed, but also on both the stability the spliced out introns and on the stability of the mature RNA. Altogether, the usage of steady-state RNA-seq to infer rates or variations of rates is intrinsically limited.
Figure 1.

RNA-seq versus metabolically labeling approaches. (A) Sketch of a concentration curve of labeled unspliced and spliced RNA after labeling based on classic first-order kinetics. (B) Steady-state data alone is not enough to untangle synthesis, degradation, and splicing rate. In particular, the ratio of spliced over unspliced reads, often taken as a proxy for splicing efficiency depends on degradation rate. (C) For labeling duration much shorter than degradation or splicing time, unspliced reads reflect the RNA synthesis (top). Labeling time series can be analyzed using kinetics model, for instance first-order kinetics (Eser 2016). This time series allows analyzing splice junction formation kinetics (bottom).

RNA-seq versus metabolically labeling approaches. (A) Sketch of a concentration curve of labeled unspliced and spliced RNA after labeling based on classic first-order kinetics. (B) Steady-state data alone is not enough to untangle synthesis, degradation, and splicing rate. In particular, the ratio of spliced over unspliced reads, often taken as a proxy for splicing efficiency depends on degradation rate. (C) For labeling duration much shorter than degradation or splicing time, unspliced reads reflect the RNA synthesis (top). Labeling time series can be analyzed using kinetics model, for instance first-order kinetics (Eser 2016). This time series allows analyzing splice junction formation kinetics (bottom).

Estimation of rates using RNA metabolic labeling

To circumvent these limitations, alternative protocols are used that directly probe the kinetics. One class of approach is based on transcriptional arrest. However, great care should be taken when using transcriptional arrest, because arresting transcription is a major stress on cells and because transcription and degradation are globally coupled. Alternatively, Dölken et al. developed a technique based on metabolic labeling which has been successfully applied to many eukaryotes including yeast, fly, mouse, and human. The key idea is to use a modified nucleotide, usually 4-thiouridine (4sU), to tag newly synthesized RNA starting from one point in time. Labeling durations as short as 90 sec have been applied allowing the investigation of very rapid events such as the degradation of short-lived non-coding RNAs and splicing. The approach is often applied with a single labeling duration. However, a whole labeling time course of a steady-state cell population can be also used giving more time points to investigate the kinetics and fit the rates. Studying time-dependence of the rates upon a stress or during the cell cycle require mathematical modeling of the time dependency of the kinetic parameters. To accurately determine the rates, one further has to model the underlying read generating process in great detail. Even though the absolute amount of labeled RNA of any transcript increases with labeling duration, the number of sequenced reads of short-lived transcripts decreases, as short-lived RNAs represent a decreasing fraction of all purified RNAs. Hence, to fit a kinetic model to read counts and to obtain absolute measures of half-life, a normalization factor that correspond to the overall amount of labeled RNA in the samples prior to purification must be estimated. This has been either done using spike-ins or by fitting a global model jointly across all genes. Moreover, it is important to control for cross-contamination with unlabeled RNA, especially for short durations where the proportion of unlabeled RNA in the sample can be so large that small cross-contamination can lead to a large fraction of reads in the purified samples. Also, inaccurate determination of the feature length (exon, intron, junction) as well as GC-content can introduce artificial correlations between the kinetic rates and, for example, length of a gene. Hence, correlations between length (transcript, 5′-UTR, 3′-UTR) and any kinetic parameter should be considered with great care. Estimations of the synthesis rate are particularly sensitive to these biases compared with splicing and decay. We observed great bias for short genes in fission yeast (mostly non-coding). In general, long observation periods are desirable. However, 4sU induced inhibition of RNA translation for long labeling periods give an upper limit of 1 h–2 h of labeling time.

Defining rates of individual phosphodiester bonds

The notion of RNA synthesis rates, degradation rates, and the splicing rates of introns lead to practical and conceptual difficulties due to the interleaved nature of transcription. A single gene locus can give rise to many splice isoforms simultaneously as well as many overlapping non-coding RNAs. It is not possible to unambiguously allocate each read to either of these transcripts. Statistical models that try to untangle the concentration of overlapping isoforms exist, but lead to highly coupled estimates. Hence, delineating the RNA metabolism rates of overlapping transcripts is extremely difficult. The issue is not only technical but also biological. Adding to splicing variation, the widespread variations in 5′ and 3′ end imply that there is an extremely large number of unique RNA sequences that are transcribed from one gene. Variations in the exact transcription start sites may affect synthesis rates, and variations in the 3′ end can affect RNA stability by adding or removing cis-regulatory motifs with role in RNA degradation. Therefore, similar isoforms may have significantly different RNA metabolism rates. Hence, although summary statistics at the level of a whole gene are certainly useful simplifications, we argue that a definition of rates for individual bonds helps clarifying the notions and devising clear mathematical models. We distinguish five types of phosphodiester bonds: exonic, exon–intron, intronic, intron–exon, and exon–exon bonds (Fig. 2). The production rate of the four first types is equal to the synthesis rate or transcription rate of these bonds. In steady-state culture conditions, the junction formation rate of the exon–exon bonds also equals the transcription rate, assuming no loss during RNA processing. However, tracking exon–exon bonds during a labeling time course will allow studying splicing kinetics and, in particular, the delay between transcription and junction formation. The degradation rates of exon–intron bonds are the cleavage rate of the donor sites, and the degradation rates of the intron–exon bonds are the cleavage rate of the acceptor sites. For one single intron, those cleavage rates do not need to be equal to each other because of a likely longer half-life of the donor site bonds, since they are transcribed before the acceptor sites, and also because alternative splicing imply that one donor site can correspond to multiple acceptor sites. The synthesis rates of these bonds are the sum of the synthesis rates of all transcripts (including all alternatively spliced isoforms) containing them. In contrast, the degradation rates of single bonds do not trivially relate to the degradation rate of the RNA species (including all alternatively spliced isoforms) that contain them. They are some combination of those, in a way that depend on degradation kinetics and amount of each RNA species. Nonetheless, both the synthesis rates and the degradation rates of individual bonds at steady state are well-defined quantities.
Figure 2.

Synthesis and degradation rates of individual phosphodiester bonds. To simplify the figure, we only consider a gene without overlapping transcripts. However, the definitions apply to configurations with overlapping transcripts and can also be defined for non-canonical splicing (cryptic-, circular-, and trans-splicing). The interpretation of the synthesis and degradation rate of each bond is given to the right.

Synthesis and degradation rates of individual phosphodiester bonds. To simplify the figure, we only consider a gene without overlapping transcripts. However, the definitions apply to configurations with overlapping transcripts and can also be defined for non-canonical splicing (cryptic-, circular-, and trans-splicing). The interpretation of the synthesis and degradation rate of each bond is given to the right. One should note that hand in hand with the definition of rates described above, a careful annotation of exon and intron boundaries is required. To this end, it is important to not rely on annotations but to adopt a data-driven approach with read mapping algorithms allowing de novo identification of splice sites. It would also be interesting to address non-canonical types of splicing (circular-, trans-splicing) which could be revealed by such de novo identifications, possibly with adapted protocol. Also for these non-canonical cases, the definition of metabolism rates of phosphodiester bonds applies and will be a useful concept.

Transient-transcriptome profiling

We recently contributed to the development of a protocol, transient-transcriptome profiling, which will be instrumental for estimating metabolism rates of individual bonds as it addresses an important positional bias that the standard 4sU-seq protocol has. Standard 4sU-seq leads to an overrepresentation of reads from the 5′ ends of genes due to labeling of already on-going transcription products. This effect is particular important in higher eukaryotes, where the polymerase takes a significantly longer time to transcribe a gene in its full extent (typically 20 min in human), than the labeling durations required for studying rapid events such as splicing or degradation of short lived RNAs. One consequence is that parts of the reads sequenced with 4sU-seq protocols are actually not labeled, and that these tend to be more present in the 5′ end of genes (Fig. 3). Hence, synthesis and degradation rates estimates based on standard 4sU-seq protocols are biased by gene length. Another effect is due to co-transcriptional splicing. In the standard 4sU-seq protocol, the pulled down RNAs may have introns in their 5′ end already spliced out prior to the labeling. Consequently, there is a relative higher amount of exon–exon reads in the 5′ end of genes and thus the introns toward the 5′ end of genes appear to be spliced faster. Whether the first introns are generally spliced faster than other introns, as single-gene microscopy has indicated, is difficult to assess form standard 4sU-seq data. In contrast, we expect data obtained by TT-seq to show a more uniform and less biased coverage of exon–exon reads.
Figure 3.

TT-seq enables uniform mapping of the human transient transcriptome. When labeling with 4sU for a shorter duration than the time required for polymerase to complete transcription, only short part (red) of a nascent transcript gets labeled (top). In standard 4sU-seq, the complete nascent transcript is purified and sequenced. This leads to higher coverage in the 5′ end of genes, and due to co-transcriptional splicing, to high ratio of exon–exon reads over exon–intron reads toward the genes 5′ end (bottom left). In contrast TT-seq, whose fragmentation steps allows sequencing only the labeled part of nascent transcripts, leads to a more uniform coverage along transcripts and a more uniform distribution of the ratio of exon–exon over exon–intron reads (bottom right). Having a uniform coverage is important to study the metabolism of individual bonds in an unbiased fashion.

TT-seq enables uniform mapping of the human transient transcriptome. When labeling with 4sU for a shorter duration than the time required for polymerase to complete transcription, only short part (red) of a nascent transcript gets labeled (top). In standard 4sU-seq, the complete nascent transcript is purified and sequenced. This leads to higher coverage in the 5′ end of genes, and due to co-transcriptional splicing, to high ratio of exon–exon reads over exon–intron reads toward the genes 5′ end (bottom left). In contrast TT-seq, whose fragmentation steps allows sequencing only the labeled part of nascent transcripts, leads to a more uniform coverage along transcripts and a more uniform distribution of the ratio of exon–exon over exon–intron reads (bottom right). Having a uniform coverage is important to study the metabolism of individual bonds in an unbiased fashion. In conclusion, we predict that TT-seq will become an important protocol to study splicing kinetics genome-wide, as it alleviates positional biases that former labeling protocols entail. To analyze RNA metabolism with TT-seq data, we suggest a switch from a gene-level to single-bond level focus.
  26 in total

1.  Precision and functional specificity in mRNA decay.

Authors:  Yulei Wang; Chih Long Liu; John D Storey; Robert J Tibshirani; Daniel Herschlag; Patrick O Brown
Journal:  Proc Natl Acad Sci U S A       Date:  2002-04-23       Impact factor: 11.205

2.  Genome-wide analysis of mRNA decay in resting and activated primary human T lymphocytes.

Authors:  Arvind Raghavan; Rachel L Ogilvie; Cavan Reilly; Michelle L Abelson; Shalini Raghavan; Jayprakash Vasdewani; Mitchell Krathwohl; Paul R Bohjanen
Journal:  Nucleic Acids Res       Date:  2002-12-15       Impact factor: 16.971

3.  Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila.

Authors:  Yevgenia L Khodor; Joseph Rodriguez; Katharine C Abruzzi; Chih-Hang Anthony Tang; Michael T Marr; Michael Rosbash
Journal:  Genes Dev       Date:  2011-12-01       Impact factor: 11.361

4.  Dissecting the regulatory circuitry of a eukaryotic genome.

Authors:  F C Holstege; E G Jennings; J J Wyrick; T I Lee; C J Hengartner; M R Green; T R Golub; E S Lander; R A Young
Journal:  Cell       Date:  1998-11-25       Impact factor: 41.582

5.  Coupled pre-mRNA and mRNA dynamics unveil operational strategies underlying transcriptional responses to stimuli.

Authors:  Amit Zeisel; Wolfgang J Köstler; Natali Molotski; Jonathan M Tsai; Rita Krauthgamer; Jasmine Jacob-Hirsch; Gideon Rechavi; Yoav Soen; Steffen Jung; Yosef Yarden; Eytan Domany
Journal:  Mol Syst Biol       Date:  2011-09-13       Impact factor: 11.429

6.  Periodic mRNA synthesis and degradation co-operate during cell cycle gene expression.

Authors:  Philipp Eser; Carina Demel; Kerstin C Maier; Björn Schwalb; Nicole Pirkl; Dietmar E Martin; Patrick Cramer; Achim Tresch
Journal:  Mol Syst Biol       Date:  2014-01-30       Impact factor: 11.429

7.  4-thiouridine inhibits rRNA synthesis and causes a nucleolar stress response.

Authors:  Kaspar Burger; Bastian Mühl; Markus Kellner; Michaela Rohrmoser; Anita Gruber-Eber; Lukas Windhager; Caroline C Friedel; Lars Dölken; Dirk Eick
Journal:  RNA Biol       Date:  2013-09-04       Impact factor: 4.652

8.  High-resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies.

Authors:  Michal Rabani; Raktima Raychowdhury; Marko Jovanovic; Michael Rooney; Deborah J Stumpo; Andrea Pauli; Nir Hacohen; Alexander F Schier; Perry J Blackshear; Nir Friedman; Ido Amit; Aviv Regev
Journal:  Cell       Date:  2014-12-11       Impact factor: 41.582

9.  The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels.

Authors:  Athma A Pai; Carolyn E Cain; Orna Mizrahi-Man; Sherryl De Leon; Noah Lewellen; Jean-Baptiste Veyrieras; Jacob F Degner; Daniel J Gaffney; Joseph K Pickrell; Matthew Stephens; Jonathan K Pritchard; Yoav Gilad
Journal:  PLoS Genet       Date:  2012-10-11       Impact factor: 5.917

10.  Extensive transcriptional heterogeneity revealed by isoform profiling.

Authors:  Vicent Pelechano; Wu Wei; Lars M Steinmetz
Journal:  Nature       Date:  2013-04-24       Impact factor: 49.962

View more
  3 in total

1.  Global donor and acceptor splicing site kinetics in human cells.

Authors:  Leonhard Wachutka; Livia Caizzi; Julien Gagneur; Patrick Cramer
Journal:  Elife       Date:  2019-04-26       Impact factor: 8.140

2.  Transcript-specific determinants of pre-mRNA splicing revealed through in vivo kinetic analyses of the 1st and 2nd chemical steps.

Authors:  Michael A Gildea; Zachary W Dwyer; Jeffrey A Pleiss
Journal:  Mol Cell       Date:  2022-07-12       Impact factor: 19.328

3.  On the optimal design of metabolic RNA labeling experiments.

Authors:  Alexey Uvarovskii; Isabel S Naarmann-de Vries; Christoph Dieterich
Journal:  PLoS Comput Biol       Date:  2019-08-07       Impact factor: 4.475

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.