Literature DB >> 28180286

Misincorporation by RNA polymerase is a major source of transcription pausing in vivo.

Katherine James1, Pamela Gamba1, Simon J Cockell2, Nikolay Zenkin1.   

Abstract

The transcription error rate estimated from mistakes in end product RNAs is 10−3–10−5. We analyzed the fidelity of nascent RNAs from all actively transcribing elongation complexes (ECs) in Escherichia coli and Saccharomyces cerevisiae and found that 1–3% of all ECs in wild-type cells, and 5–7% of all ECs in cells lacking proofreading factors are, in fact, misincorporated complexes. With the exception of a number of sequence-dependent hotspots, most misincorporations are distributed relatively randomly. Misincorporation at hotspots does not appear to be stimulated by pausing. Since misincorporation leads to a strong pause of transcription due to backtracking, our findings indicate that misincorporation could be a major source of transcriptional pausing and lead to conflicts with other RNA polymerases and replication in bacteria and eukaryotes. This observation implies that physical resolution of misincorporated complexes may be the main function of the proofreading factors Gre and TFIIS. Although misincorporation mechanisms between bacteria and eukaryotes appear to be conserved, the results suggest the existence of a bacteria-specific mechanism(s) for reducing misincorporation in protein-coding regions. The links between transcription fidelity, human disease, and phenotypic variability in genetically-identical cells can be explained by the accumulation of misincorporated complexes, rather than mistakes in mature RNA.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28180286      PMCID: PMC5388426          DOI: 10.1093/nar/gkw969

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Correct copying of genetic information into RNA is one of the requirements of successful gene expression. Overall transcription fidelity, i.e. correctness of the final RNA product, has an estimated error rate of ∼10−3–10−5 (1–5), and is a result of the accuracy of nucleotide incorporation by RNA polymerase (RNAP) and of the proofreading of occasional misincorporation events. The accuracy of nucleotide triphosphate (NTP) choice is mainly determined by the RNAP active site (5,6). Different misincorporations are not equally frequent (5), and sequencing of transcripts produced in vitro by E. coli RNAP has revealed a strong bias in errors toward G>A misincorporation (misincorporation of AMP instead of GMP, resulting in A:C mismatched base pair), with a C preceding (C-1 in the RNA) the misincorporation position (2). Upon misincorporation, the elongation complex (EC) backtracks by 1 base pair (7,8) (scheme in Figure 1A). From this conformation, the hydrolysis of the second phosphodiester bond of the transcript by the RNAP active center removes the error in the form of a dinucleotide (8). In vitro, this reaction is greatly stimulated by cleavage factors, Gre for bacterial RNAP (9) and TFIIS for eukaryotic RNAP II (10). Besides contributing to errors in the final RNA products, misincorporation events were shown to cause long-lived pausing due to RNAP backtracking in vitro (7,8). However, until recently, the misincorporation-caused backtracked pauses have not been investigated in vivo due to their random and transient nature. Additionally, misincorporation-induced pauses were overlooked due to the small effect of cleavage factors on the error rate in the final RNA products (2,11), which led to the intuitive suggestion that misincorporation is a very rare event and, thus, could not contribute to pausing significantly. Importantly, however, backtracked pauses can be detrimental to cells; for instance, sequence-specific backtracked pauses have been proposed to cause RNA polymerase traffic jams, and were shown to cause conflicts with replication forks, leading to DNA double-strand breaks and genome instability (11–13).
Figure 1.

(A) Upon misincorporation, the elongation complex (EC) backtracks by 1 base pair, which then leads to further backtracking (7,8). Misincorporated and deeply backtracked ECs result in long-living pauses of transcription until resolved by intrinsic or factor-dependent cleavage. The paused ECs may cause collisions with replication, and cause RNAP traffic jams. (B) Native Elongating Transcripts sequencing (NET-seq) is a technique that involves sequencing of the 3΄ proximal parts of transcripts that are bound to transcribing RNAP. Shown is the scheme of the transcription EC, with positions in the transcript RNA (red) numbered from the 3΄ end. (C) The error rates at the 3΄ to –10 positions of the nascent RNAs of all active ECs with no filtering from S. cerevisiae (Sc; wild-type and ΔTFIIS mutant strains) and E. coli (Ec; independent data set for wild-type and ΔGre mutant strains). (D) The specific misincorporation rates at the 3΄, −1 and −2 positions for all ECs with no filtering from wild-type and mutant E. coli and S. cerevisiae strains.

(A) Upon misincorporation, the elongation complex (EC) backtracks by 1 base pair, which then leads to further backtracking (7,8). Misincorporated and deeply backtracked ECs result in long-living pauses of transcription until resolved by intrinsic or factor-dependent cleavage. The paused ECs may cause collisions with replication, and cause RNAP traffic jams. (B) Native Elongating Transcripts sequencing (NET-seq) is a technique that involves sequencing of the 3΄ proximal parts of transcripts that are bound to transcribing RNAP. Shown is the scheme of the transcription EC, with positions in the transcript RNA (red) numbered from the 3΄ end. (C) The error rates at the 3΄ to –10 positions of the nascent RNAs of all active ECs with no filtering from S. cerevisiae (Sc; wild-type and ΔTFIIS mutant strains) and E. coli (Ec; independent data set for wild-type and ΔGre mutant strains). (D) The specific misincorporation rates at the 3΄, −1 and −2 positions for all ECs with no filtering from wild-type and mutant E. coli and S. cerevisiae strains. An increase in transcription error rate has been linked to aging and various diseases (14–17), and may be a significant source of stochastic variability at the single-cell level (1,18). The generally accepted view is that the effects of lower transcriptional fidelity on cellular functions are caused by the mistakes in final RNA products, despite the error rate of mRNA translation being much higher than that of transcription. In contrast, the possible involvement of misincorporation-caused pausing is usually not considered, since it is thought to be a rare event. However, while overall fidelity can be reliably measured in vivo, the dynamics of the formation and resolution of misincorporated complexes, and their possible roles in the cell, remain obscure. Native Elongating Transcripts sequencing (NET-seq) is a technique that involves sequencing of the 3΄ proximal parts of transcripts that are bound to transcribing RNAP, i.e. are actively elongating (19) (scheme in Figure 1B). These data provide a global snapshot of all transcription elongation complexes in the cell with precise identification of the 3΄ ends of the nascent RNAs. NET-seq has been used to study genome-wide pausing of transcription by identifying those genomic positions enriched with the 3΄ ends of the nascent RNAs (19–22). Further analysis of ECs using an RNase footprinting technique showed that ∼1% of backtracked ECs in E. coli strain lacking Gre factors are misincorporated (20). Here, we analyzed published NET-seq datasets of all (not limited to paused ECs) nascent elongating RNAs in E. coli and yeast strains with and without cleavage factors for possible errors at 3΄ positions (19,21). We found that a far higher proportion of all ECs (1–3% in wild-type and 5–7% in mutant strains) is represented by stalled misincorporated complexes than has been proposed. The analysis also suggests that, despite the sequence bias of misincorporation events, the pausing per se may not be required for an increased rate of misincorporation, in contrast to what has been suggested earlier (20).

MATERIALS AND METHODS

Data sources

NET-seq data for Saccharomyces cerevisiae and E. coli, for both wild type and dst or greA/B deletion strains, and equivalent total RNA-seq data (Supplementary Table S1) (19,21), were downloaded from the National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (23) website and converted into fastq format using sratoolkit version 2.5.2 (http://www.ncbi.nlm.nih.gov/sra). The reference genome for E. coli (NC_000913.3) was also downloaded from NCBI (23), while the nuclear S. cerevisiae reference genome (S288C) was downloaded from the Saccharomyces Genome Database (24).

Quality control and pre-processing

Dataset quality was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to ensure per base and per tile sequence quality. Where necessary raw reads were adaptor trimmed as described in the original publications (19,21). In the case of the S. cerevisiae datasets, reads aligning to tRNAs, snoRNAs and rRNAs were removed as described by Weissman and colleagues (19). Reads from the EcWT dataset with N bases at the 3΄ adjacent position were also excluded (21) due to a systematic sequencing error at this position identified during quality control (Supplementary Figure S1).

Genome alignment

We employed a stringent genome alignment strategy in order to optimize the accuracy of the error rate calculation (Supplementary Figure S2). K-mer counts were performed using jellyfish at default settings over both strands (25). Reads were aligned to genomes using Bowtie (26) allowing two mismatched bases in a seed region of 14 where only unique alignments were reported (-n 2 –l 14 –m 1).

Error rates

Data analysis was carried out in R using the BioConductor seqTools (27) and IRanges (28) packages. Total error rates were calculated as the percentage of total reads with a mismatched base at each read position in the alignment, thresholded to a Phred quality score of <30 (Supplementary Figure S3) and excluding mismatches involving ambiguous N bases. Specific error rates were calculated as the percentage of total reads with a specific mismatch, for example an A incorporated instead of a G (G > A misincorporation), at each read position, thresholded to a Phred quality score of <30 and excluding mismatches involving ambiguous N bases.

Statistical analysis

Experimental rates of false positives (matches classed as mismatched) and false negatives (mismatches classed as matches) were calculated based on the published error rates for the enzymes used in the reverse transcriptase (Primerscript (Clontech.com) or Superscript (29)) and PCR (PrimeStar Max (Clontech.com) or Phusion (https://www.neb.com/faq)) steps, and for the sequencing miscall rate of 1 in 1000 based on a Phred score threshold of 30 (30). In all cases there was assumed to be a two in three chance of a genuine mismatch remaining mismatched following an error. Accuracy of the error rates was then calculated as the percentage of all observed errors that were true positive mismatches.

Sequence analysis

Single base variations between the experimental strains and their reference genomes were identified using samtools and bcftools following the method of Li (31,32). The positions of specific misincorporations (for instance G>A) were then mapped to the reference genomes using the BioConductor seqTools (27) and IRanges (28) packages, excluding those at positions with identified mutations. Misincorporation hotspots for the EcΔGre and ScΔTFIIS datasets were defined as having >50 misincorporations. Sequence logos were created using the R seqLogo package (33). Generic Feature Format Version 3 files (GFF3) were downloaded from the NCBI website in order to identify protein coding regions (CDSs). Aligned locations were identified from the bowtie output using BEDTools (34), and the BioConductor seqTools (27) and IRanges (28) packages. Transcribed but non-translated regions (UTRs) in EcRNA were identified using Rockhopper (35), and an S. cerevisiae S288C UTRs (36) were obtained from the Saccharomyces Genome Database (24). ‘Translated’ regions were defined as aligned locations within CDSs, while ‘transcribed non-translated’ regions as aligned locations within the UTRs. In S. cerevisiae introns were also included in the ‘transcribed non-translated’ regions. Misincorporation rates were calculated for the EcΔGre and ScΔTFIIS genomic regions as misincorporated positions per 100 000 bp and as hotspots per 100 000 bp.

RESULTS

We analyzed the NET-seq data for wild-type S. cerevisiae (ScWT), a mutant S. cerevisiae lacking the cleavage factor TFIIS (ScΔTFIIS) (19), wild-type E. coli (EcWT), and a mutant E. coli lacking cleavage factors GreA and GreB (EcΔGre) (21) (Supplementary Table S1). To enable high accuracy error rate calculation, the data were subject to extensive bioinformatic pre-processing prior to alignment to reference genomes (Supplementary Figures S1 and S2). Equivalent mRNA-seq data (conventional sequencing of total RNA (19,21), available for the ScWT (ScRNA) and EcWT (EcRNA) datasets) were analyzed in parallel as a control for possible mistakes during library preparation and sequencing, and to account for differences between the laboratory strains and their reference genome sequences. We compared total error rates from the 3΄ end to position -10 of the aligned transcripts, which approximately corresponds to the length of the RNA/DNA hybrid within the EC (Figure 1B). Surprisingly, we found that the very 3΄ position of the nascent RNAs carried a large number of erroneous nucleotides in all strains (Figure 1C). The error rate at positions −1 to −10 was far lower and comparable to the overall error rate in the total RNA-seq data (Supplementary Figure S4, Table S2, see below regarding −1 and −2 positions of yeast data), indicating that 3΄ mismatched reads in the alignment represent the misincorporated ECs. The proportions of misincorporated ECs in EcWT and ScWT were approximately 3% and 1% of all ECs, respectively (Figure 1C). In mutant strains lacking cleavage factors EcΔGre and ScΔTFIIS, the proportions of misincorporated ECs were ∼5% and 7%, respectively (Figure 1C). Although consistent with the ability of cleavage factors to proofread misincorporation events, the proportions of misincorporated ECs in WT and mutant strains were much higher than one would expect given the error rate of synthesis by RNAP: 10−3–10−6 (5,6,37,38). The proportion of misincorporated ECs was also far higher than could be proposed based on the assumption that 1% of backtracked ECs are misincorporated (20). The pattern of specific 3΄ misincorporations was similar for all datasets with a strong bias toward G>A misincorporations (Figure 1D), consistent with previous observations in vitro (2,5,38). There were several G>A misincorporation hotspots - positions where misincorporation happened frequently (>50 reads per location). For these hotspots in the mutant E. coli and S. cerevisiae there was a clear bias toward C preceding the position of the G>A misincorporation (Figure 2A), consistent with previous observations on final RNA products (2). The sequence bias in hotspots is also consistent with the earlier finding that CG motifs increase G>A misincorporations (20) (although we do not observe coincidence of misincorporation hotspots and pausing at −1 position), and suggests that this mechanism is conserved between bacteria and eukaryotes. However, the far largest number of G>A misincorporation events was away from hotspots, and represented by only one to few reads per location, suggesting that G>A misincorporation is a quite random event. When all misincorporation events were taken into account, the bias toward C preceding G>A misincorporation decreased in E. coli and disappeared in S. cerevisiae (Figure 2A), indicating that formation of many misincorporated ECs is not restricted to the CG motifs or pause sites.
Figure 2.

Sequence logos for the specific misincorporations (T of the read corresponds to U in the RNA). (A) The sequences surrounding the G>A misincorporations in the EcΔGre and ScΔTFIIS strains. (B) The sequences surrounding the C>A and U>A hotspots for the ScΔTFIIS strain. C. The sequences surrounding the U>C misincorporations at the 3΄, −1 and −2 positions in ScΔTFIIS.

Sequence logos for the specific misincorporations (T of the read corresponds to U in the RNA). (A) The sequences surrounding the G>A misincorporations in the EcΔGre and ScΔTFIIS strains. (B) The sequences surrounding the C>A and U>A hotspots for the ScΔTFIIS strain. C. The sequences surrounding the U>C misincorporations at the 3΄, −1 and −2 positions in ScΔTFIIS. S. cerevisiae also showed considerable U>A and C>A misincorporations at 3΄ ends (Figure 1D), though with less obvious sequence bias in the surrounding sequence of hotspots (Figure 2B). The error rates in the S. cerevisiae datasets were slightly elevated at positions −1 and −2, with bias toward U>C mismatch in both positions (Figure 1D). The sequences downstream of the −1 and −2 U>C misincorporations were slightly biased toward Us, suggesting that some misalignment of the template may favor read-through of misincorporated C by one or two positions (Figure 2C). It is also possible that U>C errors in the −1 and −2 positions of the transcript may facilitate pausing or backtracking of the EC, thus resulting in the accumulation of these mistakes in nascent transcripts. Another unexpected observation from the S. cerevisiae misincorporated ECs was a bias toward Ts at the +1 and +2 positions in the non-template strand (irrespective of the misincorporated NMP; Figure 2C). While this bias could be due to a complex structural effect, it is also possible that this sequence diminishes the selection of correct NTPs by the template strand downstream of the active center proposed for eukaryotic RNAPs (39). We found that in protein-coding sequences (ORFs) of E. coli, the G>A misincorporation hotspots were far less abundant than in the transcribed untranslated regions, with 1.34 and 10.68 hotspots per 0.1 Mb, respectively (Table 1). Other (non-hotspot) G>A misincorporated ECs were distributed evenly. No difference in distribution of hotspots or other misincorporated complexes between coding and non-coding regions were seen in S. cerevisiae, suggesting existence of a bacteria-specific mechanism to minimize formation of misincorporated ECs in protein coding sequences.
Table 1.

Distribution of G>A misincorporations and hotspots

DatasetType# locations#ECsTranslatedTranscribed non-translated
Length (bp)mm/100 000 bpLength (bp)mm/100 000 bp
ScΔTFIISAll19794736131971734921848.141 241 2091326.21
Hotspot4049390.220.24
EcΔGreAll19930751912238718144356.2140 4054352.41
Hotspot223350231.3410.68

The number of G>A misincorporation (mm) positions and hotspots (G>A hotspots were defined as having >50 misincorporations) in the deletion mutants, and the misincorporation rates in the translated regions in comparison to the transcribed non-translated regions. Transcribed translated regions were defined as aligned locations within protein coding sequences, while transcribed non-translated regions as aligned locations within the untranslated regions. In S. cerevisiae introns were also included in the transcribed non-translated regions.

The number of G>A misincorporation (mm) positions and hotspots (G>A hotspots were defined as having >50 misincorporations) in the deletion mutants, and the misincorporation rates in the translated regions in comparison to the transcribed non-translated regions. Transcribed translated regions were defined as aligned locations within protein coding sequences, while transcribed non-translated regions as aligned locations within the untranslated regions. In S. cerevisiae introns were also included in the transcribed non-translated regions.

DISCUSSION

Our bespoke pipeline was designed to optimize the accuracy of the error rate calculation while minimizing the loss of data and, consequently, the observed effects are likely to be biological, rather than experimental (library preparation, sequencing and data processing) in origin, for a number of reasons: Although the reverse transcription step is known to introduce errors, since these enzymes do not proofread (40), and have estimated error rates in the order of 10−4 to 10−5 (41), the 3΄ error rate in the order of 10−1 to 10−2 observed here would not be affected by the documented error rates of the enzymes used in library preparation. In fact, error rates at the 3΄ position were estimated to be >70% non-experimental, while error rates in the −1 to −10 positions had far lower estimated accuracy (Supplementary Table S2). In addition, 3΄ error rates were significantly higher than error rates at randomly-generated 3΄ ends of the total RNA-seq controls (libraries originated from random alkaline fragmentation of total RNA) (Supplementary Figure S4). The sequence specificity of misincorporation (G>A) was consistent with the bias observed in mature RNAs (2), and in nascent RNAs from a different study (20) (Figure 2A), and was different from the specificity of the errors in randomly-generated 3΄ ends of the control total RNA-seq datasets (libraries originated from random alkaline fragmentation of total RNA) (Figure 1D). The observed difference in the 3΄ end error rate between WT and strains lacking proofreading factors would not be expected if these were experimentally-derived errors. Assessment of dataset quality using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) indicated sufficient quality to reliably quantify transcriptional fidelity using these data in the majority of cases, although some reads were omitted from the error rate calculations to ensure accuracy (Supplementary Figure S1). Alignment was carried out allowing two mismatches within a seed region of 14 nucleotides, chosen to minimize seed length while ensuring k-mer uniqueness (Supplementary Figure S2). Alteration of the number of allowed mismatches has little effect on the alignments or observed error rate (Supplementary Figure S3A, Tables S3 and S4). In order to reduce the effect of sequencing miscalls, a Phred threshold of 30, equivalent to a 99.9% base call accuracy rate (30), was applied at each position (Supplementary Figure S3B). Alteration of the Phred threshold also had virtually no effect on the observed error rates (Supplementary Figure S3C). Positions with single base pair variations between experimental strain and reference genome were excluded from the positional analysis to ensure misincorporation rates were not inflated by genuine mutations. An earlier study in E. coli analyzed 3΄ errors in NET-seq reads of particular lengths (14-18 nucleotides), representing ECs in different translocation states (20). This study found an unusually high rate (0.8%) of misincorporation associated with backtracked ECs in the ΔGre strain, which approximates to a misincorporation rate of <0.5% of total ECs in the cell (taking into account the distribution of misincorporations between translocation states, and possible underrating of the proportion of non-backtracked ECs). Using the NET-seq data from two different studies (19,21), we report an at least 10-fold higher abundance of misincorporated ECs in the mutant E. coli and S. cerevisiae strains lacking the proofreading factors. This difference in the proportion of misincorporated ECs can be explained by differences in the EC isolation protocols between the two studies. The DNA digestion preceding the ECs’ isolation was performed on ice in the studies by Larson et al. and Churchman et al. (19,21) (the data analyzed by us), which would slow down all the reactions of RNAP. In the study by Imashimizu et al. (20), this digestion procedure was performed at room temperature. We suggest that, at room temperature, the high concentrations of Mn2+ required by DNase I facilitated the intrinsic proofreading activity of RNAP, leading to lower proportions of misincorporated ECs. This variation in the NET-seq ECs’ preparation has also likely caused the striking difference in the proportions of the misincorporated ECs observed in the WT strains of these two studies; while our analysis of Larson et al. and Churchman et al. (19,21) data showed 1–3% of all complexes as misincorporated, Imashimizu et al. (20) reported ∼0.1%. We suggest that the difference was caused by ongoing Gre dependent proofreading during ECs isolation in the work by Imashimizu et al. Based on their analysis of misincorporation within the subset of sequence-dependent pauses, Imashimizu et al. (20) suggested that, during sequence-dependent pausing, a C-1 increases the rate of G>A misincorporation at the following position (no sequence analysis was presented for non-paused ECs). This observation cannot exclude that misincorporation is induced merely by the C-1G+1 sequence, without involvement of a pause. Our analysis of all ECs did reveal several misincorporation hotspots with a clear bias to C preceding G>A misincorporation in both E. coli and S. cerevisiae. However, we did not observe any strong pausing at the position preceding misincorporation (−1 position; not shown), suggesting that these hotspots may occur at the misincorporation-inducing elements (C−1G+1) but without involvement of pausing. Most of the misincorporations we observed were singular events (represented by one or few reads, Figure 3), indicating that they were not formed on particular misincorporation-inducing sequences, but happen more randomly. Furthermore, we observed lower (E. coli) or no (S. cerevisiae) sequence bias around the G>A misincorporations in the non-hotspot misincorporated ECs (Figure 2 A), which, thus, sequence-wise, also appear to be distributed more randomly (although in E. coli G>A misincorporation is somewhat favored at C−1G+1). Taken together, our results suggest an unusually high abundance of randomly distributed misincorporated ECs, and that misincorporation events may not be necessarily determined by the formation of a pause at C−1G+1 sequence prior to the misincorporation.
Figure 3.

The number of misincorporated ECs (A) and misincorporation positions (B) as the threshold for misincorporations per position is increased. The vast majority of misincorporations occur at positions with a single misincorporation event.

The number of misincorporated ECs (A) and misincorporation positions (B) as the threshold for misincorporations per position is increased. The vast majority of misincorporations occur at positions with a single misincorporation event. The sequence-bias toward G>A misincorporation, misincorporation hotspots at CG sequences, and the overall proportion of misincorporated ECs, were similar in E. coli and S. cerevisiae mutant and wild-type strains. This similarity indicates that the proposed pausing and misincorporation induced by CG sequences, and the overall mechanism of misincorporation and proofreading, are conserved between bacteria and eukaryotes. Misincorporation of A was more frequent than any other nucleotide, irrespective of the base in the template strand (Figure 1D). This observation could be explained by the presence of some structural or chemical determinants in the RNAP active site, which make binding of the non-complementary ATP less dependent on the template base and/or facilitate binding in the conformation favorable for misincorporation. Interestingly, the rate of the removal of the erroneous AMP (via transcript assisted second phosphodiester bond hydrolysis) is significantly faster than that for other mistakes (8,42). It is tempting to speculate that this propensity has evolved to counteract more frequent misincorporations of AMP. Despite similarities in the mechanisms of misincorporation, the distribution of the misincorporation hotspots within the genome was different for E. coli and S. cerevisiae (Table 1). Hotspots were clearly reduced in protein-coding (1.34 hotspots per 0.1 Mb) versus transcribed but untranslated regions (10.68 hotspots per 0.1 Mb) in E. coli. It is possible that ORFs may have evolved to minimize error-prone sequences (hotspots) to reduce the formation of incorrect proteins. However, the absence of such bias in S. cerevisiae does not support this idea (Table 1). Instead, bacteria may have minimized the hotspots for formation of misincorporated ECs in the protein coding regions to reduce their interference with coupled translation, which would not be required in eukaryotes where transcription and translation are uncoupled. This idea is supported by the findings that sequence-specific pausing is also enriched in 5΄ untranslated regions (20). It is tempting to speculate that bacteria may have an as yet unrecognized mechanism(s) that decreases the formation of misincorporated ECs, or facilitates correcting/overcoming them, more efficiently at the error-prone sequences of ORFs than at other sequences and in the untranslated regions. One of the speculative possibilities could be that translation itself suppresses misincorporation events, although this hypothesis requires further investigation. Misincorporation at some sequences has been shown to be much faster (∼10 times) than on random sequences (2). However, even these reported rates of misincorporation cannot account for the observed proportion of misincorporated ECs. Since misincorporation leads to stable backtracking (7,8), the observed proportion of misincorporated complexes is likely to be a result of their accumulation due to their inefficient resolution, even in the WT strains. In this scenario, misincorporated complexes accumulate relatively slowly, but are also slowly resolved. Importantly, the error rate in the mature RNA products would not be changed, since misincorporated ECs are not productive in formation of a mature RNA, until they are proofread. The high proportion of misincorporated ECs suggests that they could be a major source of strong pauses in the cell, and, thus, the main cause of conflicts with fellow RNAPs and replication complexes. This hypothesis is supported by the recent findings that DksA, which participates in prevention of collisions between transcription and replication (43), was found to increase the accuracy of RNA synthesis, i.e. decreases misincorporation events (44,45). It is also possible that random transient misincorporation pausing could be used by bacteria to help couple transcription and translation, by slowing down the former. The very high proportion of misincorporated ECs observed here implies that one of the major roles of TFIIS and Gre factors is the resolution of misincorporation events. Since the input of cleavage factors into the correctness of the final RNA product is modest (2,4), our results suggest that the main function of the Gre and TFIIS factors is to physically resolve stalling of misincorporated ECs, rather than to correct the RNAs’ sequence per se. Indeed, a significant degree of cell filamentation, often accompanied by a diffuse nucleoid morphology, was observed in an E. coli mutant lacking Gre and DksA factors, indicative of problems with replication and/or chromosome segregation (not shown). Consistently, similar defects were observed in a ΔgreA mutant of S. pneumoniae, which has only one Gre factor and has no DskA (11). Notably, factors involved in the repair of collapsed replication forks or double stranded breaks become essential in ΔgreA ΔgreB background (46). The mechanisms by which the cells deal with a high proportion of misincorporated ECs in the absence of the fidelity factors are the subject for future studies. Stochastic fluctuations in protein expression, often referred to as noise, can cause significant phenotypic heterogeneity in isogenic cell populations and are essential for the activation of bimodal genetic switches that result in alternative expression states. It has been proposed that transcription errors could be a cause of such noise by leading to the production of non-functional regulatory proteins (18). However, our findings suggest that the random formation of misincorporated stalled ECs could also produce considerable noise by physically blocking transcription of regulatory genes. Similarly, it is possible that cellular defects and diseases linked to the fidelity of transcription (1,14–18) could, in fact, be caused by the accumulation of misincorporated ECs and subsequent conflicts with other molecular mechanisms, rather than by the correctness of the final RNA products per se. Click here for additional data file.
  43 in total

1.  RNA polymerase modulators and DNA repair activities resolve conflicts between DNA replication and transcription.

Authors:  Brigitte W Trautinger; Razieh P Jaktaji; Ekaterina Rusakova; Robert G Lloyd
Journal:  Mol Cell       Date:  2005-07-22       Impact factor: 17.970

2.  Nonrandom variations in human cancer ESTs indicate that mRNA heterogeneity increases during carcinogenesis.

Authors:  Marie Brulliard; Dalia Lorphelin; Olivier Collignon; Walter Lorphelin; Benoit Thouvenot; Emmanuel Gothié; Sandrine Jacquenet; Virginie Ogier; Olivier Roitel; Jean-Marie Monnez; Pierre Vallois; Frances T Yen; Olivier Poch; Marc Guenneugues; Gilles Karcher; Pierre Oudet; Bernard E Bihain
Journal:  Proc Natl Acad Sci U S A       Date:  2007-04-23       Impact factor: 11.205

3.  The transcription factor DksA prevents conflicts between DNA replication and transcription machinery.

Authors:  Ashley K Tehranchi; Matthew D Blankschien; Yan Zhang; Jennifer A Halliday; Anjana Srivatsan; Jia Peng; Christophe Herman; Jue D Wang
Journal:  Cell       Date:  2010-05-14       Impact factor: 41.582

4.  Structural basis of transcription: mismatch-specific fidelity mechanisms and paused RNA polymerase II with frayed RNA.

Authors:  Jasmin F Sydow; Florian Brueckner; Alan C M Cheung; Gerke E Damsma; Stefan Dengl; Elisabeth Lehmann; Dmitry Vassylyev; Patrick Cramer
Journal:  Mol Cell       Date:  2009-06-26       Impact factor: 17.970

5.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2011-09-08       Impact factor: 6.937

6.  The accuracy of reverse transcriptase from HIV-1.

Authors:  J D Roberts; K Bebenek; T A Kunkel
Journal:  Science       Date:  1988-11-25       Impact factor: 47.728

7.  Use of an in vivo reporter assay to test for transcriptional and translational fidelity in yeast.

Authors:  Randal J Shaw; Nicholas D Bonawitz; Daniel Reines
Journal:  J Biol Chem       Date:  2002-05-02       Impact factor: 5.157

8.  Rep and PriA helicase activities prevent RecA from provoking unnecessary recombination during replication fork repair.

Authors:  Akeel A Mahdi; Carol Buckman; Lynda Harris; Robert G Lloyd
Journal:  Genes Dev       Date:  2006-08-01       Impact factor: 11.361

9.  Mutation rates and intrinsic fidelity of retroviral reverse transcriptases.

Authors:  Luis Menéndez-Arias
Journal:  Viruses       Date:  2009-12-04       Impact factor: 5.048

10.  Bacterial global regulators DksA/ppGpp increase fidelity of transcription.

Authors:  Mohammad Roghanian; Nikolay Zenkin; Yulia Yuzenkova
Journal:  Nucleic Acids Res       Date:  2015-01-20       Impact factor: 16.971

View more
  20 in total

1.  Widespread Backtracking by RNA Pol II Is a Major Effector of Gene Activation, 5' Pause Release, Termination, and Transcription Elongation Rate.

Authors:  Ryan M Sheridan; Nova Fong; Angelo D'Alessandro; David L Bentley
Journal:  Mol Cell       Date:  2018-11-29       Impact factor: 17.970

Review 2.  The Mechanisms of Substrate Selection, Catalysis, and Translocation by the Elongating RNA Polymerase.

Authors:  Georgiy A Belogurov; Irina Artsimovitch
Journal:  J Mol Biol       Date:  2019-05-31       Impact factor: 5.469

Review 3.  Transcription elongation.

Authors:  Arkady Mustaev; Jeffrey Roberts; Max Gottesman
Journal:  Transcription       Date:  2017-02-08

4.  Transcription infidelity and genome integrity: the parallax view.

Authors:  Alasdair J E Gordon; Priya Sivaramakrishnan; Jennifer A Halliday; Christophe Herman
Journal:  Transcription       Date:  2018-08-10

Review 5.  How Acts of Infidelity Promote DNA Break Repair: Collision and Collusion Between DNA Repair and Transcription.

Authors:  Priya Sivaramakrishnan; Alasdair J E Gordon; Jennifer A Halliday; Christophe Herman
Journal:  Bioessays       Date:  2018-08-09       Impact factor: 4.345

6.  Transcriptional fidelities of human mitochondrial POLRMT, yeast mitochondrial Rpo41, and phage T7 single-subunit RNA polymerases.

Authors:  Shemaila Sultana; Mihai Solotchi; Aparna Ramachandran; Smita S Patel
Journal:  J Biol Chem       Date:  2017-09-07       Impact factor: 5.157

Review 7.  Transcription fidelity: New paradigms in epigenetic inheritance, genome instability and disease.

Authors:  Catherine C Bradley; Alasdair J E Gordon; Jennifer A Halliday; Christophe Herman
Journal:  DNA Repair (Amst)       Date:  2019-07-08

8.  A Cre Transcription Fidelity Reporter Identifies GreA as a Major RNA Proofreading Factor in Escherichia coli.

Authors:  Mikhail G Bubunenko; Carolyn B Court; Alison J Rattray; Deanna R Gotte; Maria L Kireeva; Jorge A Irizarry-Caro; Xintian Li; Ding J Jin; Donald L Court; Jeffrey N Strathern; Mikhail Kashlev
Journal:  Genetics       Date:  2017-03-24       Impact factor: 4.562

9.  A link between transcription fidelity and pausing in vivo.

Authors:  Pamela Gamba; Katherine James; Nikolay Zenkin
Journal:  Transcription       Date:  2017-01-10

10.  Genome-Wide Spectra of Transcription Insertions and Deletions Reveal That Slippage Depends on RNA:DNA Hybrid Complementarity.

Authors:  Charles C Traverse; Howard Ochman
Journal:  mBio       Date:  2017-08-29       Impact factor: 7.867

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.