Literature DB >> 29900693

Genome-wide screening of potential RNase Y-processed mRNAs in the M49 serotype Streptococcus pyogenes NZ131.

Zhiyun Chen¹, Rahul Raghavan², Fengxia Qi¹, Justin Merritt^3,4, Jens Kreth³.

Abstract

RNase Y is a major endoribonuclease in Group A streptococcus (GAS) and other Gram-positive bacteria. Our previous study showed that RNase Y was involved in mRNA degradation and processing in GAS. We hypothesized that mRNA processing regulated the expression of important GAS virulence factors via altering their mRNA stabilities and that RNase Y mediated at least some of the mRNA-processing events. The aims of this study were to (1) identify mRNAs that were processed by RNase Y and (2) confirm the mRNA-processing events. The transcriptomes of Streptococcus pyogenes NZ131 wild type and its RNase Y mutant (Δrny) were examined with RNA-seq. The data were further analyzed to define GAS operons. The mRNA stabilities of the wild type and Δrny at subgene level were determined with tiling array analysis. Operons displaying segmental stability in the wild type but not in the Δrny were predicted to be RNase Y processed. Overall 865 operons were defined and their boundaries predicted. Further analysis narrowed down 15 mRNAs potentially processed by RNase Y. A selection of four candidates including folC1 (folylpolyglutamate synthetase), prtF (fibronectin-binding protein), speG (streptococcal exotoxin G), ropB (transcriptional regulator of speB), and ypaA (riboflavin transporter) mRNAs was examined with Northern blot analysis. However, only folC1 was confirmed to be processed, but it is unlikely that RNase Y is responsible. We conclude that GAS use RNase Y to selectively process mRNA, but the overall impact is confined to selected virulence factors.

Entities: CellLine Chemical Disease Species

Keywords: zzm321990Streptococcus pyogeneszzm321990; RNA processing; RNase Y; genomics

Mesh：

Substances：

Year: 2018 PMID： 29900693 PMCID： PMC6460267 DOI： 10.1002/mbo3.671

Source DB: PubMed Journal: Microbiologyopen ISSN： 2045-8827 Impact factor: 3.139

INTRODUCTION

RNA processing, also known as RNA maturation or posttranscriptional modification, is a process that converts primary transcript RNA into mature RNA. RNA maturation is common in eukaryotes requiring extensive processing of the primary transcripts (Lodish et al., 2000; Nazar, 2004). Prokaryotic RNA processing has been reported mainly in stable RNAs belonging to the protein synthesis machinery (rRNAs and tRNAs), but rarely in mRNAs or small RNAs (Condon, 2007; Deutscher, 2006). However, emerging evidence indicates that bacterial RNA processing also takes place in mRNAs and small RNAs and plays an important role in gene regulation. Processing of a polycistronic mRNA leads to differential stability of individual cistrons, which allows cistrons in the same operon to be expressed at different levels. Known examples in Bacillus subtilis include the dnaK operon, the gapA operon, and the ilv‐leu operon (Bechhofer, 2009). This endonucleolytic processing alters the stability of individual cistrons residing in these polycistronic mRNAs to reduce the stability of segments encoding proteins needed in lower abundance and increasing the stability of segments encoding proteins needed in higher abundance (Ludwig, Rebhan, Blencke, Merzbacher, & Stulke, 2002; Ludwig et al., 2001; Meinken, Blencke, Ludwig, & Stulke, 2003). RNA processing is also necessary for maturation and functional activity of some noncoding small RNAs. For example, Streptococcus pyogenes CRISPR RNA, which is important in the interference with foreign genetic elements like plasmids and phages, must be cleaved within the repeat sequences by RNase III to generate the active spacer RNA (Deltcheva et al., 2011). Processing in the untranslated region (UTR) of a monocistronic mRNA can change the stability of the mRNA molecule, making the processed transcript more or less stable than the primary transcript (Condon & Bechhofer, 2011). Previously, we have investigated the posttranscriptional regulation of speB, a gene that encodes an extracellular protease secreted by S. pyogenes in late exponential and stationary phases. Our transcript analyses demonstrated that the primary speB transcript was modified via endonuclease cleavage in the 5’ UTR, and that the processed speB transcript was threefold more stable than the primary transcript. The ropB mRNA, which encodes a transcriptional activator of speB and is divergently transcribed from the same intergenic region as speB, is also modified in the 5’ UTR in a similar manner and becomes more stable after the modification (Chen, Itzek, Malke, Ferretti, & Kreth, 2013). Interestingly, both processing events were mediated by RNase Y, a major S. pyogenes endoribonuclease (Chen, Itzek, Malke, Ferretti, & Kreth, 2012; Chen et al., 2013). These observations led us to suspect that mRNA processing might occur more frequently in S. pyogenes than was generally assumed and that RNase Y might play a major role in this process. Here, we used a combination of RNA‐seq, tiling microarray and Northern blot analysis to identify processed mRNAs in the S. pyogenes genome and to further investigate the role of RNase Y in mRNA processing. RNA‐seq was used to predict S. pyogenes operons and to define operon boundaries with high resolution. Tiling arrays were then used to calculate mRNA half‐lives across entire transcripts to identify those with altered segmental stabilities (regions of different RNA stability relative to the rest of the transcript). Transcripts displaying altered segmental stabilities in the wild type (WT) but not in the RNase Y mutant were identified as candidate RNase Y‐processed mRNAs. Selected transcripts were then further examined to confirm RNA processing and to determine its effect upon protein production.

METHODS

Bacterial strains and growth conditions

Streptococcus pyogenes NZ131 (M49) (McShan et al., 2008) and its derivatives were routinely grown in C medium (0.5% proteose peptone #3; 1.5% yeast extract; 10 mM K2HPO4; 0.4 mM MgSO4; 17 mM NaCl) (Lyon, Gibson, & Caparon, 1998) at 37°C without aeration. Erythromycin and spectinomycin were added at a final concentration of 2 μg/ml and 150 μg/ml when required. The deletion of rny was accomplished by an overlap extension PCR technique replacing rny with an erythromycin resistance cassette. The RNase Y gene deleted strain (Δrny), its complement strain (Δrny_pDL278::rny), and Δrny carrying an empty vector (Δrny_pDL278) were reported in a previous study (Chen et al., 2012).

RNA‐seq sample preparation and sequencing

RNA extraction and purification were carried out by following a previously described protocol (Ajdic & Pham, 2007; Chen et al., 2012). Briefly, total RNA was extracted from bacterial samples with TRIzol reagent (Invitrogen, Carlsbad, CA) and was treated twice with Turbo DNase (Invitrogen) to remove chromosomal DNA. Ribosomal RNAs were removed from the total RNA by using a bacteria‐specific rRNA‐depletion kit (Ribo‐Zero, Illumina). RNA integrity before and after the treatment was determined on an RNA Nano LabChip (Agilent). RNA samples were sent to the Oklahoma Medical Research Foundation (OMRF, Oklahoma City, OK) for cDNA library construction using TruSeq RNA Sample Preparation v2 Protocol (Illumina) and DNA sequencing (100 bp paired‐end) on Illumina HiSeq 2000.

Alignment of sequencing reads to genome

Streptococcus pyogenes NZ131 complete genome was obtained from NCBI (accession No. CP000829). Bowtie2 (Langmead & Salzberg, 2012) was used to generate the S. pyogenes genome index and to align paired‐end RNA‐seq data using “very‐sensitive‐local” settings. The aligned files were sorted, converted to BAM format, and indexed using Samtools (Li et al., 2009). The locations of genes and tRNAs were determined by aligning corresponding FASTA files to the S. pyogenes genome and converting the aligned sequences into BED format. Percentage of unambiguously mapped reads was determined with CLC Genomics Workbench (Qiagen).

Reads coverage depth calculation

Reads coverage depth of each base on the chromosome (called “base read” in the following sections) was computed with BEDTools using the “genomeCoverageBed” function (Quinlan & Hall, 2010). The base reads were log2 transformed to reduce in‐gene variation. To calculate the expression level of a particular sequence, the log‐transformed base reads in that sequence were added, averaged, and exponentially transformed.

Operon prediction

We used the following criteria to determine whether two consecutive genes belonged to the same operon: (i) they were no more than 100 bp apart on the chromosome; (ii) genes were encoded on the same strand; (iii) their expression levels were no more than fourfold different; and (iv) their intergenic region did not contain a poorly expressed domain (i.e., its expression level was no more than half of the least expressed gene). A Python‐based script was developed to group S. pyogenes genes into operons based on these criteria. The Python code has been uploaded and can be accessed by the following link: https://github.com/ZhiyunChen/RNA_Processing.

Validation of operon prediction

The accuracy of the operon predictions was evaluated by comparing them to the Prokaryotic Operon DataBase (ProOpDB), a computer‐based operon prediction tool (Taboada, Ciria, Martinez‐Guerrero, & Merino, 2012).

Operon boundary identification

We assumed that base reads within a gene transcript should be stable but would drastically increase or decrease at the border. Based on this assumption, we used the coefficient of variation (CV) of base reads as a key indicator for transcription boundary identification. To identify the transcriptional start site (TSS) of an operon, we first defined a “search region” that covered the translational start site of the first gene in the operon and its surrounding sequences. The “search region” included the first half portion of the gene’s ORF (or the first 200 base, whichever was shorter) and extended upstream into its intergenic region until it reached the site with the lowest base read in the intergenic region. CV of all base reads in the “search region” was calculated and used as a reference (refCV). Then, a 25‐bp window was created and slid through the “search region” with a one‐base step size. CV of base reads covered by the sliding window was determined after each step. The window location displaying the maximum CV (maxCV) was considered to contain the TSS if maxCV was twofold higher than refCV. TSS was defined as the site with the lowest base read in that window. The transcriptional stop site of an operon was defined in a similar manner. To unify operon boundaries obtained from different RNA‐seq datasets, the boundary that gave the shortest UTR was selected and was regarded as the “real” boundary of that operon. The procedures mentioned above were implemented with Python scripts.

mRNA half‐life calculation at a subgene level

In a previous study, we used Affymetrix microarray analysis to determine S. pyogenes mRNA half‐lives at a single gene level (Chen et al., 2013). Since Affymetrix used multiple 25‐base oligonucleotide probes to detect one gene at different sites, it was possible to determine mRNA half‐lives at a subgene level. The chip was designed to detect the expression of each ORF with a set of 17 probes. It also detected each intergenic region with a set of tiled probes, that is, the probes were spaced at a 14‐ to 23‐base distance so that their ends overlapped. Affymetrix Power Tools were used to extract raw probe hybridization signals from microarray CEL files (GEO accession number: GSE40198). mRNA decay rate at each probe‐binding site was calculated by using a “steepest slope” method (Chen et al., 2013). The overall mRNA half‐life of an operon was obtained as an averaged half‐life of all probe‐binding sites that belonged to the operon.

RNase Y‐processed mRNA identification

We reasoned that an unprocessed transcript would have a uniform decay rate throughout its sequence in both WT and ∆rny strains, but a transcript that is processed by RNase Y would display altered segmental stability in WT in comparison to the ∆rny strain (Mader, Hennig, Hecker, & Homuth, 2004). We screened the S. pyogenes transcriptome for operons displaying segmental RNA stabilities. An operon was considered to represent a potentially processed transcript if it contained a domain in which its average half‐life was twofold higher or lower than the overall operon half‐life. To reduce false positives, we mandatorily defined that the domain must contain at least two consecutive probe‐binding sites. Two lists of processed RNA candidates in the WT and ∆rny were generated, respectively. The lists were compared to find operons that were present in the WT but absent in the ∆rny. These operons were likely processed by RNase Y specifically. The procedures mentioned above were implemented with Python scripts.

Northern blot, real‐time PCR, and 5’ RACE analyses

Northern blot, real‐time‐PCR, and 5’ RACE analyses were carried out as described previously (Chen et al., 2012, 2013). Additionally, for Northern blot analysis of the FMN riboswitch 5S rRNA was used as loading control to account for the small size of the transcript. The Northern blot probe sequences are listed in Table 1.

Table 1

Characteristics of Northern blot probes

Name	Sequence	Gene locus	Orientation on chromosome	Start point	Amplicon (bp)
folC1_1F	AAGGGAGTGCATATCGTTGG	Spy49_0851	Forward	851069	434
folC1_1R(T7)	taa tac gac tca cta tag gg GCGGCAATATCAGCGTAAGT	Spy49_0851	Reverse	851502
folC1_2F	cacatcgctatgccaggac	Spy49_0851	Forward	851696	400
folC1_2R(T7)	taa tac gac tca cta tag gg tgaaatcagcaaccctacca	Spy49_0851	Reverse	852095
speG_5’UTR_1L	gcaacacttgtgcgtgaagt	Spy49_ig0084	Forward	194721	152
speG_5’UTR_1R(T7)	taa tac gac tca cta tag gg cctaaataaaaatatcaatcggtttca	Spy49_ig0084	Reverse	194872
speG_1F	ACCCCATGCGATTATGAAAA	Spy49_0187	Forward	195031	310
speG_1R(T7)	taa tac gac tca cta tag gg GAACAACCTCAGAGGGCAAA	Spy49_0187	Reverse	195340
ypaA_LD_1L	cgagcgcaagctgatgtg	Spy49_ig0086	Forward	319649	120
ypaA_LD_1R(T7)	taa tac gac tca cta tag gg aaattgaaagaagttccgtcgt	Spy49_ig0086	Reverse	319768
ypaA_1F	ATTATTCCAGGCGCAGCTTT	Spy49_0307	Forward	319870	426
ypaA_1R(T7)	taa tac gac tca cta tag gg GCAAATATCAACCCCCTCAA	Spy49_0307	Reverse	320295
prtF_5’UTR_1L	TTT GAC AGT TGT CCT GTA GTC TTT	Spy49_ig0045	Forward	125913	79
prtF_5’UTR_1R(T7)	TAA TAC GAC TCA CTA TAG GGT TGT GTC ATT TAT TTT CTC TCT CCA	Spy49_ig0045	Reverse	125991
prtF_1F	GCTTCCGCTAGAATCAGGTG	Spy49_0119	Forward	128524	475
prtF_1R(T7)	taa tac gac tca cta tag gg TTGCTCGTCTGGAAGCTTTT	Spy49_0119	Reverse	128998

Characteristics of Northern blot probes

Protein sample preparation and Western blot analysis

Streptococcus pyogenes whole‐cell lysate was prepared by incubating 108 PBS‐washed cells with 2 μg streptococcal phage lysin (PlyC, kindly provided by Dr. Vincent Fischetti, Rockefeller University, New York) (Nelson, Schuch, Chahales, Zhu, & Fischetti, 2006) in a 0.1 ml reaction mix at 37°C for 10 min or until the cells completely lysed. Extracellular protein was prepared by following a TCA precipitation protocol. Briefly, 10 ml S. pyogenes culture was centrifuged at 5,000 for 10 min and filtered though a 0.22 μm syringe filter (VWR). Clarified culture supernatant, after chilled on ice, was mixed with 10% volume ice cold trichloroacetic acid (TCA, 100% in acetone) and was incubated at 4°C for 3 hr. The mixture was centrifuged at 20,000 at 4°C for 10 min. The pellet was washed with 1 ml ice‐cold acetone for three times to remove residual TCA. Washed pellet was briefly air‐dried and suspended in 0.1 ml H2O. Western blot analysis was carried out by following a previously described protocol (Kreth, Chen, Ferretti, & Malke, 2011).

Accession numbers

The RNA‐seq sequencing and Affymetrix microarray data have been deposited in the GEO database under accession numbers: GSE40198 and GSE99533.

RESULTS

Transcription map generation with RNA‐seq

mRNA processing often occurs in the 5’ UTR of a transcript (e.g., speB and ropB in S. pyogenes) or between ORFs of a polycistronic operon (e.g., the gapA and dnaK operons in B. subtilis). In order to detect these processing events, it is first necessary to have a transcriptome map that identifies the boundary of each transcript as well as the organization of each polycistronic operon. We used RNA‐seq analysis to generate a transcriptome map of nephritogenic S. pyogenes NZ131, serotype M49 WT, and an isogenic ∆rny mutant. Two independent WT and ∆rny mutant strains were grown in C medium until the late exponential phase and used for the construction of cDNA libraries followed by deep sequencing. Sequencing generated a total of 11–12 billion reads for each sample. Subsequently, the sequence reads were aligned to the S. pyogenes NZ131 genome.

Quality control of RNA‐seq results

Among all RNA‐seq reads, more than 99% reads were unambiguously mapped to the S. pyogenes genome (Table S1). Among these mapped reads, less than 1% were mapped to rRNAs or tRNAs, which indicated a successful removal of structural RNAs in sample preparation (data not shown). The two biological replicates (referred to as wt1 and wt2 for the wild type and Δrny1 and Δrny2 for the RNase Y mutant) yielded highly consistent results (r 2 = 0.857 for the WT and r 2 = 0.998 for the Δrny) (Figure 1a). Quantitative RT‐PCR (qPCR) was used to evaluate the consistency of RNA‐seq results. Eleven genes were randomly selected and their relative transcript abundance in the Δrny mutant versus the wild type was determined with qRT‐PCR and RNA‐seq analysis, respectively. Results obtained from the two methods showed a strong correlation (r 2 = 0.996) (Figure 1b). We conclude that our RNA‐seq datasets are of high quality and can be used for downstream data analysis.

Figure 1

RNA‐seq quality control. (a) Consistency of gene abundance between biological replicates of Streptococcus pyogenes NZ131 wild type and isogenic Δrny mutant determined by RNA‐seq. The gene abundance of the replicates was plotted against each other on a logarithmic scale scatter diagram. The trend lines are presented for visual determination of consistency along with calculated R2‐values. (b) Correlation of gene expression levels between selected genes (gyrA, acpA, cfa, emm49, pnpA, prtF, ropB, sagA, ska, speG, and Spy49_1276) comparing RNA‐seq and qRT‐PCR results. The trend line is presented for visual determination of consistency along with calculated R 2‐value

Operon prediction and transcriptome map construction

The computer‐based operon prediction tool Prokaryotic Operon DataBase (ProOpDB) was used to determine the potential number of operons encoded in the S. pyogenes genome (Taboada et al., 2012). This analysis was performed to be able to compare our prediction approach using RNA‐seq data to an independent method. Overall, 720 gene pairs were reported to be cotranscribed by ProOpDB. The first step of constructing a transcriptome map from our RNA‐seq data was to group genes into operons. S. pyogenes NZ131 has a single circular DNA of 1.8 million bases including 1699 protein‐encoding genes, 18 rRNAs, and 66 tRNAs (McShan et al., 2008). RNA‐seq datasets of both the WT and Δrny strains were used for operon prediction and yielded highly consistent results. Specifically, predictions from WT2, Δrny1, and Δrny2 datasets were almost identical with only 1–2 uniquely predicted gene pairs found in each sample. Predictions from WT1 were slightly different from the others with 14 uniquely predicted gene pairs (Figure 2a). This observation suggests that an RNase Y mutation does not affect S. pyogenes operon organization. We, therefore, combined the four predictions to include all possible operons. A total of 865 operons were predicted, in which 491 CDS (coding sequences) were monocistronic, 169 dicistronic, 102 tricistronic, and 103 contain ≥4 CDS (Figure 2b). The Spy49_1460–Spy49_1481 operon contains the highest number of genes (20) and encodes phage proteins, while the second largest operon consists of 16 genes and encodes tRNAs. Our global operon prediction results can be found in Supplemental Table S2. Comparing our RNA‐seq analysis with the ProOpDB prediction (which we assumed to be 100% correct) found concordance in 720 cotranscribed gene pairs (81% sensitivity) and 627 monocistronic gene pairs (71% specificity). A total of 144 gene pairs were predicted as cotranscribed by RNA‐seq analysis, but not by ProOpDB. Therefore, to test their coexpression patterns, we randomly picked 10 gene pairs and indeed confirmed their cotranscription with RT‐PCR (Figure S1). We, therefore, concluded that our operon prediction strategy is highly reliable and possibly even more accurate than that of ProOpBD predictions.

Figure 2

Transcriptome analysis and operon prediction. (a) Venn diagram showing the relationship of individual operon predictions using Streptococcus pyogenes NZ131 wild type (wt1 and wt2) and isogenic Δrny mutant (rny1 and rny2) RNA‐seq datasets. (b) Distribution of the number of CDS (coding sequences) in operons in the S. pyogenes NZ131 genome

Identification of operon boundries

The next step was to define the boundaries of each operon. The boundary of an operon was defined as an abrupt upshift or downshift of RNA‐seq reads along the chromosome. Although the transcription start and end sites (TSSs and TESs) of most genes are located outside of their coding regions, some occur within their coding regions (Remmele et al., 2014). Therefore, a “sliding‐window” method was developed to search for abrupt changes in RNA‐seq reads in the upstream and downstream intergenic regions as well as within the coding regions of each operon (see Materials and Methods for details). We screened the upstream and 5’ regions of each operon with a 25‐bp window where an abrupt increase of RNA‐seq reads occurred and to define the left operon boundary as the site where the lowest RNA‐seq read was found in that window. The same method was used to identify the right operon boundary. RNA‐seq datasets of both the WT and Δrny strains were used for boundary definition and yielded similar results. Of all predicted boundaries, 1471 (85%) showed less than 10‐bp difference. We next combined the four prediction results by selecting the prediction that yielded the shortest UTR as the “true” boundary (Supplemental table S2). Based on this transcriptome map, we calculated the length of UTRs for the 865 operons. The majority of 5’ UTRs varied between 0 and 200 nt with a median of 43 nt (Figure 3a). The majority of 3’ UTRs varied between 0 and 150 nt with a median of 33 nt (Figure 3b). No expression difference was observed between the WT and Δrny strains, which confirmed our previous finding that the mutation of RNase Y does not affect gene expression (Chen et al., 2013).

Figure 3

Distribution of the UTR lengths in the Streptococcus pyogenes NZ131 primary transcripts. (a) Length distribution of 5’ UTR. (b) Length distribution of 3’ UTR

Distribution of the UTR lengths in the Streptococcus pyogenes NZ131 primary transcripts. (a) Length distribution of 5’ UTR. (b) Length distribution of 3’ UTR We then performed 5’ RACE on 10 highly expressed genes to test the accuracy of our transcriptional boundary predictions using RNA‐seq. Transcriptional start sites determined by the two methods were very similar to each other (Figure 4). For 6 of 10 genes, the length of the 5’ UTR as predicted by RNA‐seq was within 6 nt of the TSS identified by RACE. For three other genes, the difference was 11–12 nt. The largest discrepancy was observed for the ropB gene (Spy49_1691). Its 5’ UTR was determined to be 369 nt by RACE and 108 nt with RNA‐seq. A closer observation indicated that the ropB 5’ UTR was overlapped by its upstream gene, speB. When the speB gene was highly expressed, it apparently masked a large portion of the ropB 5’ UTR. We conclude that our transcriptional boundary predictions are reliable for well‐separated operons but are less reliable for overlapping operons.

Figure 4

Validation of predicted 5’ UTR lengths. The lengths of 10 selected genes determined with RNA‐seq analysis was validated by 5’ RACE and the results plotted against each other on a logarithmic scale scatter diagram. The trend line is presented for visual determination of consistency. The data point outside the trend line represents the UTR of speB with 693nt in length

Screen for putative RNase Y‐processed mRNAs

We screened the S. pyogenes transcriptome for putative RNase Y‐processed mRNAs by searching for transcripts that displayed segmental stability in the WT, but not in the Δrny mutant. Segmental stability was defined as an uneven decay pattern of a transcript (i.e., the transcript contained a segment with a significantly higher or lower stability than the rest of the transcript). A Python‐based script was developed to automatically identify mRNAs with segmental stabilities in the S. pyogenes transcriptome. A total of 80 mRNAs displaying segmental stabilities were identified in the WT. Among these mRNAs, 29 exhibited altered diminished segmental stabilities in the ∆rny mutant. Visual inspection of the decay patterns of these mRNAs showed that the apparent‐altered segmental stabilities in 14 mRNAs were due to random signal variations. The remaining 15 mRNAs and their gene arrangements are presented in Table 2. Four mRNAs with a potential role in GAS virulence, ypaA, speG, prtF, and folC1 were selected for further investigation. Each of these mRNAs displayed altered segmental stabilities in their 5’ regions in the WT, but not in the Δrny mutant. They also showed significantly increased abundance in the Δrny mutant. Northern blot analysis was carried out to examine their transcript patterns in the WT, Δrny mutant, Δrny mutant carrying an empty vector (Δrny_pDL278), and complemented Δrny mutant (Δrny_pDL278::rny) strains. Results showed that the WT and Δrny complemented strains yielded similar results, while the Δrny and Δrny_pDL278 strains yielded similar results for all examined mRNAs. These observations confirmed that the changes in transcript patterns were due to the RNase Y mutation, but not to polar effects or unknown mutations. We, therefore, report only the results from the WT and Δrny mutant in the following sections.

Table 2

Putative RNase Y‐processed mRNAs

Operon	Strand	Gene locus	Gene name	Product name
operon_47	+	Spy49_0086	pbp1b	Multimodular transpeptidase‐transglycosylase
operon_62	+	Spy49_0119	prtF	Fibronectin‐binding protein
operon_95	+	Spy49_0187	speG	Exotoxin‐type G precursor
operon_105	+	Spy49_0204	fasB	Putative histidine kinase
		Spy49_0205	–	Putative histidine kinase
		Spy49_0206	fasA	Response regulator FasA
operon_153	+	Spy49_0307	ypaA	Riboflavin transporter YpaA
operon_163	+	Spy49_0325	–	Hypothetical protein Spy49_0325
operon_179	−	Spy49_0350c	–	Hypothetical protein Spy49_0350c
operon_209	+	Spy49_0418	–	Hypothetical protein Spy49_0418
		Spy49_0419	–	Putative multidrug resistance efflux pump
		Spy49_0420	rpmG	50S ribosomal protein L33
operon_216	+	Spy49_0433	–	Putative sugar transferase
operon_216	+	Spy49_0434	–	Putative glucosyl transferase
operon_290	+	Spy49_0590	–	Hypothetical protein Spy49_0590
		Spy49_0591	murA	UDP‐N‐acetylglucosamine 1‐carboxyvinyltransferase
		Spy49_0592	–	Hypothetical protein Spy49_0592
operon_392	+	Spy49_0851	folC1	Dihydrofolate synthase/folylpolyglutamate synthase
		Spy49_0852	folE	GTP cyclohydrolase I
		Spy49_0853	folP	Dihydropteroate synthase
		Spy49_0854	folQ	Dihydroneopterin aldolase
		Spy49_0855	folK	2‐amino‐4‐hydroxy‐6‐hydroxymethyldihydropteridine pyrophosphokinase
operon_448	−	Spy49_0974c	–	Putative lipoprotein
operon_448	−	Spy49_0975c	cdd	Cytidine deaminase
operon_463	+	Spy49_1008	–	Putative amino acid symporter
operon_463	+	Spy49_1009	–	Putative cation efflux system protein
operon_791	−	Spy49_1689c	spi	Spi SpeB protease inhibitor
operon_791	−	Spy49_1690c	speB	Strepotococcal cysteine protease (streptopain)/streptococcal pyrogenic exotoxin B (SpeB)
operon_842	+	Spy49_1781	–	Hypothetical protein Spy49_1781
		Spy49_1782	–	Hypothetical protein Spy49_1782
		Spy49_1783	–	Hypothetical protein Spy49_1783

Putative RNase Y‐processed mRNAs

FolC1

The folC1 gene encodes dihydrofolate/folylpolyglutamate synthase (DHFS/FPGS), which is involved in folate synthesis and intracellular retention (Bognar, Osborne, & Shane, 1987). Operon prediction suggested that the folC1 gene was cotranscribed with four other folate synthesis genes, folEPQK. mRNA stability analysis revealed segmental stabilities within the folC1 ORF in the WT. Specifically, the 5’ region of the folC1 gene had a measured half‐life of 2.6 min versus 0.7 min for the remaining portion of the operon. The whole operon was strongly stabilized in the ∆rny background (half‐life = 6.8 min) with diminished segmental stabilities (Figure 5a). Northern blot analysis of folC1 using a probe targeting the 5’ region of the gene identified at least three transcripts with distinct sizes: 3.5 kb, 1.3 kb, and 0.65 kb. The 3.5 kb and 1.3 kb transcripts represented the folC1‐EPQK mRNA and the full‐length folC1 mRNA, respectively (Figure 5b). The 0.65 kb transcript could be detected with the probe targeting the 5’ region of the gene (folC 1.1), but not with the probe targeting the 3’ region of the gene (folC 1.2). This suggested that the 0.65 kb transcript encompassed the first half of the folC1 gene. What is the source of this truncated transcript? It seems unlikely to be created via premature transcriptional termination since no terminator‐like sequence could be identified within the folC1 ORF. It is possible that the full‐length folC1 transcript was processed internally so that the 3’ region of the transcript was rapidly degraded, though the 5’ region was stabilized via an unknown mechanism. However, the 0.65 kb transcript is also present in the ∆rny background, excluding a direct role of RNase Y in the generation of this fragment.

Figure 5

Role of RNase Y in transcript processing of folC1. (a) Analysis of folC1gene expression levels and mRNA half‐lives in Streptococcus pyogenes NZ131 wild type and Δrny mutant. Gene expression is presented as RNA‐seq reads and mRNA half‐life is presented as probe signal intensity relative to the chromosomal position of the folate synthesis operon. (b) Northern blot analysis of the 5’ and 3’ UTRs of folC1. The transcripts were detected with specific probes in (1) wild type, (2) Δrny, (3) Δrny carrying an empty vector (Δrny_pDL278), and (4) Δrny complemented (Δrny_pDL278::rny) strains

SpeG

The speG gene encodes superantigen G, a secreted protein that activates T cells in a nonantigen‐specific fashion and is responsible for the clinical features of streptococcal toxic shock syndrome (STSS) (Steer, Lamagni, Curtis, & Carapetis, 2012). The speG gene is preceded by the 4.5S RNA‐encoding gene (the RNA component of the signal recognition particle (SRP) (Herskovits, Bochkareva, & Bibi, 2000)) and followed by the pgi gene (encoding glucose‐6‐phosphate isomerase) on the chromosome. Sequence analysis revealed that the three genes have independent promoters (Figure 6b). Northern blot signals were barely detectable for the speG gene in the WT (Figure 6b). In the ∆rny strain, it was cotranscribed with the 4.5S RNA and pgi gene so that the read through transcripts (2.8 kb and 1.2 kb) became detectable. The 4.5S RNA‐encoding gene was strongly expressed in both WT and ∆rny mutant as a single 0.1 kb transcript (Figure 6b). The 0.1 kb transcript was unlikely the RNase Y‐processed product of mRNA read through, since the RNase Y mutation did not lead to a reduction of 4.5S RNA in abundance. It is also possible that the increased 5’ stability of the speG gene was due to the highly structured 4.5S RNA preceding the gene, rather than mRNA processing.

Figure 6

Role of RNase Y in transcript processing of speG. (a) Analysis of gene expression levels and mRNA half‐lives of speG in Streptococcus pyogenes NZ131 wild type and Δrny mutant. Gene expression is presented as RNA‐seq reads and mRNA half‐life is presented as probe signal intensity relative to the chromosomal position of the speG‐encoding operon. (b) Northern blot analysis of speG and the upstream 4.5S RNA. The transcripts were detected with specific probes in (1) wild type, (2) Δrny, (3) Δrny carrying an empty vector (Δrny_pDL278), and (4) Δrny complemented (Δrny_pDL278::rny) strains

YpaA

The ypaA gene encodes a riboflavin transporter. The gene is preceded by a typical flavin mononucleotide (FMN) riboswitch (Patenge et al., 2012) in its 5’ UTR and is followed by a putative phosphatase‐encoding gene (Spy49_0308) (Figure 7a). The FMN riboswitch functions as a riboflavin sensor that activates the ypaA gene expression in response to low levels of riboflavin. The ypaA gene was not induced under the condition used in this study since the culture medium was replete with riboflavin. Therefore, only the FMN riboswitch (0.2 kb transcript), but not the ypaA mRNA, was detected in the WT. The ypaA mRNA was detected in the ∆rny mutant as a 1.6 kb read through transcript (Figure 7b). Similar FMN riboswitch mRNA levels were observed in the WT and ∆rny mutant, which suggests that the expression of FMN riboswitch is unaffected by the RNase Y mutation. It also suggests that the FMN riboswitch is not a breakdown product of the ypaA read through, since the accumulation of ypaA read through in the ∆rny mutant does not lead to a reduction of the FMN riboswitch.

Figure 7

Role of RNase Y in transcript processing of ypaA. (a) Analysis of gene expression levels and mRNA half‐lives of ypaA in Streptococcus pyogenes NZ131 wild type and Δrny mutant. Gene expression is presented as RNA‐seq reads and mRNA half‐life is presented as probe signal intensity relative to the chromosomal position of the operon‐encoding ypaA. (b) Northern blot analysis of ypaA and the upstream FMN riboswitch. The transcripts were detected with specific probes in (1) wild type, (2) Δrny, (3) Δrny carrying an empty vector (Δrny_pDL278), and (4) Δrny complemented (Δrny_pDL278::rny) strains

PrtF

The prtF gene encodes a fibronectin‐binding adhesin protein F that mediates binding of S. pyogenes to extracellular matrix and host cells (Fogg & Caparon, 1997) (Figure 8a). Northern blot analysis of the prtF mRNA with a CDS probe detected a full‐length mRNA (3.5 kb transcript) and multiple transcripts with reduced sizes in the WT. The same transcripts were observed in the ∆rny strain with stronger signal intensity, which suggested that these shorter transcripts were not processed mRNAs but were probably degradation intermediates. Similar expression patterns were observed with Northern blot analysis targeting the prtF 5’ UTR. An additional 0.2 kb transcript was found in the Δrny mutant, which we presumed to be another degradation intermediate (Figure 8b).

Figure 8

Role of RNaseY in transcript processing of prtF. (a) Analysis of gene expression levels and mRNA half‐lives of prtF in Streptococcus pyogenes NZ131 wild type and Δrny mutant. Gene expression is presented as RNA‐seq reads and mRNA half‐life is presented as probe signal intensity relative to the chromosomal position of the prtF gene. (b) Northern blot analysis of prtF 5’ UTR and the prtF orf. The transcripts were detected with specific probes in (1) wild type, (2) Δrny, (3) Δrny carrying an empty vector (Δrny_pDL278), and (4) Δrny complemented (Δrny_pDL278::rny) strains

Effects of altered mRNA levels on protein abundance

We questioned whether the alteration of mRNA expression in the Δrny background correlated with differential protein production. Four proteins, SpeB, RopB, PrtF, and SpeG, were examined for their abundance in the WT, Δrny, Δrny_pDL278 and Δrny_pDL278::rny strains. We were not able to examine the protein abundance of YpaA or FolC1 because of the lack of specific antibodies, and several attempts of protein tagging were unsuccessful. Western blots detected RopB, PrtF, and SpeG with specific antibodies (Figure 9a,b), while secreted SpeB was detected via SDS‐PAGE and Coomassie blue staining (Figure 9c). Similar amounts of the PrtF and SpeG proteins were quantified in all tested samples (Figure 9d), despite their significantly increased mRNA levels in the Δrny mutant. Significantly reduced RopB protein production was observed in the Δrny strain even though ropB mRNA levels were unaffected (Figure 9d). Significantly reduced SpeB production was observed in the Δrny strain (Figure 9d), which was in accordance with the reduced speB mRNA level in that strain as reported before (Chen et al., 2013).

Figure 9

Detection of protein abundance of selected proteins. (a) Western blot analysis with RopB‐ and PrtF‐specific antibodies using whole‐cell preparations. The Coomassie Blue stained SDS‐gel page is shown as a loading control. (b) Western blot detection of secreted SpeG. (c) Coomassie Blue stained SDS‐page gel showing whole‐cell lysates from late exponential phase cells. SpeB is highly expressed in the late exponential growth phase and the protein is directly visible via SDS‐PAGE. (d) Quantification of band intensities using Image J. The abundance is presented relative to the wild type. For RopB, SpeG and Prtf n = 3; for SpeB n = 2. (* = significant difference; p ≤ 0.01, Student’s t test)

DISCUSSION

The aim of this study was the identification of new RNase Y mRNA targets in S. pyogenes. We previously demonstrated that RNase Y is the major endonuclease responsible for global mRNA turnover in S. pyogenes (Chen et al., 2013). Although not essential, RNase Y ensures rapid turnover of mRNA and a deletion of the rny gene increases the average mRNA half‐life from 1.24 min to 2.79 min (Chen et al., 2013). RNase Y activity also seems to be important during infection, since an RNase Y deletion mutant is highly attenuated in a murine subcutaneous infection model (Kang, Caparon, & Cho, 2010). During our initial studies, we observed that RNase Y is able to process primary transcripts. This was shown for ropB and speB, encoding the important transcriptional regulator that activates expression of streptococcus exotoxin B‐encoding gene (speB), a major S. pyogenes virulence factor (Chen et al., 2012, 2013). Here, we combined RNA‐seq and tiling‐microarray analyses to search for additional RNase Y‐processed mRNAs in S. pyogenes on a genome‐wide scale. To do so, it was first necessary to predict the global operon structure in the cell. Overall 1699 genes were grouped into 865 operons, with 491 monocistrons. The median lengths of the 5’ UTRs and 3’ UTRs were predicted to be 43 nt and 33 nt, respectively (Figure 3a,b). Another S. pyogenes strain MGAS2221 (serotype M1) has been analyzed by RNA‐seq, though the complete transcriptome map is not publically available (McClure et al., 2013). Analysis of the MGAS2221 strain identified 412 polycistronic operons, which is comparable to our result. The average UTR lengths have not been reported for this strain. Our operon prediction also yielded similar results to those reported for related human pathogen Streptococcus agalactiae, which encodes 891 operons and 484 monocistrons (Rosinski‐Chupin et al., 2015). Moreover, the distribution of 5’ UTR lengths in strain NZ131 is proportional to the distribution reported for S. agalactiae and the swine pathogen Streptococcus suis (Rosinski‐Chupin et al., 2015; Wu et al., 2014). To further validate our results, we compared our operon predictions with a computational prediction using ProOpDB. Both our control experiments and the comparison to existing data from closely related streptococci support our operon prediction method. After comparing the wild type and RNase Y mutant, transcript analysis narrowed our search for RNase Y‐processed mRNAs to 15 operons. Further experimental examination of four mRNAs was performed to verify if their transcripts serve as an RNase Y substrate. Three were either transcriptional read through products or regular unprocessed mRNAs and one (folC1) showed a truncated transcript independent of RNase Y. These observations do not support our original hypothesis that RNase Y‐mediated mRNA processing plays a significant role in global gene regulation, rather this enzyme seems to possess a more specific regulon. The folC1 mRNA was initially identified as RNase Y substrate based on the segmental stability pattern. FolC catalyzes the conversion of folates to polyglutamate (Bognar et al., 1987). Bacteria require folates for the biosynthesis of glycine, methionine, formylmethionine, thymidine, purines, and pantothenate (Bognar et al., 1987). Unlike other known processed mRNAs that are cleaved either in the UTRs or between ORFs, the folC1 mRNA was processed in the middle of the protein‐coding region, but unlikely by RNase Y since the detected truncated transcript was equally abundant in the rny mutant and wild type. However, we observed an increase in mRNA stability for the rny mutant strain, but it is unclear if this is a direct effect. No functional enzyme activity was expected to result from the processed folC1 mRNA, since only the 5’ half of the mRNA was expressed. Another copy of the folC gene, folC2 (Spy49_0638), has been identified in the S. pyogenes genome. Sequence analysis showed 64% similarity between the two genes. Therefore, it is possible that folC2 has redundant functions that can compensate under conditions when folC1 is processed. However, the physiological implications of folC1 processing and the exact mechanism how the truncated transcript is generated still requires further investigation. While the number of RNase Y‐processed transcripts is apparently low, it affects important virulence traits of S. pyogenes (Chen et al., 2013; Kang et al., 2010) and Figures S2 and S3. The importance of RNase Y targets ropB and speB for S. pyogenes virulence is well established (Carroll & Musser, 2011). Interestingly, a recent study of RNase Y‐dependent mRNA decay and processing in pathogenic Clostridium perfringens demonstrated a similar effect on global mRNA stability, but similarly identified only two transcripts stabilized by RNase Y processing (Obana, Nakamura, & Nomura, 2017). The first, colA, requires RNase Y‐dependent processing to allow access to its ribosome‐binding site and subsequent translation into the major toxin collagenase. The second, pilA2, is stabilized after processing and results in the production of a major pilin component of the type IV pili important for biofilm formation and host cell adherence (Obana et al., 2017). A difference in the role of RNase Y was observed in S. aureus: S. aureus lacking RNase Y only showed stabilization of some operons, but is not responsible for global mRNA turnover suggesting that RNase Y has a more specialized function in this organism (Marincola et al., 2012). Initially, the transcript of the sae operon, which encodes a major virulence regulator, was identified as RNase Y processed (Marincola et al., 2012) and further examination nicely showed that a secondary structure downstream of the cleavage site is important to guide RNase Y for mRNA cleavage, but the actual cleavage site is not important for the cleavage process (Marincola & Wolz, 2017) A more detailed study showed that 248 orfs are strongly affected by RNase Y. Interestingly, the authors also demonstrated that subcellular localization of RNase Y in S. aureus via its membrane anchor restricts its target choice (Khemici, Prados, Linder, & Redder, 2015). Although our approach identified RNase Y‐processed mRNAs, it has several limitations. First, segmental stability is not a unique character of processed mRNAs, but can also be observed in operons containing multiple promoters as well as through transcriptional read through. To reduce these false positives, we subtracted “processed” mRNA candidates of the RNase Y mutant from those of the WT so that the remaining mRNAs were most likely processed by RNase Y. This approach reduced the number of mRNA candidates from 80 to 26, but apparently did not completely prevent the detection of false positives. An alternative approach is to separate primary mRNAs (which are 5’ triphosphorylated) from processed mRNAs (which are 5’ monophosphorylated) and then analyze them individually (Jager, Forstner, Sharma, Santangelo, & Reeve, 2014). However, this approach cannot differentiate between processed mRNA and degraded fragments, which again would lead to false positives. The combination of the two approaches may yield more precise predictions. Second, the precise determination of transcriptional start sites (TSSs) at a single‐nucleotide resolution requires the application of differential RNA sequencing (dRNA‐seq) (Borries, Vogel, & Sharma, 2012), which was not available for this study. Alternatively, we used whole‐transcript RNA‐seq datasets to estimate operon boundaries. The estimation is not as accurate as that of dRNA‐seq because the reads are relatively depleted toward the 5’ and 3’ ends (Wang, Gerstein, & Snyder, 2009). However, for the purpose of our study, it was also unnecessary to determine the exact TSS of each operon. Furthermore, our results clearly indicate the involvement of RNase Y in mRNA processing and read through degradation. However, direct evidence showing that RNase Y is responsible for the cleavage is absent. Recombinant B. subtilis RNase Y has been successfully purified from E. coli and displayed the ability to degrade SAM‐dependent riboswitches in vitro (Shahbabian, Jamalli, Zig, & Putzer, 2009). We were able to purify recombinant S. pyogenes RNase Y from E. coli, but the recombinant protein did not display any endonucleolytic activity in vitro (data not shown). Surprisingly, we did not find an obvious correlation between the mRNA level and protein abundance of the genes examined in this study. The speG and prtF mRNAs, which were strongly stabilized and overexpressed in the Δrny mutant, yielded similar amounts of protein to the WT. Conversely, the ropB mRNA was processed in the WT, but not in the Δrny mutant. In this case, a lack of ropB processing in the Δrny mutant led to reduced RopB protein abundance. The poor correlation between mRNA and protein levels has also been observed in other bacterial species (Dressaire, Laurent, Loubiere, Besse, & Cocaign‐Bousquet, 2010; Lu, Vogel, Wang, Yao, & Marcotte, 2007; Picard et al., 2012), which implies the involvement of translational regulation. Picard et al. reported a strong inverse correlation between mRNA stability and ribosome occupancy (the fraction of mRNA engaged in translation) and ribosome density in Lactococcus lactis (Picard et al., 2012). Both ribosome occupancy and ribosome density are major determinants of final protein expression levels. The authors reasoned that mRNAs with less stable secondary structures in the 5’ UTR would be more rapidly degraded on one hand but could be better translated with higher level of ribosome occupancy on the other hand. Consequently, mRNAs with higher turnover rates also have higher protein production rates. Thus, it is possible that the speG read through transcript is poorly translated because of strong secondary structures in their 5’ regions (Figure S4). Similarly, the unprocessed ropB mRNA created as a result of an RNase Y mutation is presumably translated at a lower efficiency than the processed mRNA because of two step‐loop structures in the 5’ UTR (Neely, Lyon, Runft, & Caparon, 2003).

CONFLICT OF INTEREST

The authors declare no conflict of interest. Click here for additional data file. Click here for additional data file.

44 in total

1. Linear covariance models to examine the determinants of protein levels in Lactococcus lactis.

Authors: Clémentine Dressaire; Béatrice Laurent; Pascal Loubière; Philippe Besse; Muriel Cocaign-Bousquet
Journal: Mol Biosyst Date: 2010-05-06

2. Constitutive expression of fibronectin binding in Streptococcus pyogenes as a result of anaerobic activation of rofA.

Authors: G C Fogg; M G Caparon
Journal: J Bacteriol Date: 1997-10 Impact factor: 3.490

3. PlyC: a multimeric bacteriophage lysin.

Authors: Daniel Nelson; Raymond Schuch; Peter Chahales; Shiwei Zhu; Vincent A Fischetti
Journal: Proc Natl Acad Sci U S A Date: 2006-07-03 Impact factor: 11.205

4. Role of RNase Y in Clostridium perfringens mRNA Decay and Processing.

Authors: Nozomu Obana; Kouji Nakamura; Nobuhiko Nomura
Journal: J Bacteriol Date: 2016-12-28 Impact factor: 3.490

5. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation.

Authors: Peng Lu; Christine Vogel; Rong Wang; Xin Yao; Edward M Marcotte
Journal: Nat Biotechnol Date: 2006-12-24 Impact factor: 54.908

6. Single nucleotide resolution RNA-seq uncovers new regulatory mechanisms in the opportunistic pathogen Streptococcus agalactiae.

Authors: Isabelle Rosinski-Chupin; Elisabeth Sauvage; Odile Sismeiro; Adrien Villain; Violette Da Cunha; Marie-Elise Caliot; Marie-Agnès Dillies; Patrick Trieu-Cuot; Philippe Bouloc; Marie-Frédérique Lartigue; Philippe Glaser
Journal: BMC Genomics Date: 2015-05-30 Impact factor: 3.969