Ana Moreno-Blanco1, Virtu Solano-Collado1,2, Alejandro Ortuno-Camuñas1, Manuel Espinosa1, Sofía Ruiz-Cruz3,4, Alicia Bravo5. 1. Centro de Investigaciones Biológicas Margarita Salas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, 28040, Madrid, Spain. 2. Institute of Medical Sciences, University of Aberdeen, Foresterhill, Aberdeen, AB25 2ZD, UK. 3. Centro de Investigaciones Biológicas Margarita Salas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, 28040, Madrid, Spain. sofia.ruizcruz@ucc.ie. 4. School of Microbiology, University College Cork and APC Microbiome Ireland, Western Road, Cork, T12 YT20, Ireland. sofia.ruizcruz@ucc.ie. 5. Centro de Investigaciones Biológicas Margarita Salas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, 28040, Madrid, Spain. abravo@cib.csic.es.
Abstract
The Gram-positive bacterium Streptococcus pneumoniae is a major human pathogen that shows high levels of genetic variability. The pneumococcal R6 genome harbours several gene clusters that are not present in all strains of the species. One of these clusters contains two divergent genes, pclA, which encodes a putative surface-exposed protein that contains large regions of collagen-like repeats, and spr1404 (here named pclR). PclA was shown to mediate pneumococcal adherence to host cells in vitro. In this work, we demonstrate that PclR (494 amino acids) is a transcriptional activator. It stimulates transcription of the pclA gene by binding to a specific DNA site upstream of the core promoter. In addition, we show that PclR has common features with the MgaSpn transcriptional regulator (493 amino acids), which is also encoded by the R6 genome. These proteins have high sequence similarity (60.3%), share the same organization of predicted functional domains, and generate multimeric complexes on linear double-stranded DNAs. However, on the PpclA promoter region, MgaSpn binds to a site different from the one recognized by PclR. Our results indicate that PclR and MgaSpn have similar DNA-binding properties but different DNA-binding specificities, pointing to a different regulatory role of both proteins.
The Gram-positive bacterium Streptococcus pneumoniae is a major human pathogen that shows high levels of genetic variability. The pneumococcal R6 genome harbours several gene clusters that are not present in all strains of the species. One of these clusters contains two divergent genes, pclA, which encodes a putative surface-exposed protein that contains large regions of collagen-like repeats, and spr1404 (here named pclR). PclA was shown to mediate pneumococcal adherence to host cells in vitro. In this work, we demonstrate that PclR (494 amino acids) is a transcriptional activator. It stimulates transcription of the pclA gene by binding to a specific DNA site upstream of the core promoter. In addition, we show that PclR has common features with the MgaSpn transcriptional regulator (493 amino acids), which is also encoded by the R6 genome. These proteins have high sequence similarity (60.3%), share the same organization of predicted functional domains, and generate multimeric complexes on linear double-stranded DNAs. However, on the PpclA promoter region, MgaSpn binds to a site different from the one recognized by PclR. Our results indicate that PclR and MgaSpn have similar DNA-binding properties but different DNA-binding specificities, pointing to a different regulatory role of both proteins.
The core genome of a given bacterial species contains genes shared by all strains. In addition, the bacterial genomes often harbour a variable number of genes that are present in one or more, but not all, strains of the species. These accessory genes contribute to the high degree of genetic variability found in many bacterial species. The function of the accessory genes can be very diverse, including a wide range of adaptive traits that might be beneficial for the bacteria under certain environmental situations[1]. The Gram-positive bacterium Streptococcus pneumoniae (the pneumococcus) is a major human pathogen that shows high levels of genetic diversity. It is normally found as a harmless commensal of the upper respiratory tract (mainly the nasopharynx). Nevertheless, in individuals with a weakened immune system, the pneumococcus can migrate to other tissues/organs and cause life-threatening diseases, such as pneumonia, bacteraemia, and meningitis[2,3]. Despite the development of different vaccines and antibiotic therapies, S. pneumoniae remains a leading cause of morbidity and mortality worldwide, being the most common cause of bacterial pneumonia in children under five years old (https://www.who.int/en/news-room/fact-sheets/detail/pneumonia). An interesting aspect of S. pneumoniae is its capacity to incorporate exogenous DNA into its genome, which is mainly achieved by horizontal gene transfer mechanisms[4,5] and plays an important role in its adaptation and evolution. Comparative genomic analyses have shown that over 20% of the coding sequences of any single pneumococcal isolate are not present in all strains[5-7]. Furthermore, it has been estimated that the rate at which the pneumococcus acquires genetic variation through recombination is much higher than the rate at which random mutations occur[8].The genome sequences of the pneumococcal strains TIGR4 (serotype 4) and R6 (a derivative of D39, serotype 2) were published in 2001[9,10]. A comparison of both sequences revealed that, among other differences, the R6 genome has six gene clusters that are absent from the TIGR4 genome[11]. One of the R6-specific clusters (9634 bp) consists of two divergent genes, spr1403 (new locus tag: SPR_RS06970) and spr1404 (new locus tag: SPR_RS06975). The spr1403 gene encodes a putative cell wall anchored protein that contains large regions of collagen-like repeats, the number of which varies between strains[12]. This protein was named PclA for pneumococcal collagen-like protein A[12]. By using PCR techniques, Paterson et al.[12] found that some clinical isolates from invasive pneumococcal disease harboured the pclA-spr1404 gene cluster in the same genomic location as strain R6. Moreover, these authors showed that a pclA deletion mutant strain was defective in adhesion to host cells in vitro. The distribution of pclA was further analysed in a collection of pneumococcal clinical isolates from patients with community-acquired pneumonia[13]. This study showed that the presence of pclA was significantly associated with Pneumococcal Molecular Epidemiology Network (PMEN)[14] clones, which suggested that PclA might contribute to the selection of prevalent clones[13]. The PMEN was established in 1997 to standardize the nomenclature and classification of antibiotic-resistant pneumococcal clones worldwide (https://www.pneumogen.net/pmen). Regarding the spr1404 gene, it was reported that it encodes a putative transcriptional regulator[12] but its function has not been investigated.Global transcriptional regulators play crucial roles during bacterial adaptation to specific niches. They can rapidly adjust the gene expression pattern to new environmental conditions. The pneumococcal mgaSpn gene[15], firstly named mgrA[16], encodes a protein of 493 amino acids that is a member of the Mga/AtxA family of global transcriptional regulators[17]. This family includes Mga from S. pyogenes, AtxA from Bacillus anthracis, and MafR from Enterococcus faecalis[18-20]. Bioinformatics analyses have shown that MgaSpn is highly conserved in 25 pneumococcal strains whose genomes have been completely sequenced, including TIGR4 (locus sp1800) and R6 (locus spr1622)[15,21]. Signature-tagged mutagenesis in TIGR4 revealed that MgaSpn plays a significant role in both nasopharyngeal colonization and the development of pneumonia in murine infection models. Moreover, MgaSpn was shown to repress, directly or indirectly, the expression of several genes located within the rlrA pathogenicity islet[16,22]. This islet is absent from many pneumococcal strains[23], including R6[10]. Further studies performed in the pneumococcal R6 strain demonstrated that MgaSpn functions as a transcriptional activator. It activates the transcription of a four-gene operon (spr1623-spr1626) by binding to a specific DNA site upstream of the P1623B promoter (positions − 60 to − 99)[15,17]. In vitro DNA binding experiments have shown that MgaSpn (i) generates multimeric complexes on linear double-stranded DNA fragments, (ii) binds to linear double-stranded DNAs with little or no sequence specificity, and (iii) has a preference for AT-rich DNA sites and for DNA regions that contain a potential intrinsic curvature. Because of these findings, we proposed that MgaSpn recognizes structural characteristics in its DNA targets rather than specific nucleotide sequences[17,24].In this work, we have investigated the function of the pneumococcal spr1404 gene. We demonstrate that this gene (named pclR herein for A
regulator) encodes a protein (PclR; 494 amino acids) that activates the transcription of the pclA gene in pneumococcal cells grown to mid-log phase under standard laboratory conditions. This activation requires a specific DNA site, which is located upstream of the PpclA core promoter. PclR interacts with such a site in vitro. Therefore, PclR could play a regulatory role in pneumococcal adhesion to human cells. In addition, we show that PclR and MgaSpn have high sequence similarity (60.3%), share the same organization of predicted functional domains, and display common features in their interaction with DNA. However, despite these similarities, these regulators have different DNA-binding specificities and different regulatory capacities.
Results
Organization of predicted functional domains in PclR
The pneumococcal R6 genome (NCBI RefSeq NC_003098.1)[10] has several gene clusters that are absent from other pneumococcal genomes. One of them consists of the spr1403 gene (pclA, pneumococcal collagen-like protein A)[12] and the spr1404 gene (here named pclR) (Fig. 1). The ATG codon at coordinate 1,388,136 is likely the translation initiation codon of the pclR gene, as it is preceded by a putative Shine-Dalgarno sequence (5′-GGAGGAAA-3′). Translation from this ATG codon results in a protein of 494 residues (PclR). EMBOSS Needle Pairwise Sequence Alignment[25,26] of PclR and the pneumococcal MgaSpn transcriptional regulator (493 residues; locus_tag spr1622) revealed that these proteins have 60.3% of similarity and 40.1% of identity (Supplementary Fig. S1). According to the Conserved Domain Database (CDD)[27] and the Protein Families Database (Pfam)[28], PclR is predicted to have (i) two N-terminal helix-turn-helix DNA-binding domains, the so-called HTH_Mga (Family PF08280.14, residues 6 to 65) and Mga (Family PF05043.16, residues 72 to 158) domains, and (ii) a central phosphoenolpyruvate phosphotransferase system (PTS) regulation domain (PRD) (Family PRD_Mga PF08270.14, residues 174 to 391) (Supplementary Fig. S2). Moreover, the protein structure prediction server Phyre2[29] revealed that the C-terminal region of PclR (residues 398 to 488) has structural homology to a PTS EIIB-like component. Thus, the organization of predicted functional domains in PclR is similar to the one reported for MgaSpn[15,24]. Supplementary Fig. S3 shows the predicted three-dimensional structure of the PclR monomer according to the AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk)[30,31], as well as the location of the predicted functional domains on such a structure. The AlphaFold Database predicts similar three-dimensional structures for the PclR and MgaSpn monomers (Supplementary Fig. S4).
Figure 1
The R6-specific pclA-pclR cluster. Gene spr1404 has been named pclR in this work. For each gene, the coordinates of the translation start and stop codons are indicated. Arrows upstream of the genes represent promoters. The nucleotide sequence of the region spanning coordinates 1,387,910 to 1,388,152 of the S. pneumoniae R6 chromosome is shown. The main sequence elements (− 35 box and − 10 box) of the promoters identified in this work (PpclA and PpclR) and the transcription start sites (+1 position) are indicated. The putative Shine-Dalgarno sequence (SD) and the translation start codon (ATG) of the pclA and pclR genes are indicated in boldface letters.
The R6-specific pclA-pclR cluster. Gene spr1404 has been named pclR in this work. For each gene, the coordinates of the translation start and stop codons are indicated. Arrows upstream of the genes represent promoters. The nucleotide sequence of the region spanning coordinates 1,387,910 to 1,388,152 of the S. pneumoniae R6 chromosome is shown. The main sequence elements (− 35 box and − 10 box) of the promoters identified in this work (PpclA and PpclR) and the transcription start sites (+1 position) are indicated. The putative Shine-Dalgarno sequence (SD) and the translation start codon (ATG) of the pclA and pclR genes are indicated in boldface letters.
Expression of the pclR gene under laboratory conditions
By quantitative RT-PCR (qRT-PCR) assays and using the comparative CT method[32], we determined the relative expression of the pclR gene in pneumococcal R6 cells grown under standard laboratory conditions: AGCH medium supplemented with 0.2% yeast extract and 0.3% sucrose, 37 °C, and without aeration. Compared to the stationary phase, transcription of pclR was found to be higher (~ 3.2-fold) at the logarithmic growth phase (Supplementary Table S1). We also determined the relative expression of the regulatory mgaSpn gene. Like pclR, transcription of mgaSpn was higher (~ 4.3-fold) in exponentially growing R6 cells (Supplementary Table S2). Thus, all the experiments shown in this work were performed during the logarithmic growth phase.Paterson et al. (2008) constructed a pclR (spr1404) deletion mutant strain by allelic replacement with a spectinomycin resistance cassette[12]. Using such a mutant, they found that the lack of pclR had no significant effect on the intracellular levels of pclA transcripts[12]. This finding suggested to us that, under laboratory conditions, an increase in the expression of the pclR gene could be necessary to detect an effect on the transcription of the pclA gene. Therefore, to test this hypothesis (see below), we constructed two R6 derivative strains designed to produce different levels of PclR. Specifically, we inserted the promoterless pclR gene into the pDLF constitutive expression vector[20] in both orientations, generating the recombinant plasmids pDLFpclR (expression of pclR) and pDLFpclR-i (no expression of pclR). Then, we introduced each recombinant plasmid into the R6 strain. By qRT-PCR, we determined the relative expression of the pclR gene in both strains: R6/pDLFpclR (expression of pclR from the chromosome and the plasmid) and R6/pDLFpclR-i (expression of pclR only from the chromosome). As expected, the amount of pclR transcripts was higher (~ 3.1-fold) in strain R6/pDLFpclR (Supplementary Table S3). Each recombinant plasmid was also introduced into the R6∆mga mutant strain, which lacks the mgaSpn gene[15]. As shown in Supplementary Table S4, the amount of pclR transcripts was higher (~ 4.9-fold) in strain R6∆mga/pDLFpclR compared to strain R6∆mga/pDLFpclR-i. In the next sections, we will refer to R6/pDLFpclR and R6∆mga/pDLFpclR as strains with high levels of pclR expression, and to R6/pDLFpclR-i and R6∆mga/pDLFpclR-i as strains with low levels of pclR expression.
Identification of the promoter of the pclR gene
The BPROM program (Softberry, Inc.) predicts a promoter sequence (named PpclR herein) upstream of the pclR gene. It has a canonical − 10 element (TATAAT) and a possible − 35 element (TTTATA) at the suboptimal spacer length of 19 nucleotides (Fig. 1). By transcriptional fusions, we analysed the promoter activity of such a sequence (Fig. 2A). A 185-bp DNA fragment (coordinates 1,387,937 to 1,388,121) was inserted into the pASTT promoter-probe vector, which is based on the gfp reporter gene. The recombinant plasmid (pASTT-PpclR) was then introduced into R6∆mga/pDLFpclR (high levels of pclR expression) and R6∆mga/pDLFpclR-i (low levels of pclR expression). In both strains, gfp expression was ~ 2.5-fold higher than the basal level (strains harbouring pASTT, 0.08 ± 0.02 units). Similar results were obtained with the plasmid pASTT-PpclR∆105 (Fig. 2A). These results showed that (i) the 80-bp region between coordinates 1,388,042 and 1,388,121 contains a promoter sequence, and (ii) different levels of pclR expression do not affect the activity of such a promoter (no autoregulation). Furthermore, no promoter activity was detected when the region between coordinates 1,388,099 and 1,388,121 was deleted (plasmid pASTT-PpclR∆-10) (Fig. 2A). Such a deletion removes the − 10 element of the PpclR promoter (see Fig. 1).
Figure 2
Identification of the PpclR promoter. (A) Fluorescence assays. Three regions of the R6 chromosome were amplified by PCR and inserted into the SacI site of the promoter-probe vector pASTT. The coordinates of the inserted regions are indicated. The tetL gene confers resistance to tetracycline[33]. The promoter-less gfp gene encodes a variant of the green fluorescent protein[34]. The T1T2 box represents the tandem transcriptional terminators T1 and T2 of the E. coli rrnB ribosomal RNA operon[35]. The grey arrow represents the − 10 element of the PpclR promoter. Each pASTT derivative was introduced into strain R6∆mga/pDLFpclR (high levels of pclR expression) and strain R6∆mga/pDLFpclR-i (low levels of pclR expression). The intensity of fluorescence (arbitrary units) corresponds to 0.8 ml of culture (OD650 = 0.4). In each case, three independent cultures were analysed. (B) Plasmid pASTT-PpclR. The main sequence elements (− 35 box and − 10 box) of the PpclR promoter and the ATG translation start codon of the gfp gene are indicated. The SacI site is underlined. The position of the Int-gfp oligonucleotide used for primer extension is shown. The asterisk indicates the transcription start site identified in this work. (C) Analysis of primer extension reactions. Lane 1: total RNA from R6/pASTT-PpclR cells and oligonucleotide Int-gfp. Lane 2: total RNA from R6 cells and oligonucleotide Dw1404-2. Dideoxy-mediated chain termination sequencing reactions (M13mp18 DNA and primer − 40 M13[36]) were run in the same gel as DNA size markers. The corresponding lanes (A, C, G, and T) came from the same gel but were taken at a lower exposure (delineation with dividing lines). See the full-length gel (high and low exposures) at the end of the Supplementary Information file. The size (in nucleotides) of the cDNA products is indicated on the left of the gel.
Identification of the PpclR promoter. (A) Fluorescence assays. Three regions of the R6 chromosome were amplified by PCR and inserted into the SacI site of the promoter-probe vector pASTT. The coordinates of the inserted regions are indicated. The tetL gene confers resistance to tetracycline[33]. The promoter-less gfp gene encodes a variant of the green fluorescent protein[34]. The T1T2 box represents the tandem transcriptional terminators T1 and T2 of the E. coli rrnB ribosomal RNA operon[35]. The grey arrow represents the − 10 element of the PpclR promoter. Each pASTT derivative was introduced into strain R6∆mga/pDLFpclR (high levels of pclR expression) and strain R6∆mga/pDLFpclR-i (low levels of pclR expression). The intensity of fluorescence (arbitrary units) corresponds to 0.8 ml of culture (OD650 = 0.4). In each case, three independent cultures were analysed. (B) Plasmid pASTT-PpclR. The main sequence elements (− 35 box and − 10 box) of the PpclR promoter and the ATG translation start codon of the gfp gene are indicated. The SacI site is underlined. The position of the Int-gfp oligonucleotide used for primer extension is shown. The asterisk indicates the transcription start site identified in this work. (C) Analysis of primer extension reactions. Lane 1: total RNA from R6/pASTT-PpclR cells and oligonucleotide Int-gfp. Lane 2: total RNA from R6 cells and oligonucleotide Dw1404-2. Dideoxy-mediated chain termination sequencing reactions (M13mp18 DNA and primer − 40 M13[36]) were run in the same gel as DNA size markers. The corresponding lanes (A, C, G, and T) came from the same gel but were taken at a lower exposure (delineation with dividing lines). See the full-length gel (high and low exposures) at the end of the Supplementary Information file. The size (in nucleotides) of the cDNA products is indicated on the left of the gel.The transcription start site of the pclR gene was identified by primer extension assays. We used total RNA from R6 cells and the oligonucleotide Dw1404-2 (coordinates 1,388,208 to 1,388,232) (Table 1). A cDNA product of 114 nucleotides was detected (Fig. 2C, lane 2), which could correspond to a transcription initiation event at coordinate 1,388,119. This coordinate is located 6 nucleotides downstream of the − 10 element of the PpclR promoter (Fig. 1). Additionally, we performed primer extension assays with total RNA from R6 cells harbouring pASTT-PpclR. In this plasmid, the gfp reporter gene is under the control of the PpclR promoter (Fig. 2A). As a primer, we used the oligonucleotide Int-gfp (Table 1), which anneals to gfp transcripts (Fig. 2B). A cDNA product of 105 nucleotides was detected (Fig. 2C, lane 1), which could correspond to a transcription initiation event at coordinate 1,388,120. This coordinate is located 7 nucleotides downstream of the − 10 element of the PpclR promoter (Fig. 1). In addition to the mentioned cDNA products, a possible non-specific product of 121 nucleotides was detected in both primer extension reactions (Fig. 2C, lanes 1 and 2). From these results, we conclude that the pneumococcal RNA polymerase recognizes the PpclR promoter and initiates transcription at coordinate 1,388,119/1,388,120 (Fig. 1).
Table 1
Oligonucleotides used in this work.
Name
Sequence (5′ to 3′)a
FpclR
CTTAGACAAAAAAGCATGCAATGAATTATGG
RpclR
GTAAAGGAAGTATAGCATGCAGATAAGAGAA
DwpclA
CATTTTAAACTCCGAGCTCGTATTTTA
pclR-Dw
CATACGGGTATTGAGCTCATAATTCATT
pclR∆105
AGCCACTTATATTGAGCTCATTTTTCCG
pclR∆-10
TTCATTATATCGAGCTCTGTCTAAGTAG
UppclA
CTAATTTTTCGGCGAGCTCAT GTAATT
FpclA∆173
GAAAATTGTCGTTAGAGCTCGGAAAAATT
FpclA∆203
GAATAATATAAGGAGCTCTTAAGTGCAAA
FpclA∆224
GTGCAAAAACAGAGCTCATTAATTGTT
RpclA∆-10
CAATGTACAACTATTTGAGCTCTTAATCAAT
pclR-Nde
ATGGAGGAAATACCCATATGAGAAACCTTT
pclR-Xho-His
GACTTTTTTGATCTCGAGTAAAGTATTGGA
Dw1404-2
CGGCTAGTTCATGTAATTTCATCCA
Int-gfp
CATCACCATCTAATTCAACAAG
Fera-q
GATTGCCATCATGAGTGACAAGG
Rera-q
AGTGTCCACTTCGCGAAGGGT
FpclR-q
CCAACCTCTATCGACTGGGCA
RpclR-q
CAGGAAGGTCAGGAAAAGGC
FpclA-q
GACGTGATGGTTCAGCTCCA
RpclA-q
GGATTTGTCACCGTAATTGT
1622A
AGTTCCTGATTGTATTCCCT
1622J
GAATAAGGATAATCTGATTTGGCA
F1623-q
GGGGGACAGTGGTTCTATCA
1623B
CGTAAATTTACATGAACAGTTGGG
− 40 M13
GTTTTCCCAGTCACGAC
Up1404
CTCCTAGATAGTATTTTATAGT
Dw1404
GAATTAGGGTTTCCATTAAGCGT
Up1404-2
CAATGTACAACTATTTGAGGCA
1622H
CGGATTAAACCTCTTGCAATTATACC
1622I
CAAATTCTTTAATTGTTGCTATTA
aRestriction sites are underlined, and the base changes that generate restriction sites are in bold.
Oligonucleotides used in this work.aRestriction sites are underlined, and the base changes that generate restriction sites are in bold.
PclR activates the promoter of the pclA gene in bacterial cultures
By qRT-PCR assays and using total RNA from strains R6/pDLFpclR (high levels of pclR expression) and R6/pDLFpclR-i (low levels of pclR expression), we analysed the effect of PclR on the transcription of the pclA gene. Transcription of pclA was found to be higher (~ 3.4-fold) in the strain with high levels of pclR expression (Supplementary Table S5). Moreover, using total RNA from strains R6∆mga/pDLFpclR (high levels of pclR expression) and R6∆mga/pDLFpclR-i (low levels of pclR expression), we confirmed that the amount of pclA transcripts was higher (~ 4.5-fold) in the strain with high levels of pclR expression (Supplementary Table S6). These results indicated that PclR has a positive effect on the transcription of the pclA gene, both in the presence and in the absence of the MgaSpn regulator.The ATG codon at coordinate 1,387,920 is likely the translation start site of the pclA gene (Fig. 1). Sequence analysis of the region located between coordinates 1,388,224 and 1,387,910 revealed the existence of a putative promoter (named PpclA herein), in which the − 35 (TTGATT) and − 10 (TACATT) elements are separated by 17 nucleotides (optimal length). To analyse whether such a sequence had promoter activity, we constructed several transcriptional fusions based on the gfp reporter gene (Fig. 3A). First, we inserted a 288-bp DNA fragment (coordinates 1,388,224 to 1,387,937) into the promoter-probe vector pASTT and introduced the recombinant plasmid (pASTT-PpclA) into the pneumococcal R6 strain. Measuring the fluorescence of the cultures, we did not detect significant differences in gfp expression between R6/pASTT (0.07 ± 0.01 units; background level) and R6/pASTT-PpclA (0.08 ± 0.01 units). However, when pASTT-PpclA was introduced into R6∆mga/pDLFpclR (high levels of pclR expression) and R6∆mga/pDLFpclR-i (low levels of pclR expression), we detected a higher level of gfp expression (~ 2.5-fold) in the strain with high levels of pclR expression (Fig. 3A). Similar results were obtained with plasmids pASTT-PpclA∆103 and pASTT-PpclA∆173, which allowed us to conclude that the 115-bp region between coordinates 1,388,051 and 1,387,937 contains a PclR-dependent promoter. No PclR-dependent promoter activity was detected (i) when the − 10 element of the PpclA promoter was deleted (from coordinate 1,387,974 to 1,387,937; plasmid pASTT-PpclA∆-10), and (ii) when a 30-bp region located upstream of the PpclA promoter was removed (from coordinate 1,388,051 to 1,388,021; plasmids pASTT-PpclA∆203 and pASTT-PpclA∆224) (Figs. 1, 3A). Finally, by primer extension assays (Fig. 3C), we confirmed that the PpclA promoter located on pASTT-PpclA∆103 (Fig. 3B) is functional. We used total RNA from strain R6∆mga/pDLFpclR/pASTT-PpclA∆103 (high levels of pclR expression) and the oligonucleotide Int-gfp, which anneals to gfp transcripts (Fig. 3B). A cDNA product of 114 nucleotides was detected (Fig. 3C), which could correspond to a transcription initiation event at coordinate 1,387,946 (Figs. 1, 3B). This coordinate is located 7 nucleotides downstream of the − 10 element of the PpclA promoter. Taking together, we conclude that PclR activates transcription from the PpclA promoter. This activation requires sequences located between positions − 75 and − 105 of the PpclA promoter.
Figure 3
Identification of the PpclA promoter. (A) Fluorescence assays. Six regions of the R6 chromosome were amplified by PCR and inserted into the SacI site of the promoter-probe vector pASTT. The coordinates of the inserted regions are indicated. The grey arrow represents the − 10 element of the PpclA promoter. See also legend to Fig. 2A. (B) Plasmid pASTT-PpclA∆103. The main sequence elements (− 35 box and − 10 box) of the PpclA promoter and the ATG translation start codon of the gfp gene are indicated. The SacI site is underlined. The position of the Int-gfp oligonucleotide used for primer extension is shown. The asterisk indicates the transcription start site identified in this work. (C) Primer extension reaction using total RNA from strain R6∆mga/pDLFpclR/pASTT-PpclA∆103. The Int-gfp oligonucleotide was used as a primer. Dideoxy-mediated chain termination sequencing reactions (M13mp18 DNA and primer − 40 M13[36]) were run in the same gel as DNA size markers. The corresponding lanes (A, C, G, and T) came from the same gel but were taken at a lower exposure (delineation with dividing lines). See the full-length gel (high and low exposures) at the end of the Supplementary Information file. The size (in nucleotides) of the cDNA product is indicated on the left of the gel.
Identification of the PpclA promoter. (A) Fluorescence assays. Six regions of the R6 chromosome were amplified by PCR and inserted into the SacI site of the promoter-probe vector pASTT. The coordinates of the inserted regions are indicated. The grey arrow represents the − 10 element of the PpclA promoter. See also legend to Fig. 2A. (B) Plasmid pASTT-PpclA∆103. The main sequence elements (− 35 box and − 10 box) of the PpclA promoter and the ATG translation start codon of the gfp gene are indicated. The SacI site is underlined. The position of the Int-gfp oligonucleotide used for primer extension is shown. The asterisk indicates the transcription start site identified in this work. (C) Primer extension reaction using total RNA from strain R6∆mga/pDLFpclR/pASTT-PpclA∆103. The Int-gfp oligonucleotide was used as a primer. Dideoxy-mediated chain termination sequencing reactions (M13mp18 DNA and primer − 40 M13[36]) were run in the same gel as DNA size markers. The corresponding lanes (A, C, G, and T) came from the same gel but were taken at a lower exposure (delineation with dividing lines). See the full-length gel (high and low exposures) at the end of the Supplementary Information file. The size (in nucleotides) of the cDNA product is indicated on the left of the gel.
PclR binds upstream of the PpclA core promoter
By DNase I footprinting experiments, we analysed whether PclR recognized the PpclA promoter region. A 270-bp DNA fragment (coordinates 1,388,196 to 1,387,927) was radioactively labelled at the 5′-end of the coding strand. The labelled DNA (2 nM) was then incubated with increasing concentrations of a His-tagged version of the PclR protein (PclR-His) (Fig. 4A). At 400 nM of PclR-His, protections against DNase I digestion were observed at a particular region, from position − 169 to − 68 relative to the transcription initiation site of the PpclA promoter. Moreover, two positions located at − 139 and − 103 were more sensitive to DNase I cleavage (Fig. 4A). To determine the region protected by PclR-His on the non-coding strand, a 281-bp DNA fragment (coordinates 1,388,232 to 1,387,952) was radioactively labelled at the 5′-end of the non-coding strand (Fig. 4B). At 400 nM of PclR-His, major changes in DNase I sensitivity (diminished cleavages) were observed from position − 152 to − 83. These results indicated that PclR-His recognizes a site located between positions − 169 and − 68 of the PpclA promoter (Fig. 4C). This region contains the sequence (from position − 105 to − 75) that PclR needs to activate the PpclA promoter (Fig. 3A). Thus, we conclude that PclR activates transcription of the pclA gene by binding to a specific site upstream of the PpclA core promoter. Using the bend.it server (pongor.itk.ppke.hu/dna/bend_it.html), we calculated the bendability/curvature propensity plot of the 270-bp DNA fragment. The profile contains two potential intrinsic curvatures (~ 10–11 degrees per helical turn) within the PclR binding site (Supplementary Fig. S5). Intrinsic curvatures flanked by regions of bendability have been also predicted in DNA sites recognized by the MgaSpn transcriptional activator[17].
Figure 4
DNase I footprints of PclR-His-DNA complexes. (A) The 270-bp DNA fragment (coordinates 1,388,196 to 1,387,927) was 32P-labelled at the 5′ end of the coding strand (pclA gene) using the 32P-labelled Dw1404 oligonucleotide. (B) The 281-bp DNA fragment (coordinates 1,388,232 to 1,387,952) was 32P-labelled at the 5′ end of the non-coding strand (relative to pclA) using the 32P-labelled Up1404-2 oligonucleotide. The labelled DNA (2 nM) was incubated with the indicated concentrations of PclR-His and then digested with DNase I. Non-digested DNA (F) and dideoxy-mediated chain termination sequencing reactions were run in the same gel (lanes A, C, G, T). In panel A, the sequence corresponds to the coding strand of the 270-bp DNA fragment (32P-labelled Dw1404 oligonucleotide). In panel B, the sequence corresponds to the non-coding strand of the 281-bp DNA fragment (32P-labelled Up1404-2 oligonucleotide). Densitometer scans corresponding to DNA without PclR-His (blue line) and DNA with PclR-His (400 nM, black line) are shown. The protected regions are indicated with brackets. Arrowheads indicate positions that are slightly more sensitive to DNase I cleavage. The indicated positions are relative to the transcription start site of the pclA gene. (C) Nucleotide sequence of the region that spans coordinates 1,388,138 to 1,387,918 of the R6 chromosome. The − 35 and − 10 elements of the PpclA promoter are indicated. The transcription start site (+1 position), the putative Shine-Dalgarno sequence (SD), and the translation start codon (ATG) of the pclA gene are indicated. Brackets indicate regions protected against DNase I digestion. Black arrowheads indicate positions that are slightly more sensitive to DNase I cleavage. The grey box includes the site recognized by PclR-His.
DNase I footprints of PclR-His-DNA complexes. (A) The 270-bp DNA fragment (coordinates 1,388,196 to 1,387,927) was 32P-labelled at the 5′ end of the coding strand (pclA gene) using the 32P-labelled Dw1404 oligonucleotide. (B) The 281-bp DNA fragment (coordinates 1,388,232 to 1,387,952) was 32P-labelled at the 5′ end of the non-coding strand (relative to pclA) using the 32P-labelled Up1404-2 oligonucleotide. The labelled DNA (2 nM) was incubated with the indicated concentrations of PclR-His and then digested with DNase I. Non-digested DNA (F) and dideoxy-mediated chain termination sequencing reactions were run in the same gel (lanes A, C, G, T). In panel A, the sequence corresponds to the coding strand of the 270-bp DNA fragment (32P-labelled Dw1404 oligonucleotide). In panel B, the sequence corresponds to the non-coding strand of the 281-bp DNA fragment (32P-labelled Up1404-2 oligonucleotide). Densitometer scans corresponding to DNA without PclR-His (blue line) and DNA with PclR-His (400 nM, black line) are shown. The protected regions are indicated with brackets. Arrowheads indicate positions that are slightly more sensitive to DNase I cleavage. The indicated positions are relative to the transcription start site of the pclA gene. (C) Nucleotide sequence of the region that spans coordinates 1,388,138 to 1,387,918 of the R6 chromosome. The − 35 and − 10 elements of the PpclA promoter are indicated. The transcription start site (+1 position), the putative Shine-Dalgarno sequence (SD), and the translation start codon (ATG) of the pclA gene are indicated. Brackets indicate regions protected against DNase I digestion. Black arrowheads indicate positions that are slightly more sensitive to DNase I cleavage. The grey box includes the site recognized by PclR-His.On both DNA strands and at 800 nM of PclR-His (Fig. 4A,B), regions protected against DNase I digestion were observed along the DNA fragment, which suggested that, upon binding to the primary site, additional PclR-His units interacted with the adjacent DNA regions. This result is consistent with the ability of PclR-His to generate multimeric complexes on linear double-stranded DNAs (Supplementary Fig. S6A), a feature previously reported for the MgaSpn transcriptional regulator[17,24]. Specifically, we performed electrophoretic mobility shift assays (EMSAs) with the 270-pb DNA fragment that had been used in the DNase I footprinting assay. As shown in Supplementary Fig. S6A, the 32P-labelled DNA was incubated with different concentrations of PclR-His in the presence of non-labelled competitor calf thymus DNA. Free and bound DNAs were separated by electrophoresis on a native polyacrylamide (6%) gel. At 200 nM of PclR-His, free DNA and four protein-DNA complexes were detected. In addition, as the protein concentration was increased, such complexes disappeared and higher-order complexes appeared. This pattern of complexes suggested that multiple protein units bind orderly on the same linear DNA molecule.
PclR and MgaSpn have different DNA-binding specificities
According to EMBOSS Needle Pairwise Sequence Alignment[25,26], the N-terminal regions (first 170 amino acids) of PclR and MgaSpn share high sequence similarity (66.5% of similarity and 50% of identity). Both regions contain two predicted helix-turn-helix DNA-binding domains, the so-called HTH_Mga (residues 6 to 65) and Mga (residues 72 to 158) domains (Supplementary Figs. S1, S2, and S3). To know whether MgaSpn recognized the PpclA promoter region, we performed DNase I footprinting assays using MgaSpn-His and the 270-bp DNA fragment. The 270-bp DNA fragment was radioactively labelled at the 5′-end of the coding strand (Fig. 5A). At 75 nM of MgaSpn-His, diminished DNase I cleavages were observed from position − 173 to − 196, and from − 102 to − 115. Moreover, positions − 47, − 69, − 87, and − 131 were slightly more sensitive to DNase I digestion (Fig. 5A,C). This result was confirmed in shorter electrophoretic runs (Supplementary Fig. S7). At higher MgaSpn-His concentrations, protections against DNase I digestion were observed along the entire DNA fragment (Fig. 5A), which is consistent with the pattern of protein-DNA complexes observed by EMSA (Supplementary Fig. S6B), and with the ability of MgaSpn to form multimeric complexes on linear DNA[17]. The region protected by MgaSpn-His on the non-coding strand was defined using the 281-bp DNA fragment (Fig. 5B). At 100 nM of MgaSpn-His, diminished cleavages were mostly observed from position − 174 to − 213, and from − 103 to − 110. In addition, positions − 87, − 88, − 126, − 145, − 160, − 173, − 245 and − 251 were more sensitive to DNase I digestion (Fig. 5B,C). These results showed that PclR-His and MgaSpn-His recognize different sites on the PpclA promoter region (Fig. 6). MgaSpn-His binds preferentially to two sites: Site A (from − 173 to − 213) and Site B (from − 102 to − 115). Site A is adjacent to the region recognized by PclR-His (from − 68 to − 169) and Site B is included within such a region. Next, we analysed whether MgaSpn influenced the expression of the pclA gene. Specifically, by qRT-PCR assays, we determined the relative expression of the pclA gene in two strains: R6∆mga/pDLPsulA::mga (plasmid-encoded MgaSpn) and R6∆mga/pDL287 (absence of MgaSpn). As shown in Supplementary Table S7, plasmid-encoded MgaSpn had no significant effect on the intracellular levels of pclA transcripts. Plasmid pDLPsulA::mga had been used previously to demonstrate that plasmid-encoded MgaSpn activates the P1623B promoter[15]. Thus, the function (if any) of the interaction of MgaSpn-His with the sites A and B (Fig. 6) remains unknown.
Figure 5
DNase I footprints of MgaSpn-His-DNA complexes. (A) Binding of MgaSpn-His to the 270-bp DNA fragment (coordinates 1,388,196 to 1,387,927), which was 32P-labelled at the 5′ end of the coding strand (relative to pclA) using the 32P-labelled Dw1404 oligonucleotide. (B) Binding of MgaSpn-His to the 281-bp DNA fragment (coordinates 1,388,232 to 1,387,952), which was 32P-labelled at the 5′ end of the non-coding strand (relative to pclA) using the 32P-labelled Up1404-2 oligonucleotide. Non-digested DNA (F) and dideoxy-mediated chain termination sequencing reactions were run in the same gel (lanes A, C, G, T). In panel A, the sequence corresponds to the coding strand of the 270-bp DNA fragment (32P-labelled Dw1404 oligonucleotide). In panel B, the sequence corresponds to the non-coding strand of the 281-bp DNA fragment (32P-labelled Up1404-2 oligonucleotide). Densitometer scans corresponding to DNA without MgaSpn-His (blue line) and DNA with MgaSpn-His (black line) are shown. Brackets represent the MgaSpn-His protected regions. Positions more sensitive to DNase I cleavage are indicated with arrowheads. The indicated positions are relative to the transcription start site (+1 position) of the pclA gene. (C) Nucleotide sequence of the region that spans coordinates 1,388,213 to 1,387,951 of the R6 chromosome. The − 35 and − 10 elements of the PpclA promoter are indicated. MgaSpn-His protected regions (brackets) and positions more sensitive to DNase I cleavage (arrowheads) are indicated. The two sites recognized preferentially by MgaSpn-His (Sites A and B) are shown.
Figure 6
Sites recognized by PclR-His and MgaSpn-His in the region that contains the PpclA and PpclR promoters. The sites were defined by DNase I footprinting assays (see Figs. 4 and 5). The location of the main elements of each promoter (− 35 and − 10 boxes), as well as the transcription start site (position +1) and the translation initiation codon (ATG) of the pclA and pclR genes are indicated. The − 68, − 102, − 115, − 169, − 173 and − 213 positions are relative to the transcription start site of the pclA gene.
DNase I footprints of MgaSpn-His-DNA complexes. (A) Binding of MgaSpn-His to the 270-bp DNA fragment (coordinates 1,388,196 to 1,387,927), which was 32P-labelled at the 5′ end of the coding strand (relative to pclA) using the 32P-labelled Dw1404 oligonucleotide. (B) Binding of MgaSpn-His to the 281-bp DNA fragment (coordinates 1,388,232 to 1,387,952), which was 32P-labelled at the 5′ end of the non-coding strand (relative to pclA) using the 32P-labelled Up1404-2 oligonucleotide. Non-digested DNA (F) and dideoxy-mediated chain termination sequencing reactions were run in the same gel (lanes A, C, G, T). In panel A, the sequence corresponds to the coding strand of the 270-bp DNA fragment (32P-labelled Dw1404 oligonucleotide). In panel B, the sequence corresponds to the non-coding strand of the 281-bp DNA fragment (32P-labelled Up1404-2 oligonucleotide). Densitometer scans corresponding to DNA without MgaSpn-His (blue line) and DNA with MgaSpn-His (black line) are shown. Brackets represent the MgaSpn-His protected regions. Positions more sensitive to DNase I cleavage are indicated with arrowheads. The indicated positions are relative to the transcription start site (+1 position) of the pclA gene. (C) Nucleotide sequence of the region that spans coordinates 1,388,213 to 1,387,951 of the R6 chromosome. The − 35 and − 10 elements of the PpclA promoter are indicated. MgaSpn-His protected regions (brackets) and positions more sensitive to DNase I cleavage (arrowheads) are indicated. The two sites recognized preferentially by MgaSpn-His (Sites A and B) are shown.Sites recognized by PclR-His and MgaSpn-His in the region that contains the PpclA and PpclR promoters. The sites were defined by DNase I footprinting assays (see Figs. 4 and 5). The location of the main elements of each promoter (− 35 and − 10 boxes), as well as the transcription start site (position +1) and the translation initiation codon (ATG) of the pclA and pclR genes are indicated. The − 68, − 102, − 115, − 169, − 173 and − 213 positions are relative to the transcription start site of the pclA gene.MgaSpn activates the transcription of the spr1623-spr1626 operon by binding to a specific site upstream of the P1623B promoter (positions − 60 to − 99)[15,17]. By DNase I footprinting assays, we also analysed whether PclR-His recognized the P1623B promoter region. We used a 222-bp DNA fragment (coordinates 1,598,298 to 1,598,519) that contains the P1623B promoter and the site recognized by MgaSpn[17]. Specific regions protected against DNase I digestion were not detected (Supplementary Fig. S8), indicating that PclR-His does not recognize a specific site on the P1623B promoter region. This result correlated with the inability of PclR to influence the activity of the P1623B promoter. By qRT-PCR assays, we found similar levels of spr1623 transcripts in strains that produce different levels of PclR: R6/pDLFpclR (high levels of pclR expression) versus R6/pDLFpclR-i (low levels of pclR expression) (Supplementary Table S8), and R6∆mga/pDLFpclR (high levels of pclR expression) versus R6∆mga/pDLFpclR-i (low levels of pclR expression) (Supplementary Table S9).Taken together, we conclude that PclR and MgaSpn have different DNA-binding specificities. They recognize different sites on the PpclA promoter region. Moreover, unlike MgaSpn, PclR does not bind to the P1623B promoter region. In agreement with these results, MgaSpn does not affect the activity of the PpclA promoter, and PclR does not affect the activity of the P1623B promoter.
Discussion
S. pneumoniae is an opportunistic pathogen able to proliferate in different niches of the human host. Its adaptation to new environments and host-imposed stresses partially relies on the activity of specific transcriptional regulators. The genome of the pneumococcal R6 strain has several gene clusters that are absent from other strains. One of these clusters contains two divergent genes, pclA, which encodes a putative cell surface protein[12], and pclR, whose function has been investigated in this work. We have identified the promoter of each gene (PpclA and PpclR) and demonstrated that PclR functions as a transcriptional activator. It stimulates pclA transcription by binding to a specific site upstream of the PpclA core promoter. PclA is a collagen-like protein, which contains the peptidoglycan anchor LPXTG motif and several GXY amino acid repeats[12]. This repeating pattern is the most typical feature in the molecular architecture of bacterial collagen-like proteins[37]. In pathogenic streptococci, surface-exposed collagen-like proteins have been associated with processes of colonization, biofilm formation, and evasion of the host immune response[38]. In the case of PclA, Paterson et al.[12] reported that a pclA deletion mutant strain is defective in adherence and invasion of nasopharyngeal and epithelial cells in vitro. Thus, we speculate that PclR could have a regulatory role during pneumococcal colonization. Using the EMBOSS Needle Pairwise Sequence Alignment program[25,26], we have found that PclR has sequence similarity (40.4%) to the Mga global regulator (530 residues; GenBank AAT87855.1) of the Gram-positive bacterium S. pyogenes (Group A Streptococcus; GAS). It has been reported that Mga regulates positively the transcription of the scl1 gene (also known as sclA)[39-41]. This gene encodes a collagen-like surface protein (Scl1) that interacts with integrins, cellular fibronectin, and laminin[42-44]. Moreover, it has been shown that Scl1 mediates GAS adherence to and internalization by human pharyngeal epithelial cells, playing an important role in pathogenesis[43].DNA rearrangements and gene acquisition are natural strategies for the generation of genetic diversity in S. pneumoniae, a feature that has been recently shown to be increased by the presence of temperate bacteriophages integrated into different regions of the pneumococcal chromosome[45]. It has been reported that the gene content between pairs of pneumococcal isolates can diverge by as much as 30%[46]. The sequences of the pneumococcal TIGR4 and R6 genomes were published in 2001[9,10]. A comparison of the two sequences revealed the existence of strain-specific genes, many of which are organized in clusters. Specifically, the TIGR4 genome has twelve gene clusters (~ 7% of the total genome) that are not present in R6, and the R6 genome has six gene clusters (~ 3% of the total genome) that are absent from TIGR4[11]. PCR analyses of the distribution of the R6-specific pclA-pclR gene cluster in a collection of clinical isolates revealed that many of such isolates lacked both genes (~ 60% of the strains examined)[12]. Subsequently, pclA was found to be associated with Pneumococcal Molecular Epidemiology Network (PMEN) clones[13]. Clones included in the PMEN are resistant to one or more antibiotics that are in wide clinical use. Moreover, they have a wide geographic distribution (https://www.pneumogen.net/pmen). Now, we have analysed whether the pclR gene was present in the 24 pneumococcal genomes shown in Supplementary Table S10. Such genomes are fully sequenced and assembled (NCBI database). Moreover, they encode a highly conserved MgaSpn regulator[15,21]. Using the BLASTP protein sequence alignment program[47], we have found that only nine out of the 24 genomes encode PclR: strains ATCC 700669, A026, D39, JJA, INV104, ST556, Taiwan19F14, TCH8431/19A and 70585. The PclR regulator of these strains is identical or almost identical to the PclR regulator encoded by the R6 genome (Supplementary Table S10). Like R6, the nine genomes also encode PclA.A study based on RNA-seq revealed profound changes in the relative amount of the RNAs synthesized by the pneumococcal D39V strain under a wide range of infection-relevant conditions. The expression data as well as the co-expression matrix were published in the PneumoExpress database (https://veeninglab.com/pneumoexpress)[48]. The D39V genome contains the pclA-pclR gene cluster (genes SPV_1376 and SPV_1377 in D39V). Searching in PneumoExpress, we have found that the highest expression level of pclA and pclR corresponds to bacteria grown in nose mimicking conditions, which simulate colonization. Both genes were also highly expressed in bacteria grown in lung mimicking conditions, which simulate pneumonia, and in cerebrospinal fluid-mimicking conditions from 37 to 40 °C, which simulate meningeal fever. In the case of the mgaSpn regulatory gene (SPV_1587) and its target operon spr1623-spr1626 (SPV_1588-SPV_1591), the highest expression level corresponds also to bacteria grown in nose mimicking conditions. Hence, the expression data suggest that PclR and MgaSpn could play a significant role during nasopharyngeal colonization. Previous studies performed by Hemsley et al.[16] showed that a mgaSpn deletion mutant strain was attenuated for both nasopharyngeal carriage and pneumonia in murine infection models. Concerning the expression of the mgaSpn and pclR regulatory genes under standard laboratory conditions (this work), transcription of both genes was found to be higher in the logarithmic phase compared to the stationary phase. Most of the transcription processes in exponentially growing pneumococcal bacteria are initiated by the RNA polymerase that contains the housekeeping sigma factor SigA, also known as RpoD and σ43. In the promoters recognized by the housekeeping factor, the consensus sequence of the − 10 element is 5′-TATAAT-3′, which is present in the promoter of mgaSpn (Pmga) and the promoter of pclR (PpclR). It has been shown that SigA recognizes the Pmga promoter in vitro[49].The pneumococcal MgaSpn transcriptional regulator is a member of the Mga/AtxA family[17-19], which also includes the global regulator MafR of E. faecalis[20]. Here we have shown that PclR shares some features with MgaSpn. These proteins have the same size (494–493 residues), exhibit a high degree of sequence similarity (60%), and have the same organization of predicted functional domains, including two N-terminal helix-turn-helix DNA-binding motifs. Furthermore, PclR can generate multimeric complexes on linear double-stranded DNA fragments, a feature reported first for MgaSpn[17] and later on for MafR[50]. Regarding their mechanism to activate transcription from specific promoters, both proteins stimulate transcription by binding to a specific site upstream of the core promoter. PclR recognizes a site upstream of the PpclA promoter (positions − 68 to − 169), and MgaSpn activates transcription of a four-gene operon (spr1623-spr1626) by binding to a site upstream of the P1623B promoter (positions − 60 to − 99)[17]. Nevertheless, despite these similarities, we have shown that PclR and MgaSpn have different DNA-binding specificities. PclR does not bind to the site recognized by MgaSpn on the P1623B promoter region, and MgaSpn does not bind to the site recognized by PclR on the PpclA promoter region. As a consequence, PclR does not influence the expression of the spr1623 gene, and MgaSpn does not influence the expression of the pclA gene.In summary, the pclA-pclR gene cluster of the pneumococcal R6 strain is not present in all strains of the species. Our present work demonstrates that PclR is a transcriptional activator of the pclA gene (collagen-like protein). PclR recognizes a specific DNA site upstream of the PpclA core promoter. Moreover, PclR is homologous to the MgaSpn transcriptional regulator, which is also encoded by the R6 genome. Our study shows that PclR and MgaSpn have similar DNA-binding properties but different DNA-binding specificities.
Materials and methods
Oligonucleotides, bacterial strains, and plasmids
The oligonucleotides used in this work are listed in Table 1. S. pneumoniae strains R6[10] and R6∆mga[15] were used. R6∆mga lacks the mgaSpn regulatory gene. The pneumococcal strains R6∆mga/pDL287 (absence of MgaSpn) and R6∆mga/pDLPsulA::mga (plasmid-encoded MgaSpn) were described previously[15]. Plasmid pDLF is a constitutive expression vector that carries a kanamycin resistance gene[20]. This vector has an engineered unique restriction site for SphI downstream of the enterococcal P2493 promoter[34]. Plasmids pDLFpclR and pDLFpclR-i are pDLF derivatives. For their construction, a 1594-bp region of the R6 chromosome was amplified by PCR using the FpclR and RpclR oligonucleotides. After SphI digestion, the 1561-bp restriction fragment was inserted into the SphI site of pDLF in both orientations, being pDLFpclR the recombinant plasmid that carries the gene pclR under the control of the P2493 promoter. Plasmid pASTT is a promoter-probe vector based on the gfp reporter gene[51]. It is a pAST derivative[34] and carries a tetracycline resistance gene. The following pASTT-derivatives were constructed in this work. In all cases, a region of the R6 chromosome was amplified by PCR using the indicated primers. Then, the PCR product was digested with SacI, and the restriction fragment was ligated to the SacI-linearized pASTT vector: (a) pASTT-PpclR (primers DwpclA and pclR-Dw, 190-bp restriction fragment), (b) pASTT-PpclR∆105 (primers pclR∆105 and pclR-Dw, 85-bp restriction fragment), (c) pASTT-PpclR∆-10 (primers DwpclA and pclR∆-10, 169-bp restriction fragment), (d) pASTT-PpclA (primers UppclA and DwpclA, 292-bp restriction fragment), (e) pASTT-PpclA∆103 (primers pclR-Dw and DwpclA, 190-bp restriction fragment), (f) pASTT-PpclA∆173 (primers FpclA∆173 and DwpclA, 119-bp restriction fragment), (g) pASTT-PpclA∆203 (primers FpclA∆203 and DwpclA, 91-bp restriction fragment), (h) pASTT-PpclA∆224 (FpclA∆224 and DwpclA, 70-bp restriction fragment) and (i) pASTT-PpclA∆-10 (primers UppclA and RpclA∆-10, 255-bp restriction fragment). For protein overproduction, an inducible expression system based on the Escherichia coli strain BL21(DE3) (a gift of F. W. Studier) and the plasmid vector pET24b (Novagen) was used. This strain carries the gene for T7 RNA polymerase under the control of the lacUV5 promoter[52], which is inducible by isopropyl β-D-1-thiogalactopyranoside (IPTG). Vector pET24b is based on the phi10 promoter recognized by the T7 RNA polymerase. Plasmid pET24b-pclR-His encodes the PclR-His protein, which carries the Leu-Glu-6xHis peptide fused to its C-terminus. For its construction, a 1517-bp region of the R6 chromosome was amplified by PCR using the pclR-Nde and pclR-Xho-His oligonucleotides. The amplified DNA was digested with NdeI and XhoI, and the 1484-bp digestion product was inserted into pET24b. Plasmid pET24b-mgaSpn-His encodes the MgaSpn-His protein[15].
Growth and transformation of bacteria
Pneumococcal cells were grown in AGCH medium[34,53] supplemented with 0.3% sucrose and 0.2% yeast extract, at 37 °C in a static water bath. For plasmid-harbouring cells, the medium was supplemented with kanamycin (50 µg/ml; pDLF derivatives) and/or tetracycline (1 µg/ml; pASTT derivatives). The protocol used for natural transformation of S. pneumoniae was described previously[33]. E. coli cells carrying a pET24b derivative were grown in tryptone-yeast extract (TY) medium supplemented with kanamycin (30 µg/ml), at 37 °C in a shaking water bath. The protocol used to transform E. coli by electroporation was described previously[54].
Overproduction and purification of His-tagged proteins
E. coli strains BL21(DE3)/pET24b-mgaSpn-His[15] and BL21(DE3)/pET24b-pclR-His (this work, see above) were used. The protocols used to overproduce and purify the MgaSpn-His protein were described previously[15]. MgaSpn-His purification involved the use of a HisTrap HP column (GE Healthcare) and a HiLoad Superdex 200 gel filtration column (Amersham). For overproduction and purification of the PclR-His protein, the protocols reported for MafR-His[50] were used. Basically, PclR-His purification included the following steps: (i) precipitation of nucleic acids with polyethyleneimine (PEI) (0.2%) in the presence of NaCl (300 mM). The ionic strength at which PEI precipitation was done was low enough to recover PclR-His in the PEI pellet, (ii) elution of PclR-His from the PEI pellet using a higher ionic strength buffer (700 mM NaCl), (iii) precipitation of the eluted proteins with 70% saturated ammonium sulphate, and (iv) fast-pressure liquid chromatography (Biologic Duoflow, Bio-Rad) on a nickel affinity column (HisTrap HP) (Supplementary Fig. S9). Protein concentration was determined using a NanoDrop ND-2000 Spectrophotometer (Thermo Scientific).
DNA and RNA isolation
Genomic DNA from S. pneumoniae was prepared as reported[53]. Plasmid DNA was prepared using the High Pure Plasmid Isolation Kit (Roche Applied Science) as described[34]. Total RNA was isolated using the RNeasy Mini Kit (QIAGEN). Cultures were processed as specified by the supplier, except that cells were resuspended in a buffer that contained 50 mM Tris–HCl, pH 7.6, 1 mM EDTA, 50 mM NaCl, and 0.1% deoxycholate. Then, cells were incubated at 37 °C for 5 min (cell lysis). The integrity of rRNAs was analysed by agarose gel electrophoresis. RNA concentration was determined using a NanoDrop ND-2000 Spectrophotometer.
Polymerase chain reaction (PCR) and quantitative RT-PCR (qRT-PCR)
Phusion High-Fidelity DNA Polymerase (Thermo Scientific) was used for all PCR applications as reported[34]. PCR products were purified with the QIAquick PCR Purification Kit (QIAGEN). In the qRT-PCR assays, for each strain, total RNA was isolated from three independent bacterial cultures. Then, from each RNA preparation, cDNA was synthesized. For cDNA synthesis with random primers, the iScript Select cDNA Synthesis Kit (Bio-Rad) was used as described previously[20]. To rule out the presence of genomic DNA in the RNA preparations, reactions without adding reverse transcriptase were performed. Quantitative PCRs were carried out using the iQ SYBR Green Supermix (Bio-Rad) and an iCycler Thermal Cycler (Bio-Rad) as reported[20]. From each cDNA sample, three PCRs per gene (gene of interest and internal control gene) were performed. Data were analysed with the iQTM5 Optical System Software. Relative quantification of gene expression was performed using the comparative CT method[32]. The era gene (spr0871) was used as the internal control gene (oligonucleotides Fera-q and Rera-q). The oligonucleotides used to determine the relative expression of the pclR (FpclR-q and RpclR-q), pclA (FpclA-q and RpclA-q), mgaSpn (1622A and 1622 J) and spr1623 (F1623-q and 1623B) are shown in Table 1. The threshold cycle values (CT) of the gene of interest and the internal control gene were used to calculate 2−Δ, where ΔCT = CT gene of interest—CT internal control. In general, for each cDNA sample (total three), the mean CT from the three PCRs for the gene of interest, the mean CT from the three PCRs for the internal control gene, and the 2−Δ value were calculated. Then, the mean ± standard deviation of the three 2−Δ values was calculated. The differences between two groups were analysed using a Student´s t-test (paired, two-tailed). For the gene of interest, the fold change in expression (FC) in one strain compared to another was obtained by dividing the corresponding mean 2−Δ values. The results of these analyses are shown in Supplementary Tables S1 to S9 and Supplementary Fig. S10.
Primer extension
The ThermoScript Reverse Transcriptase enzyme (Invitrogen) and [α-32P]-dATP (3000 Ci/mmol; PerkinElmer) were used as reported[15]. Basically, primer extension reactions (20 µl) contained 1 pmol of the indicated oligonucleotide and 2.5–15 µg of total RNA from the indicated strain. To anneal the primer with the transcript, samples were incubated at 65 °C for 5 min. Extension reactions were carried out at 55 °C for 45 min. After heating at 85 °C for 5 min, non-incorporated [α-32P]-dATP was removed using Illustra MicroSpin™ G-25 columns (GE Healthcare). Samples were ethanol precipitated as described[51]. cDNA products were analysed by sequencing gel (8 M urea, 6% polyacrylamide) electrophoresis. As DNA size markers, dideoxy-sequencing reactions were carried out using M13mp18 DNA, primer − 40 M13[36], and the Sequenase Version 2.0 DNA Sequencing kit (USB Corporation). Labelled products were visualized using a Fujifilm Image Analyser FLA-3000.
Fluorescence assays
Pneumococcal cells harbouring a pASTT derivative were grown as indicated above to an optical density at 650 nm (OD650) of 0.3–0.4 (exponential phase). Then, cultures were processed as reported[51]. Fluorescence intensity was measured using a Thermo Scientific Varioskan Flash instrument.
DNase I footprinting assays
Oligonucleotides were 32P-labelled at the 5′-end as described[17]. PCR amplification using a 32P-labelled oligonucleotide was used to obtain double-stranded DNA fragments labelled at the 5′-end of one of the strands. Three regions of the R6 chromosome were amplified by PCR: (a) a 270-bp region (coordinates 1,388,196 to 1,387,927) using the Up1404 and Dw1404 oligonucleotides, (b) a 281-bp region (coordinates 1,388,232 to 1,387,952) using the Up1404-2 and Dw1404-2 oligonucleotides, and (c) a 222-bp region (coordinates 1,598,298 to 1,598,519) using the 1622H and 1622I oligonucleotides. Binding reactions and DNase I digestion were performed as described[51]. Samples were analysed by sequencing gel (6% polyacrylamide, 8 M urea) electrophoresis. Labelled products were visualized using a Fujifilm Image Analyser FLA-3000 and the intensity of the bands was quantified using the Quantity One software (Bio-Rad).
Electrophoretic mobility shift assays
Binding reactions were performed as described[50]. When indicated, non-labelled competitor calf thymus DNA and 32P-labelled DNA were added simultaneously to the binding reaction. Reaction mixtures were analysed by electrophoresis on native polyacrylamide (6%) gels.Supplementary Information.
Authors: L McGee; L McDougal; J Zhou; B G Spratt; F C Tenover; R George; R Hakenbeck; W Hryniewicz; J C Lefévre; A Tomasz; K P Klugman Journal: J Clin Microbiol Date: 2001-07 Impact factor: 5.948
Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971
Authors: H Tettelin; K E Nelson; I T Paulsen; J A Eisen; T D Read; S Peterson; J Heidelberg; R T DeBoy; D H Haft; R J Dodson; A S Durkin; M Gwinn; J F Kolonay; W C Nelson; J D Peterson; L A Umayam; O White; S L Salzberg; M R Lewis; D Radune; E Holtzapple; H Khouri; A M Wolf; T R Utterback; C L Hansen; L A McDonald; T V Feldblyum; S Angiuoli; T Dickinson; E K Hickey; I E Holt; B J Loftus; F Yang; H O Smith; J C Venter; B A Dougherty; D A Morrison; S K Hollingshead; C M Fraser Journal: Science Date: 2001-07-20 Impact factor: 47.728
Authors: Reinhold Brückner; Michael Nuhn; Peter Reichmann; Beate Weber; Regine Hakenbeck Journal: Int J Med Microbiol Date: 2004-09 Impact factor: 3.473
Authors: Claudio Donati; N Luisa Hiller; Hervé Tettelin; Alessandro Muzzi; Nicholas J Croucher; Samuel V Angiuoli; Marco Oggioni; Julie C Dunning Hotopp; Fen Z Hu; David R Riley; Antonello Covacci; Tim J Mitchell; Stephen D Bentley; Morgens Kilian; Garth D Ehrlich; Rino Rappuoli; E Richard Moxon; Vega Masignani Journal: Genome Biol Date: 2010-10-29 Impact factor: 13.583