| Literature DB >> 19429668 |
Miriam Moscoso1, Ernesto García.
Abstract
The polysaccharide capsule of Streptococcus pneumoniae is the main virulence factor, which makes the bacterium resistant to phagocytosis. Expression of capsular polysaccharide must be adjusted at different stages of pneumococcal infection, thus, their transcriptional regulation appears to be crucial. To get insight into the existence of regulatory mechanisms common to most serotypes, a bioinformatic analysis of the DNA region located upstream of the capsular locus was performed. With the exception of serotype 37, the capsular locus is located between dexB and aliA on the pneumococcal chromosome. Up to 26 different sequence organizations were found among pneumococci synthesizing their capsule through a Wzy-polymerase-dependent mechanism, mostly varying according to the presence/absence of distinct insertion elements. As a consequence, only approximately 250 bp (including a 107 bp RUP_A element) was conserved in 86 sequences, although only a short (ca. 87 bp) region located immediately upstream of cpsA was strictly conserved in all the sequences analyzed. An exhaustive search for possible operator sequences was done. Interestingly, although the promoter region of serotype 3 isolates completely differs from that of other serotypes, most of the proteins proposed to regulate transcription in serotype 3 pneumococci were also predicted to function as possible regulators in non-serotype 3 S. pneumoniae isolates.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19429668 PMCID: PMC2695774 DOI: 10.1093/dnares/dsp007
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1Consensus sequences derived from alignments of strains synthesizing CPS through a Wzy-polymerase- (A and B) or a synthase-dependent mechanism (C). Eighty six (A), 115 (B) and four (C) sequences were aligned. The frequency of each nucleotide is indicated above the sequence, in vertical format. The position of the transcription initiation site is assigned +1 and other positions are numbered accordingly. (A) The RUP_A element is shaded and a sequence possibly related to some ISs is italicized. Asterisks indicate a region deleted in several strains. Abbreviations: H, A or C or T; K, G or T; M, A or C; R, A or G; S, C or G; W, A or T; Y, C or T. (B) The initiation codon of cpsA is indicated with an arrow. The −35, −10 and the transcription initiation site are indicated by white lettering on a black background. The underlined sequence corresponds to that tandemly duplicated in some strains. (C) Aligned nucleotide sequence of three serotype 3 strains. At positions −94 and −56, a slash indicates a T (or no nucleotide), or an A (or no nucleotide), respectively.
Binding sites of transcriptional regulators putatively involved in CPS biosynthesis of S. pneumoniae
| Organism | Transcriptional regulator (Accession no.) | Binding site | Identity/similarity (log10 | Sequence (5′ position)a | ||
|---|---|---|---|---|---|---|
| Wzy-dependentb | Synthase-dependent | |||||
| ComX1 (Q8CM18) | TACGAATA | — | — | TACtAAgA (−75f) | TgaGAATA (+19f) | |
| TAacAATA (−152r) | TAttAATA (+25f) | |||||
| TAaGAAcA (−111r) | TAatAATR (+28f) | |||||
| TACGAtTA (−4r) | TAttAATA (+25r) | |||||
| CopY (Q8DQJ7) | KACAN2TGTA | — | — | TAgARTaGTA (−186f) | GAggACTGTA (+44f) | |
| TACACATcTg (−178f) | GACtGTaGTA (+47f) | |||||
| TAaAGATGcA (−63f) | ||||||
| aAaAGGTGTA (−44f) | ||||||
| TAtAATcGTA (−13f) | ||||||
| TAtAGGTGTt (+10f) | ||||||
| TACtAYTcTA (−177r) | ||||||
| cAgATGTGTA (−169r) | ||||||
| GACAAATtgA (−120r) | ||||||
| TgCATCTtTA (−54r) | ||||||
| TACACCTtTt (−35r) | ||||||
| TACgATTaTA (−4r) | ||||||
| aACACCTaTA (+19r) | ||||||
| MalR (P0A4T2) | AAACGTTTc | — | — | AAWCGaTT (−149f) | AAAaGTTT (−52f) | |
| AAAgGTgT (−43f) | AAACtaTT (−5f) | |||||
| AAtCGWTT (−142r) | AAACtTTT (−45r) | |||||
| AcACcTTT (−36r) | AAtaGTTT (+3r) | |||||
| GlnR (Q8DQX7) | TGTNAN7TNACA | — | — | TcTAAAAHATTKTTAgA (−166f) | TGaTATTCCCCTTGACA (−45f) | |
| TcTTATTTCATTTTACt (−115f) | TtTTAAATAAAGTGAgA (+7f) | |||||
| TcTAAMAATATTTTAgA (−150r) | TGaGAATATTAATAAtR (+19f) | |||||
| aGTAAAATGAAATAAgA (−99r) | TGTCAAGGGGAATAtCA (−29r) | |||||
| TcTCACTTTATTTAAaA (+23r) | ||||||
| YaTTATTAATATTCtCA (+35r) | ||||||
| RitR (Q04M91) | WNATTANW3RWYRR | — | — | ATATTATATTGAaAc (−201f) | TAATcAGTTTAACGG (−101f) | |
| TGATcAATTTGTCAt (−132f) | AGATaAAATTATTAt (−26f) | |||||
| TTAcTATATTtTTGG (−103f) | AAATTATTATATaAt (−21f) | |||||
| AAtaTAGTAAAATGA (−94r) | TTATTATATAATTAA (−18f) | |||||
| TAATTAAAcTATTGc (−10f) | ||||||
| AAgTgAGAATATTAA (+16f) | ||||||
| TGAgaATATTAATAA (+19f) | ||||||
| ATATTAATAAtgCAG (+24f) | ||||||
| TGATTACTTTccTAA (−96r) | ||||||
| TAATaATTTTATCtA (−13r) | ||||||
| TAATTATATAATaAt (−5r) | ||||||
| TTAaTtATATAATAA (−4r) | ||||||
| AGtTTAATTAtATAA (+1r) | ||||||
| TTATTtAAAAAgCAA (+16r) | ||||||
| ACtTTATTTAAAaAG (+19r) | ||||||
| AdcR (Q04I02) | TAACYRGTTAA | — | — | — | TAAtCAGTTtA (−91f) | |
| TAtCCcGTTAA (−83r) | ||||||
| CovR/CsrR (Q8P2J8) | DDHHATTARAR | CsrR (Q8DR53) | 45/68 (−49) | TTATATTgAAA (−198f) | GGTTAggAAAG (−112f) | |
| TGAAAcTAGAR (−192f) | AGTAATYAGtt (−103f) | |||||
| TGCTtcTAAAA (−170f) | ATCTtTTcAAA (−83f) | |||||
| HATTkTTAGAA (−159f) | TGATAcTaAGG (−70f) | |||||
| GTCTAcTAAgA (−78f) | TGACAaTAGAt (−33f) | |||||
| AGATAcTtAAA (−70f) | AATAgaTAAAA (−29f) | |||||
| GATAcTTAAAG (−69f) | TAAAATTAttA (−23f) | |||||
| AGATAgTgAAA (−54f) | AATTATTAtAt (−20f) | |||||
| GATAgTgAAAA (−53f) | ATATAaTtAAA (−13f) | |||||
| AGACATTAccG (−35f) | TATAATTAAAc (−12f) | |||||
| TTACcgTAAAA (−30f) | TGCTtTTtAAA (+3f) | |||||
| TACCgTaAAAA (−29f) | TTTAAaTAAAG (+8f) | |||||
| TTTCAaTAtAA (−188r) | TAAAgTgAGAA (+14f) | |||||
| TACTATTctAG (−177r) | GAATATTAAtA (+22f) | |||||
| ATDTtTTAGAA (−157r) | TATTAaTAAtR (+25f) | |||||
| GGACAgTyAAA (−133r) | TAATAaTRcAG (+28f) | |||||
| GAACATgAcAA (−114r) | ||||||
| TGAAAYaAGAA (−106r) | ||||||
| TAAAATgAAAt (−101r) | ||||||
| GTAAAaTgAAA (−100r) | ||||||
| ATATAgTAAAA (−95r) | ||||||
| GTATcTTAGtA (−65r) | ||||||
| ATACATTgAAc (+12r) | ||||||
| RovS (Q8E447) | AWAAWVHTDAWN6/7 WTKWWAMDWAK | SPD_0939 | 52/73 (−76) | — | ATAtAATTAAACTATTGCTTTTTAAATAa (−13f) | |
| CitT (O34534) | WWCAAA | RpsI (A5MLP7)d | 27/50 (−17) | TTgAAA (−193f) | AACAgA (−118f) | |
| TACAcA (−178f) | TTCAAA (−78f) | |||||
| TTCtAA (−167f) | TACtAA (−67f) | |||||
| TAgAAA (−153f) | cACAAA (−59f) | |||||
| ATCAAt (−130f) | AAaAAA (−55f) | |||||
| TACtAA (−75f) | AAaAAA (−54f) | |||||
| AAaAAA (−46f) | ATaAAA (−24f) | |||||
| TAaAAA (−24f) | ATtAAA (−8f) | |||||
| AAaAAA (−23f) | TTtAAA (+8f) | |||||
| TTCAAt (+3f) | AAtAAA (+12f) | |||||
| TTCAAt (−189r) | TRCAgA (+34f) | |||||
| AACAAt (−153r) | ||||||
| TTCtAA (−149r) | ||||||
| gTCAAA (−138r) | ||||||
| gACAAA (−120r) | ||||||
| ATgAAA (−105r) | ||||||
| AcCAAA (−88r) | ||||||
| AACcAA (−87r) | ||||||
| DeoR (P39140) | TTCAAT | Spr0228 (Q8DRC3)d | 28/50 (−30) | aTCAAT (−130f) | TTCAAa (−78f) | |
| TTCAWT (−109f) | TTaAAT (+9f) | |||||
| TTCAAT (+3f) | TTtAAT (−3r) | |||||
| TTCAAT (−189r) | ||||||
| TTCAcT (−45r) | ||||||
| GerE (P11470) | RWWTRGGYN2YY | RR03 (Q8DR45) | 49/67 (−6)e | GATTtGaCTGTC (−145f) | AATTAaaCTATT (−9f) | |
| ATTTGacTGTCC (−144f) | GAAaAGaTATCC (−76r) | |||||
| GTATAGGTRTTa (+10f) | ||||||
| AAATcGWTTTCT (−141r) | ||||||
| cTTTAaGTATCT (−59r) | ||||||
| GATTAtaTCACT (−7r) | ||||||
| CcpA (P25144) | WTGNAANCGNWN2CW | CcpA (Q97NM1) | 54/74 (−96) | TaGAAAWCGATTTrA (−152f) | TTcAAAGCtGATACT (−78f) | |
| ATaTAATCGTAAGaT (−14f) | ||||||
| gTyAAATCGWTTTCT (−138r) | ||||||
| Spo0A (P06 534) | TGTCGAA | RR09 (Q8DQN8) | 38/62 (−10) | TaTtGAA (−195f) | TtTCaAA (−79f) | |
| TGTaGAc (−38f) | TGTaGtA (+50f) | |||||
| TGTtcAA (+1f) | TaTCaAA (−41r) | |||||
| TtTaGAA (−161r) | TGTCaAg (−29r) | |||||
| TtTCtAA (−148r) | ||||||
| aGTCaAA (−137r) | ||||||
| TGaCaAA (−119r) | ||||||
| TGTCtAc (−31r) | ||||||
af and r indicate whether the sequence corresponds to the forward or reverse sequence of that included in the EMBL database (AF026471 or Z47210), respectively. Unless otherwise stated, a maximum of two mismatches were allowed.
bSequences corresponding to those common to all strains are indicated with a gray background. Mismatches are indicated by lowercase lettering.
cThis sequence is a subset of the CcpA binding site.
dOnly one mismatch was allowed.
eSimilarity restricted to part of the protein (from residue 145 to 197 of the pneumococcal protein and 12 to 64 of that of B. subtilis).