| Literature DB >> 22139924 |
Christophe Pichon1, Laurence du Merle, Marie Elise Caliot, Patrick Trieu-Cuot, Chantal Le Bouguénec.
Abstract
Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22139924 PMCID: PMC3326304 DOI: 10.1093/nar/gkr1141
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Strains and plasmids used in this study
| Name | Description | Genotype/Resistance | Reference |
|---|---|---|---|
| Strains | |||
| | Sepsis-associated ExPEC isolate | ( | |
| | Pyelonephritis-associated ExPEC isolate (O6:K15:H31) | ( | |
| | Deletion of the full | ( | |
| | Allelic exchange of the | This study | |
| | Human septicaemia isolate | ( | |
| | Laboratory strain | ( | |
| | ( | ||
| | This study | ||
| Plasmids | |||
| pCP20 | Thermosensitive plasmid expressing the | ( | |
| pKOBEG-Apra | Thermosensitive recombination plasmid used for allelic exchange | pSC101ts, | ( |
| pZE21- | ColE1, | ( | |
| pZE2R- | Replacement of the PLtetO-1 promoter from pZE21- | ColE1, | ( |
| pZE21-null | pZE1- | ColE1, | This study |
| pZE2R-null | pZE2R- | ColE1, | This study |
| pZE2R- | Insertion of | ColE1, | This study |
| pZE21- | Insertion of | ColE1, | This study |
| pZE2R- | Insertion of | ColE1, | This study |
| pXG-0 | Luciferase-expressing plasmid | pSC101*, | ( |
| pXG-10 | Translational fusion of | pSC101*, | ( |
| pXG | pXG10 derivative with a | pSC101*, | This study |
| pXG | pXG10 derivative with a | pSC101*, | This study |
| pTCV- | Shuttle low-copy vector to analyze regulatory elements in Gram-positive bacteria under the control of the constitutive promoter Ptet | pAMβ1, | S. Dramsi |
| pTCV-SQ18 | Insertion of the SQ18 sRNA gene into the BamHI/PstI sites of pTCVerm-Ptet plasmid. | pAMβ1, | This study |
| pTCV-SQ485 | Insertion of the SQ485 sRNA gene into the BamHI/PstI sites of pTCVerm-Ptet plasmid. | pAMβ1, | This study |
| pTCV-SQ893 | Insertion of the SQ893 sRNA gene into the BamHI/PstI sites of pTCVerm-Ptet plasmid. | pAMβ1, | This study |
aApra, Cb, Cm, Erm, Km were resistance to apramycin, carbenicillin, chloramphenicol, erythromycin and kanamycin, respectively.
Figure 1.UML activity diagram for our in silico sRNA prediction model. (A) The first part of this process involves the prediction of sRNA protein-binding sites (RIT prediction in this study) and extraction of the flanking sequences. (B) Core software for sRNA analysis and discovery based on a combination of comparative genomics, RNA prediction and covariation analysis.
Summary of sRNA candidates identified in silico
| Strain | Disease | IGR | asRNA | 5′ asRNA | 3′ asRNA | 5′ & 3′ asRNA | 5′ UTR | 3′ UTR | sense RNA |
|---|---|---|---|---|---|---|---|---|---|
| MG1655 | L. S. | 195 | 452 | 74 | 142 | 73 | 89 | 199 | 643 |
| UTI89 | Cys. | 199 | 398 | 66 | 95 | 77 | 96 | 170 | 527 |
| 536 | Pyl. | 191 | 388 | 66 | 107 | 54 | 73 | 140 | 496 |
| AL862 | Sep. | 9 | 6 | 2 | 2 | 0 | 3 | 3 | 4 |
| S88 | Men. | 212 | 430 | 63 | 103 | 85 | 90 | 154 | 532 |
| NEM316 | Sep. | 41 | 63 | 12 | 24 | 6 | 5 | 21 | 25 |
IGR, intergenic region; asRNA, sRNA antisense to a CDS; 5′ asRNA, antisense to the 5′-end of a CDS ; 3′ asRNA, antisense to the 3′-end of a CDS; 5′ UTR, 5′ untranslated region of a CDS; 3′ UTR, 3′ untranslated region of a CDS. For classification of the sRNA candidates into one of these categories, the first nucleotide of the RIT was used as the position reference of the candidate. This nucleotide had to be on the opposite DNA strand, between nucleotides −50 nt to +15 nt around the ATG codon (5′ asRNA), from position +15 nt with respect to the ATG codon to position –50 nt near the stop codon (asRNA) or from –50 nt to +15 nt around the stop codon (3′ asRNA). When candidates were on the same DNA strand as the CDS, the window around the first RIT nucleotide was < –100 nt before the ATG codon (5′ UTR), < +200 nt after the stop codon (3′ UTR) and from +50 nt after the ATG to –50 nt before the stop codon (seRNA). All candidates outside a CDS not included in a previous category are referred to IGR candidates. All candidates had to have a RIT with a score of ΔG°37 < -4 kcal/mol and at least two covariations had to be present in the RNA structure including the stem of the RIT. For asRNA and seRNA candidates, ΔG°37 had to be below -8 kcal/mol. L. S., laboratory strain; Cys., cystitis; Pyl., pyelonephritis; Sep., sepsis; Men., meningitis. Only the PAI-IAL862 sequence of the AL862 strain was analyzed.
Efficiency of the in silico process for predicting previously known sRNAs in six bacterial species
| Gram | Strains | Total known sRNAs | sRNA genes in IGR | asRNA genes in CDS | ||
|---|---|---|---|---|---|---|
| Known sRNA with RIT | Success (%) | Known asRNA with RIT | Success (%) | |||
| 101 | 60 | 86.7 | 5 | 60 | ||
| 79 | 51 | 70.6 | 0 | NA | ||
| 40 | 31 | 90.4 | 9 | 55.5 | ||
| 24 | 24 | 66.7 | 0 | NA | ||
| 55 | 38 | 76.3 | 1 | 100 | ||
| 50 | 27 | 29.6 | 10 | 70 | ||
aThe RITs of the published asRNA genes were not characterized by authors.
The efficiency of sRNAs prediction was calculated from data for bona fide sRNA genes. Only sRNAs that had been experimentally validated by Northern blots, 5′ RACE and RT–PCR were taken into account. We excluded unconfirmed sRNAs from RNA-seq or tiling microarray data and 5′ or 3′ UTRs from mRNAs.
bNA, Not Applicable.
Figure 2.Comparative analysis of sRNAs identified by our in silico model based on sequence conservation among ExPEC strains. Venn diagram representations of the number of sRNA predicted in IGR (A) and of asRNAs predicted in CDSs (B).
Figure 3.Northern blot analysis of some sRNAs from E. coli AL862, 536 and S. agalactiae NEM316 strains. Expression analysis of 7 sRNA candidates co-localized with virulence factors (see Table 4) and identified in (A) E. coli AL862, (B) E. coli 536 and (C) S. agalactiae NEM316 strains. Expression was analyzed in two phases of growth (E, exponential; S, late stationary) in LB and M9 + 0.4% pyruvate (M9py) media for E. coli or TH and RPMI1640 + 0.4% glucose media for S. agalactiae. Expression of the constitutively transcribed 5S ribosomal gene was used as loading control. The C0465 sRNA which is expressed only in early stationary phase in E. coli MG1655 strain was used as a negative expression control (46). Notes, ig, sRNA gene located in the IGR; as, sRNA gene located a position antisense to a CDS (asRNA). Black arrows indicated hybridized sRNA molecules.
List of validated sRNA genes located close to virulence-related genes
| Candidate | sRNA | Origin | Loc. | 5′-end | 3′-end | Type | Target genes | Target function | O. g. | ExPEC specific? | Score |
|---|---|---|---|---|---|---|---|---|---|---|---|
| SQ8164 | IntP4R | PAI-II | 4 735 462 | 4 735 232 | asRNA | PAI DNA mobility | < > | No | 10 / −26.28 | ||
| SQ7560 | PrfR | PAI-II | 4 747 389 | 4 747 630 | asRNA | Adhesion | > < | Yes | 3 / −12.64 | ||
| SQ7575 | HlyR | PAI-II | 4 763 726 | 4 763 963 | asRNA | Hemolysis | > < | Yes | 2 / −5.76 | ||
| SQ7606 | HaeR | PAI-II | 4 783 731 | 4 783 731 | asRNA | Filamentous haemagglutinin | > < | Yes | 9 / −6.52 | ||
| SQ8017 | FimR | Core | 4 852 969* | 4 852 518 | asRNA | Adhesion | < > | No | 15 / −8.49 | ||
| SQ109 | AfaR | PAI-I | 56 564* | 56 332 | IGR | Adhesion | > < | Yes | 2 / −5.2 | ||
| SQ19 | IntR | PAI-I | 58 845 | 59 076 | asRNA | PAI DNA mobility | < > | No | 12 / −14.94 | ||
| SQ18 | SQ18 | Core | 47 857* | 47 734 | asRNA | Surface exposed protein | > < | N.A. | 3 / −10 | ||
| SQ340 | SQ340 | PAI-X | 1 163 702* | 1 163 779 | IGR | Transposase of TnGBS2 | > < | N.A. | 3 / −10.5 | ||
| SQ893 | SQ893 | Core | 13 00 661 | 1 300 360 | IGR | Fibronectin binding protein | < > | N.A. | 3 / −4 | ||
| SQ407 | SQ407 | PAI-XII | 1 350 419 | 1 350 658 | asRNA | Laminin binding protein | > < | N.A. | 11 / −11.5 | ||
| SQ485 | SQ485 | Core | 1 655 610 | 1 655 852 | asRNA | Putative ABC transporter | > < | N.A. | 9 / −10.3 | ||
| SQ1004 | SQ1004 | PAI-XIII | 2 052 153 | 2 052 383 | IGR | Streptomycin resistance | > < | N.A. | 3 / −7.6 |
aLocalization of the sRNA gene. Core, core genome; PAI, pathogenicity islands.
bThe 5′-end of the sRNA candidate is arbitrarily located 200 bp upstream from the first nucleotide of the predicted RIT. An asterisk indicates the 5′ triphosphates RNA end determined by 5′ RACE. The 5′ ends of SQ109 (E. coli AL862) and SQ340 (S. agalactiae NEM316) sRNAs were determined in another study (C.P., personal communication).
cThe 3′-end of the sRNA candidate is defined as the last nucleotide of the RIT poly-uracil tail.
dType of sRNA candidate gene locus. IGR, intergenic region; asRNA, sRNA antisense to a CDS.
eAntisense sRNA predicted target mRNA. The sRNA genes located in an IGR may regulate adjacent genes by an antisense mechanism.
fO. g., Orientation of genes (order sRNA/mRNA).
gSpecificity was determined by FASTA analysis against the Genbank database.
hN, number of covariations identified/RIT score in kcal/mol.
E.c., Escherichia coli; S.a., Streptococcus agalactiae
Yeast agglutination assays for E. coli 536 derivatives
| Strain | Yeast agglutination titer |
|---|---|
| 536 + pZE2R-null | 1/16 |
| 536 + pZE2R- | 1/64 |
| 536 Δ | NO |
| 536 Δ | NO |
| 536 + pZE21-null | 1/16 |
| 536 + pZE21- | 1/4 |
| 536 Δ | NO |
| 536 Δ | NO |
| 536 | 1/16 |
| 536Δ | NO |
The level of expression of type 1 fimbriae was assessed in E. coli 536 wild type and mutant strains expressing the FimR sRNA, the antiFimR sRNA or mock plasmids. No 536 Δfim strains agglutinated yeasts indicating that the agglutination phenotypes resulted from the expression of type 1 fimbriae. NO: not observable.
Figure 4.Over-expression of FimR and SQ18 antisense sRNAs regulates the fimD and gbs0031 target genes, respectively. (A) Analysis by Western blot and quantitative RT–PCR of gfp and FimR gene expression in E. coli strain TOP10 harboring pZE2R-fimR or pZE2R-null plasmids combined with pXG-0 (no gfp target control) or pXGfimD::gfp target expression plasmids. The four isolates were cultured in LB medium at 37°C until they reached an OD600 of 0.9. Quantitative expression of the gfp fusion gene was normalized to 1.0 for the TOP10 + pZE2R-null + pXGfimD::gfp strain. FimR expression was normalized to 1.0 for the TOP10 + pZE2R-fimR + pXG-0 strain. (B) Western blot and quantitative RT–PCR analysis were performed as described in (A) but in a Δhfq context. Asterisks indicate a significant difference between mean values in unpaired t-tests (P < 0.01).
Figure 5.FimR sRNA up regulates type 1 fimbriae gene expression in vivo. Quantitative real-time RT–PCR analysis of expression of the fimBEAICDFGH gene cluster was performed in (A) E. coli 536 + pZE2R-fimR relatively to E. coli 536 + pZE2R-null, (B) E. coli 536 + pZE21-antifimR relatively to E. coli 536 + pZE21-null and (C) 536 Δhfq::KmFRT relatively to 536 strains, cultured in LB medium statically at 37°C for 24 h (stationary phase).
Figure 6.SQ18, SQ893 and SQ485 sRNAs controlled the gbs0031, gbs1263 and gbs1588 target genes expression, respectively. (A) Quantitative real-time RT–PCR analysis of expression of gbs0031, gbs1263 and gbs1588 gene. The relative expression of the three mRNA genes were determined by comparing over-expressing strains S. agalactiae NEM316 + pTCV-SQ18 or pTCV-SQ485 or pTCV-SQ893 against the wild-type S. agalactiae NEM316 isolate. (B) Analysis by Western blot and quantitative RT–PCR of the expression of the gfp and SQ18 gene expression in E. coli TOP10 strain harboring pZE2R-SQ18 or mock plasmids combined with pXG-0 (no gfp target control) or pXGgbs0031::gfp expression plasmids. SQ18 expression was normalized to 1.0 for the TOP10 + pZE2R-SQ18 + pXG-0 strain. Asterisks indicate a significant difference between mean values in unpaired t tests (P < 0.01).