| Literature DB >> 16870723 |
Jonathan Livny1, Anja Brencic, Stephen Lory, Matthew K Waldor.
Abstract
sRNAs are small, non-coding RNA species that control numerous cellular processes. Although it is widely accepted that sRNAs are encoded by most if not all bacteria, genome-wide annotations for sRNA-encoding genes have been conducted in only a few of the nearly 300 bacterial species sequenced to date. To facilitate the efficient annotation of bacterial genomes for sRNA-encoding genes, we developed a program, sRNAPredict2, that identifies putative sRNAs by searching for co-localization of genetic features commonly associated with sRNA-encoding genes. Using sRNAPredict2, we conducted genome-wide annotations for putative sRNA-encoding genes in the intergenic regions of 11 diverse pathogens. In total, 2759 previously unannotated candidate sRNA loci were predicted. There was considerable range in the number of sRNAs predicted in the different pathogens analyzed, raising the possibility that there are species-specific differences in the reliance on sRNA-mediated regulation. Of 34 previously unannotated sRNAs predicted in the opportunistic pathogen Pseudomonas aeruginosa, 31 were experimentally tested and 17 were found to encode sRNA transcripts. Our findings suggest that numerous genes have been missed in the current annotations of bacterial genomes and that, by using improved bioinformatic approaches and tools, much remains to be discovered in 'intergenic' sequences.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16870723 PMCID: PMC1524904 DOI: 10.1093/nar/gkl453
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Predicted P.aeruginosa sRNAs subjected to experimental confirmation
| sRNA namea | Start | End | Dir. | Lengthb | Approximate observed lengthsc | Conditiond | 5′ Gene length | Dir. | Distancee | 3′ Gene length | Dir. | Distancef |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 334 549 | 334 728 | < | 179 | 300,350 | E S | 1374 | < | 93 | 750 | > | 5 | |
| P2 | 356 478 | 356 615 | > | 137 | — | — | 1227 | < | 0 | 1392 | > | 65 |
| P4 | 747 255 | 747 469 | < | 214 | — | — | 1212 | > | 298 | 1104 | > | 0 |
| P5 | 912 791 | 912 861 | > | 70 | 90 | E S | 1182 | < | 11 | 483 | > | 224 |
| P6 | 971 645 | 972 154 | < | 509 | — | — | 1953 | > | 19 | 777 | > | 11 |
| 971 858 | 971 972 | > | 114 | 140,150 | E S | 1953 | > | 232 | 777 | > | 193 | |
| P8 | 1 117 532 | 1 117 609 | > | 77 | 130 | E S | 753 | > | 141 | 1359 | > | 548 |
| P9 | 1 436 491 | 1 436 618 | > | 127 | 130 | E S | 510 | > | 93 | 405 | > | 44 |
| P10 | 1 807 682 | 1 807 753 | > | 71 | — | — | 405 | > | 33 | 1578 | > | 439 |
| P11 | 1 928 666 | 1 928 886 | < | 220 | 100 | E S | 2448 | < | 39 | 1593 | < | 7 |
| 3 106 919 | 3 106 994 | < | 75 | 70 | E S | 1920 | < | 167 | 861 | < | 6 | |
| P14 | 3 206 733 | 3 206 877 | > | 144 | 300 | — | 87 | > | 235 | 249 | > | 36 |
| 3 299 022 | 3 299 269 | > | 247 | 180 | E S | 1014 | < | 101 | 1092 | > | 221 | |
| P16 | 3 318 663 | 3 318 859 | > | 196 | 110 | S | 1131 | > | 6 | 774 | < | 22 |
| P17 | 3 677 435 | 3 677 716 | < | 281 | — | — | 312 | < | 354 | 609 | < | 2 |
| 3 703 022 | 3 703 156 | < | 134 | 100 | E S | 855 | < | 72 | 1992 | < | 9 | |
| P19 | 3 705 268 | 3 705 582 | > | 314 | — | — | 1992 | < | 107 | 600 | < | 306 |
| P20 | 3 705 315 | 3 705 622 | < | 307 | 150 | — | 1992 | < | 154 | 600 | < | 266 |
| 4 444 696 | 4 444 952 | < | 256 | 300 | E S | 783 | > | 100 | 507 | > | 24 | |
| P25 | 4 444 893 | 4 444 961 | < | 68 | — | — | 783 | > | 297 | 507 | > | 15 |
| P26 | 4 780 768 | 4 780 833 | < | 65 | 250 | E S | 4071 | < | 151 | 366 | < | 4 |
| 4 781 786 | 4 781 978 | > | 192 | 90 | E S | 498 | < | 0 | 693 | < | 5 | |
| P28 | 4 956 328 | 4 956 536 | > | 208 | 180 | E S | 453 | < | 299 | 846 | < | 196 |
| 5 308 743 | 5 308 964 | < | 221 | 180 | E S | 1434 | > | 318 | 1401 | > | 360 | |
| P31 | 5 344 903 | 5 344 981 | > | 78 | — | — | 1134 | < | 0 | 804 | < | 103 |
| P32 | 5 344 950 | 5 345 060 | < | 110 | 80 | E S | 1134 | < | 47 | 804 | < | 24 |
| P33 | 5 775 061 | 5 775 220 | > | 159 | — | — | 1428 | > | 254 | 465 | < | 398 |
| 5 835 082 | 5 835 480 | < | 398 | 150 | E S | 2319 | > | 12 | 411 | > | 0 | |
| 4 985 782 | 4 985 843 | < | 61 | 55 | E S | 237 | < | 52 | 306 | < | 2 | |
| P36 | 5 308 433 | 5 308 493 | > | 60 | 75,80 | E S | 1434 | > | 8 | 1401 | > | 831 |
| P37 | 5 672 302 | 5 672 363 | < | 61 | — | — | 4443 | < | 161 | 1653 | < | 1 |
aThe names of sRNAs tested with two independent probes are in boldface.
bThe predicted length of the candidate sRNA-encoding genes includes the length of the putative Rho-independent terminator.
cThe approximate length(s) of the major species detected in each blot.
dGrowth condition(s) under which the transcript(s) was observed (E, exponential phase; S, stationary phase). Boldface indicate transcript(s) was significantly more abundant under the indicated condition.
eThe distance between the end of the upstream gene and the predicted start of the candidate sRNA.
fThe distance between the start of the downstream gene and the predicted end of the candidate sRNA.
Figure 1Detection of novel sRNAs by northern analysis. Total RNA was extracted from cultures of P.aeruginosa strain PAO1 grown in LB to exponential phase (first lane in each blot) or stationary phase (second lane in each blot). Blots were hybridized to radiolabeled DNA oligonucleotide probes and then exposed for varying times; thus the relative intensities of the signals do not correspond to the relative abundance of each sRNA. The approximate positions of size standards are shown on the left. Boxes are included to highlight the major species observed in each blot.
Figure 2The accuracy (closed diamond) and sensitivity (closed square) of the predictive search increases and decreases, respectively, as the BLAST stringency is increased. The accuracy corresponds to the percentage of sRNAs predicted at the indicated BLAST stringencies that were confirmed. The sensitivity corresponds to the percentage of all 23 experimentally confirmed P.aeruginosa sRNAs that were predicted at the indicated BLAST stringencies.
Figure 3Venn diagram showing the number of novel sRNAs confirmed per the number of novel sRNAs predicted based on conservation between P.aeruginosa and the three BLAST partner species used for comparison.
Summary of sRNAPredict2 annotations for sRNA-encoding genes in 11 species of pathogens
| No. of BLAST partnersa | Total no. of sRNAs predictedb | No. of unique sRNAs predictedc | No. of novel sRNAs predictedd… | …conserved in >1 partner | …conserved in >2 partners | …encoding predicted conserved RNA structure | Total kb IGR conservatione | Proportion of previously annotated sRNAs predicted | No. of IG terminators | No. of unique sRNA/Mb genome | No. of unique sRNA/10 kb IGR | No. of unique sRNAs/IG terminator | No. of total sRNAs/100 kb IGR conservation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 1828 | 947 | 889 | 688 | 23 | 858 | 1235.4 | 2/3 | 2895 | 181.1 | 12.73 | 0.327 | 14.80 | |
| 7 | 1478 | 772 | 755 | 443 | 221 | 550 | 1627.5 | 13/24 | 1201 | 167.8 | 9.45 | 0.643 | 10.62 | |
| 7 | 1284 | 599 | 572 | 321 | 145 | 460 | 1486.3 | 22/39 | 1289 | 116.8 | 7.73 | 0.465 | 8.64 | |
| 3 | 344 | 168 | 144 | 84 | 48 | 106 | 366.2 | 3/4 | 819 | 59.8 | 3.36 | 0.205 | 9.39 | |
| 4 | 426 | 161 | 138 | 108 | 77 | 127 | 534.3 | 0/3 | 852 | 56.1 | 4.22 | 0.189 | 7.97 | |
| 5 | 111 | 63 | 62 | 29 | 13 | 48 | 78.8 | 0/3 | 675 | 29.2 | 2.52 | 0.093 | 14.08 | |
| 5 | 126 | 62 | 56 | 26 | 19 | 49 | 247.1 | 2/3 | 613 | 33.5 | 2.28 | 0.101 | 5.10 | |
| 4 | 78 | 56 | 56 | 13 | 6 | 47 | 199.6 | 0/3 | 114 | 12.7 | 1.75 | 0.491 | 3.91 | |
| 6 | 130 | 50 | 50 | 42 | 29 | 8 | 114.6 | 0/3 | 126 | 29.9 | 3.60 | 0.397 | 11.35 | |
| 5 | 51 | 43 | 43 | 4 | 3 | 30 | 81.2 | 0/3 | 167 | 41.3 | 4.58 | 0.257 | 6.28 | |
| 3 | 72 | 38 | 34 | 19 | 13 | 25 | 152.3 | 5/6 | 1018 | 6.1 | 0.59 | 0.037 | 4.73 | |
| Total (10 genomesf) | 6073 | 2912 | 2759 | 1758 | 584 | 2308 | ||||||||
| Max. fold differenceg | 35.8 | 20.3 | 22.2 | 29.7 | 21.6 | 17.4 | 3.8 |
aStrain names of the species annotated and of their BLAST partners as well as the sources of their genomic sequence files are listed in Supplementary Table S2.
bThe total number of sRNA-encoding genes predicted in all searches between the species of interest and each of its BLAST partners.
cThe number of distinct sRNA-encoding genes predicted in the species of interest as determined by using the sRNAPredict2 Venn diagram function.
dThe number of predicted sRNAs that do not correspond to riboswitches or sRNAs annotated in the Rfam database.
eThe amount of genome sequence conserved between the species of interest and all partner species.
fNot including P.aeruginosa.
gThe fold difference between the largest and smallest values in each column.