| Literature DB >> 19223322 |
Jane M Liu1, Jonathan Livny, Michael S Lawrence, Marc D Kimball, Matthew K Waldor, Andrew Camilli.
Abstract
Direct cloning and parallel sequencing, an extremely powerful method for microRNA (miRNA) discovery, has not yet been applied to bacterial transcriptomes. Here we present sRNA-Seq, an unbiased method that allows for interrogation of the entire small, non-coding RNA (sRNA) repertoire in any prokaryotic or eukaryotic organism. This method includes a novel treatment that depletes total RNA fractions of highly abundant tRNAs and small subunit rRNA, thereby enriching the starting pool for sRNA transcripts with novel functionality. As a proof-of-principle, we applied sRNA-Seq to the human pathogen Vibrio cholerae. Our results provide information, at unprecedented depth, on the complexity of the sRNA component of a bacterial transcriptome. From 407 039 sequence reads, all 20 known V. cholerae sRNAs, 500 new, putative intergenic sRNAs and 127 putative antisense sRNAs were identified in a limited number of growth conditions examined. In addition, characterization of a subset of the newly identified transcripts led to the identification of a novel sRNA regulator of carbon metabolism. Collectively, these results strongly suggest that the number of sRNAs in bacteria has been greatly underestimated and that future efforts to analyze bacterial transcriptomes will benefit from direct cloning and parallel sequencing experiments aided by 5S/tRNA depletion.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19223322 PMCID: PMC2665243 DOI: 10.1093/nar/gkp080
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Results from V. cholerae sRNA-Seq experiment. (A) Length distribution of all 681 205 reads from 454 sequencing. Reads that match the V. cholerae genome are in blue; reads that represent contaminants are in yellow. (B) Breakdown of total V. cholerae reads from 454 sequencing based on their genomic origin (n = 407 039). ORF, annotated open reading frames; AS, transcripts antisense to ORFs; IGR, transcripts from intergenic regions. (C) Breakdown of sequencing reads that correspond to candidate sRNAs based on their genomic origin (n = 205 537). The IGR-derived candidates include the 92 690 reads of known or previously predicted V. cholerae sRNAs. (D) Length distribution of sRNA candidates. The 2140 sRNA candidates, corresponding to 205 537 total reads, are plotted based on the length of the most abundant sequence observed for each candidate. (E) Visual representation of the depth of the sRNA-Seq data. Approximately 50% of the reads mapping to the V. cholerae genome grouped into candidate sRNAs that were found in two or more samples (yellow). All the previously known and verified V. cholerae sRNAs (white) were amongst these candidate reads; this includes the 20 sRNAs of known function and the 5 sRNAs predicted and verified by northern blot analysis (20,21).
Breakdown of all sequences from 454 pyrosequencing
| Percentage | ||
|---|---|---|
| 407 039 | 60 | |
| Yeast (rRNA) | 107 217 | 16 |
| Yeast (tRNA) | 31 586 | 5 |
| Yeast (RNA) | 27 227 | 4 |
| Other/unidentified | 57 448 | 8 |
| Short (<15 nt) | 50 688 | 7 |
| Total | 681 205 | 100 |
Analysis of candidate sRNAs
| Gene(s) | Coordinates | nt | NB | Rfam | blastn | Prom | Term | |
|---|---|---|---|---|---|---|---|---|
| VC0019<< VC0020<< | 16673–16801 (+) | 129 | + | – | ++g | + | – | |
| VC1130<< VC1131>> | 1199011–1199144 (+) | 134 | +/− | – | + | – | – | |
| VC2384<< VC2385>> | 2549444–2549562 (+) | 119 | + | msr | +g | – | + | |
| VCA0518<< VCA0519>> | 449220–449323 (−) | 104 | + | – | Only Vc | + | – | |
| VCA0279>> VCA0281>> | 300143–300253 (+) | 110 | + | – | +g | + | – | |
| VC0175<< VC0176<< | 177132–177267 (−) | 136 | + | – | Only Vc | + | – | |
| VCA1044>> VCA1045>> | 994875–994876 (−) | 120 | + | – | + | – | – | |
| VC0512 | 542160–542294 (−) | 135 | – | – | + | + | – | |
| VC0658 | 701570–701732 (+) | 163 | – | – | + | – | – | |
| VC2063 | 2217964–2218094 (+) | 131 | – | – | ++ | – | – | |
| VC2203 | 2349040–2349128 (−) | 89 | – | – | + | + | – | |
| VC1225 | 1301660–1301792 (−) | 133 | + | – | +g | – | – | |
| VC2332 | 2482288–2482403 (+) | 116 | – | – | + | + | – | |
| VCA0644 | 578617–578785 (−) | 169 | + | – | ++ | – | – | |
| VCA0676 | 611959–612076 (−) | 118 | +/− | – | + | + | – | |
| VC2269 | 2424422–2424525 (+) | 104 | + | – | ++ | + | – | |
| VCA1078 | 1031452–1031586 (+) | 135 | – | – | ++ | – | – | |
| VC2540 | 2722407–2722525 (−) | 119 | + | – | ++ | – | – | |
| VC0215 | 222692–222827 (−) | 136 | + | – | ++ | – | – | |
| VC0913 | 974460–974582 (+) | 123 | + | – | + | – | – | |
| VC1499 | 1611161–1611276 (+) | 116 | + | – | + | + | – | |
| VC0689 | 736177–736300 (+) | 124 | – | – | + | + | – | |
| VC0957 | 1021523–1021633 (+) | 111 | + | – | + | – | – |
aFor IGR-sRNA candidates, the gene numbers for the up- and downstream ORF are indicated. Genes encoded on the plus strand are denoted with >> and genes encoded on the minus strand are denoted with <<.
bCoordinates and length reported here represent the most abundant sequence corresponding to each candidate transcript. (+) plus strand; (−) minus strand.
cNorthern blot analysis; (+) band of expected size observed; (+/−) bands of significantly higher or lower molecular weight than expected observed; (–) no bands observed.
dCandidate was queried against the Rfam database (http://www.sanger.ac.uk/Software/Rfam/index.shtml); matches to known sRNAs are indicated.
eSequence conservation of candidate in other microbial organisms was queried using the BLASTN algorithm. (Only Vc) sequence was not observed in other bacterial species; (+) sequence was conserved primarily in Vibrionaceae; (++) sequence was conserved in many bacterial species.
fCandidates (including the 100 nt up- and downstream) were analyzed using BPROM and FindTerm software (Softberry, Mount Kisco, NY). (+) promoter/terminator was predicted for candidate; (–) promoter/terminator not predicted for candidate.
gMultiple hits within V. cholerae N16961 genome observed by BLASTN analysis.
Figure 2.sRNA candidate IGR7. (A) Predicted secondary structure of candidate IGR7 (27,28). (B) Northern blot analysis of IGR7 expression (top panel) throughout growth of V. cholerae in LB medium. Total RNA samples were prepared at the indicated OD600. 5S rRNA is shown as a loading control (bottom panel).
Figure 3.Analysis of IGR7 sRNA. (A) Intracellular abundance of mtlA mRNA and IGR7 sRNA, relative to 4.5S, as measure by qRT-PCR. Error bars represent standard deviations of three or more independent trials. (B) Growth curves of derivatives of V. cholerae N16961 harboring empty vector (pJML01) or plasmid expressing IGR7 from an arabinose-inducible promoter (pIGR7). All samples were grown in M9 media supplemented with 0.4% glycerol (Gly) or 0.4% mannitol (Mtl). Arabinose was added at a final concentration of 0.02% to all samples. All samples grown without arabinose grew similar to each other and to V. cholerae without plasmid (data not shown). (C) Growth curves of derivatives of V. cholerae N16961 harboring empty vector (pJML01) or plasmid expressing AS-IGR7 from an arabinose-inducible promoter (pAS). Growth conditions were the same as in (B). Error bars represent standard deviation of three independent trials.
Figure 4.Alignment of the IGR7 homolog sequences.