| Literature DB >> 29587634 |
Erin E Gill1, Luisa S Chan1, Geoffrey L Winsor1, Neil Dobson1, Raymond Lo1, Shannan J Ho Sui1, Bhavjinder K Dhillon1, Patrick K Taylor2, Raunak Shrestha1, Cory Spencer1, Robert E W Hancock2, Peter J Unrau3, Fiona S L Brinkman4.
Abstract
BACKGROUND: Understanding the RNA processing of an organism's transcriptome is an essential but challenging step in understanding its biology. Here we investigate with unprecedented detail the transcriptome of Pseudomonas aeruginosa PAO1, a medically important and innately multi-drug resistant bacterium. We systematically mapped RNA cleavage and dephosphorylation sites that result in 5'-monophosphate terminated RNA (pRNA) using monophosphate RNA-Seq (pRNA-Seq). Transcriptional start sites (TSS) were also mapped using differential RNA-Seq (dRNA-Seq) and both datasets were compared to conventional RNA-Seq performed in a variety of growth conditions.Entities:
Keywords: Gene expression; Gene regulation; Nucleases; Pseudomonas aeruginosa; RNA processing; RNA-Seq; Transcription; dRNA-Seq; pRNA-Seq
Mesh:
Substances:
Year: 2018 PMID: 29587634 PMCID: PMC5870498 DOI: 10.1186/s12864-018-4538-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1RNA transcription and processing. a Transcription of RNA is initiated from a promoter sequence (indicated in red) within the genome. Ribonucleoside triphosphate polymerisation results in a 5′ triphosphate at the 5′ end of the nascent transcript and a 3′ hydroxyl at its 3′ terminus. b mRNAs can be internally cleaved by endonucleases to yield two RNA fragments or can be degraded by exonucleases from either their 5′ or 3′ termini. The activity of exonucleases is often triggered by the selective dephosphorylation of a terminal triphosphate to a monophosphate by a pyrophosphatase. c This study focuses on all RNA processing events that result in either a 5′ triphosphate (dRNA-Seq) or 5′ monophosphate (pRNA-Seq) and that simultaneously contain a terminal 3′ hydroxyl
Library growth conditions and summary of the number and percentage of reads mapped to the Pseudomonas aeruginosa PAO1 reference sequence and including (+ RNA) or excluding (− RNA) rRNA, tRNA and tmRNA genes
| Sample |
|
| |||
|---|---|---|---|---|---|
| Total Reads | Reads Mapped | Percent Mapped | Reads Mapped | Percent Mapped | |
| pRNA-Seq | 272,983,632 | 149,142,710 | 55% | 51,067,158 | 19% |
| dRNA-Seq | 131,130,793 | 108,900,693 | 83% | 29,801,108 | 23% |
| RNA-Seq | 54,000,716 | 51,910,678 | 96% | 16,003,536 | 30% |
| RNA-Seq | 63,854,706 | 61,850,374 | 97% | 6,802,731 | 11% |
| RNA-Seq | 75,437,354 | 72,729,310 | 96% | 30,068,678 | 40% |
| RNA-Seq | 20,643,394 | 17,559,382 | 85% | 3,352,489 | 16% |
In all libraries but 110817_SN865, which consists of single end reads, the mapped reads are in proper pairs with a maximum insert size < 1000. A moderate drop in read quality (55% aligned reads) was observed with the dRNA-Seq library, which we attributed to the extra RNA manipulation steps required for library construction. We employed the same rigorous alignment quality thresholds with the dRNA-Seq library as with all other libraries to ensure that only high quality reads were mapped
Fig. 2Identification of precise RNA processing events within the ssrA transcript (tmRNA) and tRNA operon transcripts. a A histogram showing precise processing of the ssrA gene (by RNase P), with the 5′ ends of processed pRNA-Seq paired end reads aligning exactly at a single genomic location. Transcription for this gene is initiated 60-nt upstream of the location of RNase P cleavage. b Conventional RNA-Seq suggests the possibility that complex RNA processing occurs within the tRNA operon shown. pRNA-Seq indicates that a series of precise RNA cleavages occur downstream from a single strong initial transcriptional start
Fig. 3Circular plot showing distribution of mapped RNA-Seq reads and 5′ monophosphate cleavage sites throughout the Pseudomonas aeruginosa PAO1 genome. The outer green and blue tracks represent reverse-strand and forward-strand genes, respectively. The third track from the outside is a heat map showing log10 first base-pair coverage (100 bp window) of RNA-Seq reads where a transition from yellow to green to blue correlates with increased transcription. The histogram with a grey background shows log10 coverage of 5′ monophosphate sites on both the forward and reverse strands. This is followed by a track containing rRNA (green), tRNA (blue) and ribosomal proteins (purple). The innermost track shows G-C skew (1000 bp window)
Fig. 4Relative cleavage site position within genes and reading frame dependent cleavage bias. a The relative distance that cleavage sites fall within protein coding genes was plotted in a histogram (see materials and methods). Cleavage sites are shown in turquoise, 5′ monophosphate sites occurring at the location of a TSS (i.e. potential 5′ pyrophosphatase sites) are shown in yellow. The remaining sites are distributed relatively evenly throughout the ORF. b The reading frames of cleavage products were determined for both positive (navy blue) and negative (red) stranded transcripts. Cleavage at site 1 occurs 5′ to nucleotide 1 (↓N1 N2 N3), cleavage at site 2 occurs 5′ to nucleotide 2 (5’-N1↓N2 N3) and cleavage at site 3 occurs 5′ to nucleotide 3 (5’-N1N2↓N3)
Fig. 55’ Monophosphate processing patterns and their corresponding sequence motifs. Cleavage sites are derived from genome locations with 100 reads or more 1st bp coverage from our pRNA-Seq library. Peak shape refers to the number of mapped transcripts surrounding a cleavage site. Peak shapes were categorized using k-mean clustering. Motifs were calculated from peak shape clusters using MEME [68]. The graph at the top of each panel shows normalized peak height. The WebLogo [69] at the bottom of each panel shows the sequence motif associated with each peak shape. a The global motif, which is derived from the entire pRNA-Seq dataset. b the “Sharp” peak shape motif, shows strong similarity to the RNase E motif in E. coli [38]. c the “Tail L” motif and (d) the “Tail R” motif
Association of cleavage site motif types with KEGG functional terms
| Test by peak shape | KEGG functional terms | Number of genes in set with this function | Number of Cleavage Sites by KEGG Term | |
|---|---|---|---|---|
| Global | Oxidative phosphorylation | 4.90E-06 | 14 | 54 |
| Purine metabolism | 2.08E-03 | 13 | 37 | |
| Aminoacyl-tRNA biosynthesis | 8.94E-03 | 7 | 16 | |
| RNA polymerase | 1.72E-02 | 3 | 19 | |
| Pyrimidine metabolism | 4.67E-02 | 8 | 26 | |
| Protein export | 4.93E-02 | 5 | 37 | |
| Sharp | Oxidative phosphorylation | 2.60E-05 | 12 | 54 |
| Purine metabolism | 1.13E-03 | 12 | 37 | |
| RNA polymerase | 9.13E-03 | 3 | 19 | |
| Pyrimidine metabolism | 1.14E-02 | 8 | 26 | |
| Aminoacyl-tRNA biosynthesis | 1.89E-02 | 6 | 16 | |
| Tail L | RNA polymerase | 7.11E-05 | 3 | 19 |
| Protein export | 1.24E-02 | 3 | 37 | |
| Purine metabolism | 1.39E-02 | 5 | 37 | |
| Tail R | RNA polymerase | 4.52E-02 | 3 | 54 |
| Protein export | 1.51E-02 | 2 | 37 | |
| Twin R | Oxidative phosphorylation | 9.70E-04 | 5 | 54 |
| Protein export | 1.03E-02 | 3 | 37 | |
| Oxidative phosphorylation | 2.52E-02 | 4 | 54 | |
| Citrate cycle (TCA cycle) | 4.94E-02 | 3 | 39 |
Certain cleavage site motifs are disproportionately located within genes that are associated with certain KEGG functional terms. The following table lists the KEGG terms associated with each cleavage site motif type and the p-values associated with each KEGG term. P-values were calculated from hypergeometric tests for KEGG pathways and Holm’s test was used to correct for multiple testing
Fig. 6Correlation between TSS occurring in the antiparallel orientation and a palindromic sequence motif. a TSS show a significant correlation with TSS in the opposite direction that are 18-bp upstream (black curve, n = -18). No statistically significant correlation was found between TSS having the same strand orientation (red curve). The correlation function defined as: C(n) = Sum[x(l)y(l + n)], l = 1..G-n, with G being the genome size, x(l) plus strand TSS counts, y(l) either minus strand TSS counts (black curve) or plus strand TSS counts (red curve). The maximum value of the correlation function was normalized to one for each curve. b A “TATNATA” motif occurs between antiparallel TSS. Of the 105 antiparallel TSS found that where 18-bp appart, 3 match the motif exactly, 27 have one mismatch, 25 have two mismatches and 13 have three mismatches
Small RNA species detected reliably by RNA-Seq and confirmed by RT-qPCR. Differential expression in biofilms and during swarming motility
| Genomic Coordinates | Name | Size bp | Identity Gomez-Lozano, et al. | Identity Wurtzel, et al. | Complementarity (potential binding sites within other transcripts) | Fold change in biofilms | Fold change in swarming motility |
|---|---|---|---|---|---|---|---|
| 143,349–143,517 | PA0123.1 | 169 | pant15 | Not identified | None | 1 | 10 |
| 326,875–327,066 | PA0290.1 | 192 | pant37 | PA14sr_012 | −9 | 1 | |
| 334,491–334,686 | PA0296.1 | 196 | Not identified | P1 |
| −2 | 1 |
| 354,527–354,742 | PA0314.1 | 216 | pant42 | Not identified | None | −5 | 1 |
| 586,867–586,990 | PA0527.11 | 124 |
|
| None | 4 | 11 |
| 720,091–720,345 | PA0667.1 | 255 | Not identified | Not identified | PA3505, PA2897, PA0690 | −3 | 1 |
| 798,865–799,255 | PA0730.1 | 391 | pant80 | PA14sr_122 | None | −3 | 44 |
| 883,307–883,582 | PA0805.1 | 276 | pant89 | PA14sr_119PA14sr_120 | None | −5 | −5 |
| 1,045,414–1,045,733 | PA0958.1 | 320 | pant103 | PA14sr_112 | None | −6 | 2 |
| 1,182,820–1,183,057 | PA1091.1 | 238 | pant119 | Not identified | PA0588 | 3 | 5 |
| 1,254,432–1,254,698 | PA1156.1 | 267 | pant125 | PA14sr_105 | PA1123, | 1 | 1 |
| 2,761,459–2,761,704 | PA2461.1 | 246 | pant225, | PA14sr_076, | PA2460, PA2458 | −8 | −3 |
| 2,761,599–2,761,911 | PA2461.2 | 313 | pant226 | PA14sr_077 | PA2460, PA2458 | −5 | 1 |
| 2,964,898–2,965,137 | PA2461.3 | 240 | pant233 | Not identified | PA5134 | −2 | 3 |
| 2,977,373–2,977,611 | PA2633.1 | 239 | pant235 | PA14sr_067 | PA3672, | 5 | 8 |
| 3,312,577–3,312,693 | PA2952.1 | 117 | Not identified | PA14sr_061 | PA4629 | −2 | 2 |
| 3,545,572–3,545,872 | PA3159.1 | 301 | pant292 | no PA14 ortholog | None | −2 | 3 |
| 3,697,226–3,697,433 | PA3299.1 | 208 | Not identified | Not identified | PA0690, | 2 | 1 |
| 3,930,282–3,930,639 | PA3514.1 | 358 | pant326 | no PA14 ortholog |
| 10 | 1 |
| 4,012,653–4,012,735 | PA3580.1 | 83 | pant337 | Not identified | None | −4 | 1 |
| 4,536,541–4,536,848 | PA4055.1 | 308 | pant373 | Not identified | PA2728, | −3 | −2 |
| 5,080,450–5,080,630 | PA4539.1 | 181 | pant415 | Not identified |
| 2 | 10 |
| 5,207,898–5,208,463 | PA4639.1 | 566 | pant428 | PA14sr_139 | PA5156, PA2502, PA4510, | 5 | n/a |
| 5,224,568–5,224,795 | PA4656.1 | 228 | pant430 | Not identified | PA2038, PA3517, PA2152, | −4 | 3 |
| 5,283,960–5,284,110 | PA4704.11 | 151 |
|
|
| 12 | 217 |
| 5,284,172–5,284,319 | PA4704.21 | 148 |
|
|
| 20 | 141 |
| 5,283,960–5,284,319 |
| 360 |
|
|
| 13 | 163 |
| 5,309,047–5,309,325 | PA4726.3 | 279 | pant439 | PA14sr_141 | −5 | 1 | |
| 5,718,503–5,718,753 | PA5078.1 | 251 | pant465 | Not identified | PA0312, | −6 | 3 |
| 5,973,539–5,973,681 | PA5304.1 | 143 | pant487 | Not identified | None | 1 | 5 |
| 5,986,186–5,986,427 | PA5316.2 | 242 | pant488 | PA14sr_154 | PA1302, PA2933, | −7 | −2 |
1RsmY and PrrF1–2, included as controls, were previously known to be up-regulated during the biofilm mode of growth [24]. The two small RNAs named PrrF1 and PrrF2 are adjacent and very similar in sequence and have been reported to also form a single transcript named PrrH
2n/a indicates that no consistent amplification or expression of that transcript was detectable under the given adaptive condition in three biological replicates. Expression was observed under mid logarithmic growth conditions
Four other putative sRNAs (68,836–69,271, 707,395–707,685, 830,970–831,031, and 99,801–100,048) could not be confirmed by RT-qPCR