| Literature DB >> 34751588 |
Mohamad Al Kadi1, Eiji Ishii1, Dang Tat Truong2, Daisuke Motooka2, Shigeaki Matsuda1, Tetsuya Iida1, Toshio Kodama1,3, Daisuke Okuzaki4,5,6,7.
Abstract
Conventional bacterial genome annotation provides information about coding sequences but ignores untranslated regions and operons. However, untranslated regions contain important regulatory elements as well as targets for many regulatory factors, such as small RNAs. Operon maps are also essential for functional gene analysis. In the last decade, considerable progress has been made in the study of bacterial transcriptomes through transcriptome sequencing (RNA-seq). Given the compact nature of bacterial genomes, many challenges still cannot be resolved through short reads generated using classical RNA-seq because of fragmentation and loss of the full-length information. Direct RNA sequencing is a technology that sequences the native RNA directly without information loss or bias. Here, we employed direct RNA sequencing to annotate the Vibrio parahaemolyticus transcriptome with its full features, including transcription start sites (TSSs), transcription termination sites, and operon maps. A total of 4,103 TSSs were identified. In comparison to short-read sequencing, full-length information provided a deeper view of TSS classification, showing that most internal and antisense TSSs were actually a result of gene overlap. Sequencing the transcriptome of V. parahaemolyticus grown with bile allowed us to study the landscape of pathogenicity island Vp-PAI. Some genes in this region were reannotated, providing more accurate annotation to increase precision in their characterization. Quantitative detection of operons in V. parahaemolyticus showed high complexity in some operons, shedding light on a greater extent of regulation within the same operon. Our study using direct RNA sequencing provides a quantitative and high-resolution landscape of the V. parahaemolyticus transcriptome. IMPORTANCE Vibrio parahaemolyticus is a halophilic bacterium found in the marine environment. Outbreaks of gastroenteritis resulting from seafood poisoning by these pathogens have risen over the past 2 decades. Upon ingestion by humans-often through the consumption of raw or undercooked seafood-V. parahaemolyticus senses the host environment and expresses numerous genes, the products of which synergize to synthesize and secrete toxins that can cause acute gastroenteritis. To understand the regulation of such adaptive response, mRNA transcripts must be mapped accurately. However, due to the limitations of common sequencing methods, not all features of bacterial transcriptomes are always reported. We applied direct RNA sequencing to analyze the V. parahaemolyticus transcriptome. Mapping the full features of the transcriptome is anticipated to enhance our understanding of gene regulation in this bacterium and provides a data set for future work. Additionally, this study reveals a deeper view of a complicated transcriptome landscape, demonstrating the importance of applying such methods to other bacterial models.Entities:
Keywords: RNA-seq; Vibrio parahaemolyticus; nanopore; pathogenicity islands; transcriptome
Year: 2021 PMID: 34751588 PMCID: PMC8577284 DOI: 10.1128/mSystems.00996-21
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1Quantitative evaluation of direct RNA sequencing. (A) Correlation between direct RNA sequencing (dRNA-seq) and short-read Illumina sequencing. A matrix correlation shows a strong correlation between dRNA-seq and Illumina for the same samples, less correlation for biological replicates, and the weakest correlation for different biological samples. (B) Differential gene expression analysis. A Venn diagram illustrates the overlapping of differential gene expression analysis results between dRNA-seq and Illumina. Most of the differentially expressed genes were detected by both methods.
FIG 2Transcription start site identification. (A) A visual example of direct RNA sequencing reads and an annotated TSS. (B) Detected transcription start sites under both conditions, growing with bile (Bile) and without bile (LB). (C) Distribution of lengths of 5′ UTRs of the annotated genes. (D) Analysis of sequences upstream the transcription start sites showed an enriched motif for the elements of the bacterial σ70 promoter. (E) Classification of detected transcription start sites. A TSS can be a (primary) site if it was less than 300 bp from a start codon, an alternative site if it was expressed less than another TSS for the same gene (secondary), internal if it was located in the coding region, or antisense if it was located in the coding region or was less than 100 bp from the coding region on the opposite strand.
FIG 3Transcription termination site identification. (A) An example of a detected TTS. The TTS was selected as the most frequent end site among the end sites after the stop codon. (B) Number of TTSs per gene. Most genes had only one termination site, but others had up to 4 sites. (C) Distribution of lengths of 3′ UTRs for annotated genes. (D) Detection of the rho-independent motif by analyzing the sequences at the termination site.
FIG 4Operon detection. (A) Operon detection and quantification. Operons can be detected using long reads easily in both qualitative and quantitative ways. (B) Distribution of operon gene numbers. (C) Expression level of detected operons relative to their genes’ expression. Operons have low (<20%), medium (20 to 60%) or high (>60%) expression compared to their genes’ expression. (D) Suboperon in the tatABC operon. tatABC showed two operons, tatABC and tatAB.
FIG 5Analysis of the transcriptional landscape of the pathogenicity island. (A) Reannotation of the VP-RS21460 gene; our reconstructed transcript supports the old annotation, VPA1312. It covers the gene body and has a Shine-Dalgarno (SD) sequence near the start codon. (B) Reannotation of the VPA1369 gene. Our reconstructed transcript supports the new annotation, VP-RS21725. It covers the gene body and has a Shine-Dalgarno (SD) sequence near the start codon. (C) Suboperons in the operon VPA1321–VPA1323; all transcriptional units share the same transcription termination sites, but each of them has its own transcription start site (TSS). (D) Northern blot of VPA1321 in Vibrio parahaemolyticus in WT and ΔVPA1321 strains growing with bile and an NaCl concentration of 0.5% (left) and a WT strain growing without bile and an NaCl concentration of 3% (right).