| Literature DB >> 28266571 |
Xin Chen1,2,3,4, Wen-Chi Chou5, Qin Ma6,7, Ying Xu1,2,3.
Abstract
A transcription unit (TU) consists of K ≥ 1consecutive genes on the same strand of a bacterial genome that are transcribed into a single mRNA molecule under certain conditions. Their identification is an essential step in elucidation of transcriptional regulatory networks. We have recently developed a machine-learning method to accurately identify TUs from RNA-seq data, based on two features of the assembled RNA reads: the continuity and stability of RNA-seq coverage across a genomic region. While good performance was achieved by the method on Escherichia coli and Clostridium thermocellum, substantial work is needed to make the program generally applicable to all bacteria, knowing that the program requires organism specific information. A web server, named SeqTU, was developed to automatically identify TUs with given RNA-seq data of any bacterium using a machine-learning approach. The server consists of a number of utility tools, in addition to TU identification, such as data preparation, data quality check and RNA-read mapping. SeqTU provides a user-friendly interface and automated prediction of TUs from given RNA-seq data. The predicted TUs are displayed intuitively using HTML format along with a graphic visualization of the prediction.Entities:
Mesh:
Year: 2017 PMID: 28266571 PMCID: PMC5339711 DOI: 10.1038/srep43925
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Workflow of SeqTU server.
Figure 2Screenshots of the SeqTU server input pages.
(a) SeqTU server homepage; (b) email input page; (c) targeted genome selection page; (d) strand specificity option page; (e) input NC number of the targeted genome; and (f) page for submitting user’s RNA-seq data.
Figure 3Screenshots of the SeqTU result pages for SRR400619.
(a) The progress report table; (b) The final TU prediction table; and (c) An example of computed expression levels over an identified TU in non-strand-specific dataset with default 800 bps upstream and downstream, where the blue histogram represents the read depth over a TU, with the lower part showing the genes in a TU.
Figure 4Screenshots of the SeqTU result pages for SRR578142.
(a) The final TU prediction table; and (b) An example of computed expression levels over an identified TU in strand-specific dataset, where the blue histogram represents the read depth over a TU in forward strand, with the middle part showing the genes in a TU, and the red histogram represents the read depth over a TU in reverse strand.