| Literature DB >> 24780064 |
Thorsten Bischler, Matthias Kopf, Björn Voß1.
Abstract
BACKGROUND: RNA-seq and its variant differential RNA-seq (dRNA-seq) are today routine methods for transcriptome analysis in bacteria. While expression profiling and transcriptional start site prediction are standard tasks today, the problem of identifying transcriptional units in a genome-wide fashion is still not solved for prokaryotic systems.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24780064 PMCID: PMC4016656 DOI: 10.1186/1471-2105-15-122
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Chromosome section 563,000 to 583,000 from 26695. The positional log10-normalized coverage of primary read starts (black) and secondary reads (grey) for the forward (top) and the reverse strand (bottom) is visualized. Data for the forward strand is displayed with positive values above the annotation and for the reverse strand with negative values below. Annotated genes are represented by unfilled boxes. Predicted consensus transcript segments (T) are shown in grey and putative subtranscripts ( ) are shown as grey inlays. Transcripts from the maximum number of change points (T) are shown as dark grey boxes. TSSs and operons determined in [2] are indicated by filled and dotted arrows, respectively.
Figure 2Chromosome section 71,500 to 78,500 26695. Data is arranged in the same way as in Figure 1.
RNASEG results on simulated data
| | | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| | 1 | 0.93 | 0.93 | 0.93 | 0.93 | 100.00 | 100.00 | 99.95 | 97.92 | |
| | 10 | - | 0.93 | 0.92 | 0.93 | - | 100.00 | 99.96 | 98.10 | |
| | 100 | - | - | 0.86 | 0.86 | - | - | 99.91 | 99.12 | |
| | 1,000 | - | - | - | 0.67 | - | - | - | 97.73 | |
| | 1 | 0.93 | 0.93 | 0.93 | 0.92 | 100.00 | 100.00 | 99.96 | 98.63 | |
| | 10 | - | 0.93 | 0.92 | 0.92 | - | 100.00 | 99.96 | 98.10 | |
| | 100 | - | - | 0.86 | 0.86 | - | - | 99.91 | 99.12 | |
| | 1,000 | - | - | - | 0.67 | - | - | | 97.73 | |
| | 1 | 0.87 | 0.86 | 0.83 | 0.81 | 100.00 | 99.98 | 99.83 | 97.76 | |
| | 10 | - | 0.86 | 0.83 | 0.81 | - | 99.98 | 99.83 | 97.76 | |
| | 100 | - | - | 0.82 | 0.80 | - | - | 99.81 | 97.76 | |
| 1,000 | - | - | - | 0.67 | - | - | - | 97.75 | ||
F-measure and fraction of recovered sequencing data from the secondary library (in %) for simulated data. RNASEG was applied to the forward strand of region 684,676 to 987,046 of the H. pylori genome. We set = 20,000, = 1,000, w = 100, and varying values for parameters t, a and u. We define: True positives (TP) are genes that are part of a transcript segment, true negatives (TN) are intergenic regions that are not part of a transcript segment, false positives (FP) are intergenic regions that are part of a transcript segment and false negatives (FN) are genes that are not part of a transcript segment. ‘-’ indicates parameter combinations that have not been tested because they are not sensible. Numbers in brackets below t correspond to the average runtime in CPU hours for all simulations with this value of t.