| Literature DB >> 19737799 |
Timo Lassmann1, Yoshihide Hayashizaki, Carsten O Daub.
Abstract
MOTIVATION: Next-generation parallel sequencing technologies produce large quantities of short sequence reads. Due to experimental procedures various types of artifacts are commonly sequenced alongside the targeted RNA or DNA sequences. Identification of such artifacts is important during the development of novel sequencing assays and for the downstream analysis of the sequenced libraries.Entities:
Mesh:
Year: 2009 PMID: 19737799 PMCID: PMC2781754 DOI: 10.1093/bioinformatics/btp527
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Percentages of reads identified as artifacts in five sequencing runs at varying FDR thresholds
| Description | Accession | Sequences | FDR 0.05 (%) | FDR 0.01 (%) | FDR 0.001 (%) | CPU sec. |
|---|---|---|---|---|---|---|
| Genomic PE (18 nt) | ERR000017 | 6 381 596 | 1.4 (98.79) | 0.4 (98.91) | 0.1 (98.61) | 28 |
| Genomic PE (36 nt) | ERR000130 | 10 209 914 | 3.2 (84.05) | 0.8 (52.72) | 0.4 (11.44) | 84 |
| Genomic (25 nt) | SRR000723 | 7 230 975 | 1.7 (57.64) | 0.5 (54.26) | 0.1 (36.44) | 45 |
| Chip-Seq (25 nt) | SRR000731 | 6 011 079 | 3.7 (29.15) | 2.5 (12.81) | 2.0 (1.73) | 37 |
| RNA-Seq (33 nt) | SRR002052 | 12 099 833 | 1.8 (23.32) | 0.6 (22.30) | 0.1 (20.38) | 103 |
The mapping rates of the artifactual sequences to the human genome are indicated in brackets. The last column lists the runtime of TagDust in CPU seconds for the 0.05 FDR cutoff.