| Literature DB >> 20472541 |
Henrik Stranneheim1, Max Käller, Tobias Allander, Björn Andersson, Lars Arvestad, Joakim Lundeberg.
Abstract
MOTIVATION: New generation sequencing technologies producing increasingly complex datasets demand new efficient and specialized sequence analysis algorithms. Often, it is only the 'novel' sequences in a complex dataset that are of interest and the superfluous sequences need to be removed.Entities:
Mesh:
Year: 2010 PMID: 20472541 PMCID: PMC2887045 DOI: 10.1093/bioinformatics/btq230
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Timing, sensitivity and specificity of FACS versus BLAT and SSAHA2 on the synthetic metagenome dataset
| Method | Time | Sensitivity | Specificity | |
|---|---|---|---|---|
| size | (min) | (%) | (%) | |
| SSAHA2/454 | 12 | 32.4 | 98.6 | 98.9 |
| BLAT/11occ | 11 | 12.5 | 99.8 | 100 |
| BLAT/11occ/fastMap | 11 | 1.5 | 43.6 | 100 |
| BLAT/11occ/fastMap | 11 | 1.5 | 66.4 | 100 |
| FACS | 21 | 1.7 | 98.1 | 100 |
| FACS | 21 | 1.7 | 99.8 | 100 |
Sensitivity = true positives/(true positives + false negatives), specificity = true positives/(true positives + false positives). Classified reads were removed from further querying of subsequent reference genomes.
aMatch cut-off: 65% sequence similarity over an alignment spanning at least 70% of the query length.
bMatch cut-off: 45% sequence similarity.
cMatch cut-off: 35% sequence similarity.
Fig. 1.Venn diagram comparing read classification using three methods, with the human mitochondrial genome as a reference.
Classified reads from the experimental dataset using the human genome as a reference
| Method | Time (min) | Reads Classified | |
|---|---|---|---|
| SSAHA2/454 | 12 | 189 | 48 692 |
| BLAT/11occ | 11 | 129 | 39 403 |
| BLAT/11occ/fastMap | 11 | 11 | 24 244 |
| FACS | 21 | 6 | 40 074 |
aMatch cut-off: 65% sequence similarity over an alignment spanning at least 70% of the query length.
bMatch cut-off: 45.5% sequence similarity.
cMatch cut-off: 45% sequence similarity.
Fig. 2.(A) Venn diagram comparing read classification using three methods with the human genome as a reference. (B) Nature of the unique reads for the three methods.