| Literature DB >> 23596984 |
Richard Cg Holland1, Nick Lynch.
Abstract
Next-generation sequencing machines produce large quantities of data which are becoming increasingly difficult to move between collaborating organisations or even store within a single organisation. Compressing the data to assist with this is vital, but existing techniques do not perform as well as might be expected. The need for a new compression technique was identified by the Pistoia Alliance who commissioned an open innovation contest to find one. The dynamic and interactive nature of the contest led to some novel algorithms and a high level of competition between participants.Entities:
Year: 2013 PMID: 23596984 PMCID: PMC3637481 DOI: 10.1186/2047-217X-2-5
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
A selection of entries vs baseline algorithms
| 101: James Bonfield | Compression ratio | 0.1141 | 0.3007 |
| 61: James Bonfield | Compression time | 109.9 | 1020.97 |
| 28: Ryan Braganza | Compression memory | 15040 | 5200 |
| 7: James Bonfield | Decompression time | 100.91 | 104.5 |
| 28: Ryan Braganza | Decompression memory | 13472 | 5008 |
The results from running bzip2 are shown against the winning entries in each category of the contest. Full results from all entries, including links to their source code, are available on the Sequence Squeeze website [4]. Compression ratios are the ratio of compressed file size to original file size (smaller is better). Times are in clockface seconds. Memory usage is peak in kilobytes. Entries with less than 100% round-trip accuracy are excluded.