Literature DB >> 18996942

Human genomes as email attachments.

Scott Christley1, Yiming Lu, Chen Li, Xiaohui Xie.   

Abstract

SUMMARY: The amount of genomic sequence data being generated and made available through public databases continues to increase at an ever-expanding rate. Downloading, copying, sharing and manipulating these large datasets are becoming difficult and time consuming for researchers. We need to consider using advanced compression techniques as part of a standard data format for genomic data. The inherent structure of genome data allows for more efficient lossless compression than can be obtained through the use of generic compression programs. We apply a series of techniques to James Watson's genome that in combination reduce it to a mere 4MB, small enough to be sent as an email attachment.

Entities:  

Mesh:

Year:  2008        PMID: 18996942     DOI: 10.1093/bioinformatics/btn582

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  45 in total

1.  Compressive genomics.

Authors:  Po-Ru Loh; Michael Baym; Bonnie Berger
Journal:  Nat Biotechnol       Date:  2012-07-10       Impact factor: 54.908

2.  Data structures and compression algorithms for genomic sequence data.

Authors:  Marty C Brandon; Douglas C Wallace; Pierre Baldi
Journal:  Bioinformatics       Date:  2009-05-15       Impact factor: 6.937

3.  Efficient storage of high throughput DNA sequencing data using reference-based compression.

Authors:  Markus Hsi-Yang Fritz; Rasko Leinonen; Guy Cochrane; Ewan Birney
Journal:  Genome Res       Date:  2011-01-18       Impact factor: 9.043

4.  Compressing genomic sequence fragments using SlimGene.

Authors:  Christos Kozanitis; Chris Saunders; Semyon Kruglyak; Vineet Bafna; George Varghese
Journal:  J Comput Biol       Date:  2011-03       Impact factor: 1.479

5.  An extended IUPAC nomenclature code for polymorphic nucleic acids.

Authors:  Andrew D Johnson
Journal:  Bioinformatics       Date:  2010-03-03       Impact factor: 6.937

6.  ERGC: an efficient referential genome compression algorithm.

Authors:  Subrata Saha; Sanguthevar Rajasekaran
Journal:  Bioinformatics       Date:  2015-07-02       Impact factor: 6.937

7.  Quality score compression improves genotyping accuracy.

Authors:  Y William Yu; Deniz Yorukoglu; Jian Peng; Bonnie Berger
Journal:  Nat Biotechnol       Date:  2015-03       Impact factor: 54.908

8.  Compression and fast retrieval of SNP data.

Authors:  Francesco Sambo; Barbara Di Camillo; Gianna Toffolo; Claudio Cobelli
Journal:  Bioinformatics       Date:  2014-07-26       Impact factor: 6.937

9.  Personal genomic information management and personalized medicine: challenges, current solutions, and roles of HIM professionals.

Authors:  Amal Alzu'bi; Leming Zhou; Valerie Watzlaf
Journal:  Perspect Health Inf Manag       Date:  2014-04-01

10.  NRGC: a novel referential genome compression algorithm.

Authors:  Subrata Saha; Sanguthevar Rajasekaran
Journal:  Bioinformatics       Date:  2016-08-02       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.