Literature DB >> 31225861

GTShark: genotype compression in large projects.

Sebastian Deorowicz1, Agnieszka Danek1.   

Abstract

SUMMARY: Nowadays large sequencing projects handle tens of thousands of individuals. The huge files summarizing the findings definitely require compression. We propose a tool able to compress large collections of genotypes almost 30% better than the best tool to date, i.e. squeezing human genotype to less than 62 KB. Moreover, it can also compress single samples in reference to the existing database achieving comparable results.
AVAILABILITY AND IMPLEMENTATION: https://github.com/refresh-bio/GTShark. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2019        PMID: 31225861     DOI: 10.1093/bioinformatics/btz508

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  genozip: a fast and efficient compression tool for VCF files.

Authors:  Divon Lan; Raymond Tobler; Yassine Souilmi; Bastien Llamas
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

2.  Sparse Project VCF: efficient encoding of population genotype matrices.

Authors:  Michael F Lin; Xiaodong Bai; William J Salerno; Jeffrey G Reid
Journal:  Bioinformatics       Date:  2021-04-01       Impact factor: 6.937

3.  XSI - A genotype compression tool for compressive genomics in large biobanks.

Authors:  Rick Wertenbroek; Simone Rubinacci; Ioannis Xenarios; Yann Thoma; Olivier Delaneau
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.