Literature DB >> 26550772

Efficient genotype compression and analysis of large genetic-variation data sets.

Ryan M Layer1, Neil Kindlon1, Konrad J Karczewski2, Aaron R Quinlan1,3,4.   

Abstract

Genotype Query Tools (GQT) is an indexing strategy that expedites analyses of genome-variation data sets in Variant Call Format based on sample genotypes, phenotypes and relationships. GQT's compressed genotype index minimizes decompression for analysis, and its performance relative to that of existing methods improves with cohort size. We show substantial (up to 443-fold) gains in performance over existing methods and demonstrate GQT's utility for exploring massive data sets involving thousands to millions of genomes. GQT can be accessed at https://github.com/ryanlayer/gqt.

Entities:  

Mesh:

Year:  2015        PMID: 26550772      PMCID: PMC4697868          DOI: 10.1038/nmeth.3654

Source DB:  PubMed          Journal:  Nat Methods        ISSN: 1548-7091            Impact factor:   28.547


  7 in total

1.  Fast and flexible simulation of DNA sequence data.

Authors:  Gary K Chen; Paul Marjoram; Jeffrey D Wall
Journal:  Genome Res       Date:  2008-11-24       Impact factor: 9.043

2.  Searching for missing heritability: designing rare variant association studies.

Authors:  Or Zuk; Stephen F Schaffner; Kaitlin Samocha; Ron Do; Eliana Hechter; Sekar Kathiresan; Mark J Daly; Benjamin M Neale; Shamil R Sunyaev; Eric S Lander
Journal:  Proc Natl Acad Sci U S A       Date:  2014-01-17       Impact factor: 11.205

3.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

4.  Recent explosive human population growth has resulted in an excess of rare genetic variants.

Authors:  Alon Keinan; Andrew G Clark
Journal:  Science       Date:  2012-05-11       Impact factor: 47.728

5.  The variant call format and VCFtools.

Authors:  Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin
Journal:  Bioinformatics       Date:  2011-06-07       Impact factor: 6.937

6.  Big Data: Astronomical or Genomical?

Authors:  Zachary D Stephens; Skylar Y Lee; Faraz Faghri; Roy H Campbell; Chengxiang Zhai; Miles J Efron; Ravishankar Iyer; Michael C Schatz; Saurabh Sinha; Gene E Robinson
Journal:  PLoS Biol       Date:  2015-07-07       Impact factor: 8.029

7.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

  7 in total
  25 in total

1.  Accurate, scalable cohort variant calls using DeepVariant and GLnexus.

Authors:  Taedong Yun; Helen Li; Pi-Chuan Chang; Michael F Lin; Andrew Carroll; Cory Y McLean
Journal:  Bioinformatics       Date:  2021-01-05       Impact factor: 6.937

2.  Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework.

Authors:  Miaoxin Li; Jiang Li; Mulin Jun Li; Zhicheng Pan; Jacob Shujui Hsu; Dajiang J Liu; Xiaowei Zhan; Junwen Wang; Youqiang Song; Pak Chung Sham
Journal:  Nucleic Acids Res       Date:  2017-05-19       Impact factor: 16.971

3.  SeqArray-a storage-efficient high-performance data format for WGS variant calls.

Authors:  Xiuwen Zheng; Stephanie M Gogarten; Michael Lawrence; Adrienne Stilp; Matthew P Conomos; Bruce S Weir; Cathy Laurie; David Levine
Journal:  Bioinformatics       Date:  2017-08-01       Impact factor: 6.937

Review 4.  Advances in Genomic Discovery and Implications for Personalized Prevention and Medicine: Estonia as Example.

Authors:  Bram Peter Prins; Liis Leitsalu; Katri Pärna; Krista Fischer; Andres Metspalu; Toomas Haller; Harold Snieder
Journal:  J Pers Med       Date:  2021-04-29

5.  Sparse Allele Vectors and the Savvy Software Suite.

Authors:  Jonathon LeFaive; Albert V Smith; Hyun Min Kang; Gonçalo Abecasis
Journal:  Bioinformatics       Date:  2021-05-14       Impact factor: 6.931

Review 6.  Computational pan-genomics: status, promises and challenges.

Authors: 
Journal:  Brief Bioinform       Date:  2018-01-01       Impact factor: 11.622

Review 7.  Novel bioinformatic developments for exome sequencing.

Authors:  Stefan H Lelieveld; Joris A Veltman; Christian Gilissen
Journal:  Hum Genet       Date:  2016-04-13       Impact factor: 4.132

8.  Vcfanno: fast, flexible annotation of genetic variants.

Authors:  Brent S Pedersen; Ryan M Layer; Aaron R Quinlan
Journal:  Genome Biol       Date:  2016-06-01       Impact factor: 13.583

9.  Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes.

Authors:  Jerome Kelleher; Alison M Etheridge; Gilean McVean
Journal:  PLoS Comput Biol       Date:  2016-05-04       Impact factor: 4.475

10.  Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints.

Authors:  Gustavo Glusman; Denise E Mauldin; Leroy E Hood; Max Robinson
Journal:  Front Genet       Date:  2017-09-26       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.