BACKGROUND: Various processes such as annotation and filtering of variants or comparison of variants in different genomes are required in whole-genome or exome analysis pipelines. However, processing different databases and searching among millions of genomic loci is not trivial. RESULTS: gSearch compares sequence variants in the Genome Variation Format (GVF) or Variant Call Format (VCF) with a pre-compiled annotation or with variants in other genomes. Its search algorithms are subsequently optimized and implemented in a multi-threaded manner. The proposed method is not a stand-alone annotation tool with its own reference databases. Rather, it is a search utility that readily accepts public or user-prepared reference files in various formats including GVF, Generic Feature Format version 3 (GFF3), Gene Transfer Format (GTF), VCF and Browser Extensible Data (BED) format. Compared to existing tools such as ANNOVAR, gSearch runs more than 10 times faster. For example, it is capable of annotating 52.8 million variants with allele frequencies in 6 min. AVAILABILITY: gSearch is available at http://ml.ssu.ac.kr/gSearch. It can be used as an independent search tool or can easily be integrated to existing pipelines through various programming environments such as Perl, Ruby and Python.
BACKGROUND: Various processes such as annotation and filtering of variants or comparison of variants in different genomes are required in whole-genome or exome analysis pipelines. However, processing different databases and searching among millions of genomic loci is not trivial. RESULTS: gSearch compares sequence variants in the Genome Variation Format (GVF) or Variant Call Format (VCF) with a pre-compiled annotation or with variants in other genomes. Its search algorithms are subsequently optimized and implemented in a multi-threaded manner. The proposed method is not a stand-alone annotation tool with its own reference databases. Rather, it is a search utility that readily accepts public or user-prepared reference files in various formats including GVF, Generic Feature Format version 3 (GFF3), Gene Transfer Format (GTF), VCF and Browser Extensible Data (BED) format. Compared to existing tools such as ANNOVAR, gSearch runs more than 10 times faster. For example, it is capable of annotating 52.8 million variants with allele frequencies in 6 min. AVAILABILITY: gSearch is available at http://ml.ssu.ac.kr/gSearch. It can be used as an independent search tool or can easily be integrated to existing pipelines through various programming environments such as Perl, Ruby and Python.
Authors: Mark Yandell; Chad Huff; Hao Hu; Marc Singleton; Barry Moore; Jinchuan Xing; Lynn B Jorde; Martin G Reese Journal: Genome Res Date: 2011-06-23 Impact factor: 9.043
Authors: Martin G Reese; Barry Moore; Colin Batchelor; Fidel Salas; Fiona Cunningham; Gabor T Marth; Lincoln Stein; Paul Flicek; Mark Yandell; Karen Eilbeck Journal: Genome Biol Date: 2010-08-26 Impact factor: 13.583
Authors: In-Hee Lee; Kyungjoon Lee; Michael Hsing; Yongjoon Choe; Jin-Ho Park; Shu Hee Kim; Justin M Bohn; Matthew B Neu; Kyu-Baek Hwang; Robert C Green; Isaac S Kohane; Sek Won Kong Journal: Hum Mutat Date: 2014-03-06 Impact factor: 4.878
Authors: Kyu-Baek Hwang; In-Hee Lee; Jin-Ho Park; Tina Hambuch; Yongjoon Choe; MinHyeok Kim; Kyungjoon Lee; Taemin Song; Matthew B Neu; Neha Gupta; Isaac S Kohane; Robert C Green; Sek Won Kong Journal: Hum Mutat Date: 2014-06-24 Impact factor: 4.878