Literature DB >> 25414351

GRASP: guided reference-based assembly of short peptides.

Cuncong Zhong1, Youngik Yang1, Shibu Yooseph2.   

Abstract

Protein sequences predicted from metagenomic datasets are annotated by identifying their homologs via sequence comparisons with reference or curated proteins. However, a majority of metagenomic protein sequences are partial-length, arising as a result of identifying genes on sequencing reads or on assembled nucleotide contigs, which themselves are often very fragmented. The fragmented nature of metagenomic protein predictions adversely impacts homology detection and, therefore, the quality of the overall annotation of the dataset. Here we present a novel algorithm called GRASP that accurately identifies the homologs of a given reference protein sequence from a database consisting of partial-length metagenomic proteins. Our homology detection strategy is guided by the reference sequence, and involves the simultaneous search and assembly of overlapping database sequences. GRASP was compared to three commonly used protein sequence search programs (BLASTP, PSI-BLAST and FASTM). Our evaluations using several simulated and real datasets show that GRASP has a significantly higher sensitivity than these programs while maintaining a very high specificity. GRASP can be a very useful program for detecting and quantifying taxonomic and protein family abundances in metagenomic datasets. GRASP is implemented in GNU C++, and is freely available at http://sourceforge.net/projects/grasp-release.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25414351      PMCID: PMC4330339          DOI: 10.1093/nar/gku1210

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  39 in total

Review 1.  From bacterial to microbial ecosystems (metagenomics).

Authors:  Shannon J Williamson; Shibu Yooseph
Journal:  Methods Mol Biol       Date:  2012

Review 2.  Next-generation sequencing technologies for environmental DNA research.

Authors:  Shadi Shokralla; Jennifer L Spall; Joel F Gibson; Mehrdad Hajibabaei
Journal:  Mol Ecol       Date:  2012-04       Impact factor: 6.185

3.  Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2.

Authors:  Martin Wu; Alexandra J Scott
Journal:  Bioinformatics       Date:  2012-02-12       Impact factor: 6.937

Review 4.  Next-generation DNA sequencing methods.

Authors:  Elaine R Mardis
Journal:  Annu Rev Genomics Hum Genet       Date:  2008       Impact factor: 8.929

5.  Integrative analysis of environmental sequences using MEGAN4.

Authors:  Daniel H Huson; Suparna Mitra; Hans-Joachim Ruscheweyh; Nico Weber; Stephan C Schuster
Journal:  Genome Res       Date:  2011-06-20       Impact factor: 9.043

6.  MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.

Authors:  Toshiaki Namiki; Tsuyoshi Hachiya; Hideaki Tanaka; Yasubumi Sakakibara
Journal:  Nucleic Acids Res       Date:  2012-07-19       Impact factor: 16.971

7.  FragGeneScan: predicting genes in short and error-prone reads.

Authors:  Mina Rho; Haixu Tang; Yuzhen Ye
Journal:  Nucleic Acids Res       Date:  2010-08-30       Impact factor: 16.971

8.  RAPSearch: a fast protein similarity search tool for short reads.

Authors:  Yuzhen Ye; Jeong-Hyeon Choi; Haixu Tang
Journal:  BMC Bioinformatics       Date:  2011-05-15       Impact factor: 3.307

9.  CAMERA: a community resource for metagenomics.

Authors:  Rekha Seshadri; Saul A Kravitz; Larry Smarr; Paul Gilna; Marvin Frazier
Journal:  PLoS Biol       Date:  2007-03       Impact factor: 8.029

10.  Bioprospecting metagenomes: glycosyl hydrolases for converting biomass.

Authors:  Luen-Luen Li; Sean R McCorkle; Sebastien Monchy; Safiyh Taghavi; Daniel van der Lelie
Journal:  Biotechnol Biofuels       Date:  2009-05-18       Impact factor: 6.040

View more
  4 in total

1.  MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle.

Authors:  Valerie De Anda; Icoquih Zapata-Peñasco; Augusto Cesar Poot-Hernandez; Luis E Eguiarte; Bruno Contreras-Moreira; Valeria Souza
Journal:  Gigascience       Date:  2017-11-01       Impact factor: 6.524

2.  GRASPx: efficient homolog-search of short peptide metagenome database through simultaneous alignment and assembly.

Authors:  Cuncong Zhong; Youngik Yang; Shibu Yooseph
Journal:  BMC Bioinformatics       Date:  2016-08-31       Impact factor: 3.169

3.  ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores.

Authors:  Luis H Orellana; Luis M Rodriguez-R; Konstantinos T Konstantinidis
Journal:  Nucleic Acids Res       Date:  2017-02-17       Impact factor: 16.971

Review 4.  Best practices for evaluating single nucleotide variant calling methods for microbial genomics.

Authors:  Nathan D Olson; Steven P Lund; Rebecca E Colman; Jeffrey T Foster; Jason W Sahl; James M Schupp; Paul Keim; Jayne B Morrow; Marc L Salit; Justin M Zook
Journal:  Front Genet       Date:  2015-07-07       Impact factor: 4.599

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.