Literature DB >> 18229678

TRELLIS+: an effective approach for indexing genome-scale sequences using suffix trees.

Benjarath Phoophakdee1, Mohammed J Zaki.   

Abstract

UNLABELLED: With advances in high-throughput sequencing methods, and the corresponding exponential growth in sequence data, it has become critical to develop scalable data management techniques for sequence storage, retrieval and analysis. In this paper we present a novel disk-based suffix tree approach, called TRELLIS+, that effectively scales to massive amount of sequence data using only a limited amount of main-memory, based on a novel string buffering strategy. We show experimentally that TRELLIS+ outperforms existing suffix tree approaches; it is able to index genome-scale sequences (e.g., the entire Human genome), and it also allows rapid query processing over the disk-based index. AVAILABILITY: TRELLIS+ source code is available online at http://www.cs.rpi.edu/-zaki/software/trellis

Entities:  

Mesh:

Year:  2008        PMID: 18229678

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  7 in total

1.  A hybrid, de novo based, genome-wide database search approach applied to the sea urchin neuropeptidome.

Authors:  Gerben Menschaert; Tom T M Vandekerckhove; Geert Baggerman; Bart Landuyt; Jonathan V Sweedler; Liliane Schoofs; Walter Luyten; Wim Van Criekinge
Journal:  J Proteome Res       Date:  2010-02-05       Impact factor: 4.466

2.  BiologicalNetworks--tools enabling the integration of multi-scale data for the host-pathogen studies.

Authors:  Sergey Kozhenkov; Mayya Sedova; Yulia Dubinina; Amarnath Gupta; Animesh Ray; Julia Ponomarenko; Michael Baitaluk
Journal:  BMC Syst Biol       Date:  2011-01-14

3.  Damming the genomic data flood using a comprehensive analysis and storage data structure.

Authors:  Marc Bouffard; Michael S Phillips; Andrew M K Brown; Sharon Marsh; Jean-Claude Tardif; Tibor van Rooij
Journal:  Database (Oxford)       Date:  2010-12-15       Impact factor: 3.451

4.  STEME: efficient EM to find motifs in large data sets.

Authors:  John E Reid; Lorenz Wernisch
Journal:  Nucleic Acids Res       Date:  2011-07-23       Impact factor: 16.971

5.  Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome.

Authors:  Wentian Li; Jan Freudenberg; Pedro Miramontes
Journal:  BMC Bioinformatics       Date:  2014-01-03       Impact factor: 3.169

6.  Breaking the computational barriers of pairwise genome comparison.

Authors:  Oscar Torreno; Oswaldo Trelles
Journal:  BMC Bioinformatics       Date:  2015-08-11       Impact factor: 3.169

7.  Suffix tree searcher: exploration of common substrings in large DNA sequence sets.

Authors:  David Minkley; Michael J Whitney; Song-Han Lin; Marina G Barsky; Chris Kelly; Chris Upton
Journal:  BMC Res Notes       Date:  2014-07-23
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.