Literature DB >> 12835265

Fast sequence clustering using a suffix array algorithm.

Ketil Malde1, Eivind Coward, Inge Jonassen.   

Abstract

MOTIVATION: Efficient clustering is important for handling the large amount of available EST sequences. Most contemporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data.
RESULTS: A new, fast EST clustering algorithm is presented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a benchmark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising. AVAILABILITY: The source code for the prototype implementation is available under a GPL license from http://www.ii.uib.no/~ketil/bio/.

Mesh:

Year:  2003        PMID: 12835265     DOI: 10.1093/bioinformatics/btg138

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Evolutionary insights from suffix array-based genome sequence analysis.

Authors:  Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan
Journal:  J Biosci       Date:  2007-08       Impact factor: 1.826

2.  Insights into pathophysiology of dystropy through the analysis of gene networks: an example of bronchial asthma and tuberculosis.

Authors:  Elena Yu Bragina; Evgeny S Tiys; Maxim B Freidin; Lada A Koneva; Pavel S Demenkov; Vladimir A Ivanisenko; Nikolay A Kolchanov; Valery P Puzyrev
Journal:  Immunogenetics       Date:  2014-06-24       Impact factor: 2.846

3.  Discovering Activities to Recognize and Track in a Smart Environment.

Authors:  Parisa Rashidi; Diane J Cook; Lawrence B Holder; Maureen Schmitter-Edgecombe
Journal:  IEEE Trans Knowl Data Eng       Date:  2011       Impact factor: 6.977

4.  PEACE: Parallel Environment for Assembly and Clustering of Gene Expression.

Authors:  D M Rao; J C Moler; M Ozden; Y Zhang; C Liang; J E Karro
Journal:  Nucleic Acids Res       Date:  2010-06-03       Impact factor: 16.971

5.  Ultrafast clustering algorithms for metagenomic sequence analysis.

Authors:  Weizhong Li; Limin Fu; Beifang Niu; Sitao Wu; John Wooley
Journal:  Brief Bioinform       Date:  2012-07-06       Impact factor: 11.622

6.  CLU: a new algorithm for EST clustering.

Authors:  Andrey Ptitsyn; Winston Hide
Journal:  BMC Bioinformatics       Date:  2005-07-15       Impact factor: 3.169

7.  Masking repeats while clustering ESTs.

Authors:  Korbinian Schneeberger; Ketil Malde; Eivind Coward; Inge Jonassen
Journal:  Nucleic Acids Res       Date:  2005-04-14       Impact factor: 16.971

Review 8.  An overview of the wcd EST clustering tool.

Authors:  Scott Hazelhurst; Winston Hide; Zsuzsanna Lipták; Ramon Nogueira; Richard Starfield
Journal:  Bioinformatics       Date:  2008-05-14       Impact factor: 6.937

9.  A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

Authors:  Keng-Hoong Ng; Chin-Kuan Ho; Somnuk Phon-Amnuaisuk
Journal:  PLoS One       Date:  2012-10-11       Impact factor: 3.240

10.  XHM: a system for detection of potential cross hybridizations in DNA microarrays.

Authors:  Kristian Flikka; Fekadu Yadetie; Astrid Laegreid; Inge Jonassen
Journal:  BMC Bioinformatics       Date:  2004-08-27       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.