Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Fast sequence clustering using a suffix array algorithm.

Literature DB >> 12835265

Fast sequence clustering using a suffix array algorithm.

Ketil Malde¹, Eivind Coward, Inge Jonassen.

Abstract

MOTIVATION: Efficient clustering is important for handling the large amount of available EST sequences. Most contemporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data.
RESULTS: A new, fast EST clustering algorithm is presented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a benchmark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising. AVAILABILITY: The source code for the prototype implementation is available under a GPL license from http://www.ii.uib.no/~ketil/bio/.

Mesh：

Year: 2003 PMID： 12835265 DOI： 10.1093/bioinformatics/btg138

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

11 in total

1. Evolutionary insights from suffix array-based genome sequence analysis.

Authors: Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan
Journal: J Biosci Date: 2007-08 Impact factor: 1.826

2. Insights into pathophysiology of dystropy through the analysis of gene networks: an example of bronchial asthma and tuberculosis.

Authors: Elena Yu Bragina; Evgeny S Tiys; Maxim B Freidin; Lada A Koneva; Pavel S Demenkov; Vladimir A Ivanisenko; Nikolay A Kolchanov; Valery P Puzyrev
Journal: Immunogenetics Date: 2014-06-24 Impact factor: 2.846

Fast sequence clustering using a suffix array algorithm.

1. Evolutionary insights from suffix array-based genome sequence analysis.

2. Insights into pathophysiology of dystropy through the analysis of gene networks: an example of bronchial asthma and tuberculosis.

3. Discovering Activities to Recognize and Track in a Smart Environment.

4. PEACE: Parallel Environment for Assembly and Clustering of Gene Expression.

5. Ultrafast clustering algorithms for metagenomic sequence analysis.

6. CLU: a new algorithm for EST clustering.

7. Masking repeats while clustering ESTs.

Review 8. An overview of the wcd EST clustering tool.

9. A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

10. XHM: a system for detection of potential cross hybridizations in DNA microarrays.