| Literature DB >> 11294794 |
W Li1, L Jaroszewski, A Godzik.
Abstract
We present a fast and flexible program for clustering large protein databases at different sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and clustering of the non-redundant protein database of over 560,000 sequences on a high-end PC. The output database, including only the representative sequences, can be used for more efficient and sensitive database searches.Mesh:
Substances:
Year: 2001 PMID: 11294794 DOI: 10.1093/bioinformatics/17.3.282
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937