| Literature DB >> 20053844 |
Ying Huang1, Beifang Niu, Ying Gao, Limin Fu, Weizhong Li.
Abstract
UNLABELLED: CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels. AVAILABILITY: Free access at http://cd-hit.orgEntities:
Mesh:
Year: 2010 PMID: 20053844 PMCID: PMC2828112 DOI: 10.1093/bioinformatics/btq003
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Screenshots of CD-HIT Suite. (a) Cluster Explorer for investigating clusters. (b) A cluster distribution plot to explore the global structure of a whole dataset.