Literature DB >> 16731699

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Weizhong Li1, Adam Godzik.   

Abstract

MOTIVATION: In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.

Mesh:

Substances:

Year:  2006        PMID: 16731699     DOI: 10.1093/bioinformatics/btl158

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2000 in total

1.  Alternaria alternata allergen Alt a 1: a unique β-barrel protein dimer found exclusively in fungi.

Authors:  Maksymilian Chruszcz; Martin D Chapman; Tomasz Osinski; Robert Solberg; Matthew Demas; Przemyslaw J Porebski; Karolina A Majorek; Anna Pomés; Wladek Minor
Journal:  J Allergy Clin Immunol       Date:  2012-06-02       Impact factor: 10.793

2.  iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition.

Authors:  Hao Lin; En-Ze Deng; Hui Ding; Wei Chen; Kuo-Chen Chou
Journal:  Nucleic Acids Res       Date:  2014-10-31       Impact factor: 16.971

3.  Extensive transcriptional response associated with seasonal plasticity of butterfly wing patterns.

Authors:  Emily V Daniels; Rabi Murad; Ali Mortazavi; Robert D Reed
Journal:  Mol Ecol       Date:  2014-12-04       Impact factor: 6.185

4.  Cryptic genetic variation promotes rapid evolutionary adaptation in an RNA enzyme.

Authors:  Eric J Hayden; Evandro Ferrada; Andreas Wagner
Journal:  Nature       Date:  2011-06-02       Impact factor: 49.962

5.  Widespread ancient whole-genome duplications in Malpighiales coincide with Eocene global climatic upheaval.

Authors:  Liming Cai; Zhenxiang Xi; André M Amorim; M Sugumaran; Joshua S Rest; Liang Liu; Charles C Davis
Journal:  New Phytol       Date:  2018-07-21       Impact factor: 10.151

6.  Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes.

Authors:  Xi Li; Baohai Hao; Da Pan; Gerald M Schneeweiss
Journal:  Front Plant Sci       Date:  2017-11-21       Impact factor: 5.753

7.  Comparative Transcriptomics Reveals Patterns of Adaptive Evolution Associated with Depth and Age Within Marine Rockfishes (Sebastes).

Authors:  Joseph Heras; Andres Aguilar
Journal:  J Hered       Date:  2019-05-07       Impact factor: 2.645

8.  Cupriavidus necator H16 Uses Flavocytochrome c Sulfide Dehydrogenase To Oxidize Self-Produced and Added Sulfide.

Authors:  Chuanjuan Lü; Yongzhen Xia; Daixi Liu; Rui Zhao; Rui Gao; Honglei Liu; Luying Xun
Journal:  Appl Environ Microbiol       Date:  2017-10-31       Impact factor: 4.792

9.  Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

Authors:  Guy Nimrod; András Szilágyi; Christina Leslie; Nir Ben-Tal
Journal:  J Mol Biol       Date:  2009-02-20       Impact factor: 5.469

10.  Developmental dynamics of the preterm infant gut microbiota and antibiotic resistome.

Authors:  Molly K Gibson; Bin Wang; Sara Ahmadi; Carey-Ann D Burnham; Phillip I Tarr; Barbara B Warner; Gautam Dantas
Journal:  Nat Microbiol       Date:  2016-03-07       Impact factor: 17.745

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.