Robert C Edgar1. 1. Tiburon, CA 94920, USA. robert@drive5.com
Abstract
MOTIVATION: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. RESULTS: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. AVAILABILITY: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.
MOTIVATION: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. RESULTS: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. AVAILABILITY: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.
Authors: Gil Benedek; Jun Zhang; Ha Nguyen; Gail Kent; Hilary A Seifert; Sean Davin; Patrick Stauffer; Arthur A Vandenbark; Lisa Karstens; Mark Asquith; Halina Offner Journal: J Neuroimmunol Date: 2017-06-21 Impact factor: 3.478
Authors: K M Handley; Y M Piceno; P Hu; L M Tom; O U Mason; G L Andersen; J K Jansson; J A Gilbert Journal: ISME J Date: 2017-08-04 Impact factor: 10.302
Authors: Ursula Pieper; Avner Schlessinger; Edda Kloppmann; Geoffrey A Chang; James J Chou; Mark E Dumont; Brian G Fox; Petra Fromme; Wayne A Hendrickson; Michael G Malkowski; Douglas C Rees; David L Stokes; Michael H B Stowell; Michael C Wiener; Burkhard Rost; Robert M Stroud; Raymond C Stevens; Andrej Sali Journal: Nat Struct Mol Biol Date: 2013-02 Impact factor: 15.369