BACKGROUND: The implementation of high throughput sequencing for exploring biodiversity poses high demands on bioinformatics applications for automated data processing. Here we introduce CLOTU, an online and open access pipeline for processing 454 amplicon reads. CLOTU has been constructed to be highly user-friendly and flexible, since different types of analyses are needed for different datasets. RESULTS: In CLOTU, the user can filter out low quality sequences, trim tags, primers, adaptors, perform clustering of sequence reads, and run BLAST against NCBInr or a customized database in a high performance computing environment. The resulting data may be browsed in a user-friendly manner and easily forwarded to downstream analyses. Although CLOTU is specifically designed for analyzing 454 amplicon reads, other types of DNA sequence data can also be processed. A fungal ITS sequence dataset generated by 454 sequencing of environmental samples is used to demonstrate the utility of CLOTU. CONCLUSIONS: CLOTU is a flexible and easy to use bioinformatics pipeline that includes different options for filtering, trimming, clustering and taxonomic annotation of high throughput sequence reads. Some of these options are not included in comparable pipelines. CLOTU is implemented in a Linux computer cluster and is freely accessible to academic users through the Bioportal web-based bioinformatics service (http://www.bioportal.uio.no).
BACKGROUND: The implementation of high throughput sequencing for exploring biodiversity poses high demands on bioinformatics applications for automated data processing. Here we introduce CLOTU, an online and open access pipeline for processing 454 amplicon reads. CLOTU has been constructed to be highly user-friendly and flexible, since different types of analyses are needed for different datasets. RESULTS: In CLOTU, the user can filter out low quality sequences, trim tags, primers, adaptors, perform clustering of sequence reads, and run BLAST against NCBInr or a customized database in a high performance computing environment. The resulting data may be browsed in a user-friendly manner and easily forwarded to downstream analyses. Although CLOTU is specifically designed for analyzing 454 amplicon reads, other types of DNA sequence data can also be processed. A fungal ITS sequence dataset generated by 454 sequencing of environmental samples is used to demonstrate the utility of CLOTU. CONCLUSIONS: CLOTU is a flexible and easy to use bioinformatics pipeline that includes different options for filtering, trimming, clustering and taxonomic annotation of high throughput sequence reads. Some of these options are not included in comparable pipelines. CLOTU is implemented in a Linux computer cluster and is freely accessible to academic users through the Bioportal web-based bioinformatics service (http://www.bioportal.uio.no).
Authors: Gene W Tyson; Jarrod Chapman; Philip Hugenholtz; Eric E Allen; Rachna J Ram; Paul M Richardson; Victor V Solovyev; Edward M Rubin; Daniel S Rokhsar; Jillian F Banfield Journal: Nature Date: 2004-02-01 Impact factor: 49.962
Authors: Leho Tedersoo; R Henrik Nilsson; Kessy Abarenkov; Teele Jairus; Ave Sadam; Irja Saar; Mohammad Bahram; Eneke Bechem; George Chuyong; Urmas Kõljalg Journal: New Phytol Date: 2010-07-15 Impact factor: 10.151
Authors: Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg Journal: Nature Date: 2005-07-31 Impact factor: 49.962
Authors: Juan Falgueras; Antonio J Lara; Noé Fernández-Pozo; Francisco R Cantón; Guillermo Pérez-Trabado; M Gonzalo Claros Journal: BMC Bioinformatics Date: 2010-01-20 Impact factor: 3.169
Authors: Robert A Edwards; Beltran Rodriguez-Brito; Linda Wegley; Matthew Haynes; Mya Breitbart; Dean M Peterson; Martin O Saar; Scott Alexander; E Calvin Alexander; Forest Rohwer Journal: BMC Genomics Date: 2006-03-20 Impact factor: 3.969
Authors: Nathalie J van Orsouw; René C J Hogers; Antoine Janssen; Feyruz Yalcin; Sandor Snoeijers; Esther Verstege; Harrie Schneiders; Hein van der Poel; Jan van Oeveren; Harold Verstegen; Michiel J T van Eijk Journal: PLoS One Date: 2007-11-14 Impact factor: 3.240
Authors: Robert Lücking; James D Lawrey; Patrick M Gillevet; Masoumeh Sikaroodi; Manuela Dal-Forno; Simon A Berger Journal: J Mol Evol Date: 2013-12-17 Impact factor: 2.395
Authors: Holly M Bik; Dorota L Porazinska; Simon Creer; J Gregory Caporaso; Rob Knight; W Kelley Thomas Journal: Trends Ecol Evol Date: 2012-01-11 Impact factor: 17.712
Authors: Kodjovi D Mlaga; Alban Mathieu; Charles Joly Beauparlant; Alban Ott; Ahmad Khodr; Olivier Perin; Arnaud Droit Journal: Front Microbiol Date: 2021-05-05 Impact factor: 5.640
Authors: Anke Stüken; Russell J S Orr; Ralf Kellmann; Shauna A Murray; Brett A Neilan; Kjetill S Jakobsen Journal: PLoS One Date: 2011-05-18 Impact factor: 3.240