H H Chou1, M H Holmes. 1. Department of Zoology and Genetics, Department of Computer Science, Iowa State University, Ames, IA 50011, USA. hhchou@iastate.edu
Abstract
MOTIVATION: Most sequence comparison methods assume that the data being compared are trustworthy, but this is not the case with raw DNA sequences obtained from automatic sequencing machines. Nevertheless, sequence comparisons need to be done on them in order to remove vector splice sites and contaminants. This step is necessary before other genomic data processing stages can be carried out, such as fragment assembly or EST clustering. A specialized tool is therefore needed to solve this apparent dilemma. RESULTS: We have designed and implemented a program that specifically addresses the problem. This program, called LUCY, has been in use since 1998 at The Institute for Genomic Research (TIGR). During this period, many rounds of experience-driven modifications were made to LUCY to improve its accuracy and its ability to deal with extremely difficult input cases. We believe we have finally obtained a useful program which strikes a delicate balance among the many issues involved in the raw sequence cleaning problem, and we wish to share it with the research community. AVAILABILITY: LUCY is available directly from TIGR (http://www.tigr.org/softlab). Academic users can download LUCY after accepting a free academic use license. Business users may need to pay a license fee to use LUCY for commercial purposes. CONTACT: Questions regarding the quality assessment module of LUCY should be directed to Michael Holmes (mholmes@tigr.org). Questions regarding other aspects of LUCY should be directed to Hui-Hsien Chou (hhchou@iastate.edu).
MOTIVATION: Most sequence comparison methods assume that the data being compared are trustworthy, but this is not the case with raw DNA sequences obtained from automatic sequencing machines. Nevertheless, sequence comparisons need to be done on them in order to remove vector splice sites and contaminants. This step is necessary before other genomic data processing stages can be carried out, such as fragment assembly or EST clustering. A specialized tool is therefore needed to solve this apparent dilemma. RESULTS: We have designed and implemented a program that specifically addresses the problem. This program, called LUCY, has been in use since 1998 at The Institute for Genomic Research (TIGR). During this period, many rounds of experience-driven modifications were made to LUCY to improve its accuracy and its ability to deal with extremely difficult input cases. We believe we have finally obtained a useful program which strikes a delicate balance among the many issues involved in the raw sequence cleaning problem, and we wish to share it with the research community. AVAILABILITY: LUCY is available directly from TIGR (http://www.tigr.org/softlab). Academic users can download LUCY after accepting a free academic use license. Business users may need to pay a license fee to use LUCY for commercial purposes. CONTACT: Questions regarding the quality assessment module of LUCY should be directed to Michael Holmes (mholmes@tigr.org). Questions regarding other aspects of LUCY should be directed to Hui-Hsien Chou (hhchou@iastate.edu).
Authors: Joseph C Kuhl; Foo Cheung; Qiaoping Yuan; William Martin; Yayeh Zewdie; John McCallum; Andrew Catanach; Paul Rutherford; Kenneth C Sink; Maria Jenderek; James P Prince; Christopher D Town; Michael J Havey Journal: Plant Cell Date: 2003-12-11 Impact factor: 11.277
Authors: Bastien Chevreux; Thomas Pfisterer; Bernd Drescher; Albert J Driesel; Werner E G Müller; Thomas Wetter; Sándor Suhai Journal: Genome Res Date: 2004-05-12 Impact factor: 9.043
Authors: Rachel Marine; Shawn W Polson; Jacques Ravel; Graham Hatfull; Daniel Russell; Matthew Sullivan; Fraz Syed; Michael Dumas; K Eric Wommack Journal: Appl Environ Microbiol Date: 2011-09-23 Impact factor: 4.792
Authors: Susanne Schmitt; Peter Tsai; James Bell; Jane Fromont; Micha Ilan; Niels Lindquist; Thierry Perez; Allen Rodrigo; Peter J Schupp; Jean Vacelet; Nicole Webster; Ute Hentschel; Michael W Taylor Journal: ISME J Date: 2011-10-13 Impact factor: 10.302
Authors: Nina Sanapareddy; Ryan M Legge; Biljana Jovov; Amber McCoy; Lauren Burcal; Felix Araujo-Perez; Thomas A Randall; Joseph Galanko; Andrew Benson; Robert S Sandler; John F Rawls; Zaid Abdo; Anthony A Fodor; Temitope O Keku Journal: ISME J Date: 2012-05-24 Impact factor: 10.302
Authors: Federico M Lauro; Matthew Z DeMaere; Sheree Yau; Mark V Brown; Charmaine Ng; David Wilkins; Mark J Raftery; John A E Gibson; Cynthia Andrews-Pfannkoch; Matthew Lewis; Jeffrey M Hoffman; Torsten Thomas; Ricardo Cavicchioli Journal: ISME J Date: 2010-12-02 Impact factor: 10.302
Authors: Erica Duarte Silveira; Larissa Arrais Guimarães; Diva Maria de Alencar Dusi; Felipe Rodrigues da Silva; Natália Florencio Martins; Marcos Mota do Carmo Costa; Márcio Alves-Ferreira; Vera Tavares de Campos Carneiro Journal: Plant Cell Rep Date: 2011-11-09 Impact factor: 4.570
Authors: Yan Fu; Scott J Emrich; Ling Guo; Tsui-Jung Wen; Daniel A Ashlock; Srinivas Aluru; Patrick S Schnable Journal: Proc Natl Acad Sci U S A Date: 2005-08-15 Impact factor: 11.205
Authors: Joachim Messing; Arvind K Bharti; Wojciech M Karlowski; Heidrun Gundlach; Hye Ran Kim; Yeisoo Yu; Fusheng Wei; Galina Fuks; Carol A Soderlund; Klaus F X Mayer; Rod A Wing Journal: Proc Natl Acad Sci U S A Date: 2004-09-23 Impact factor: 11.205