Avril Coghlan1, Richard Durbin. 1. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. alc@sanger.ac.uk
Abstract
MOTIVATION: Correct gene predictions are crucial for most analyses of genomes. However, in the absence of transcript data, gene prediction is still challenging. One way to improve gene-finding accuracy in such genomes is to combine the exons predicted by several gene-finders, so that gene-finders that make uncorrelated errors can correct each other. RESULTS: We present a method for combining gene-finders called Genomix. Genomix selects the predicted exons that are best conserved within and/or between species in terms of sequence and intron-exon structure, and combines them into a gene structure. Genomix was used to combine predictions from four gene-finders for Caenorhabditis elegans, by selecting the predicted exons that are best conserved with C.briggsae and C.remanei. On a set of approximately 1500 confirmed C.elegans genes, Genomix increased the exon-level specificity by 10.1% and sensitivity by 2.7% compared to the best input gene-finder. AVAILABILITY: Scripts and Supplementary Material can be found at http://www.sanger.ac.uk/Software/analysis/genomix
MOTIVATION: Correct gene predictions are crucial for most analyses of genomes. However, in the absence of transcript data, gene prediction is still challenging. One way to improve gene-finding accuracy in such genomes is to combine the exons predicted by several gene-finders, so that gene-finders that make uncorrelated errors can correct each other. RESULTS: We present a method for combining gene-finders called Genomix. Genomix selects the predicted exons that are best conserved within and/or between species in terms of sequence and intron-exon structure, and combines them into a gene structure. Genomix was used to combine predictions from four gene-finders for Caenorhabditis elegans, by selecting the predicted exons that are best conserved with C.briggsae and C.remanei. On a set of approximately 1500 confirmed C.elegans genes, Genomix increased the exon-level specificity by 10.1% and sensitivity by 2.7% compared to the best input gene-finder. AVAILABILITY: Scripts and Supplementary Material can be found at http://www.sanger.ac.uk/Software/analysis/genomix
Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971
Authors: Erich M Schwarz; Igor Antoshechkin; Carol Bastiani; Tamberlyn Bieri; Darin Blasiar; Payan Canaran; Juancarlos Chan; Nansheng Chen; Wen J Chen; Paul Davis; Tristan J Fiedler; Lisa Girard; Todd W Harris; Eimear E Kenny; Ranjana Kishore; Dan Lawson; Raymond Lee; Hans-Michael Müller; Cecilia Nakamura; Phil Ozersky; Andrei Petcherski; Anthony Rogers; Will Spooner; Mary Ann Tuli; Kimberly Van Auken; Daniel Wang; Richard Durbin; John Spieth; Lincoln D Stein; Paul W Sternberg Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971