Literature DB >> 18187439

Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction.

Qian Liu1, Aaron J Mackey, David S Roos, Fernando C N Pereira.   

Abstract

MOTIVATION: The increasing diversity and variable quality of evidence relevant to gene annotation argues for a probabilistic framework that automatically integrates such evidence to yield candidate gene models.
RESULTS: Evigan is an automated gene annotation program for eukaryotic genomes, employing probabilistic inference to integrate multiple sources of gene evidence. The probabilistic model is a dynamic Bayes network whose parameters are adjusted to maximize the probability of observed evidence. Consensus gene predictions are then derived by maximum likelihood decoding, yielding n-best models (with probabilities for each). Evigan is capable of accommodating a variety of evidence types, including (but not limited to) gene models computed by diverse gene finders, BLAST hits, EST matches, and splice site predictions; learned parameters encode the relative quality of evidence sources. Since separate training data are not required (apart from the training sets used by individual gene finders), Evigan is particularly attractive for newly sequenced genomes where little or no reliable manually curated annotation is available. The ability to produce a ranked list of alternative gene models may facilitate identification of alternatively spliced transcripts. Experimental application to ENCODE regions of the human genome, and the genomes of Plasmodium vivax and Arabidopsis thaliana show that Evigan achieves better performance than any of the individual data sources used as evidence. AVAILABILITY: The source code is available at http://www.seas.upenn.edu/~strctlrn/evigan/evigan.html.

Entities:  

Mesh:

Year:  2008        PMID: 18187439     DOI: 10.1093/bioinformatics/btn004

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  16 in total

Review 1.  A beginner's guide to eukaryotic genome annotation.

Authors:  Mark Yandell; Daniel Ence
Journal:  Nat Rev Genet       Date:  2012-04-18       Impact factor: 53.242

2.  Meeting report: a workshop on Best Practices in Genome Annotation.

Authors:  Ramana Madupu; Lauren M Brinkac; Jennifer Harrow; Laurens G Wilming; Ulrike Böhme; Philippe Lamesch; Linda I Hannick
Journal:  Database (Oxford)       Date:  2010-02-18       Impact factor: 3.451

3.  De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

Authors:  Minou Nowrousian; Jason E Stajich; Meiling Chu; Ines Engh; Eric Espagne; Karen Halliday; Jens Kamerewerd; Frank Kempken; Birgit Knab; Hsiao-Che Kuo; Heinz D Osiewacz; Stefanie Pöggeler; Nick D Read; Stephan Seiler; Kristina M Smith; Denise Zickler; Ulrich Kück; Michael Freitag
Journal:  PLoS Genet       Date:  2010-04-08       Impact factor: 5.917

4.  Metagenomes from high-temperature chemotrophic systems reveal geochemical controls on microbial community structure and function.

Authors:  William P Inskeep; Douglas B Rusch; Zackary J Jay; Markus J Herrgard; Mark A Kozubal; Toby H Richardson; Richard E Macur; Natsuko Hamamura; Ryan deM Jennings; Bruce W Fouke; Anna-Louise Reysenbach; Frank Roberto; Mark Young; Ariel Schwartz; Eric S Boyd; Jonathan H Badger; Eric J Mathur; Alice C Ortmann; Mary Bateson; Gill Geesey; Marvin Frazier
Journal:  PLoS One       Date:  2010-03-19       Impact factor: 3.240

5.  Novel Gene Discovery in the Human Malaria Parasite using Nucleosome Positioning Data.

Authors:  N Pokhriyal; N Ponts; E Y Harris; K G Le Roch; S Lonardi
Journal:  Comput Syst Bioinformatics Conf       Date:  2010-08

6.  A novel multifunctional oligonucleotide microarray for Toxoplasma gondii.

Authors:  Amit Bahl; Paul H Davis; Michael Behnke; Florence Dzierszinski; Manjunatha Jagalur; Feng Chen; Dhanasekaran Shanmugam; Michael W White; David Kulp; David S Roos
Journal:  BMC Genomics       Date:  2010-10-25       Impact factor: 3.969

7.  RNA-Seq analysis of splicing in Plasmodium falciparum uncovers new splice junctions, alternative splicing and splicing of antisense transcripts.

Authors:  Katherine Sorber; Michelle T Dimon; Joseph L DeRisi
Journal:  Nucleic Acids Res       Date:  2011-01-17       Impact factor: 16.971

8.  Draft genome sequencing and comparative analysis of Aspergillus sojae NBRC4239.

Authors:  Atsushi Sato; Kenshiro Oshima; Hideki Noguchi; Masahiro Ogawa; Tadashi Takahashi; Tetsuya Oguma; Yasuji Koyama; Takehiko Itoh; Masahira Hattori; Yoshiki Hanya
Journal:  DNA Res       Date:  2011-06       Impact factor: 4.458

9.  Gene gain and loss during evolution of obligate parasitism in the white rust pathogen of Arabidopsis thaliana.

Authors:  Eric Kemen; Anastasia Gardiner; Torsten Schultz-Larsen; Ariane C Kemen; Alexi L Balmuth; Alexandre Robert-Seilaniantz; Kate Bailey; Eric Holub; David J Studholme; Dan Maclean; Jonathan D G Jones
Journal:  PLoS Biol       Date:  2011-07-05       Impact factor: 8.029

10.  Evaluating high-throughput ab initio gene finders to discover proteins encoded in eukaryotic pathogen genomes missed by laboratory techniques.

Authors:  Stephen J Goodswen; Paul J Kennedy; John T Ellis
Journal:  PLoS One       Date:  2012-11-30       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.