Literature DB >> 16076884

JIGSAW: integration of multiple sources of evidence for gene prediction.

Jonathan E Allen1, Steven L Salzberg.   

Abstract

MOTIVATION: Computational gene finding systems play an important role in finding new human genes, although no systems are yet accurate enough to predict all or even most protein-coding regions perfectly. Ab initio programs can be augmented by evidence such as expression data or protein sequence homology, which improves their performance. The amount of such evidence continues to grow, but computational methods continue to have difficulty predicting genes when the evidence is conflicting or incomplete. Genome annotation pipelines collect a variety of types of evidence about gene structure and synthesize the results, which can then be refined further through manual, expert curation of gene models.
RESULTS: JIGSAW is a new gene finding system designed to automate the process of predicting gene structure from multiple sources of evidence, with results that often match the performance of human curators. JIGSAW computes the relative weight of different lines of evidence using statistics generated from a training set, and then combines the evidence using dynamic programming. Our results show that JIGSAW's performance is superior to ab initio gene finding methods and to other pipelines such as Ensembl. Even without evidence from alignment to known genes, JIGSAW can substantially improve gene prediction accuracy as compared with existing methods. AVAILABILITY: JIGSAW is available as an open source software package at http://cbcb.umd.edu/software/jigsaw.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16076884     DOI: 10.1093/bioinformatics/bti609

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  60 in total

Review 1.  A beginner's guide to eukaryotic genome annotation.

Authors:  Mark Yandell; Daniel Ence
Journal:  Nat Rev Genet       Date:  2012-04-18       Impact factor: 53.242

Review 2.  Homology and phylogeny and their automated inference.

Authors:  Georg Fuellen
Journal:  Naturwissenschaften       Date:  2008-02-21

3.  Conrad: gene prediction using conditional random fields.

Authors:  David DeCaprio; Jade P Vinson; Matthew D Pearson; Philip Montgomery; Matthew Doherty; James E Galagan
Journal:  Genome Res       Date:  2007-08-09       Impact factor: 9.043

4.  Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel.

Authors:  Abel González-Pérez; Nuria López-Bigas
Journal:  Am J Hum Genet       Date:  2011-03-31       Impact factor: 11.025

Review 5.  Between a chicken and a grape: estimating the number of human genes.

Authors:  Mihaela Pertea; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-05-05       Impact factor: 13.583

6.  Realistic artificial DNA sequences as negative controls for computational genomics.

Authors:  Juan Caballero; Arian F A Smit; Leroy Hood; Gustavo Glusman
Journal:  Nucleic Acids Res       Date:  2014-05-06       Impact factor: 16.971

7.  Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.

Authors:  Rami A Dalloul; Julie A Long; Aleksey V Zimin; Luqman Aslam; Kathryn Beal; Le Ann Blomberg; Pascal Bouffard; David W Burt; Oswald Crasta; Richard P M A Crooijmans; Kristal Cooper; Roger A Coulombe; Supriyo De; Mary E Delany; Jerry B Dodgson; Jennifer J Dong; Clive Evans; Karin M Frederickson; Paul Flicek; Liliana Florea; Otto Folkerts; Martien A M Groenen; Tim T Harkins; Javier Herrero; Steve Hoffmann; Hendrik-Jan Megens; Andrew Jiang; Pieter de Jong; Pete Kaiser; Heebal Kim; Kyu-Won Kim; Sungwon Kim; David Langenberger; Mi-Kyung Lee; Taeheon Lee; Shrinivasrao Mane; Guillaume Marcais; Manja Marz; Audrey P McElroy; Thero Modise; Mikhail Nefedov; Cédric Notredame; Ian R Paton; William S Payne; Geo Pertea; Dennis Prickett; Daniela Puiu; Dan Qioa; Emanuele Raineri; Magali Ruffier; Steven L Salzberg; Michael C Schatz; Chantel Scheuring; Carl J Schmidt; Steven Schroeder; Stephen M J Searle; Edward J Smith; Jacqueline Smith; Tad S Sonstegard; Peter F Stadler; Hakim Tafer; Zhijian Jake Tu; Curtis P Van Tassell; Albert J Vilella; Kelly P Williams; James A Yorke; Liqing Zhang; Hong-Bin Zhang; Xiaojun Zhang; Yang Zhang; Kent M Reed
Journal:  PLoS Biol       Date:  2010-09-07       Impact factor: 8.029

8.  MaizeGDB becomes 'sequence-centric'.

Authors:  Taner Z Sen; Carson M Andorf; Mary L Schaeffer; Lisa C Harper; Michael E Sparks; Jon Duvick; Volker P Brendel; Ethalinda Cannon; Darwin A Campbell; Carolyn J Lawrence
Journal:  Database (Oxford)       Date:  2009-12-07       Impact factor: 3.451

9.  The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes.

Authors:  James C Estill; Jeffrey L Bennetzen
Journal:  Plant Methods       Date:  2009-06-19       Impact factor: 4.993

10.  H-InvDB in 2009: extended database and data mining resources for human genes and transcripts.

Authors:  Chisato Yamasaki; Katsuhiko Murakami; Jun-ichi Takeda; Yoshiharu Sato; Akiko Noda; Ryuichi Sakate; Takuya Habara; Hajime Nakaoka; Fusano Todokoro; Akihiro Matsuya; Tadashi Imanishi; Takashi Gojobori
Journal:  Nucleic Acids Res       Date:  2009-11-23       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.