Literature DB >> 16249266

Large-scale prokaryotic gene prediction and comparison to genome annotation.

Pernille Nielsen1, Anders Krogh.   

Abstract

MOTIVATION: Prokaryotic genomes are sequenced and annotated at an increasing rate. The methods of annotation vary between sequencing groups. It makes genome comparison difficult and may lead to propagation of errors when questionable assignments are adapted from one genome to another. Genome comparison either on a large or small scale would be facilitated by using a single standard for annotation, which incorporates a transparency of why an open reading frame (ORF) is considered to be a gene.
RESULTS: A total of 143 prokaryotic genomes were scored with an updated version of the prokaryotic genefinder EasyGene. Comparison of the GenBank and RefSeq annotations with the EasyGene predictions reveals that in some genomes up to approximately 60% of the genes may have been annotated with a wrong start codon, especially in the GC-rich genomes. The fractional difference between annotated and predicted confirms that too many short genes are annotated in numerous organisms. Furthermore, genes might be missing in the annotation of some of the genomes. We predict 41 of 143 genomes to be over-annotated by >5%, meaning that too many ORFs are annotated as genes. We also predict that 12 of 143 genomes are under-annotated. These results are based on the difference between the number of annotated genes not found by EasyGene and the number of predicted genes that are not annotated in GenBank. We argue that the average performance of our standardized and fully automated method is slightly better than the annotation.

Mesh:

Substances:

Year:  2005        PMID: 16249266     DOI: 10.1093/bioinformatics/bti701

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  70 in total

1.  Validation of a Burkholderia pseudomallei hypothetical protein and determination of its translational start codon using chromosomal integration of His-Tag coding sequence.

Authors:  Hokchai Yam; Ainihayati Abdul Rahim; Ooi Gim Luan; Razip Samian; Uyub Abdul Manaf; Suriani Mohamad; Nazalan Najimudin
Journal:  Protein J       Date:  2012-03       Impact factor: 2.371

2.  Comprehensive proteomic analysis of membrane proteins in Toxoplasma gondii.

Authors:  Fa-Yun Che; Carlos Madrid-Aliste; Berta Burd; Hongshan Zhang; Edward Nieves; Kami Kim; Andras Fiser; Ruth Hogue Angeletti; Louis M Weiss
Journal:  Mol Cell Proteomics       Date:  2010-10-10       Impact factor: 5.911

3.  Identifying bacterial genes and endosymbiont DNA with Glimmer.

Authors:  Arthur L Delcher; Kirsten A Bratke; Edwin C Powers; Steven L Salzberg
Journal:  Bioinformatics       Date:  2007-01-19       Impact factor: 6.937

4.  Identification of prokaryotic small proteins using a comparative genomic approach.

Authors:  Josue Samayoa; Fitnat H Yildiz; Kevin Karplus
Journal:  Bioinformatics       Date:  2011-05-05       Impact factor: 6.937

5.  Transcriptome analysis of Pseudomonas syringae identifies new genes, noncoding RNAs, and antisense activity.

Authors:  Melanie J Filiatrault; Paul V Stodghill; Philip A Bronstein; Simon Moll; Magdalen Lindeberg; George Grills; Peter Schweitzer; Wei Wang; Gary P Schroth; Shujun Luo; Irina Khrebtukova; Yong Yang; Theodore Thannhauser; Bronwyn G Butcher; Samuel Cartinhour; David J Schneider
Journal:  J Bacteriol       Date:  2010-02-26       Impact factor: 3.490

6.  Proteogenomic Analysis and Discovery of Immune Antigens in Mycobacterium vaccae.

Authors:  Jianhua Zheng; Lihong Chen; Liguo Liu; Haifeng Li; Bo Liu; Dandan Zheng; Tao Liu; Jie Dong; Lilian Sun; Yafang Zhu; Jian Yang; Xiaobing Zhang; Qi Jin
Journal:  Mol Cell Proteomics       Date:  2017-07-21       Impact factor: 5.911

7.  Biosynthetic assembly of the Bacteroides fragilis capsular polysaccharide A precursor bactoprenyl diphosphate-linked acetamido-4-amino-6-deoxygalactopyranose.

Authors:  Anahita Z Mostafavi; Jerry M Troutman
Journal:  Biochemistry       Date:  2013-03-08       Impact factor: 3.162

8.  Representative transcript sets for evaluating a translational initiation sites predictor.

Authors:  Jia Zeng; Reda Alhajj; Douglas J Demetrick
Journal:  BMC Bioinformatics       Date:  2009-07-02       Impact factor: 3.169

9.  The Microbe browser for comparative genomics.

Authors:  Alexandre Gattiker; Christophe Dessimoz; Adrian Schneider; Ioannis Xenarios; Marco Pagni; Jacques Rougemont
Journal:  Nucleic Acids Res       Date:  2009-04-30       Impact factor: 16.971

10.  Rickettsia phylogenomics: unwinding the intricacies of obligate intracellular life.

Authors:  Joseph J Gillespie; Kelly Williams; Maulik Shukla; Eric E Snyder; Eric K Nordberg; Shane M Ceraul; Chitti Dharmanolla; Daphne Rainey; Jeetendra Soneja; Joshua M Shallom; Nataraj Dongre Vishnubhat; Rebecca Wattam; Anjan Purkayastha; Michael Czar; Oswald Crasta; Joao C Setubal; Abdu F Azad; Bruno S Sobral
Journal:  PLoS One       Date:  2008-04-16       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.