Literature DB >> 30259084

Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.

Louis Berrios1, Bert Ely2.   

Abstract

Annotated sequence data are instrumental in nearly all realms of biology. However, the advent of next-generation sequencing has rapidly facilitated an imbalance between accurate sequence data and accurate annotation data. To increase the annotation accuracy of the Caulobacter vibrioides CB13b1a (CB13) genome, we compared the PGAP and RAST annotations of the CB13 genome. A total of 64 unique genes were identified in the PGAP annotation that were either completely or partially absent in the RAST annotation, and a total of 16 genes were identified in the RAST annotation that were not included in the PGAP annotation. Moreover, PGAP identified 73 frameshifted genes and 22 genes with an internal stop. In contrast, RAST annotated the larger segment of these frameshifted genes without indicating a change in reading frame may have occurred. The RAST annotation did not include any genes with internal stop codons, since it chose start codons that were after the internal stop. To confirm the discrepancies between the two annotations and verify the accuracy of the CB13 genome sequence data, we re-sequenced and re-annotated the entire genome and obtained an identical sequence, except in a small number of homopolymer regions. A genome sequence comparison between the two versions allowed us to determine the correct number of bases in each homopolymer region, which eliminated frameshifts for 31 genes annotated as frameshifted genes and removed 24 pseudogenes from the PGAP annotation. Both annotation systems correctly identified genes that were missed by the other system. In addition, PGAP identified conserved gene fragments that represented the beginning of genes, but it employed no corrective method to adjust the reading frame of frameshifted genes or the start sites of genes harboring an internal stop codon. In doing so, the PGAP annotation identified a large number of pseudogenes, which may reflect evolutionary history but likely do not produce gene products. These results demonstrate that re-sequencing and annotation comparisons can be used to increase the accuracy of genomic data and the corresponding gene annotation.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30259084      PMCID: PMC6232633          DOI: 10.1007/s00284-018-1572-3

Source DB:  PubMed          Journal:  Curr Microbiol        ISSN: 0343-8651            Impact factor:   2.188


  16 in total

1.  Artemis: sequence visualization and annotation.

Authors:  K Rutherford; J Parkhill; J Crook; T Horsnell; P Rice; M A Rajandream; B Barrell
Journal:  Bioinformatics       Date:  2000-10       Impact factor: 6.937

2.  Large-scale prokaryotic gene prediction and comparison to genome annotation.

Authors:  Pernille Nielsen; Anders Krogh
Journal:  Bioinformatics       Date:  2005-10-25       Impact factor: 6.937

3.  progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement.

Authors:  Aaron E Darling; Bob Mau; Nicole T Perna
Journal:  PLoS One       Date:  2010-06-25       Impact factor: 3.240

4.  A computational genomics pipeline for prokaryotic sequencing projects.

Authors:  Andrey O Kislyuk; Lee S Katz; Sonia Agrawal; Matthew S Hagen; Andrew B Conley; Pushkala Jayaraman; Viswateja Nelakuditi; Jay C Humphrey; Scott A Sammons; Dhwani Govil; Raydel D Mair; Kathleen M Tatti; Maria L Tondella; Brian H Harcourt; Leonard W Mayer; I King Jordan
Journal:  Bioinformatics       Date:  2010-06-02       Impact factor: 6.937

5.  Mauve assembly metrics.

Authors:  Aaron E Darling; Andrew Tritt; Jonathan A Eisen; Marc T Facciotti
Journal:  Bioinformatics       Date:  2011-08-02       Impact factor: 6.937

6.  The essential genome of a bacterium.

Authors:  Beat Christen; Eduardo Abeliuk; John M Collier; Virginia S Kalogeraki; Ben Passarelli; John A Coller; Michael J Fero; Harley H McAdams; Lucy Shapiro
Journal:  Mol Syst Biol       Date:  2011-08-30       Impact factor: 11.429

7.  The RAST Server: rapid annotations using subsystems technology.

Authors:  Ramy K Aziz; Daniela Bartels; Aaron A Best; Matthew DeJongh; Terrence Disz; Robert A Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M Glass; Michael Kubal; Folker Meyer; Gary J Olsen; Robert Olson; Andrei L Osterman; Ross A Overbeek; Leslie K McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D Pusch; Claudia Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko
Journal:  BMC Genomics       Date:  2008-02-08       Impact factor: 3.969

8.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

9.  Correction of the Caulobacter crescentus NA1000 genome annotation.

Authors:  Bert Ely; LaTia Etheredge Scott
Journal:  PLoS One       Date:  2014-03-12       Impact factor: 3.240

10.  NCBI prokaryotic genome annotation pipeline.

Authors:  Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell
Journal:  Nucleic Acids Res       Date:  2016-06-24       Impact factor: 16.971

View more
  1 in total

1.  Novel Pseudomonas sp. SCA7 Promotes Plant Growth in Two Plant Families and Induces Systemic Resistance in Arabidopsis thaliana.

Authors:  Theresa Kuhl-Nagel; Patricia Antonia Rodriguez; Isabella Gantner; Soumitra Paul Chowdhury; Patrick Schwehn; Maaria Rosenkranz; Baris Weber; Jörg-Peter Schnitzler; Susanne Kublik; Michael Schloter; Michael Rothballer; Pascal Falter-Braun
Journal:  Front Microbiol       Date:  2022-06-27       Impact factor: 6.064

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.