Literature DB >> 24831154

Complete Genome Sequence of Coprothermobacter proteolyticus DSM 5265.

Alexandra Alexiev¹, David A Coil¹, Jonathan H Badger², Julie Enticknap³, Naomi Ward⁴, Frank T Robb⁵, Jonathan A Eisen⁶.

Abstract

Here we present the complete 1,424,912-bp genome sequence of Coprothermobacter proteolyticus DSM 5265, isolated from a thermophilic digester fermenting tannery wastes and cattle manure.

Entities: Chemical Species

Year: 2014 PMID： 24831154 PMCID： PMC4022818 DOI： 10.1128/genomeA.00470-14

Source DB: PubMed Journal: Genome Announc

GENOME ANNOUNCEMENT

Coprothermobacter proteolyticus is a nonmotile, non-spore-forming, rod-shaped, Gram-negative anaerobic bacterium isolated from a thermophilic consortium fermenting tannery wastes and cattle manure (1). C. proteolyticus has increased utilization of fructose, mannose, glucose, maltose, and sucrose with the addition of yeast extract with either rumen fluid or Trypticase peptone compared to when it is grown without these additives (1). It was first considered a member of the genus Thermobacteroides but was latter reclassified as Coprothermobacter proteolyticus (2). C. proteolyticus was selected in 2002 as part of a National Science Foundation-funded “Assembling the Tree of Life” project at the Institute for Genomic Research (TIGR) to sequence the genomes of representatives of the seven phyla of bacteria that at the time had cultured representatives but no available genome sequence. C. proteolyticus DSM 5265 was grown in DSM medium 481, and DNA was extracted using standard techniques. Sanger sequencing and genome assembly were performed as previously described for genomes sequenced by TIGR (3–5). Small and large insert plasmid libraries were constructed in pUC-derived vectors after random mechanical shearing (nebulization) of genomic DNA. Sequencing resulted in 14,614 reads with an average read length of 1,039 bp and a coverage estimate of 10×. Sequences were assembled using Celera Assembler (6). The coverage criteria were that every position required at least double-clone coverage (or sequence from a PCR product amplified from genomic DNA) and either sequence from both strands or two different sequencing chemistries. The sequence was edited manually, and additional PCR and sequencing reactions were done to close gaps, improve coverage and resolve sequence ambiguities (7). All repeated DNA regions were verified by PCR amplification across the repeat and sequencing of the product. The full assembly consists of 1,424,912 bases and has a G+C content of 44.8%. The replication origin was determined by colocalization of genes (dnaA, dnaN, recF, and gyrA) often found near the origin in prokaryotic genomes and G+C nucleotide skew (G·C/G+C) analysis (8). Completeness of the genome was assessed using the Phylosift software (9), which searches for 40 highly conserved, single copy marker genes (10). Thirty-nine of these 40 markers were found in this assembly and the missing marker (encoding porphobilinogen deaminase) was only found in 80% of the original 1,000 genomes used to generate the markers. An initial set of open reading frames likely to encode proteins (coding sequences [CDSs]) were predicted as previously described (7). All predicted proteins larger than 30 amino acids were searched against a nonredundant protein database as previously described (7). Protein membrane-spanning domains were identified by TopPred (11). The 5′ regions of the CDSs were inspected to define initiation codons using similarity searches and to identify positions of ribosomal binding sites and transcriptional terminators. Two sets of hidden Markov models were used to determine CDS membership in families and superfamilies: Pfam v11.0 (12) and TIGRFAMs 3.0 (13). Pfam v11.0 hidden Markov models were also used with a constraint of a minimum of two hits to find repeated domains within proteins and mask them. This annotation was submitted with the genome in 2008, but in 2014 we requested an in-place update of the annotation from NCBI, using their integrated PGAP pipeline (14).

Nucleotide sequence accession numbers.

This genome sequence has been deposited at DDBJ/EMBL/GenBank under the accession no. CP001145. The version described in this paper is version CP001145.1.

12 in total

1. TIGRFAMs: a protein family resource for the functional identification of proteins.

Authors: D H Haft; B J Loftus; D L Richardson; F Yang; J A Eisen; I T Paulsen; O White
Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971

2. Optimized multiplex PCR: efficiently closing a whole-genome shotgun sequencing project.

Authors: H Tettelin; D Radune; S Kasif; H Khouri; S L Salzberg
Journal: Genomics Date: 1999-12-15 Impact factor: 5.736

3. Toward an online repository of Standard Operating Procedures (SOPs) for (meta)genomic annotation.

Authors: Samuel V Angiuoli; Aaron Gussman; William Klimke; Guy Cochrane; Dawn Field; George Garrity; Chinnappa D Kodira; Nikos Kyrpides; Ramana Madupu; Victor Markowitz; Tatiana Tatusova; Nick Thomson; Owen White
Journal: OMICS Date: 2008-06

4. Asymmetric substitution patterns in the two DNA strands of bacteria.

Authors: J R Lobry
Journal: Mol Biol Evol Date: 1996-05 Impact factor: 16.240

5. A whole-genome assembly of Drosophila.

Authors: E W Myers; G G Sutton; A L Delcher; I M Dew; D P Fasulo; M J Flanigan; S A Kravitz; C M Mobarry; K H Reinert; K A Remington; E L Anson; R A Bolanos; H H Chou; C M Jordan; A L Halpern; S Lonardi; E M Beasley; R C Brandon; L Chen; P J Dunn; Z Lai; Y Liang; D R Nusskern; M Zhan; Q Zhang; X Zheng; G M Rubin; M D Adams; J C Venter
Journal: Science Date: 2000-03-24 Impact factor: 47.728

6. TopPred II: an improved software for membrane protein structure predictions.

Authors: M G Claros; G von Heijne
Journal: Comput Appl Biosci Date: 1994-12

7. Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis.

Authors: John F Heidelberg; Ian T Paulsen; Karen E Nelson; Eric J Gaidos; William C Nelson; Timothy D Read; Jonathan A Eisen; Rekha Seshadri; Naomi Ward; Barbara Methe; Rebecca A Clayton; Terry Meyer; Alexandre Tsapin; James Scott; Maureen Beanan; Lauren Brinkac; Sean Daugherty; Robert T DeBoy; Robert J Dodson; A Scott Durkin; Daniel H Haft; James F Kolonay; Ramana Madupu; Jeremy D Peterson; Lowell A Umayam; Owen White; Alex M Wolf; Jessica Vamathevan; Janice Weidman; Marjorie Impraim; Kathy Lee; Kristy Berry; Chris Lee; Jacob Mueller; Hoda Khouri; John Gill; Terry R Utterback; Lisa A McDonald; Tamara V Feldblyum; Hamilton O Smith; J Craig Venter; Kenneth H Nealson; Claire M Fraser
Journal: Nat Biotechnol Date: 2002-10-07 Impact factor: 54.908

8. Metabolic complementarity and genomics of the dual bacterial symbiosis of sharpshooters.

Authors: Dongying Wu; Sean C Daugherty; Susan E Van Aken; Grace H Pai; Kisha L Watkins; Hoda Khouri; Luke J Tallon; Jennifer M Zaborsky; Helen E Dunbar; Phat L Tran; Nancy A Moran; Jonathan A Eisen
Journal: PLoS Biol Date: 2006-06 Impact factor: 8.029

9. PhyloSift: phylogenetic analysis of genomes and metagenomes.

Authors: Aaron E Darling; Guillaume Jospin; Eric Lowe; Frederick A Matsen; Holly M Bik; Jonathan A Eisen
Journal: PeerJ Date: 2014-01-09 Impact factor: 2.984

10. Systematic identification of gene families for use as "markers" for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups.

Authors: Dongying Wu; Guillaume Jospin; Jonathan A Eisen
Journal: PLoS One Date: 2013-10-17 Impact factor: 3.240

2 in total

1. Quantitative Metaproteomics Highlight the Metabolic Contributions of Uncultured Phylotypes in a Thermophilic Anaerobic Digester.

Authors: Live H Hagen; Jeremy A Frank; Mirzaman Zamanzadeh; Vincent G H Eijsink; Phillip B Pope; Svein J Horn; Magnus Ø Arntzen
Journal: Appl Environ Microbiol Date: 2016-12-30 Impact factor: 4.792

2. Non-autotrophic methanogens dominate in anaerobic digesters.

Authors: Atsushi Kouzuma; Maho Tsutsumi; Shun'ichi Ishii; Yoshiyuki Ueno; Takashi Abe; Kazuya Watanabe
Journal: Sci Rep Date: 2017-05-04 Impact factor: 4.379

2 in total