Literature DB >> 27284142

Complete Genome Sequence of Sphingopyxis macrogoltabida Strain 203N (NBRC 111659), a Polyethylene Glycol Degrader.

Yoshiyuki Ohtsubo¹, Shouta Nonoyama, Yuji Nagata², Mitsuru Numata³, Keiko Tsuchikane³, Akira Hosoyama³, Atsushi Yamazoe³, Masataka Tsuda², Nobuyuki Fujita³, Fusako Kawai⁴.

Abstract

We determined the complete genome sequence of Sphingopyxis macrogoltabida strain 203N, a polyethylene glycol degrader. Because the PacBio assembly (285× coverage) seemed to be full of nucleotide-level mismatches, the Newbler assembly of MiSeq mate-pair and paired-end data was used for finishing and the PacBio assembly was used as a reference. The PacBio assembly carried 414 nucleotide mismatches over 5,953,153 bases of the 203N genome.

Entities: Chemical Disease Species

Year: 2016 PMID： 27284142 PMCID： PMC4901233 DOI： 10.1128/genomeA.00529-16

Source DB: PubMed Journal: Genome Announc

GENOME ANNOUNCEMENT

Sphingopyxis macrogoltabida strain 203 was isolated from soil as the polyethylene glycol (PEG)-utilizing Flavobacterium sp. strain 203 (1). Later, the strain was designated the type strain of Sphingomonas macrogoltabidus (2) and reidentified as Sphingopyxis macrogoltabida (3), based on the taxonomical standards proposed by Yabuuchi et al. (4). The strain was deposited to the National Institute of Technology and Evaluation (Tokyo, Japan) and stocked under the number NBRC 15033. The complete genome of NBRC 15033 was determined, but the genes for PEG utilization were missing, and repeated cultivation was assumed to be the reason for the loss (5). From a laboratory stock, we recovered a strain, designated 203N, harboring the pegA gene (6, 7) and capable of growing on PEG. Here, we report the complete genome sequence of S. macrogoltabida 203N. To determine the complete sequence, we obtained PacBio data from Macrogen Japan. The total number of reads obtained was 237,846 with an N50 length of 9,733 bp and a total length of 1.7 Gb. The reads were assembled by HGAP3, and three circular contigs corresponding to the main chromosome and two plasmids were obtained. However, we found that the sequences differ considerably to those of NBRC 15033 (5). Besides one genomic rearrangement, which was predicted to cause the loss, and 10 differences related to insertion sequences, a huge number (approximately 400) of nucleotide-level mismatches were counted. We also obtained Illumina MiSeq reads from the very DNA solution used for PacBio sequencing, and the assembled contig sequences suggested that PacBio assembly was erroneous. Replacing each part of the PacBio assembly by a corresponding MiSeq contig seemed inappropriate for correcting the errors, because the nucleotide-level mismatches were located throughout the genome, and contigs deriving from repeats in the genome might carry variation bases. Therefore, we decided to start the finishing from the Newbler assembly of the MiSeq reads obtained from mate-pair and PCR-free paired-end libraries. The finishing was facilitated by using ShortReadManager, GenoFinisher, and AceFileViewer (AFV) (8), which have been used to determine complete genome sequences, often enabling the complete in silico finishing, especially when the PCR-free kit of the Illumina sequence library preparation was used. In the finishing, the PacBio assembly was used as a reference to search the correct paths of contigs that fill each gap in scaffolds. The correct sequences of the paths were determined by AFV. The finished sequence was confirmed by FinishChecker, wherein genomic k-mers not found in the MiSeq reads were searched and corrected as necessary. Thus, the complete genome sequence of 203N was determined. We found 414 nucleotide-level mismatches to the PacBio assembly, most of which were found at homopolymeric stretches, which is not a characteristic error pattern of Illumina sequencing reads but may be one for PacBio. The sequences were annotated by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) and curated using GenomeMatcher (9). While referring to the annotation data obtained from the Microbial Genome Annotation Pipeline (http://www.migap.org) (10), we corrected start codon positions and added genes that were missing in the PGAP annotation.

Nucleotide sequence accession numbers.

The genome sequence of Sphingopyxis macrogoltabida strain 203N has been deposited in NCBI/GenBank under the accession numbers CP013344 to CP013346. Sphingopyxis macrogoltabida strain 203N is available from the Biological Resource Center, National Institute of Technology and Evaluation (Tokyo, Japan). Its deposit number is NBRC 111659.

7 in total

1. Structure and conservation of a polyethylene glycol-degradative operon in sphingomonads.

Authors: Akio Tani; Jittima Charoenpanich; Terumi Mori; Mayuko Takeichi; Kazuhide Kimbara; Fusako Kawai
Journal: Microbiology Date: 2007-02 Impact factor: 2.777

2. Proposals of Sphingomonas paucimobilis gen. nov. and comb. nov., Sphingomonas parapaucimobilis sp. nov., Sphingomonas yanoikuyae sp. nov., Sphingomonas adhaesiva sp. nov., Sphingomonas capsulata comb. nov., and two genospecies of the genus Sphingomonas.

Authors: E Yabuuchi; I Yano; H Oyaizu; Y Hashimoto; T Ezaki; H Yamamoto
Journal: Microbiol Immunol Date: 1990 Impact factor: 1.955

3. Proposal of the genus Sphingomonas sensu stricto and three new genera, Sphingobium, Novosphingobium and Sphingopyxis, on the basis of phylogenetic and chemotaxonomic analyses.

Authors: M Takeuchi; K Hamana; A Hiraishi
Journal: Int J Syst Evol Microbiol Date: 2001-07 Impact factor: 2.747

4. The first step in polyethylene glycol degradation by sphingomonads proceeds via a flavoprotein alcohol dehydrogenase containing flavin adenine dinucleotide.

Authors: M Sugimoto; M Tanabe; M Hataya; S Enokibara; J A Duine; F Kawai
Journal: J Bacteriol Date: 2001-11 Impact factor: 3.490

5. Complete genome sequence of Acidovorax sp. strain KKS102, a polychlorinated-biphenyl degrader.

Authors: Yoshiyuki Ohtsubo; Fumito Maruyama; Hisayuki Mitsui; Yuji Nagata; Masataka Tsuda
Journal: J Bacteriol Date: 2012-12 Impact factor: 3.490

6. GenomeMatcher: a graphical user interface for DNA sequence comparison.

Authors: Yoshiyuki Ohtsubo; Wakako Ikeda-Ohtsubo; Yuji Nagata; Masataka Tsuda
Journal: BMC Bioinformatics Date: 2008-09-16 Impact factor: 3.169

7. Complete Genome Sequence of Sphingopyxis macrogoltabida Type Strain NBRC 15033, Originally Isolated as a Polyethylene Glycol Degrader.

Authors: Yoshiyuki Ohtsubo; Yuji Nagata; Mitsuru Numata; Kieko Tsuchikane; Akira Hosoyama; Atsushi Yamazoe; Masataka Tsuda; Nobuyuki Fujita; Fusako Kawai
Journal: Genome Announc Date: 2015-12-10

7 in total

1 in total

1. Genomic Analysis of γ-Hexachlorocyclohexane-Degrading Sphingopyxis lindanitolerans WS5A3p Strain in the Context of the Pangenome of Sphingopyxis.

Authors: Michal A Kaminski; Adam Sobczak; Andrzej Dziembowski; Leszek Lipinski
Journal: Genes (Basel) Date: 2019-09-06 Impact factor: 4.096

1 in total