Literature DB >> 26594306

Complete genome sequence of Thioalkalivibrio paradoxus type strain ARh 1(T), an obligately chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacterium isolated from a Kenyan soda lake.

Tom Berben1, Dimitry Y Sorokin2, Natalia Ivanova3, Amrita Pati3, Nikos Kyrpides4, Lynne A Goodwin3, Tanja Woyke3, Gerard Muyzer1.   

Abstract

Thioalkalivibrio paradoxus strain ARh 1(T) is a chemolithoautotrophic, non-motile, Gram-negative bacterium belonging to the Gammaproteobacteria that was isolated from samples of haloalkaline soda lakes. It derives energy from the oxidation of reduced sulfur compounds and is notable for its ability to grow on thiocyanate as its sole source of electrons, sulfur and nitrogen. The full genome consists of 3,756,729 bp and comprises 3,500 protein-coding and 57 RNA-coding genes. This organism was sequenced as part of the community science program at the DOE Joint Genome Institute.

Entities:  

Keywords:  Haloalkaliphilic; Soda lakes; Sulfur-oxidizing bacteria; Thiocyanate

Year:  2015        PMID: 26594306      PMCID: PMC4653848          DOI: 10.1186/s40793-015-0097-7

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Soda lakes are characterized by a high and stable pH (>9) due to the presence of molar concentrations of soluble carbonates as the dominant anions and a moderate to high salinity [1]. They are found in arid zones in many parts of the world, for example, in the Kulunda Steppe in Russia, North-Eastern China, the Rift Valley in Africa and the arid regions of California and Nevada (e.g., Mono Lake, Big Soda Lake). Despite their (extremely) haloalkaline character, these environments harbor a rich microbial diversity that is responsible for driving highly active biogeochemical cycles [2], of which the sulfur cycle is the most active. Our current research focuses on a group of chemolithoautotrophic sulfur-oxidizing bacteria that belong to the genus in the class . These organisms are of interest because of their role in the oxidative part of the sulfur cycle in soda lakes [3] and their application in the sustainable removal of sulfur from wastewater and gas streams [4]. To better understand the success of this group of organisms, we have sequenced the genomes of a large number of isolates. Here we present the genome sequence of ARh 1T (= DSM 13531 = JCM 11367).

Organism information

Classification and features

This obligate aerobic and haloalkaliphilic strain, which was isolated from a mixed sample of sediments from Kenyan soda lakes, is a non-motile coccoid rod forming intracellular sulfur as an obligate intermediate during oxidation of thiosulfate and thiocyanate (Fig. 1). It is an obligate chemolithoautotroph, capable of using a variety of reduced, inorganic sulfur compounds, including sulfide, thiosulfate and polysulfide, as electron donor for carbon fixation. It can also oxidize CS2 (carbon disulfide). Of special interest is its ability to grow with thiocyanate (NCS−) as electron donor, with a relatively high growth rate of 0.08–0.1 h−1 in continuous culture, compared to 0.01–0.015 h-1 for growth on thiosulfate [5]. Phylogenetic analysis based on 16S rRNA sequences shows that is closely related to ALEN 2T (Fig. 2). An overview of basic features of the organism is provided in Table 1.
Fig. 1

Thin section electron microscopy photographs of cells of strain ARh 1T grown with thiocyanate in batch (a) and chemostat (b) cultures at pH 10 and 0.6 M total Na+. S(in) - intracellular sulfur globe; S(out) - excreted sulfur globe damaging the cell membrane and the wall; N - nucleoid; C - carboxysomes

Fig. 2

Phylogenetic tree, based on 16S rRNA sequences, of Thioalkalivibrio and various members of the Ectothiorhodospiraceae family. ARB [22] was used for tree construction and MEGA6 [23] for the bootstrap analysis. Alphaproteobacteria were used as the outgroup and pruned from the finished tree

Table 1

Classification and general features of Thioalkalivibrio paradoxus ARh 1 T [24]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [25]
Phylum Proteobacteria TAS [26, 27]
Class Gammaproteobacteria TAS [27, 28]
Order Chromatiales TAS [27, 29]
Family Ectothiorhodospiraceae TAS [30]
Genus Thioalkalivibrio TAS [31]
Species Thioalkalivibrio paradoxus TAS [5]
Type strain: ARh 1T (DSM 13531)
Gram stainNegativeTAS [5, 31]
Cell shapeBarrel-like rodsTAS [5]
MotilityNon-motileTAS [5]
SporulationNon-sporulatingNAS
Temperature rangeMesophilicTAS [5]
Optimum temperature35–37 °CTAS [5]
pH range; Optimum8.5–10.5TAS [5]
Carbon sourceInorganic carbonTAS [5]
MIGS-6HabitatSoda lakesTAS [5]
MIGS-6.3Salinity0.3–1.0 M Na+ TAS [5]
MIGS-22Oxygen requirementAerobeTAS [5]
MIGS-15Biotic relationshipfree-livingNAS
MIGS-14PathogenicityNon-pathogenicNAS
MIGS-4Geographic locationKenyaTAS [5]
MIGS-5Sample collection1999TAS [5]
MIGS-4.1LatitudeNot reported
MIGS-4.2LongitudeNot reported
MIGS-4.4AltitudeNot reported

aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [32]

Thin section electron microscopy photographs of cells of strain ARh 1T grown with thiocyanate in batch (a) and chemostat (b) cultures at pH 10 and 0.6 M total Na+. S(in) - intracellular sulfur globe; S(out) - excreted sulfur globe damaging the cell membrane and the wall; N - nucleoid; C - carboxysomes Phylogenetic tree, based on 16S rRNA sequences, of Thioalkalivibrio and various members of the Ectothiorhodospiraceae family. ARB [22] was used for tree construction and MEGA6 [23] for the bootstrap analysis. Alphaproteobacteria were used as the outgroup and pruned from the finished tree Classification and general features of Thioalkalivibrio paradoxus ARh 1 T [24] aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [32]

Genome sequencing information

Genome project history

In order to better understand the diversity within the genus , as well as their biogeochemical role in soda lakes, a large number of isolates (approximately 70) was sequenced at the Joint Genome Institute. The full genome of the type strain of presented here contains 3.8 million basepairs. Sequencing was performed at the JGI under project number 401912 and the sequence data was subsequently released in Genbank on December 31, 2013. A project overview is provided in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS 31Finishing qualityFinished
MIGS-28Libraries usedIllumina
MIGS 29Sequencing platformsIllumina HiSeq 2000
MIGS 31.2Fold coverage1,486X
MIGS 30AssemblersVelvet [8], ALLPATHS R39750 [7]
MIGS 32Gene calling methodProdigal [14], GenePRIMP [15]
Locus TagTHITH
Genbank IDNZ_CP007029
GenBank Date of Release2013–12–31
GOLD IDGp0008932
BIOPROJECTPRJNA52643
MIGS 13Source Material IdentifierDSM 13531
Project relevanceBiotechnology
Project information Genome statistics

Growth conditions and genomic DNA preparation

A buffer using sodium carbonate and bicarbonate, with a total salt concentration of 0.6 M Na+, was used for cultivation of the organism; the energy source was thiosulfate (40 mM). After harvesting, the cells were stored at −80 °C for further processing. Genomic DNA was extracted using a standard chloroform-phenol-isoamyl alcohol mixture, followed by ethanol precipitation. After vacuum drying, the pellet was dissolved in water and the quantity and quality of the DNA determined using the JGI-provided Mass Standard Kit.

Genome sequencing and assembly

The draft genome of ARh 1T was generated at the DOE Joint Genome Institute (JGI) using Illumina data [6]. For this genome, we constructed and sequenced an Illumina short-insert paired-end library with an average insert size of 270 bp which generated 18,589,770 reads and an Illumina long-insert paired-end library with an average insert size of 7,058.67 +/–3247.54 bp which generated 20,051,794 reads totaling 5,796 Mbp of Illumina data (unpublished, Feng Chen). All general aspects of library construction and sequencing performed at the JGI can be found at http://www.jgi.doe.gov/. The initial draft assembly contained 83 contigs in 11 scaffolds. The initial draft data was assembled with ALLPATHS [7], version 39750, and the consensus was computationally shredded into 10 Kbp overlapping fake reads (shreds). The Illumina draft data was also assembled with Velvet, version 1.1.05 [8], and the consensus sequences were computationally shredded into 1.5 Kbp overlapping fake reads (shreds). The Illumina draft data was assembled again with Velvet using the shreds from the first Velvet assembly to guide the next assembly. The consensus from the second Velvet assembly was shredded into 1.5 Kbp overlapping fake reads. The fake reads from the ALLPATHS assembly and both Velvet assemblies and a subset of the Illumina CLIP paired-end reads were assembled using parallel phrap, version 4.24 (High Performance Software, LLC). Possible mis-assemblies were corrected with manual editing in Consed [9-11]. Gap closure was accomplished using repeat resolution software (Wei Gu, unpublished), and sequencing of bridging PCR fragments with Sanger and/or PacBio (unpublished, Cliff Han) technologies. A total of 50 additional sequencing reactions were completed to close gaps and to raise the quality of the final sequence. The size of the genome is 3.8 Mb and the final assembly is based on 5,796 Mbp of Illumina draft data, which provides an average 1,486X coverage of the genome.

Genome annotation

The assembled sequence was annotated using the JGI prokaryotic annotation pipeline [12] and was further reviewed using the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [13]. Genes were identified using Prodigal [14], followed by manual curation using GenePRIMP [15]. Predicted CDSs were translated and used to search the NCBI non-redundant, UniProt, TIGRFam, Pfam, KEGG, COG and InterPro databases. The tRNAScanSE tool [16] was used to detect tRNA genes and ribosomal RNA genes were detected using models contructed from SILVA [17]. Other RNA genes were predicted using Rfam profiles in Infernal [18]. CRISPR elements were detected using CRT [19] and PILER-CR [20]. Further annotation was performed using the Integrated Microbial Genomics (IMG) platform [21].

Genome properties

The finished genome with a G + C percentage of 66.06 % comprises a single chromosome of approximately 3.8 Mb (Fig. 3). There are 3557 genes of which 3,500 are protein-coding genes (a summary of genome properties is shown in Table 3). Approximately two-thirds of the protein coding genes could be assigned to a COG functional category (Table 4).
Fig. 3

Genome map of Thioalkalivibrio paradoxus ARh 1T. From outer to inner ring: genes on the forward strand; genes on the reverse strand; RNA genes (tRNA: green; rRNA: red; other: black); GC content and GC skew

Table 3

Genome statistics

AttributeValue% of Total
Genome size (bp)3,756,729100
DNA coding (bp)3,305,44587.99
DNA G + C (bp)2,500,00466.55
Total genes3,557100
Protein coding genes3,50098.40
RNA genes571.60
Pseudo genes1243.49
Genes in internal clusters1763.46
Genes with function prediction2,73977.00
Genes assigned to COGs2,31765.14
Genes with Pfam domains2,83579.70
Genes with signal peptides2717.62
Genes with transmembrane helices84123.64
CRISPR repeats8100
Table 4

Number of genes associated with the 25 general COG functional categories

CodeValuePercentDescription
J2047.95Translation, ribosomal structure and biogenesis
A20.08RNA processing and modification
K1034.02Transcription
L963.74Replication, recombination and repair
B10.04Chromatin structure and dynamics
D321.25Cell cycle control, Cell division, chromosome partitioning
V1164.52Defense mechanisms
T1194.64Signal transduction mechanisms
M2017.84Cell wall/membrane biogenesis
N341.33Cell motility
U511.99Intracellular trafficking and secretion
O1586.16Posttranslational modification, protein turnover, chaperones
C2288.89Energy production and conversion
G913.55Carbohydrate transport and metabolism
E1626.32Amino acid transport and metabolism
F612.38Nucleotide transport and metabolism
H1505.85Coenzyme transport and metabolism
I953.70Lipid transport and metabolism
P1786.94Inorganic ion transport and metabolism
Q361.40Secondary metabolites biosynthesis, transport and catabolism
R2379.24General function prediction only
S1465.69Function unknown
-1,24034.86Not in COGs

The total is based on the total number of protein coding genes in the genome

Genome map of Thioalkalivibrio paradoxus ARh 1T. From outer to inner ring: genes on the forward strand; genes on the reverse strand; RNA genes (tRNA: green; rRNA: red; other: black); GC content and GC skew Number of genes associated with the 25 general COG functional categories The total is based on the total number of protein coding genes in the genome

Conclusions

The availability of high-quality genomic sequences of the type strains of , the dominant genus of sulfur-oxidizing bacteria in soda lakes, is an invaluable tool for gaining a more complete understanding of the biogeochemistry of these extreme environments. Additionally, this information may provide new insights into the exact mechanisms of adaptation these bacteria have evolved to not only survive, but thrive in this habitat. Finally, the genome may contain clues that will help improve the existing biotechnological applications of this organism in bioremediation.
  26 in total

1.  ARB: a software environment for sequence data.

Authors:  Wolfgang Ludwig; Oliver Strunk; Ralf Westram; Lothar Richter; Harald Meier; Arno Buchner; Tina Lai; Susanne Steppi; Gangolf Jobb; Wolfram Förster; Igor Brettske; Stefan Gerber; Anton W Ginhart; Oliver Gross; Silke Grumann; Stefan Hermann; Ralf Jost; Andreas König; Thomas Liss; Ralph Lüssmann; Michael May; Björn Nonhoff; Boris Reichel; Robert Strehlow; Alexandros Stamatakis; Norbert Stuckmann; Alexander Vilbig; Michael Lenke; Thomas Ludwig; Arndt Bode; Karl-Heinz Schleifer
Journal:  Nucleic Acids Res       Date:  2004-02-25       Impact factor: 16.971

2.  GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes.

Authors:  Amrita Pati; Natalia N Ivanova; Natalia Mikhailova; Galina Ovchinnikova; Sean D Hooper; Athanasios Lykidis; Nikos C Kyrpides
Journal:  Nat Methods       Date:  2010-05-02       Impact factor: 28.547

3.  Validation of publication of new names and new combinations previously effectively published outside the IJSEM.

Authors: 
Journal:  Int J Syst Evol Microbiol       Date:  2005-05       Impact factor: 2.747

4.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

5.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors:  Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2013-10-16       Impact factor: 16.240

6.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

7.  Consed: a graphical tool for sequence finishing.

Authors:  D Gordon; C Abajian; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

8.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Authors:  T M Lowe; S R Eddy
Journal:  Nucleic Acids Res       Date:  1997-03-01       Impact factor: 16.971

9.  The minimum information about a genome sequence (MIGS) specification.

Authors:  Dawn Field; George Garrity; Tanya Gray; Norman Morrison; Jeremy Selengut; Peter Sterk; Tatiana Tatusova; Nicholas Thomson; Michael J Allen; Samuel V Angiuoli; Michael Ashburner; Nelson Axelrod; Sandra Baldauf; Stuart Ballard; Jeffrey Boore; Guy Cochrane; James Cole; Peter Dawyndt; Paul De Vos; Claude DePamphilis; Robert Edwards; Nadeem Faruque; Robert Feldman; Jack Gilbert; Paul Gilna; Frank Oliver Glöckner; Philip Goldstein; Robert Guralnick; Dan Haft; David Hancock; Henning Hermjakob; Christiane Hertz-Fowler; Phil Hugenholtz; Ian Joint; Leonid Kagan; Matthew Kane; Jessie Kennedy; George Kowalchuk; Renzo Kottmann; Eugene Kolker; Saul Kravitz; Nikos Kyrpides; Jim Leebens-Mack; Suzanna E Lewis; Kelvin Li; Allyson L Lister; Phillip Lord; Natalia Maltsev; Victor Markowitz; Jennifer Martiny; Barbara Methe; Ilene Mizrachi; Richard Moxon; Karen Nelson; Julian Parkhill; Lita Proctor; Owen White; Susanna-Assunta Sansone; Andrew Spiers; Robert Stevens; Paul Swift; Chris Taylor; Yoshio Tateno; Adrian Tett; Sarah Turner; David Ussery; Bob Vaughan; Naomi Ward; Trish Whetzel; Ingio San Gil; Gareth Wilson; Anil Wipat
Journal:  Nat Biotechnol       Date:  2008-05       Impact factor: 54.908

10.  CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats.

Authors:  Charles Bland; Teresa L Ramsey; Fareedah Sabree; Micheal Lowe; Kyndall Brown; Nikos C Kyrpides; Philip Hugenholtz
Journal:  BMC Bioinformatics       Date:  2007-06-18       Impact factor: 3.169

View more
  2 in total

1.  A Sulfur Oxygenase from the Haloalkaliphilic Bacterium Thioalkalivibrio paradoxus with Atypically Low Reductase Activity.

Authors:  Patrick Rühl; Uwe Pöll; Johannes Braun; Andreas Klingl; Arnulf Kletzin
Journal:  J Bacteriol       Date:  2017-01-30       Impact factor: 3.490

2.  Comparative Genome Analysis of Three Thiocyanate Oxidizing Thioalkalivibrio Species Isolated from Soda Lakes.

Authors:  Tom Berben; Lex Overmars; Dimitry Y Sorokin; Gerard Muyzer
Journal:  Front Microbiol       Date:  2017-02-28       Impact factor: 5.640

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.