Literature DB >> 26634017

Complete genome sequence of Salinicoccus halodurans H3B36, isolated from the Qaidam Basin in China.

Kai Jiang1, Yanfen Xue2, Yanhe Ma2.   

Abstract

Salinicoccus halodurans H3B36 is a moderately halophilic bacterium isolated from a sediment sample of Qaidam Basin at 3.2 m vertical depth. Strain H3B36 accumulate N (α)-acetyl-α-lysine as compatible solute against salinity and heat stresses and may have potential applications in industrial biotechnology. In this study, we sequenced the genome of strain H3B36 using single molecule, real-time sequencing technology on a PacBio RS II instrument. The complete genome of strain H3B36 was 2,778,379 bp and contained 2,853 protein-coding genes, 12 rRNA genes, and 61 tRNA genes with 58 tandem repeats, six minisatellite DNA sequences, 11 genome islands, and no CRISPR repeat region. Further analysis of epigenetic modifications revealed the presence of 11,000 m4C-type modified bases, 7,545 m6A-type modified bases, and 89,064 other modified bases. The data on the genome of this strain may provide an insight into the metabolism of N (α)-acetyl-α-lysine.

Entities:  

Keywords:  Genome sequencing; Moderately halophilic; Qaidam Basin; Salinicoccus halodurans strain; Staphylococcaceae

Year:  2015        PMID: 26634017      PMCID: PMC4667468          DOI: 10.1186/s40793-015-0108-8

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Moderately halophilic bacteria are a group of halophilic microorganisms that grow optimally in media containing between 3 % and 15 % (w/v) NaCl. These bacteria exhibit strong salt tolerance and are widely distributed in different high-salt habitats, such as hypersaline soils and lakes, solar salterns, and salted foods [1, 2]. To cope with the hyperosmotic conditions, these microorganisms accumulate large quantities of inorganic ions, such as K+ and Cl−, or a particular group of organic osmolytes [3, 4], such as sugars (trehalose and sucrose), sugar derivatives (glucosylglycerol and mannosylglycerate), polyols (glycerol and arabitol), phosphodiesters (di-myo-inositol phosphate), amino acids (proline, α-glutamate, and β-glutamate), and derivatives (betaine and ectoine) [5-8]. In strain H3B36, which was isolated from subsurface saline soil (3.2-m depth) in Qaidam Basin in the Qinghai province, China, we detected a special compound, Nα-acetyl-α-lysine, that acts as an organic osmolyte and thermolyte (authors’ unpublished observation). The amount of Nα-acetyl-α-lysine in the cell was increased and could be accumulated to a high level when strain H3B36 was subjected to salt stress or heat stress. Unlike other compatible solutes, Nα-acetyl-α-lysine has only been found to date in to date, and the molecular mechanisms through which this compound is synthesized and stored are unclear [9, 10]. Based on analysis of the 16S rRNA gene sequence, this strain is most closely related to W24T (= CGMCC 1.6501T = DSM 19336) [11]. The genus , which was first described by Ventosa et al. [12, 13], belongs to the family . To date, 16 validly named species of have been identified; however, only six genome sequences are available. All species of the genus are defined as moderately halophilic bacteria. These organisms may have potential applications in various fields, including as additives in the food industry; for production of polymer compounds, enzymes, and stress protectants; and in environmental protection and biodegradation [14-19]. To obtain insights into the metabolic pathway of Nα-acetyl-α-lysine and explore the genome of the spp, we performed complete genome sequence analysis and annotation of H3B36.

Organism information

Classification and features

Strain H3B36 (Table 1) was isolated from a subsurface saline soil sample (3.2 m depth) from the Qaidam Basin of China by enriching in liquid medium at 37 °C and then plating on agar medium until single colonies were obtained. The 16S rRNA gene sequence of strain H3B36 and other available 16S rRNA gene sequences of closely related species collected from the EzTaxon-e database were used to construct a phylogenetic tree (Fig. 1) [20]. CLUSTAL_X was used to generate alignments [21]. After trimming, the alignments were converted to the MEGA format, and a phylogenetic tree was constructed. The evolutionary history was inferred using the maximum likelihood method based on the Kimura 2-parameter model within MEGA software version 5.10 [22, 23]. Taxonomic analysis showed that strain H3B36 was most closely related to W24 T with 99.9 % 16S rRNA gene sequence identity, and as such, strain H3B36 was classified as a strain of .
Table 1

Classification and general features of Salinicoccus halodurans H3B36 according to the MIGS recommendations [44]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [45]
Phylum Firmicutes TAS [46]
Class Bacilli TAS [47, 48]
Order Bacillales TAS [49, 50]
Family Staphylococcaceae TAS [48, 51]
Genus Salinicoccus TAS [12, 13]
Species Salinicoccus halodurans TAS [11]
Strain H3B36IDA
Gram stainPositiveTAS [11]
Cell shapeCocciIDA
MotilityNon-motileTAS [11]
SporulationNon-sporulatingTAS [11]
Temperature range4-42 °CIDA
Optimum temperature28-30 °CIDA
pH range; Optimum5.5-9.0; 7.5IDA
Carbon sourceHeterotrophIDA
GS-6Habitatsubsurface saline soil (3.2 m depth)IDA
MIGS-6.3Salinity range;2-18 % NaCl (w/v)IDA
MIGS-22Oxygen requirementAerobicIDA
MIGS-15Biotic relationshipFree-livingIDA
MIGS-14PathogenicityUnknownNAS
MIGS-4Geographic locationChina: Qaidam basinIDA
MIGS-5Sample collection2006IDA
MIGS-4.1Latitude37.06 NIDA
MIGS-4.2Longitude94.73 EIDA
MIGS-4.4Altitude2674 mIDA

aEvidence codes- IDA inferred from direct assay, TAS traceable author statement, NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [29]

Fig. 1

Phylogenetic tree based on the 16S rRNA gene showing the position of Salinicoccus halodurans H3B36 relative to other species in the genus Salinicoccus. Staphylococcus aureus was used as an outgroup. The analysis involved 18 nucleotide sequences, and there were a total of 1394 positions in the final dataset. GenBank accession numbers for the sequences of each strain are indicated in parentheses. The maximum likelihood algorithm based on the Kimura 2-parameter model was used to construct the phylogenetic consensus tree. All positions containing missing data and gaps were eliminated. Numbers next to the branches represent the bootstrap values obtained by repeating the analysis 1000 times, and values of less than 70 % are not shown at the nodes. The tree is drawn to scale, with branch lengths indicating the number of substitutions per site

Classification and general features of Salinicoccus halodurans H3B36 according to the MIGS recommendations [44] aEvidence codes- IDA inferred from direct assay, TAS traceable author statement, NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [29] Phylogenetic tree based on the 16S rRNA gene showing the position of Salinicoccus halodurans H3B36 relative to other species in the genus Salinicoccus. Staphylococcus aureus was used as an outgroup. The analysis involved 18 nucleotide sequences, and there were a total of 1394 positions in the final dataset. GenBank accession numbers for the sequences of each strain are indicated in parentheses. The maximum likelihood algorithm based on the Kimura 2-parameter model was used to construct the phylogenetic consensus tree. All positions containing missing data and gaps were eliminated. Numbers next to the branches represent the bootstrap values obtained by repeating the analysis 1000 times, and values of less than 70 % are not shown at the nodes. The tree is drawn to scale, with branch lengths indicating the number of substitutions per site The cell morphology of strain H3B36 was determined using scanning electron microscopy (Fig. 2). Microscopically, cells of strain H3B36 were spherical and measured approximately 0.9 μm in diameter. Cells occurred singly or in pairs, tetrads, or irregular clumps at early growth stages. Colonies on GMH agar medium were white, opaque, circular, and slight convex. Cells were able to grow at a temperature range from 4 to 42 °C, with optimum growth observed around 30 °C in GMH medium. Analysis of growth in GMH medium with different NaCl concentrations, the strain grew well when NaCl ranged from 2 to 18 % (w/v) and could not grow in medium without NaCl or with NaCl at concentrations of more than 20 % (w/v). Optimal growth occurred between 4 % and 6 % (w/v) NaCl.
Fig. 2

Scanning electron micrographs of Salinicoccus halodurans H3B36 using field-emission scanning electron microscopy (Hitachi SU8010, Japan)

Scanning electron micrographs of Salinicoccus halodurans H3B36 using field-emission scanning electron microscopy (Hitachi SU8010, Japan)

Genome sequencing information

Genome project history

H3B36 was selected for genome sequencing because we observed the presence of a unique compatible solute for protection and potential industrial applications. The complete genome sequence has been deposited in GenBank under the accession number CP011366. Sequencing, annotation, and analysis were performed at WUHAN Institute of Biotechnology, China. The project information and its association with MIGS version 2.0 are shown in Table 2.
Table 2

Genome sequencing project information

MIGS IDPropertyTerm
MIGS 31Finishing qualityFinished
MIGS-28Libraries usedNone
MIGS 29Sequencing platformsPacBio RS II
MIGS 31.2Fold coverage212X
MIGS 30AssemblersHGAP2.2.0 workflow
MIGS 32Gene calling methodGlimmer
Locus TagAAT16
GenBank IDCP011366
GenBank Date of ReleaseMay 11, 2015
GOLD IDGp0114775
BioProject IDPRJNA282445
MIGS 13Source Material IdentifierStrain H3B36
Project relevanceEnvironmental and biotechnological
Genome sequencing project information

Growth conditions and genomic DNA preparation

H3B36 was grown aerobically in GMH medium containing 5 g/L casamino acid, 5 g/L yeast extract, 4 g/L MgSO4 · 7H2O, 2 g/L KCl, 0.036 g/L FeSO4 · 7H2O, 0.36 mg/L MnCl2 · 7H2O, and 60 g/L NaCl, at pH 7.0 (titrated with 1 M NaOH). Genomic DNA from freshly grown cells harvested in the exponential growth phase was extracted using the QIAGEN Genomic DNA Buffer Set and QIAGEN Genomic-tip 100/G according to the manufacturer’s protocols. The prepared DNA was evaluated on a 0.75 % agarose gel to verify the integrity of the molecular weight fragments. Qualification and quantification of the prepared DNA sample was measured with a NanoDrop instrument (Thermo Scientific, Wilmington, MA, USA) and Qubit (Life Technologies, Grand Island, NY, USA) to confirm the suitability of the DNA sample for high-throughput next-generation sequencing.

Genome sequencing and assembly

The genome of H3B36 was sequenced using third-generation sequencing technology on a PacBio RS II instrument. The analysis produced a total of 573,153,827 bp, and 54,457 post-filter reads with a mean length of 10,524 bp were obtained. The Hierarchical Genome Assembly Processing pipeline, version 2.2.0, was used to assemble the genome [24-26]. Long reads were selected as the seed sequences for constructing preassemblies, and the other short reads were mapped to the seeds using BLASTR software for alignment, which corrected the errors in the long reads and thus increased the accuracy rating of bases more than 99 %. Based on this analysis, we obtained 95.7 M high-quality reads with an average length of 12,910 bp. Using the overlap-layout-consensus (OLC) algorithms to debug the parameters, we adopted Celera assembler software for assembly. To improve the assembly, the raw data were mapped to the assembled reference sequence to remove any fine-scale errors using Quiver software. Low-depth contigs were then removed, and the rest of the contigs were connected using Minumus2 software. Finally, the data were assembled de novo to one final 2,778,378-bp complete contig with 212 × depth of coverage.

Genome annotation

The RAST Prokaryotic Genome Annotation Server was used to predict protein-coding open reading frames, tRNAs, and structural RNA genes [27]. The Cluster of Orthologous Groups, Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, Swiss-Prot, and Non-Redundant Protein databases were used to annotate the predicted genes [28-32]. Pfam databases were used to predicted genes with conserved domains [33]. Transmembrane helices and signal peptides were identified using TMHMM and SignalP, version4.1, respectively [34, 35]. Tandem Repeat Finder software was used to predict tandem repeat sequences, and Misa software was used to find the minisatellite DNA sequences [36]. Genome islands were analyzed using IslandViewer software, which integrates three software programs (IslandPick, SIGI-HMM, and IslandPath-DIMOB) and combines the Virulence Factor and Antibiotic Resistance Gene databases [37, 38]. In addition, the CRISPR motif was identified using CRISPR II software [39]. Analysis of the raw data was performed to identify loci having epigenetic modifications (i.e., m4C, m6A, and other modification) due to the dynamic characteristics of the raw data [40, 41]. The Restriction Enzyme Database was used to identify the genes involved in the restriction modification system [42].

Genome properties

The complete genome sequence of H3B36 was found to be 2,778,378 bp and had a G + C content of 44.54 %. No plasmids were found. RAST predicted 2,853 coding sequences, 61 tRNA genes, and 16 structural RNA genes. The predicted CDSs represented 88.79 % of the total genome sequence, with an average length of 864.72 bp. Genome analysis showed that the genome of strain H3B36 contained 58 tandem repeats, six minisatellite DNA sequences, and 11 genome islands. Further analysis of epigenetic modifications revealed 11,000 m4C-type modified bases, 7,545 m6A-type modified bases, and 89,064 other modified bases in the genome. Furthermore, several restriction modification genes were found, with eight belonging to the type I system, three belonging to the type II system, and one belonging to the type IV system. The genome statistics and gene distributions into COG functional categories are presented in Tables 3 and 4, respectively. The circular representation of the bacterial genome was drawn using CGview software (Fig. 3) [43].
Table 3

Genome statistics

AttributeValue% of Total
Genome size (bp)2,778,379100.00
DNA coding (bp)2,489,75389.61
DNA G + C (bp)1,237,61644.54
DNA scaffolds1100.00
Total genes2,930100.00
Protein coding genes2,85397.37
RNA genes772.63
Pseudo genesN/Da
Genes in internal clustersN/Da
Genes with function prediction223576.28
Genes assigned to COGs260788.98
Genes with Pfam domains245883.89
Genes with signal peptides1023.48
Genes with transmembrane helices72324.68
CRISPR repeatsNA

a N/D, not determined

Table 4

Number of genes associated COG functional categories of Salinicoccus halodurans H3B36

CodeValue% ageDescription
J1435.0Translation, ribosomal structure and biogenesis
A00RNA processing and modification
K2067.2Transcription
L1234.3Replication, recombination and repair
B20.1Chromatin structure and dynamics
D220.8Cell cycle control, Cell division, chromosome partitioning
V481.7Defense mechanisms
T863.0Signal transduction mechanisms
M1294.5Cell wall/membrane biogenesis
N130.5Cell motility
U170.6Intracellular trafficking and secretion
O903.2Posttranslational modification, protein turnover, chaperones
C1746.1Energy production and conversion
G2699.4Carbohydrate transport and metabolism
E2789.7Amino acid transport and metabolism
F792.8Nucleotide transport and metabolism
H983.4Coenzyme transport and metabolism
I1394.9Lipid transport and metabolism
P1535.7Inorganic ion transport and metabolism
Q391.4Secondary metabolites biosynthesis, transport and catabolism
R2779.7General function prediction only
S2227.8Function unknown
-2468.6Not in COGs

The total is based on the total number of protein coding genes in the annotated genome

Fig. 3

Circular chromosome map of Salinicoccus halodurans H3B36. From inner to outer: 1, GC skew (GC Skew is calculated using a sliding window, as (G – C) / (G + C), with the value plotted as the deviation from the average GC skew of the entire sequence); 2, GC content (plotted using a sliding window, as the deviation from the average GC content of the entire sequence); 3, tRNA/rRNA; 4 and 5, CDS (colored according to COG function categories, where 4 is the reverse strand and 5 is the forward strand); 6 and 7, m4C and m6A sites in CDS/rRNA/tRNA (6 is the reverse strand and 7 is the forward strand); and 8, m4C and m6A sites in intergene regions

Genome statistics a N/D, not determined Number of genes associated COG functional categories of Salinicoccus halodurans H3B36 The total is based on the total number of protein coding genes in the annotated genome Circular chromosome map of Salinicoccus halodurans H3B36. From inner to outer: 1, GC skew (GC Skew is calculated using a sliding window, as (G – C) / (G + C), with the value plotted as the deviation from the average GC skew of the entire sequence); 2, GC content (plotted using a sliding window, as the deviation from the average GC content of the entire sequence); 3, tRNA/rRNA; 4 and 5, CDS (colored according to COG function categories, where 4 is the reverse strand and 5 is the forward strand); 6 and 7, m4C and m6A sites in CDS/rRNA/tRNA (6 is the reverse strand and 7 is the forward strand); and 8, m4C and m6A sites in intergene regions

Insights from the genome sequence

Genome analysis showed that H3B36 contained many genes related to the stress response, such as choline and betaine transporters, glycerol uptake facilitator protein, cold-shock protein, chaperones proteins, and others. These genes allowed the strain to cope with different environmental stresses. Experimentation and additional analysis of these genes may help to elucidate the mechanisms mediating the stress response and facilitate the development of H3B36 for use in industry applications. In addition, several genes encoding hydrolases, including amylase (1), protease (19), pullulanase (2), lipase (3), phosphoesterase (5), and glucosidase (4), were identified in the genome. Hydrolases are highly valuable resources for some specific industrial processes, and hydrolases from various extremophiles may have many advantages [14, 19]. These results indicated that H3B36 might have the potential for application in industrial biotechnology as a producer of miscellaneous hydrolases. Nα-acetyl-α-lysine was found play a key role in protecting H3B36 cells under different stresses (unpublished observation by Kai Jiang, Yanfen Xue and Yanhe Ma). Genome annotations showed that lysine may be synthesized through the acetyl-dependent diaminopimelic acid pathway in H3B36. One 8-kb gene cluster containing eight genes was predicted to be involved in Nα-acetyl-α-lysine biosynthesis. Six genes in the cluster map to enzymes in the acetyl-dependent diaminopimelic acid pathway, including the genes encoding aspartokinase, aspartate-semialdehyde dehydrogenase, dihydrodipicolinate synthase, dihydrodipicolinate reductase, 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-acetyltransferase and diaminopimelate decarboxylase. Nα-acetyl-α-lysine is a derivative of lysine, so this gene cluster may participate in the synthesis of Nα-acetyl-α-lysine. Further studies are required to verify this assumption and identify the metabolic pathway mediating Nα-acetyl-α-lysine biosynthesis in H3B36.

Conclusions

This is the first report describing the genome sequence of . The genome size of H3B36 (2.78 M) is larger than the other sequenced members of genus including sp. SV-16 (2.59 M), DSM 17002 (2.55 M), DSM 19776 (2.64 M), CrmT (2.67 M), and W12 (2.56 M). H3B36 has a G + C content (44.5 %) higher than DSM 19776 but lower than those of CrmT, sp. SV-16, DSM 17002, and strain W12 (47.9 %, 48.7 %, 49.1 % and 50.0 %, respectively). Further comparative genomic study shows that the Nα-acetyl-α-lysine related gene cluster is also found in other sequenced members of genus . The gene cluster in sp. SV-16, DSM 17002, CrmT, and W12 containing eight genes are similar to that in H3B36. DSM 19776 has a slight discrepancy, which lacks aspartokinase in its gene cluster. The genome of H3B36 provides important insights into our understanding of the metabolism of Nα-acetyl-α-lysine. Furthermore, the sequence of H3B36 provides useful information and may contribute to facilitate applications of genus in industrial biotechnology.
  41 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

Review 2.  Potential of halotolerant and halophilic microorganisms for biotechnology.

Authors:  R Margesin; F Schinner
Journal:  Extremophiles       Date:  2001-04       Impact factor: 2.395

3.  Intracellular ion and organic solute concentrations of the extremely halophilic bacterium Salinibacter ruber.

Authors:  Aharon Oren; Mikal Heldal; Svein Norland; Erwin A Galinski
Journal:  Extremophiles       Date:  2002-08-24       Impact factor: 2.395

Review 4.  Compatible solute biosynthesis in cyanobacteria.

Authors:  Stephan Klähn; Martin Hagemann
Journal:  Environ Microbiol       Date:  2010-11-05       Impact factor: 5.491

5.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

6.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

Authors:  Chen-Shan Chin; David H Alexander; Patrick Marks; Aaron A Klammer; James Drake; Cheryl Heiner; Alicia Clum; Alex Copeland; John Huddleston; Evan E Eichler; Stephen W Turner; Jonas Korlach
Journal:  Nat Methods       Date:  2013-05-05       Impact factor: 28.547

7.  Salinibacter ruber gen. nov., sp. nov., a novel, extremely halophilic member of the Bacteria from saltern crystallizer ponds.

Authors:  Josefa Antón; Aharon Oren; Susana Benlloch; Francisco Rodríguez-Valera; Rudolf Amann; Ramón Rosselló-Mora
Journal:  Int J Syst Evol Microbiol       Date:  2002-03       Impact factor: 2.747

8.  Salinicoccus halodurans sp. nov., a moderate halophile from saline soil in China.

Authors:  Xiaowei Wang; Yanfen Xue; Sanqing Yuan; Cheng Zhou; Yanhe Ma
Journal:  Int J Syst Evol Microbiol       Date:  2008-07       Impact factor: 2.747

9.  UniProt Knowledgebase: a hub of integrated protein data.

Authors:  Michele Magrane
Journal:  Database (Oxford)       Date:  2011-03-29       Impact factor: 3.451

10.  Aggressive assembly of pyrosequencing reads with mates.

Authors:  Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal:  Bioinformatics       Date:  2008-10-24       Impact factor: 6.937

View more
  3 in total

1.  Large-scale distribution of bacterial communities in the Qaidam Basin of the Qinghai-Tibet Plateau.

Authors:  Rui Xing; Qing-Bo Gao; Fa-Qi Zhang; Jiu-Li Wang; Shi-Long Chen
Journal:  Microbiologyopen       Date:  2019-08-26       Impact factor: 3.139

2.  Net Charges of the Ribosomal Proteins of the S10 and spc Clusters of Halophiles Are Inversely Related to the Degree of Halotolerance.

Authors:  Madhan R Tirumalai; Daniela Anane-Bediakoh; Sidharth Rajesh; George E Fox
Journal:  Microbiol Spectr       Date:  2021-12-15

3.  Identification and characterization of a novel GNAT superfamily Nα -acetyltransferase from Salinicoccus halodurans H3B36.

Authors:  Xiaochen Ma; Kai Jiang; Cheng Zhou; Yanfen Xue; Yanhe Ma
Journal:  Microb Biotechnol       Date:  2022-01-05       Impact factor: 6.575

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.