Literature DB >> 31806750

Metagenome-Assembled Genome of Halomonas sp. Isolate SL48-SHIP-3 from the Microbial Mat of Salt Lake Number 48 (Novosibirsk Region, Russia).

Aleksandra A Shipova1, Anton V Korzhuk2, Valeria Shlyahtun2, Alla V Bryanskaya2, Aleksey S Rozanov2, Sergey E Peltek2.   

Abstract

The Halomonas sp. isolate SL48-SHIP-3 genome was obtained from metagenomics sequencing of the microbial mat of Salt Lake Number 48 (54.201806N, 78.179194E; Novosibirsk region, Russia). The sequenced and annotated genome is 2,575,909 bp and encodes 2,368 genes.
Copyright © 2019 Shipova et al.

Entities:  

Year:  2019        PMID: 31806750      PMCID: PMC6895310          DOI: 10.1128/MRA.01293-19

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Halomonas is one of the genera of the family Halomonadaceae (Proteobacteria). These bacteria are halophiles that were previously discovered in other regions of the planet, including salt lakes (1). Salt Lake Number 48 (54.201806N, 78.179194E; Novosibirsk Region, Russia) is a chloride sulfate lake (pH 8.0), with salinity ranging from 190 g/liter to 230 g/liter depending on weather conditions (2). A sample of the microbial mat, which consists of floating particles 2 to 10 mm in size, was taken from the coastal part of this lake and stored in alcohol at −70°C. From this sample, a metagenome was obtained, from which the genomes of individual microorganisms were then isolated. We have already described one of the genomes in a previous article (3). Total DNA was isolated using the NucleoSpin soil kit (genomic DNA from soil) with the default protocol. Libraries for metagenome sequencing were prepared at the Center of Genomic Studies, Institute of Cytology and Genetics of the Siberian Branch of the RAS (ICG SB RAS), using a NEBNext Ultra DNA library prep kit for Illumina; the insert size was 450 bp. Metagenome paired-end sequencing was performed on a NovaSeq platform (Illumina) at Genetico LLC using the NovaSeq 6000 S2 reagent kit (200 cycles). A total of 398,852,702 reads were sequenced; the average read length was 100 bp. The reads were processed by Trimmomatic version 0.36 (using options MINLEN:95 and CROP:97) (4). De novo assembly of short reads into scaffolds was performed using SPAdes version 3.11.1 with the option “-only-assembler” (5). Contigs shorter than 1,000 bp were deleted. Binning of metagenomics scaffolds into separate clusters (in which one bin represents one genome) was carried out using MaxBin (version 2.2.4) with default parameters (6). Here, we describe one of the resulting genomes with coverage of 629.5×. The classification of this genome to the family level was obtained using MetaWRAP version 1.2.2 (7) with the option “classify_bins.” The genome belongs to the family Halomonadaceae (within the class Gammaproteobacteria). We aligned amino acid sequences (which were found by MaxBin for this cluster) against the NCBI nr protein database. Most of them had the best match (with the highest score and an E value of <0.05) with Halomonas sp. strain es.049. We compared this genome with the reference genomes of the most closely related members of the genus Halomonas (that had a match with the described genome) using the Average Nucleotide Identity (ANI) calculator (http://enve-omics.ce.gatech.edu/ani/) with default parameters. The genomes of Halomonas arcis strain CGMCC (identity, 77.78%), Halomonas campaniensis strain LS21 (identity, 78.04%), and Halomonas sp. es.049 (identity, 80.18%) were closest to our genome. These results suggest that our organism may constitute a new strain. The genome was checked for contamination using CheckM version 1.0.13 (8) (with the option “taxonomy_wf family Halomonadaceae”); the contamination rate was 0.64%, and the completeness rate was 92.18%. The genome is 2,575,909 bp long, consists of 149 contigs, and has a GC content of 56.79% and an N50 value of 37,322 bp (evaluated using QUAST version 5.0.2 with default parameters [9]). Open reading frame (ORF) prediction and automatic annotation were performed using NCBI PGAP (version 4.8) with the default parameters (10). The complete genome sequence contains 2,446 genes, 2,388 coding sequences (CDS), 1 rRNA (23S), 53 tRNAs, and 4 noncoding RNAs (ncRNAs).

Data availability.

The raw metagenomics data have been deposited at DDBJ/EMBL/GenBank under accession no. SRR7943696. The draft genome sequence for Halomonas sp. isolate SL48-SHIP-3 has been deposited at DDBJ/EMBL/GenBank under accession no. VMQS00000000. The 149 contigs have been deposited under accession no. VMQS01000001 to VMQS01000149 (https://www.ncbi.nlm.nih.gov/Traces/wgs/VMQS01?display=contigs).
  9 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets.

Authors:  Yu-Wei Wu; Blake A Simmons; Steven W Singer
Journal:  Bioinformatics       Date:  2015-10-29       Impact factor: 6.937

3.  QUAST: quality assessment tool for genome assemblies.

Authors:  Alexey Gurevich; Vladislav Saveliev; Nikolay Vyahhi; Glenn Tesler
Journal:  Bioinformatics       Date:  2013-02-19       Impact factor: 6.937

4.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

Authors:  Donovan H Parks; Michael Imelfort; Connor T Skennerton; Philip Hugenholtz; Gene W Tyson
Journal:  Genome Res       Date:  2015-05-14       Impact factor: 9.043

5.  MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis.

Authors:  Gherman V Uritskiy; Jocelyne DiRuggiero; James Taylor
Journal:  Microbiome       Date:  2018-09-15       Impact factor: 14.650

6.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

7.  The role of environmental factors for the composition of microbial communities of saline lakes in the Novosibirsk region (Russia).

Authors:  Alla V Bryanskaya; Tatyana K Malup; Elena V Lazareva; Oxana P Taran; Alexey S Rozanov; Vadim M Efimov; Sergey E Peltek
Journal:  BMC Microbiol       Date:  2016-01-27       Impact factor: 3.605

8.  NCBI prokaryotic genome annotation pipeline.

Authors:  Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell
Journal:  Nucleic Acids Res       Date:  2016-06-24       Impact factor: 16.971

9.  Metagenome-Assembled Genome Sequence of Phormidium sp. Strain SL48-SHIP, Isolated from the Microbial Mat of Salt Lake Number 48 (Novosibirsk Region, Russia).

Authors:  Aleksey S Rozanov; Aleksandra A Shipova; Alla V Bryanskaya; Sergey E Peltek
Journal:  Microbiol Resour Announc       Date:  2019-08-01
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.