Literature DB >> 34095378

Draft genome sequence data of Streptomyces sp. FH025.

Lucky Poh Wah Goh1, Fauze Mahmud1, Ping-Chin Lee1.   

Abstract

The genome data of Streptomyces sp. FH025 comprised of 8,381,474 bp with a high GC content of 72.51%. The genome contains 7035 coding sequences spanning 1261 contigs. Streptomyces sp. FH025 contains 57 secondary metabolite gene clusters including polyketide synthase, nonribosomal polyketide synthase and other biosynthetic pathways such as amglyccycl, butyrolactone, terpenes, siderophores, lanthipeptide-class-iv, and ladderane. 16S rRNA analysis of Streptomyces sp. FH025 is similar to the Streptomyces genus. This whole genome project has been deposited at NCBI under the accession JAFLNG000000000.
© 2021 The Author(s). Published by Elsevier Inc.

Entities:  

Keywords:  Anti-malarial activity; Draft genome sequence; FH025; Secondary metabolites; Streptomyces sp.

Year:  2021        PMID: 34095378      PMCID: PMC8166745          DOI: 10.1016/j.dib.2021.107128

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specification Table

Value of the Data

The Streptomyces strain FH025 draft genome showed that it is unique as compared to other strains and has the potential to produce novel bioactive compounds. The secondary metabolite putative genes identified in Streptomyces sp. FH025 genome could contribute greatly to the antibiotic and drug discovery for treatment of various human diseases. Based on the genome data and previous study, this strain could be a potential strain for study of anti-malarial compounds as well as various enzymes production.

Data Description

Streptomyces sp. FH025 was isolated from Likas, Sabah, Malaysia (06°2′18.4″ N 116° 7′16.6″ E). The draft genome characteristics of Streptomyces sp. FH025 were summarized in Table 1. There were 1261 number of contigs with a total contig size of 8381,474 bp and N50 contig number of 10,071. The L50 value was 246 and the GC content was 72.51%. Based on genome annotation, there were 1261 number of contigs with protein encoding genes and 406 number of sub systems with 7035 number of coding sequences (Table 1, Fig. 1). There were 74 RNAs.
Table 1

Characteristics of draft genome assembly of Streptomyces sp. FH025.

Number of contigs1261
Total contig size (bp)8381,474
N50 contig numbera10,071
L50246
GC content (%)72.51
Number of contigs (with protein encoding genes)1261
Number of subsystems406
Number of coding sequences7035
Number of RNAs74

Minimum set of contigs that represent at least 50% of total genome sequence.

Fig. 1

Subsystem statistics information of FH025 using RAST annotation. The subsystems category and corresponding feature counts were shown in the legend.

Characteristics of draft genome assembly of Streptomyces sp. FH025. Minimum set of contigs that represent at least 50% of total genome sequence. Subsystem statistics information of FH025 using RAST annotation. The subsystems category and corresponding feature counts were shown in the legend. Additionally, Streptomyces sp. FH025 could produce important secondary metabolites when analyzed using antiSMASH. It was estimated that there were 51 secondary metabolites cluster of genes (smCOG) (Table 2). The secondary metabolite genes present were type I and type III polyketides synthase (PKS). There were 9 non-ribosomal polypeptide synthetase (NRPS), 10 NRPS-like and 1 NRPS-Type I PKS identified. Besides, several secondary metabolite biosynthetic pathways were present such as amglyccycl, butyrolactone, terpenes, siderophores, lantipeptide and ladderane.
Table 2

Putative gene clusters coding for secondary metabolites detected by antiSMASH annotation of Streptomyces sp. FH025.

FeaturesNumber of clusters
No of smCOG157
PKS2
  PKS-like2
  Type I17
  Type III2
NRPS39
NRPS-like10
NRPS-Type I PKS1
Biosynthetic Pathways
  Amglyccycl1
  Butyrolactone1
  Terpenes3
  Siderophores4
  Lanthipeptide-class-iv1
  Ladderane1
RiPP-like1
RRE-containing2
NAPAA1
Others1

Secondary metabolism Clusters of Orthologous Groups.

Polyketide synthase.

Nonribosomal polypeptide synthetase.

Putative gene clusters coding for secondary metabolites detected by antiSMASH annotation of Streptomyces sp. FH025. Secondary metabolism Clusters of Orthologous Groups. Polyketide synthase. Nonribosomal polypeptide synthetase. ContEst16S software analysis indicated that the draft genome assembly did not have contamination of other prokaryotic genome. The 16S rRNA phylogenetic analysis revealed that Streptomyces sp. FH025 is closely related to the Streptomyces genus (Fig 2). Furthermore, genome-based taxonomy analysis revealed that strain FH025 has the highest average nucleotide identity (ANI) value (89.42%) and highest digital DNA-DNA hybridization (dDDH) value (38.4%) with Kitasatospora aureofaciens strain DM-1 (Table 3). However, strain FH025 was not affiliated as Kintasatospora because the values of ANI and dDDH were not greater than the established cutoff values on species delimitation for ANI (> 95–96%) [1] and dDDH value (>70%), respectively [2]. The low genome identity of strain FH025 with other strains analyzed indicated that strain FH025 is unique and warrant further investigation.
Fig. 2

Phylogenetic tree diagram of FH025 generated using neighbor-joining based on 16S rRNA gene sequence (947 bp) shows that FH025 was closely related with the Streptomyces genus. The numbers at branch nodes indicate percentages from 1000 bootstraps.

Table 3

The 16S rRNA sequence similarity, ANI and dDDH values of strain FH025 and its closely related species.

Closely related species16S rRNA sequence similarity (%)OrthoANIu value (%)dDDH value (%)
NC_016109.1 Kitasatospora setae KM 6054, complete sequence98.1780.5424.6
NZ_CP020563.1 Kitasatospora albolonga strain YIM 101,047 chromosome, complete genome97.4675.5221.8
NZ_CP020567.1 Kitasatospora aureofaciens strain DM-1 chromosome, complete genome99.8089.4238.4
NZ_CP025394.1 Kitasatospora sp. MMS16-BH015 chromosome, complete genome98.6781.0125.2
NZ_CP054919.1 Kitasatospora sp. NA04385 chromosome, complete genome98.5780.7224.7
Streptomyces clavuligerus strain ATCC 27,064 chromosome, complete genome96.6475.8921.8
Streptomyces galilaeus strain ATCC 14,969 chromosome, complete genome96.1375.5921.3
Streptomyces nitrosporeus strain ATCC 12,769 chromosome, complete genome97.2575.8721.7
Streptomyces subrutilus strain ATCC 27,467 chromosome, complete genome96.8576.2321.5
Streptomyces tsukubensis strain NRRL 18,488 chromosome, complete genome96.9575.5921.8
Phylogenetic tree diagram of FH025 generated using neighbor-joining based on 16S rRNA gene sequence (947 bp) shows that FH025 was closely related with the Streptomyces genus. The numbers at branch nodes indicate percentages from 1000 bootstraps. The 16S rRNA sequence similarity, ANI and dDDH values of strain FH025 and its closely related species.

Experimental Design, Materials and Methods

Sample collection and isolation of streptomyces

Soil samples covered with dead leaves were collected under a tree, Shorea parvifolia from Likas, Sabah, Malaysia and bacteria isolation was performed as previously described [3]. Briefly, serial dilution was performed on the soil samples and bacteria isolation was carried out using modified humic acid agar (with addition of vitamin B). Screening of isolates exhibiting anti-malaria activities was conducted and FH025 was observed to exhibit anti-malarial activities as previously described [3]. The isolate was sub-cultured on oatmeal agar (pH 7.2) at 28 °C to obtain a pure isolate named FH025. The culture was stored in 20% glycerol stock at −80 °C.

DNA isolation, genome sequencing, assembly, and annotation

Genomic DNA was isolated using Wizard® Genomic DNA Purification Kit according to manufacturer's instructions (Promega, USA). A whole-genome sequencing library was prepared using Nextera XT DNA library preparation kit following manufacturer's instructions (illumina, USA). The libraries were sequenced using the Miseq platform (Illumina, USA) to generate 2 × 250 paired end reads. The raw reads adapters were trimmed. Low quality sequences (4]. Primary genome assembly was performed using Unicycler version 0.4.8.0 [5]. The primary draft genome was analyzed by rapid annotation using subsystems technology (RAST) [6], [7], [8]. The secondary metabolites biosynthetic gene clusters of strain FH025 draft genome were identified using antiSMASH version 5.0 [9].

16S rRNA phylogenetic analysis

ContEst16S software was used to extract Streptomyces sp. FH025 16S rRNA gene sequence (981 bp) and analyze for any contamination of prokaryotic genomes [10]. Basic local alignment search tool (BLAST) analysis was performed against NCBI database and the top 20 near species 16S rRNA gene sequence was retrieved. The sequences were aligned using ClustalW and trimmed to 947 bp [11]. The phylogenetic tree was constructed by neighbor joining method with 1000 bootstraps using MEGA X software [12].

Average nucleotide identity and digital dna-dna hybridization genome-based taxonomy analysis

The ANI between the genome of strain FH025 and related species with complete genome from NCBI database were determined by OrthoANIu algorithm [13]. Digital DNA-DNA hybridization (dDDH) was performed using genome blast distance phylogeny with 10 closely related species with complete genome sequence obtained from NCBI database [14].

Data Availability

The whole genome project was deposited at NCBI under the accession JAFLNG000000000.

Ethics Statement

This study did not involve any human subjects and animal experiments. No ethical approval was required.

CRediT Author Statement

Lucky Poh Wah Goh: Formal analysis, Data curation, Writing – original draft, Writing – review & editing; Fauze Mahmud: Writing – review & editing; Ping-Chin Lee: Conceptualization, Resources, Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they do not have conflict of interest that could influence the work reported in this paper.
SubjectBiology
Specific subject areaMicrobiology, Bacterial genomics, Biotechnology
Type of dataFigure, Table, Draft genome sequence data
How data were acquiredGenome sequencing on Miseq
Data formatRaw and analyzed
Parameters for data collectionGenomic DNA was isolated from a pure culture of Streptomyces sp. FH025. AntiSMASH software predicted the putative biosynthetic gene clusters.
Description of data collectionWhole-genome sequencing, assembly, and annotation
Data source locationSoil samples used for bacteria isolation were collected at Likas, Sabah, Malaysia. (06°2′18.4″N 116° 7′16.6″E)
Data accessibilityThe data is available at NCBI Genbank from the following links:http://www.ncbi.nlm.nih.gov/bioproject/705517https://www.ncbi.nlm.nih.gov/biosample/18091016https://www.ncbi.nlm.nih.gov/sra/PRJNA705517
  14 in total

1.  Microbial species delineation using whole genome sequences.

Authors:  Neha J Varghese; Supratim Mukherjee; Natalia Ivanova; Konstantinos T Konstantinidis; Kostas Mavrommatis; Nikos C Kyrpides; Amrita Pati
Journal:  Nucleic Acids Res       Date:  2015-07-06       Impact factor: 16.971

2.  ContEst16S: an algorithm that identifies contaminated prokaryotic genomes using 16S RNA gene sequences.

Authors:  Imchang Lee; Mauricio Chalita; Sung-Min Ha; Seong-In Na; Seok-Hwan Yoon; Jongsik Chun
Journal:  Int J Syst Evol Microbiol       Date:  2017-06-22       Impact factor: 2.747

3.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

4.  Anti-malarial Activities of Two Soil Actinomycete Isolates from Sabah via Inhibition of Glycogen Synthase Kinase 3β.

Authors:  Dhiana Efani Dahari; Raifana Mohamad Salleh; Fauze Mahmud; Lee Ping Chin; Noor Embi; Hasidah Mohd Sidek
Journal:  Trop Life Sci Res       Date:  2016-08

5.  Genome sequence-based species delimitation with confidence intervals and improved distance functions.

Authors:  Jan P Meier-Kolthoff; Alexander F Auch; Hans-Peter Klenk; Markus Göker
Journal:  BMC Bioinformatics       Date:  2013-02-21       Impact factor: 3.169

6.  RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

Authors:  Thomas Brettin; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Gary J Olsen; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D Pusch; Maulik Shukla; James A Thomason; Rick Stevens; Veronika Vonstein; Alice R Wattam; Fangfang Xia
Journal:  Sci Rep       Date:  2015-02-10       Impact factor: 4.379

7.  Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

Authors:  Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal:  PLoS Comput Biol       Date:  2017-06-08       Impact factor: 4.475

8.  The RAST Server: rapid annotations using subsystems technology.

Authors:  Ramy K Aziz; Daniela Bartels; Aaron A Best; Matthew DeJongh; Terrence Disz; Robert A Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M Glass; Michael Kubal; Folker Meyer; Gary J Olsen; Robert Olson; Andrei L Osterman; Ross A Overbeek; Leslie K McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D Pusch; Claudia Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko
Journal:  BMC Genomics       Date:  2008-02-08       Impact factor: 3.969

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  1 in total

1.  Bioactivities and Mode of Actions of Dibutyl Phthalates and Nocardamine from Streptomyces sp. H11809.

Authors:  Fauze Mahmud; Ngit Shin Lai; Siew Eng How; Jualang Azlan Gansau; Khairul Mohd Fadzli Mustaffa; Chiuan Herng Leow; Hasnah Osman; Hasidah Mohd Sidek; Noor Embi; Ping-Chin Lee
Journal:  Molecules       Date:  2022-03-31       Impact factor: 4.411

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.