Literature DB >> 28770029

Complete genome sequence of the sand-sediment actinobacterium Nocardioides dokdonensis FR1436T.

Min-Jung Kwak1, Soon-Kyeong Kwon1, Jihyun F Kim1,2.   

Abstract

Nocardioides dokdonensis, belonging to the class Actinobacteria, was first isolated from sand sediment of a beach in Dokdo, Korea, in 2005. In this study, we determined the genome sequence of FR1436, the type strain of N. dokdonensis, and analyzed its gene contents. The genome sequence is the second complete one in the genus Nocardioides after that of Nocardioides sp. JS614. It is composed of a 4,376,707-bp chromosome with a G + C content of 72.26%. From the genome sequence, 4,104 CDSs, three rRNA operons, 51 tRNAs, and one tmRNA were predicted, and 71.38% of the genes were assigned putative functions. Through the sequence analysis, dozens of genes involved in steroid metabolism, especially its degradation, were detected. Most of the identified genes were located in large gene clusters, which showed high similarities with the gene clusters in Pimelobacter simplex VKM Ac-2033D. Genomic features of N. dokdonensis associated with steroid catabolism indicate that it could be used for research and application of steroids in science and industry.

Entities:  

Keywords:  Cholesterol; Corynebacteria; Nocardioidaceae; Propionibacteria; Steroid medicine

Year:  2017        PMID: 28770029      PMCID: PMC5526307          DOI: 10.1186/s40793-017-0257-z

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Bacteria in the genus were first isolated from soil in 1976 [1] and currently more than 90 validly published species are available from diverse terrestrial and aquatic environments such as soil, wastewater, plant roots, groundwater, beach sand, and marine sediment [2-10]. Originally, the genus was classified as a member of the order in the phylum , but recently was reclassified to the order [11]. , also called Gram-positive high G + C bacteria, contain diverse bacterial groups that are capable of a variety of secondary metabolism including biosynthesis of antibiotics and degradation of harmful compounds [12, 13]. The genus is also known to utilize several kinds of non-degradable materials such as alkane compounds [14], atrazine [15], phenanthrene [16], trinitrophenol [17], and vinyl chloride [18]. Despite almost 100 species with validly published names and their useful features associated with secondary metabolism, only draft genome sequences are publically available for the genus besides that of sp. JS614. was isolated from beach sand in Dokdo, a volcanic island located in the East Sea of Korea, in 2005 [19]. The East Sea is called a “mini-ocean” due to its oceanological properties [20] and is known to have a high microbial diversity [21]. To reveal distinguishing genomic features of species, we determined and analyzed the genome sequence of FR1436T.

Organism information

Classification and features

FR1436T, a Gram-positive, non-motile, and strictly aerobic bacterium, was isolated from sand sediment of the Dokdo island in Korea [19]. The strain grows at the temperature range of 4 to 30 °C (optimum, 25 °C), pH range of 5.0 to 10.0 (optimum, 7.0), and NaCl concentration of 0 to 7% (w/v) (optimum, 0 to 3) [19]. Its colony size is about 1.0–2.0 mm on TSA medium after incubation for 3 days at 25 °C. Cells are 1.2–1.8 μm long and 0.6–0.9 μm wide in size [19] (Fig. 1). FR1436 can utilize adonitol, glycerol, melezitose, melibiose, ribose, sodium acetate, sodium citrate, sodium propionate, and sodium pyruvate as a sole carbon source [19]. Minimum information about the genome sequence (MIGS) for FR1436 is described in Table 1.
Fig. 1

Transmission electron microscopic image of N. dokdonensis FR1436

Table 1

Classification and general features of N. dokdonensis FR1436 according to the MIGS recommendations [39]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [40]
Phylum Actinobacteria TAS [41]
Class Actinobacteria TAS [42]
Order Propionibacteriales TAS [11]
Family Nocardioidaceae TAS [11]
Genus Nocardioides TAS [43]
Species Nocardioides dokdonensis TAS [19]
Strain FR1436TAS [19]
Gram stainGram-positiveTAS [19]
Cell shapeRodTAS [19]
MotilityNon-motileTAS [19]
SporulationNonsporulatingTAS [19]
Temperature range4 to 30 °CTAS [19]
Optimum temperature25 °CTAS [19]
pH range; Optimum5.0 to 10.0, 7.0TAS [19]
Carbon sourceAdonitol, glycerol, melezitose, melibiose, ribose, sodium acetate, sodium citrate, sodium propionate, sodium pyruvateTAS [19]
MIGS-6HabitatSand sedimentTAS [19]
MIGS-6.3Salinity0 to 7% (w/v)TAS [19]
MIGS-22Oxygen requirementStrictly aerobicTAS [19]
MIGS-15Biotic relationshipFree-livingTAS [19]
MIGS-14PathogenicityUnknownNAS
MIGS-4Geographic locationRepublic of KoreaTAS [19]
MIGS-5Sample collection2008TAS [19]
MIGS-4.1Latitude37° 05′ NTAS [19]
MIGS-4.2Longitude131° 13′ ETAS [19]
MIGS-4.4AltitudeNot reportedNAS

aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [44]

Transmission electron microscopic image of N. dokdonensis FR1436 Classification and general features of N. dokdonensis FR1436 according to the MIGS recommendations [39] aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [44] Phylogenetically, belongs to the family of the order , and a phylogenetic tree based on the 16S rRNA genes of the type strains in the genus shows that FR1436 forms a sister clade with (Fig. 2), which was isolated from soil, and shares common ancestor with , , and N. salaries.
Fig. 2

Phylogenetic relationship of the species in Nocardioides. A neighbor-joining tree based on the 16S rRNA gene was generated using MEGA 5 and Jukes-Cantor model was used for calculation of evolutionary distance based on the comparison of 1275 nucleotides. Bootstrap values (percentages of 1000 replications) greater than 50% are shown at each node; Nocardia asteroids NBRC 15531 (BAFO01000006) was used as an out-group. Scale bar represents 0.01 nucleotide substitutions per site. Accession numbers of the 16S rRNA gene are presented in the parentheses. Species for which genome sequences are available are indicated in bold

Phylogenetic relationship of the species in Nocardioides. A neighbor-joining tree based on the 16S rRNA gene was generated using MEGA 5 and Jukes-Cantor model was used for calculation of evolutionary distance based on the comparison of 1275 nucleotides. Bootstrap values (percentages of 1000 replications) greater than 50% are shown at each node; Nocardia asteroids NBRC 15531 (BAFO01000006) was used as an out-group. Scale bar represents 0.01 nucleotide substitutions per site. Accession numbers of the 16S rRNA gene are presented in the parentheses. Species for which genome sequences are available are indicated in bold

Genome sequencing information

Genome project history

As part of the project that investigates the genomic and metabolic features of bacterial isolates in and around Dokdo, the genome sequencing and analysis of FR1436 were performed at the Laboratory of Microbial Genomics and Systems/Synthetic Biology at Yonsei University. The complete genome sequence of FR1436T (= KCTC 19309 T = JCM 14815 T) has been deposited in GenBank under the accession number CP015079. The Bioproject accession number is PRJNA191956. A summary of the genome project is provided in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS-31Finishing qualityComplete
MIGS-28Libraries usedA 20-kb library
MIGS-29Sequencing platformsPacBio RS II system
MIGS-31.2Fold coverage355.4×
MIGS-30AssemblersSMRTpipe HGAP 3.0
MIGS-32Gene calling methodProkka
Locus TagI601
Genbank IDCP015079
Genbank Date of ReleaseMarch 31, 2016
GOLD IDGp0037383
BIOPROJECTPRJNA191956
MIGS-13Source Material IdentifierFR1436
Project relevanceEnvironmental, soil bacterium
Project information

Growth conditions and genomic DNA preparation

FR1436 was streaked on trypticase soy agar medium (Difco, 236,950) and incubated at 25 °C for 3 days. A single colony was inoculated in trypticase soy broth and incubated at 25 °C for 2 days. Cells in the exponential phase were harvested and genomic DNA was extracted using Wizard Genomic DNA Purification Kit (Promega, USA) according to the manufacturer’s protocol.

Genome sequencing and assembly

Genome sequencing of FR1436 was performed using the PacBio RS II System (Macrogen, Inc., Republic of Korea). A 20-kb library and C4-P6 chemistry were used for the genome sequencing. A total of 200,435 continuous long reads and 1,551,246,448 base pairs were generated after genome sequencing and quality trimming of the sequencing reads. De novo assembly was conducted with SMRTpipe HGAP and scaffolding and gap filling were performed with SMRTpipe AHA. Finally, consensus sequences were generated with SMRTpipe Quiver.

Genome annotation

Structural gene prediction and functional annotation were conducted using the Prokka program [22]. Additionally, we performed a functional assignment of the predicted protein-coding sequences using blastp against Pfam, Uniref90, KEGG, COG, and GenBank NR databases for more accurate annotation. tRNAscan-SE [23] and RNAmmer [24] were used for prediction of transfer RNAs and ribosomal RNAs, respectively. Assignment of the Clusters of Orthologous Groups was conducted with RPS-BLAST against COG database with an e-value cutoff of less than 1e-02. Clustered regularly interspaced short palindromic repeats were predicted with CRISPR Finder [25]. Proteins containing signal peptide and transmembrane helices were predicted using SignalP [26] and TMHMM [27], respectively. Secondary metabolite biosynthetic genes were predicted using AntiSMASH program [28].

Genome properties

FR1436 has a single chromosome of 4,376,707 bp in length, and consists of 72.26% of G + C content (Fig. 3 and Table 3). The genome has 4165 genes that are comprised of 4104 CDSs, three rRNA operons, 51 tRNAs, and one tmRNA. Results from the analysis of KEGG pathways indicated that, in the genome of FR1436, all of the genes involved in glycolysis, gluconeogenesis, and citrate cycle are present and well conserved. Among the predicted genes, 71.38% of the genes were assigned putative functions and 2832 CDSs was functionally assigned to the COG categories (Table 4). Also in the genome, ten putative CRISPR repeats were predicted using the CRISPRFinder program, but there were no CRISPR-associated proteins next to the predicted repeat sequences. Two gene clusters, possibly associated with secondary metabolism, were predicted using the AntiSMASH program. One cluster (accession numbers ANH38050 to ANH38087) has genes associated with the phenylacetate catabolic pathway [29] and another cluster (accession numbers ANH40163 to ANH40204) has genes of type 3 polyketide synthases.
Fig. 3

Circular representation of the genome of N. dokdonensis FR1436. The first and second circles from inside indicate COG-assigned genes in color codes. Black circle represents the G + C content and red-yellow circle is for the G + C skew. Innermost, blue-scattered spots are tRNA genes and red-scattered spots indicate rRNA genes

Table 3

Genome statistics

AttributeValue% of total
Genome size (bp)4,376,707100
DNA coding (bp)4,059,32692.75
DNA G + C (bp)3,162,42772.26
DNA scaffolds1
Total genes4165100
Protein coding genes410498.54
RNA genes611.46
Pseudogenes00
Genes in internal clustersND*ND*
Genes with function prediction297371.38
Genes assigned to COGs283269.01
Genes with Pfam domains258462.04
Genes with signal peptides3438.24
Genes with transmembrane helices101124.27
CRISPR repeats1010

*ND not determined

Table 4

Number of protein coding genes of N. dokdonensis FR1436 associated with the general COG functional categories

CodeValuePercentage*Description
J1513.68Translation, ribosomal structure and biogenesis
A10.02RNA processing and modification
K2125.17Transcription
L1644.00Replication, recombination, and repair
B10.02Chromatin structure and dynamics
D250.61Cell cycle control, cell division and chromosome partitioning
V411.00Defense mechanisms
T1162.83Signal transduction mechanisms
M1243.02Cell wall/membrane/envelope biogenesis
N30.07Cell motility
U290.71Intracellular trafficking, and secretion
O1112.70Posttranslational modification, protein turnover, chaperones
C2405.85Energy production and conversion
G1453.53Carbohydrate transport and metabolism
E3177.72Amino acid transport and metabolism
F751.83Nucleotide transport and metabolism
H1062.58Coenzyme metabolism
I2325.65Lipid metabolism
P1333.24Inorganic ion transport and metabolism
Q862.10Secondary metabolites biosynthesis, transport, and catabolism
R3217.82General function prediction only
S1994.85Function unknown
-127230.99Not in COGs

*The percentages are based on the total number of protein-coding genes in the genome

Circular representation of the genome of N. dokdonensis FR1436. The first and second circles from inside indicate COG-assigned genes in color codes. Black circle represents the G + C content and red-yellow circle is for the G + C skew. Innermost, blue-scattered spots are tRNA genes and red-scattered spots indicate rRNA genes Genome statistics *ND not determined Number of protein coding genes of N. dokdonensis FR1436 associated with the general COG functional categories *The percentages are based on the total number of protein-coding genes in the genome

Insights from the genome sequence

In the genome of FR1436, dozens of steroid-degrading genes were detected (Additional file 1). Major functions of steroids, essential biomolecules in living organisms, include maintaining membrane fluidity as a component of the cell membrane and controlling cell metabolism as signaling molecules [30]. Moreover, steroid medicines are used for treatment of a number of diseases from inflammation to cancer [31]. The molecular backbone of steroids is composed of three cyclohexanes and one cyclopentane. To the backbone, diverse side chains are attached to endow them with diverse functions [32]. Catabolic pathways of steroid degradation or modification have been analyzed in depth for some genera in the order [33-35]. In , several large gene clusters, which have potential binding sites of the transcriptional regulator associated with steroid catabolism in their promoters, were predicted in the genome of VKM Ac-2033D [36]. In the genome of FR1436, gene cluster A, which is known to be involved in degrading steroid rings A/B, and gene cluster B, which is involved in degrading side chains, were detected (Fig. 4). However, in FR1436, cluster A is separated into two large gene clusters and an additional mce gene cluster, which is involved in steroid uptake [37], was detected (Additional file 1). In VKM Ac-2033D, cluster A is located approximately 350-kb downstream of cluster B, whereas in FR1436, cluster A is located 6 kb downstream. Moreover, two kstR and 11 kstR2 genes, which encode the TetR family of transcriptional regulators and are reported to regulate cholesterol metabolism in mycobacteria [38], were detected (Additional file 1). Besides the genes in clusters A and B, genes encoding 3-beta-hydroxysteroid dehydrogenase (ANH36717 and ANH37882), 3-alpha-hydroxysteroid dehydrogenase (ANH37023 and ANH37488), and steroid delta-isomerase (ANH36955) were also detected in the genome of FR1436. Additionally, all genes involved in degradation of cholesterol to HIP-CoA were identified (Fig. 5). These results indicate that the genus can be useful for research and utilization of steroid metabolism.
Fig. 4

Steroid degrading gene clusters. Gene clusters were referred from the ones of P. simplex VKM Ac-2033D [35], for which genes associated with steroid degradation are indicated in grey arrows. Genes associated with steroid degradation in N. dokdonensis FR1436 are represented by black arrows. Sky blue indicates genes located in the cluster, but little information associated with steroid degradation. White arrows indicate genes encoding hypothetical protein. a. Gene cluster A involved in degradation of steroid ring A and B [35]. Accession numbers of the genes in P. simplex VKM Ac-2033D are AIY19941 to AIY17666. Accession numbers of the genes in N. dokdonensis FR1436 are ANH39848 to ANH39880 and ANH37060 to ANH37075. b. Gene cluster B involved in degradation of side chains of steroids [35]. Accession numbers of the genes are AIY19891 to AIY17347 for P. simplex VKM Ac-2033D and ANH39925 to ANH39888 for N. dokdonensis FR1436

Fig. 5

Cholesterol degradation pathway. Metabolic pathway was referred from the KEGG pathway map 00984. Blue indicates gene accession numbers involved in the cholesterol degradation in N. dokdonensis FR1436. DSHA, 3-hydroxy-5,9,17-trioxo-4,5:9,10-disecoandrosta-1(10),2-dien-4-oate; HIP, 9,17-dioxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid

Steroid degrading gene clusters. Gene clusters were referred from the ones of P. simplex VKM Ac-2033D [35], for which genes associated with steroid degradation are indicated in grey arrows. Genes associated with steroid degradation in N. dokdonensis FR1436 are represented by black arrows. Sky blue indicates genes located in the cluster, but little information associated with steroid degradation. White arrows indicate genes encoding hypothetical protein. a. Gene cluster A involved in degradation of steroid ring A and B [35]. Accession numbers of the genes in P. simplex VKM Ac-2033D are AIY19941 to AIY17666. Accession numbers of the genes in N. dokdonensis FR1436 are ANH39848 to ANH39880 and ANH37060 to ANH37075. b. Gene cluster B involved in degradation of side chains of steroids [35]. Accession numbers of the genes are AIY19891 to AIY17347 for P. simplex VKM Ac-2033D and ANH39925 to ANH39888 for N. dokdonensis FR1436 Cholesterol degradation pathway. Metabolic pathway was referred from the KEGG pathway map 00984. Blue indicates gene accession numbers involved in the cholesterol degradation in N. dokdonensis FR1436. DSHA, 3-hydroxy-5,9,17-trioxo-4,5:9,10-disecoandrosta-1(10),2-dien-4-oate; HIP, 9,17-dioxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid

Conclusions

Steroids are important biomolecules in living organisms and carry out diverse roles as components of the cell membrane to signaling molecules [30]. Moreover, steroids are being used to treat various diseases from inflammation to cancer [31]. These indicate that research on modification of steroid compounds has infinite possibilities to improve human health. To date, studies on bacterial steroid metabolism have been mainly focused on the order [33-35]. Recently, genome analysis of the genus in the order revealed several kinds of gene clusters associated with steroid degradation [36]. In this study, we determined the complete genome sequence of FR1436 and analyzed the genome sequence to detect the presence of genes related to steroid metabolism. In the genome of FR1436, dozens of genes associated with steroid catabolism were detected in large gene clusters. These results demonstrate that bacteria in the genus can be used as promising candidates for steroid research and related fields of industry.
  39 in total

Review 1.  Steroid hormone receptors and drug discovery: therapeutic opportunities and assay designs.

Authors:  Chang Bai; Azriel Schmidt; Leonard P Freedman
Journal:  Assay Drug Dev Technol       Date:  2003-12       Impact factor: 1.738

2.  Nocardioides endophyticus sp. nov. and Nocardioides conyzicola sp. nov., isolated from herbaceous plant roots.

Authors:  Ji-Hye Han; Tae-Su Kim; Yochan Joung; Mi Na Kim; Kee-Sun Shin; Taeok Bae; Seung Bum Kim
Journal:  Int J Syst Evol Microbiol       Date:  2013-08-29       Impact factor: 2.747

3.  A hidden Markov model for predicting transmembrane helices in protein sequences.

Authors:  E L Sonnhammer; G von Heijne; A Krogh
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1998

4.  Prokka: rapid prokaryotic genome annotation.

Authors:  Torsten Seemann
Journal:  Bioinformatics       Date:  2014-03-18       Impact factor: 6.937

Review 5.  Steroidogenic enzymes: structure, function, and role in regulation of steroid hormone biosynthesis.

Authors:  I Hanukoglu
Journal:  J Steroid Biochem Mol Biol       Date:  1992-12       Impact factor: 4.292

6.  Two distinct monooxygenases for alkane oxidation in Nocardioides sp. strain CF8.

Authors:  N Hamamura; C M Yeager; D J Arp
Journal:  Appl Environ Microbiol       Date:  2001-11       Impact factor: 4.792

7.  Nocardioides dokdonensis sp. nov., an actinomycete isolated from sand sediment.

Authors:  Seong Chan Park; Keun Sik Baik; Mi Sun Kim; Jongsik Chun; Chi Nam Seong
Journal:  Int J Syst Evol Microbiol       Date:  2008-11       Impact factor: 2.747

8.  Nocardioides flava sp. nov., isolated from rhizosphere of poppy plant, Republic of Korea.

Authors:  Hina Singh; Chang Shik Yin
Journal:  Arch Microbiol       Date:  2016-01-22       Impact factor: 2.552

9.  Nocardioides basaltis sp. nov., isolated from black beach sand.

Authors:  Kyoung-Ho Kim; Seong Woon Roh; Ho-Won Chang; Young-Do Nam; Jung-Hoon Yoon; Che Ok Jeon; Hee-Mock Oh; Jin-Woo Bae
Journal:  Int J Syst Evol Microbiol       Date:  2009-01       Impact factor: 2.747

10.  Genome-wide bioinformatics analysis of steroid metabolism-associated genes in Nocardioides simplex VKM Ac-2033D.

Authors:  Victoria Y Shtratnikova; Mikhail I Schelkunov; Victoria V Fokina; Yury A Pekov; Tanya Ivashina; Marina V Donova
Journal:  Curr Genet       Date:  2016-02-01       Impact factor: 3.886

View more
  1 in total

1.  Different genome-wide transcriptome responses of Nocardioides simplex VKM Ac-2033D to phytosterol and cortisone 21-acetate.

Authors:  Victoria Yu Shtratnikova; Mikhail I Sсhelkunov; Victoria V Fokina; Eugeny Y Bragin; Andrey A Shutov; Marina V Donova
Journal:  BMC Biotechnol       Date:  2021-01-13       Impact factor: 2.563

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.