Literature DB >> 28257489

Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

Motoshige Yasuike1, Issei Nishiki1, Yuki Iwasaki1, Yoji Nakamura1, Atushi Fujiwara1, Yoshiko Shimahara2, Takashi Kamaishi3, Terutoyo Yoshida4, Satoshi Nagai1, Takanori Kobayashi5, Masaya Katoh1.   

Abstract

Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of the genus Nocardia.

Entities:  

Mesh:

Year:  2017        PMID: 28257489      PMCID: PMC5336288          DOI: 10.1371/journal.pone.0173198

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Members of the genus Nocardia are Gram-positive, non-motile and aerobic actinomycetes, belonging to the family Nocardiaceae. This genus contains more than 90 recognized species and are widely distributed in both aquatic and terrestrial habitats [1]. Many species of this genus are known as the causative agent of nocardiosis in humans and a variety of animals, which cause various clinical diseases and high mortality rates in some cases [2, 3]. In aquatic environments, four species of Nocardia, N. asteroides, N. seriolae, N. salmonicida and N. crassostreae, have also been found in diseased aquatic animals [4]. In Japan, members of the genus Seriola, including yellowtail (S. quinqueradiata), amberjack (S. dumerili), and kingfish (S. lalandi), are the most produced and economically important aquaculture fish species. Nocardiosis, caused by N. seriolae [5] (initially reported as N. kampachi [6]), is one of the most serious economic threats in the Seriola aquaculture. N. seriolae also infects other fish species including both marine and freshwater fishes and is found in other Asian countries [7]. To date, only two antibiotics, sulfamonomethoxine and sulfisozole sodium, are licensed for treatment of N. seriolae infections in Japan [8, 9]. Although these antibiotics are valuable for the control of nocardiosis, there are some concerns about the emergence of antibiotic-resistant strains and environmental impacts. Vaccination is thought to be another effective strategy for control of nocardiosis. However, the intracellular parasitic nature of N. seriolae makes development of vaccines for the disease difficult [10]. Complete genome sequences of pathogenic bacteria provide a powerful tool for understanding their biology, including mechanisms of bacterial pathogenicity and their drug-resistant properties, as well as for the development of new genetic and molecular approaches for disease control strategies [11]. So far, four Nocardia species have been fully sequenced, including three agents of human nocardiosis, N. farcinica IFM 10152 [12], N. brasiliensis HUJEG-1 [13] and N. cyriacigeorgica GUH-2 [14], and a rubber and gutta-percha-degrading strain, N. nova SH22a isolated from a root of Couma macrocarpa [15]. Although two draft sequences of N. seriolae isolates, ZJ0503 [16] and N-2927 [17] have been reported recently, these draft genome sequences consist of a large number of contigs, 319 contigs in 315 scaffolds for ZJ0503 [16] and 339 large contigs (>500bp) for N-2927 [17]. Therefore a complete genome sequence of N. seriolae is essential for a robust annotation, overall genome organization and comparative genomics of this species [18]. In this study, using Single Molecule, Real-Time (SMRT) DNA sequencing [19, 20], the complete genome nucleotide sequence of N. seriolae UTF1 isolated from a yellowtail that succumbed to nocardiosis in Japan was determined and annotated. We explored the virulence factors and antibiotic resistance gene candidates in the N. seriolae UTF1 genome. In addition, to investigate genomic diversity, the N. seriolae UTF1 genome sequence were compared with the four existing complete genomes of Nocardia. This is, to the best of our knowledge, the first report of the complete genome of the fish pathogenic Nocardia species. This genomic information will provide a reference genome data set of N. seriolae that could provide a basis for understanding the ecological and functional diversity of the genus Nocardia by comparative studies in future studies.

Materials and methods

Genome sequencing and assembly

N. seriolae UTF1 was originally isolated from a cultured yellowtail (Seriola quinqueradiata) in 2008 (Miyazaki Prefecture, Japan). This isolate was cultured in Brain Heart Infusion Broth (Difco, Sparks, MD, USA) at 25°C for 5 days under constant shaking with 150 rpm. After treatment with lysozyme, genomic DNA was extracted using the Maxwell Cell DNA Purification Kit (Promega, Madison, WI, USA). The nucleotide sequence of the N. seriolae UTF1 was determined by the Pacific Biosciences (PacBio) RS sequencing platform (Pacific Biosciences, Inc., CA, USA) at Tomy Digital Biology Co., Ltd (Tokyo, Japan). Briefly, genomic DNA (7 μg) was sheared using the g-TUBE (Covaris Inc., MA, USA) and a library was prepared using a DNA Template Prep Kit v2.0 (Pacific Biosciences) by the manufacturer’s instructions. The library was run in four Single Molecule, Real-Time (SMRT) cells on a PacBio RS sequencer (Pacific Biosciences) using P4C2 chemistry and a 120 minute data collection mode. The PacBio RS platform generated 227,796 sequence reads (mean read length: 4,756 bp, N50 read length: 7,364 bp) with 1,083,416,572 bp, providing a 133-fold sequencing coverage of the genome (Table 1). De novo assembly of sequence reads was performed using a SMRT Analysis v2.2.0 software package (Pacific Biosciences) and resulted in two contigs with lengths of 6,093,704 bp and 2,035,833 bp.
Table 1

Sequencing statistics of the N. seriolae UTF1 genome.

Number of Reads227,796
N50 Read Length (bp)7,364
Mean Read Length (bp)4,756
Number of Bases1,083,416,572
Average Reference Coverage113.01
Number of contigs2
For the genome sequence finishing, gaps between the two contigs were amplified by a long-range PCR method. The both gap-flanking sequences encoded ribosomal RNA (rRNA) operons, and therefore the PCR primer sets were designed for the outside of the region of rRNA operons. Using the gap-flanking PCR primer sets shown in S1 Table, long-range PCR was conducted using Phusion High-Fidelity PCR master mix with HF buffer (Thermo Fisher Scientific, Waltham, MA) following the manufacturer’s protocol. The long-range PCR amplification was performed using 50 ng of extracted total genomic DNA of N. seriolae UTF1 with an initial denaturation step of 30 s at 98°C and then a two-step PCR procedure (35 cycles of 98°C for 10 s and 72°C for 3 min), and 10 min of final extension at 72°C. The long-range PCR successfully amplified approximately 7–8 kb fragments correspond to the both gap regions and the two PCR products were purified with QIAquick PCR Purification Kit (QIAGEN, Hilden, Germany). The purified long-range PCR products (100 ng) were shotgun-sequenced using the Ion PGM platform (Thermo Fisher Scientific) and the sequence reads were assembled using V-GAP [21]. These two assembled sequences aligned to the two contigs, and consequently the complete nucleotide sequence of N. seriolae UTF1 comprising a circular chromosome was determined. The final sequence was submitted to DDBJ under accession number AP017900.

Genome annotation

The complete genome sequence of N. seriolae UTF1was annotated using the Rapid Annotations using Subsystems Technology (RAST) server v2.0 with SEED data [22] and using BLASTP [23] against the NCBI RefSeq protein data [24] (E value threshold of 1E-5). Functional categories of the predicted genes of N. seriolae UTF1 and four other Nocardia spp. genes were assigned with the Clusters of Orthologous Groups of proteins (COGs) database [25] using COGsoft [26] and with the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database using KEGG Orthology And Links Annotation (BlastKOALA) program [27]. Virulence factors of N. seriolae UTF1 were predicted by BLASTP against the Virulence Factor Database (VFDB) set A (a core dataset that covers genes associated with experimentally verified VFs) [28] and the virulence genes data set of N. farcinica IFM 10152 at the Nocardia farcinica Genome Project Page (http://nocardia.nih.go.jp/), using a cut off E-value of 1E−5. These BLAST results were then filtered using the criteria of query coverage per HSP (qcovhsp) greater than 80% and sequence similarity greater than 50%. The organization of the mce operon structures was identified based on the annotation result from the RAST server and visualized using Easyfig version 2.1 [29]. Genes for antibiotic resistance were estimated by Antibiotic Resistance Genes Database (ARDB) [30] with default parameters.

Comparative genomics

To Identify and characterize the gap regions in the previous reported N. seriolae draft genome sequences, the contig sequences of N. seriolae ZJ0503 (GenBank: NZ_JNCT01000000, 319 contigs) and N-2927 (GenBank: NZ_BAWD02000000, 339 contigs) were aligned to the complete sequence of N. seriolae UTF1 genome using MUMmer version 3.22 [31]. The uncovered region sequences in the comparison with ZJ0503 and with N-2927 were subjected to BLASTX search against the NCBI RefSeq database (E value threshold of 1E−5). Four available complete genome sequences of the genus Nocardia, N. farcinica IFM 10152 (GenBank: AP006618), N. brasiliensis HUJEG-1 (GenBank: CP003876), N. cyriacigeorgica GUH-2 (GenBank: FO082843) and N. nova SH22a (GenBank: CP006850), were used for the comparative genomic analysis with N. seriolae UTF1. For visualization of circular genome comparisons, the BLASTN-based ring image was generated by BLAST Ring Image Generator (BRIG) version 0.95 [32], with N. seriolae UTF1 as a reference. Dot plots of complete nucleotide sequences were generated by MUMmer version 3.22 and the mummerplot script and the Unix program gnuplot [31]. Average Nucleotide Identity (ANI) and Amino Acid Identity (AAI) were calculated using the ANI calculator and AAI calculator, respectively (default settings) [33]. The orthologous and species-specific genes were identified using OrthoMCL [34]. Comparisons of functional profiling of the N. seriolae UTF1 and four other Nocardia spp. complete genomes were carried out by the method by Verma et al. (2014) [35] with some modification. The top 50 SEED subsystems [36, 37] from the RAST analysis and the KEGG modules assigned by BlastKOALA were clustered hierarchically by the abundance of gene content for each categories among the Nocardia genomes using Cluster 3.0 software [38]. The results were visualized by Java Treeview version 1.1.6r4 [39].

Results and discussion

Assembly and general genomic features

The relatively large genome size (6–10 Mbp) with high GC contents (approximately 70%) of genus Nocardia [40] makes the completion of their genome sequences difficult. Because of the genomic complexity, draft assemblies of N. seriolae from an Illumina MiSeq platform [16] or a Roche 454-GS Junior System combined with the Illumina reads (NCBI Sequence Read Archive, DRX020602) [8, 17] consist of a large number of contigs. It should be noted that our initial draft sequence assembly of the N. seriolae UTF1 genomes using a 454 GS-FLX+ System resulted in a total of 134 scaffolds comprising 365 contigs (unpublished data). The PacBio RS platform, a third-generation sequencing technology and based on single-molecule real-time (SMRT) sequencing, can achieve unbiased GC coverage with extremely long reads [19, 20]. This platform has been employed successfully for sequencing in complex bacterial genomes such as those with extremely high GC content genomes [41] and with multiple chromosomes containing more repetitive sequences [42]. In this study, we determined the complete genome nucleotide sequence of N. seriolae UTF1 using the PacBio RS. Our de novo assembly with a 133-fold genome coverage of PacBio RS long-reads (mean read length: 4,756 bp, N50 read length: 7,364 bp) produced two large contigs with lengths of 6,093,704 bp and 2,035,833 bp (Table 1). The flanking sequences at both sides of the two assembled contigs encoded ribosomal RNA genes. Since the length of multiple copies of rRNA operons (16S-23S-5S rRNA) are approximately 5 kb, the current sequence reads with an average of 4,756 bp may have not been able to fully cover these rRNA operon regions. It should be noted that the newest PacBio RS II sequencer generates an average read length of 10–15 kb [43], and therefore the new sequencer is assumed to be able to assemble around these rRNA operon regions. After closing the gaps with a long-range PCR method, the complete nucleotide sequence of N. seriolae UTF1 comprising a circular chromosome of 8,121,733 bp with a G+C content of 68.1% was determined (Fig 1 and Table 2). The complete genome of N. seriolae UTF1 contains 7,697 predicted coding DNA sequences (CDSs) with an average length of 909 bp, 4 rRNA operons, and 62 transfer RNA (tRNA) sequences (Table 2). The genome sizes vary among the fully sequenced Nocardia genomes that range from 6,021,225 bp for N. farcinica IFM 10152 to 9,436,348 bp for N. brasiliensis HUJEG-1, while the numbers of CDSs also vary among them that range from 5,491 for N. farcinica IFM 10152 to 8,414 for N. brasiliensis HUJEG-1 (Table 2). On the other hand, the N. seriolae UTF1 genome has four rRNA operons and 62 tRNAs, while three rRNA operons and 49–53 tRNAs have been found in the other four fully sequenced Nocardia genomes (Table 2).
Fig 1

Comparative genomic map of N. seriolae UTF1 genome and the other four Nocardia complete genomes.

The BLASTN-based ring image was generated by BLAST Ring Image Generator (BRIG) version 0.95 [32]. The innermost two rings show GC content (black) and GC skew (purple/green). The remaining rings (rings 3–6) represent a BLASTN comparison with complete genome of N. farcinica IFM 10152 (magenta), N. brasiliensis HUJEG-1(cyan), N. cyriacigeorgica GUH-2 (blue) and N. nova SH22a (pale blue). Bars indicate the position of mobile-element related genes in the N. seriolae UTF1 genome such as transposases (black), endonuclease DDE (blue) and integrase (green). The outermost red boxes highlight the 8 mce operons found in the N. seriolae UTF1 genome.

Table 2

Comparison of genomic features of N. seriolae UTF1 and the other four Nacardia spp. complete genomes.

SpecieshostAccesion numberSize (bp)GC%CDSrRNAtRNAReference
N. seriolae UTF1Seriola quinqueradiataAP0179008,121,73368.17,697462This study
N. farcinica IFM 10152Homo sapiensAP0066186,021,22570.85,674353[12]
N. brasiliensis HUJEG-1Homo sapiensCP0038769,436,34868.08,414351[13]
N. cyriacigeorgica GUH-2Homo sapiensFO0828436,194,64568.45,491349[14]
N. nova SH22aA root of Couma macrocarpaCP0068508,348,53267.87,583349[15]

Comparative genomic map of N. seriolae UTF1 genome and the other four Nocardia complete genomes.

The BLASTN-based ring image was generated by BLAST Ring Image Generator (BRIG) version 0.95 [32]. The innermost two rings show GC content (black) and GC skew (purple/green). The remaining rings (rings 3–6) represent a BLASTN comparison with complete genome of N. farcinica IFM 10152 (magenta), N. brasiliensis HUJEG-1(cyan), N. cyriacigeorgica GUH-2 (blue) and N. nova SH22a (pale blue). Bars indicate the position of mobile-element related genes in the N. seriolae UTF1 genome such as transposases (black), endonuclease DDE (blue) and integrase (green). The outermost red boxes highlight the 8 mce operons found in the N. seriolae UTF1 genome. The genome sequences of N. seriolae UTF1 and the previously reported N. seriolae isolates were quite similar. The N. seriolae UTF1 genome showed 99.99 and 99.95% ANI with ZJ0503 [16] and N-2927 [17], respectively. It should be noted that the ANI between N-2927 and U-1 [8], the most recently reported draft genome of N. seriolae isolate, was 100%. Therefore, we did not use the U-1 genome for further comparisons and analysis. To clarify the uncovered regions (gaps) in the previously reported N. seriolae genome sequences, the contig sequences of N. seriolae ZJ0503 [16] (319 contigs) and N-2927 [17] (339 contigs) were aligned to the complete sequence of N. seriolae UTF1 genome. As a result, 300 uncovered regions (average length: 1,373 bp) were detected in the draft genome of ZJ0503, while 297 uncovered regions (average length: 1,502 bp) were detected in the draft genome of N-2927. These gap sequences were subjected to BLASTX searches against the NCBI RefSeq database (E value threshold of 1E−5) and as a result, 294 (98.0%) for ZJ0503 and 290 (97.6%) for N-2927 had significant BLAST hits. The BLAST results revealed that most assigned genes (76.9% for ZJ0503 and 73.1% for N-2927) were mobile element-related, such as for transposase, endonuclease DDE and integrase (S1 Fig). The comparative genomic map of N. seriolae UTF1 with ZJ0503 and N-2927 also showed that these mobile element-related genes are interspersed across the N. seriolae UTF1 genome and coincides with the gap regions of ZJ0503 and N-2927 (S2 Fig). Since the nucleotide length of these genes was more than 1kb, these repeat sequences have a significant influence during de novo assembly when relatively short reads (several hundred bp) from MiSeq and 454-GS Junior are used [8, 16, 17]. In the present study, the PacBio long-reads (mean read length: 4,756 bp, N50 read length: 7,364 bp) could cover over these repeated sequences and achieve completion of the N. seriolae UTF1 genome.

Overview of N. seriolae UTF1 virulence factors

Like causal agents of human nocardiosis, N. seriolae is considered to be an intracellular pathogen that invades and grows within host cells, even including phagocytes [44, 45]. The intracellular nature of this bacteria makes it difficult to control disease. As a first step toward understanding the virulence factors and pathogenic properties of N. seriolae UTF1, we conducted BLASTP searches of N. seriolae UTF1 CDSs against the Virulence Factor Database (VFDB) [28] and a well-annotated virulence genes data set of N. farcinica (http://nocardia.nih.go.jp/) (E-value < 1E-5, sequence length overlap > 80% and sequence similarity > 50%). The VFDB search identified 173 CDSs as candidate virulence factors (S2 Table). In addition, almost all of N. farcinica putative virulence genes were found in the N. seriolae UTF1 genome (Table 3).
Table 3

Virulence factor candidates in the N. seriolae UTF1 by comparison with the virulence genes data set of N. farcinica.

CategoryN. seriorae UTF1 ORF IDN. farcinica gene ID*Gene*Description*%Identity%SimilarityE value
Cell wall proteinsORF-145nfa1810fbpAmycolyltransferase70.979.95.00E-159
ORF-7590nfa1820fbpBmycolyltransferase56.475.97.00E-67
ORF-150nfa1830fbpCmycolyltransferase57.773.02.00E-37
Metal importersORF-5098nfa37790ideRtranscriptional regulator66.079.44.00E-84
ORF-950nfa7630nbtAthioesterase85.790.20
ORF-951nfa7640nbtBpolyketide synthase86.490.05.00E-60
ORF-952nfa7650nbtCpolyketide synthase87.393.01.00E-143
ORF-953nfa7660nbtDnon-ribosomal peptide synthetase88.593.80
ORF-954nfa7670nbtEnon-ribosomal peptide synthetase91.696.24.00E-161
ORF-750nfa7680nbtFnon-ribosomal peptide synthetase93.097.12.00E-108
ORF-948nfa7610nbtGlysine-N-oxygenase84.793.10
ORF-747nfa6190nbtSsalicylate synthase81.488.54.00E-126
ORF-749nfa6200nbtTsalicylate-AMP ligase82.591.34.00E-126
Oxidative and nitrosative stressesORF-5118nfa37890ahpCalkylhydroperoxide reductase68.079.05.00E-149
ORF-5119nfa37900ahpDalkylhydroperoxidase68.879.30
ORF-325nfa55390katCcatalase79.087.80
ORF-4212nfa29500katGcatalase-peroxidase63.877.81.00E-169
ORF-6237nfa45490narGnitrate reductase alpha subunit71.384.80
ORF-6238nfa45500narHnitrate reductase beta subunit72.483.31.00E-78
ORF-6240nfa45520narInitrate reductase gamma subunit75.384.60
ORF-6239nfa45510narJnitrate reductase delta subunit74.183.81.00E-157
ORF-6265nfa45610nirBnitrite reductase (NAD(P)H) subunit76.986.90
ORF-6264nfa45600nirDnitrite reductase (NAD(P)H) subunit75.684.01.00E-173
ORF-5117nfa37880oxyRhydrogen peroxide sensing transcriptional regulator66.978.62.00E-165
ORF-418nfa52980sodCsuperoxide dismutase78.286.41.00E-140
ORF-68nfa1210sodFsuperoxide dismutase40.557.53.00E-112
Penetration into mammalian cellsORF-4357nfa34810invinvasin65.680.42.00E-119
Phagosome arrestingORF-5937nfa13510ndknucleoside diphosphate kinase44.857.62.00E-45
ORF-5498nfa16310ptpAprotein-tyrosine phosphatase48.761.42.00E-88
ORF-2644nfa18680ptpBprotein-tyrosine phosphatase60.270.70
otherORF-2873nfa19960tlyAputative cytotoxin/hemolysin77.483.26.00E-135

* according to the Nocardia farcinica Genome Project Page (http://nocardia.nih.go.jp/).

* according to the Nocardia farcinica Genome Project Page (http://nocardia.nih.go.jp/). Mammalian cell entry (Mce)-family proteins, virulence factors of Mycobacterium tuberculosis (the class Actinobacteria), have the ability to enter into mammalian cells and survive inside the macrophage [46]. The genome of M. tuberculosis contains four mce operons which comprise eight genes per operon in identical manner (two yrbE genes, A and B; six mce genes, A, B, C, D, E and F) [47, 48]. Mce proteins are found in diverse Actinobacteria including Nocardia spp. [49]. Six copies of mce operons have been found in the three human clinical isolates (N. farcinica IFM 10152 [12], N. brasiliensis HUJEG-1 [13] and N. cyriacigeorgica GUH-2 [14]), whereas 14 mce operons have been found in N. nova SH22, which was isolated from a plant root and has the ability of rubber and gutta-percha degradation [15]. Recently, Carrillo-González et al. (2016) demonstrated the importance of mce proteins for Nocardia pathogenesis from whole-genome comparison of an attenuated N. brasiliensis HUJEG-1 and the parental strain [50]. In the N. seriolae UTF1 genome, we found eight complete mce loci with nucleotide length of 6,814 bp to 8,964 bp (Figs 1 and 2). It should be noted that the locus mce3 has two extra genes (endonuclease DDE and orf3406) between mce3E and mce3F (Fig 2). However, the influence of these two extra genes to the function of the mce3 operon is unclear. Amino acid sequence similarities of N. seriolae UTF1 Mce1 proteins with the other four Nocardia species, and two Actinobacteria, Rhodococcus equi and M. tuberculosis are shown in S3 Table. The Mce1C protein is highly conserved among Mce1 proteins (87.8–90.7% similarities with Nocardia spp., 82.0% similarity with R. equi and 63.8% similarity with M. tuberculosis), while the Mce1E protein exhibit a relatively lower sequence homology (75.4–80.0% similarities with Nocardia spp., 75.3% similarity with R. equi and 63.3% similarity with M. tuberculosis). Invasin also plays a role in attachment and penetration into host cells by several bacterial species [51, 52], Nocardia species also possess an invasin gene [12, 13] and the N. seriolae UTF1 ORF-4357 is very similar to N. farcinica IFM 10152 invasin (80.4% similarity) (Table 3). Since Nocardia species are facultative intracellular pathogens, they most likely use this protein for entry into host cells. Further studies to determine the function of Nocardia invasin involved in the host cell entry are required.
Fig 2

The organization of 8 mce operons in N. seriolae UTF1 genome.

The figure was generated using Easyfig 2.1 [29]. Arrows represent yrbE genes, mce genes and two additional genes, DDE (endonuclease DDE) and orf3406 (hypothetical protein) in mce3. Values indicate the number of base positions. Asterisks (mce6* and mce7*) indicate reverse complement orientation.

The organization of 8 mce operons in N. seriolae UTF1 genome.

The figure was generated using Easyfig 2.1 [29]. Arrows represent yrbE genes, mce genes and two additional genes, DDE (endonuclease DDE) and orf3406 (hypothetical protein) in mce3. Values indicate the number of base positions. Asterisks (mce6* and mce7*) indicate reverse complement orientation. Nocardia species can modulate phagocyte function and can grow in macrophages. It has been reported that catalase and superoxide dismutase (SOD) impair the ability of the oxidative killing mechanisms of phagocytes [53]. In the N. seriolae UTF1 genome, two catalases (katC, ORF-325; katG, ORF-4212) and two sod (sodC, ORF-418; sodF, ORF-68) genes are present (Table 3). We also found the N. seriolae UTF1 CDSs that are homologous to the N. farcinica IFM 10152 nitrate reductase genes, narG (84.8%), narH (83.3%), narI (84.6%), narJ (83.8%), nirB (86.9%) and nirD (84.0%) (Table 3), which may contribute to survival under low-oxygen conditions in stimulated macrophages [54, 55]. Recently, it has been reported that the attenuated N. brasiliensis HUJEG-1 lost a catalase, SOD and several nitrate reductase genes, suggesting that these genes may be associated to Nocardia spp. pathogenesis [50]. In addition to the above oxidative and nitrosative stress-related genes, alkylhydroperoxidases, AhpC and AhpD, have a crucial role for antioxidant defense in mycobacterial species [56, 57], particularly when the KatG catalase-peroxidase activity is depressed [58]. In the AhpC/AhpD antioxidant defense system, OxyR acts as a positive regulator for their expression [59]. The N. seriolae UTF1 genome contains these genes similar to N. farcinica IFM 10152 with a protein homology of ahpC (79.0%), ahpD (79.3%) and oxyR (78.6%) (Table 3). A phagosome, a cellular compartment, is essential for intracellular killing and digesting of pathogenic microorganisms [60]. Nucleoside diphosphate kinase (Ndk) and protein tyrosine phosphatase A (PtpA) arrest macrophage phagosomal maturation for the intracellular survival and persistence of pathogenic mycobacteria [61]. The N. seriolae UTF1 ORF-5937 and ORF-5498 are homologous to N. farcinica IFM 10152 ndk (57.6%) and ptpA (61.4%), respectively (Table 3). Since iron is in very low concentration and in an insoluble state within macrophages, efficient iron-acquisition systems are required for pathogenic bacteria to survive [61]. Putative N. seriolae UTF1 metal importer related genes (ideR and nbtA–G, S, T) were identified (Table 3), and may contribute to the abilities of N. seriolae to survive in fish tissues including within macrophages. Other virulence factor candidates were also found in the N. seriolae UTF1 genome. Antigen 85 (Ag85) complex of M. tuberculosis is a family of fibronectin binding proteins (Fbp) that plays an essential role in the pathogenesis of tuberculosis [62]. The Ag85 complex consists of three proteins (Ag85A, Ag85B and Ag85C: encoded by the genes fbpA, fbpB and fbpC) that possess mycolyltransferase activity involved in the final stages of mycobacterial cell wall assembly [63]. The N. seriolae UTF1 genome was found to have at least seven putative fbp genes: three fbpA (ORF-144, ORF-145 and ORF-147), two fbpB (ORF-146 and ORF-7590) and two fbpC (ORF-148 and ORF-150). Among these seven fbp gene candidates, protein sequences of ORF-145 (fbpA), ORF-7590 (fbpB) and ORF-150 (fbpC) were most similar to N. farcinica IFM 10152 fbpA (79.9%), fbpB (75.9%) and fbpC (73.0%) (Table 3). TlyA proteins of the pathogenic bacteria has a virulent hemolytic ability [64, 65], and the N. seriolae UTF1 ORF-2873 encodes a TlyA, with a homology of 83.2% to the N. farcinica IFM 10152 (Table 3). In contrast to the PtpA (mentioned above), PtpB is considered to be nonessential for the phagosome arresting function of M. tuberculosis [61]. On the other hand, Zhoua et al. (2010) reported that M. tuberculosis PtpB depresses the innate immune responses by inhibiting the signaling pathway involved in interleukin-6 (IL-6) production and promoting host cell survival by activating the Akt pathway for their survival in macrophages [66]. The N. seriolae UTF1 genome contains a ptpB gene (ORF-2644), with a similar protein in N. farcinica IFM 10152 (70.7%) (Table 3). Overall, the whole-genome analysis of N. seriolae UTF1 reveals that the genome contains known virulence genes of mycobacteria and human clinical Nocardia isolates for host cell invasion, modulation of phagocyte function and surviving inside the macrophages. Therefore, the presence of these virulence genes in the N. seriolae UTF1 may explain their ability to survive intracellularly and within macrophages. The virulence gene set of N. seriolae UTF1 we present provides the material basis for further study of their pathogenic mechanisms at the molecular level. In addition, the complete genome sequence of N. seriolae UTF1 can be utilized for comparing the genomes between pathogenic and non (less)-pathogenic isolates to more fully resolve the genes responsible for N. seriolae pathogenesis in future studies.

Potential antibiotic resistance genes of N. seriolae UTF1

In general, Nocardia spp. are naturally resistant to many antibiotics and most β-lactams [7, 67–69]. The N. seriolae UTF1 encodes at least 14 β-lactamases, while the number of β-lactamase genes is one for N. farcinica IFM 10152 [12], 29 for N. brasiliensis HUJEG-1 [13] and 12 for N. cyriacigeorgica GUH-2 [14]. In addition, as found in the draft genome of N. seriolae genomes [8, 16, 17], the N. seriolae UTF1 genome has one vancomycin and two fluoroquinolones resistant gene candidates according to the RAST annotation. To explore more antibiotic resistant genes in the N. seriolae UTF1 chromosome, all their CDSs were compared with the Antibiotic Resistance Genes Database (ARDB) [30]. This analysis identified 20 CDSs as candidate antibiotic resistant genes that were classified into 16 antibiotic resistance gene types (Table 4). The presence of these antibiotic resistance genes of the N. seriolae UTF1 may, in part, explain the difficulty in treating diseases caused by N. seriolae.
Table 4

Potential antibiotic resistance genes of N. seriolae UTF1 according to the Antibiotic Resistance Genes Database (ARDB).

Resistance gene typeAntibiotic resistanceDescriptionUTF1 ORF IDBest Hit AccessionE-ValueScore
aac(6’)-IcAmikacin, dibekacin, isepamicin, netilmicin, sisomicin, tobramycinAminoglycoside N-acetyltransferase, which modifies aminoglycosides by acetylation.ORF-1348AAA265496.00E-32130
bacABacitracinUndecaprenyl pyrophosphate phosphatase, which consists in the sequestration of Undecaprenyl pyrophosphate.ORF-4188CAL137058.00E-23101
bl2a_iii2PenicillinClass A beta-lactamase. This enzyme breaks the beta-lactam antibiotic ring open and deactivates the molecule's antibacterial properites.ORF-6138YP_0899328.00E-65241
carALincosamide, macrolide, streptogramin bABC transporter system, Macrolide-Lincosamide-Streptogramin B efflux pump.ORF-4637AAC320272.00E-1058
catbB1ChloramphenicolGroup B chloramphenicol acetyltransferase, which can inactivate chloramphenicol. Also referred to as xenobiotic acetyltransferase.ORF-1707AAK887123.00E-0645
dfrA26TrimethoprimGroup A drug-insensitive dihydrofolate reductase, which can not be inhibited by trimethoprim.ORF-6939CAL484578.00E-29120
macBMacrolideResistance-nodulation-cell division transporter system. Multidrug resistance efflux pump. Macrolide-specific efflux system.ORF-1987YP_0014537609.00E-54204
ORF-4813YP_0015710411.00E-46180
ORF-6113YP_0013345785.00E-47181
ORF-6376YP_0014537602.00E-49189
mfpAFluoroquinolonePentapeptide repeat family, which protects DNA gyrase from the inhibition of quinolones.ORF-3773ABL051326.00E-0852
otr(B)TetracyclineMajor facilitator superfamily transporter, tetracycline efflux pump.ORF-1750AAD040321.00E-105375
srmBLincosamide, macrolide, streptogramin bABC transporter system, Macrolide-Lincosamide-Streptogramin B efflux pump.ORF-4948CAA450509.00E-68250
tcmATetracenomycin cMajor facilitator superfamily transporter. Resistance to tetracenomycin C by an active tetracenomycin C efflux system which is probably energized by transmembrane electrochemical gradients.ORF-4285AAA675093.00E-0540
tcr3TetracyclineMajor facilitator superfamily transporter, tetracycline efflux pump.ORF-6293BAA073908.00E-99355
vanRBVancomycinVanB type vancomycin resistance operon genes, which can synthesize peptidoglycan with modified C-terminal D-Ala-D-Ala to D-alanine—D-lactate.ORF-5538ABB533681.00E-1370
ORF-7016ABB533683.00E-1679
vanRCVancomycinVanC type vancomycin resistance operon genes, which can synthesize peptidoglycan with modified C-terminal D-Ala-D-Ala to D-alanine—D-serine.ORF-2855AAY679713.00E-31129
vatBStreptogramin aVirginiamycin A acetyltransferase, which can inactivate the target drug.ORF-4237YP_0010380949.00E-1057
vatCStreptogramin aVirginiamycin A acetyltransferase, which can inactivate the target drug.ORF-3078AAG216956.00E-49187
It has been reported that N. seriolae isolates could be divided into two phenotypic groups using α-glucosidase (α-glu) activity (α-glu-positive or -negative) [7, 68]. These two groups showed different oxytetracycline (OTC: a tetracyclinic antibiotics)/erythromycin (Em: a macrolide antibiotic) susceptibility profiles. Most of the α-glu-positive isolates were OTC-resistant and Em-sensitive, while most of α-glu-negative isolates were OTC-sensitive and Em-resistant [69,70]. N. seriolae UTF1 was α-glu-positive isolate which exhibited resistance to OTC and sensitivity to Em. Ismail et al. 2011 [69] found that OTC-resistant strains of N. seriolae possess tet(K) and/or tet(L) gene(s), while the Em-resistant strains possessed mef(A) and msr(D) genes. The tet(K) and tet(L) genes are generally found on small transmissible plasmids [71,72], while mef(A) and msr(D) genes are encoded in chromosomes of Gram-positive bacteria and associated with conjugative transposons [73]. Despite obtaining a 1.1 Gbp (133-fold genome coverage) of PacBio RS long-reads in this study and a 247 Mbp (30-fold genome coverage) of 454 reads (unpublished data), we could not find any plasmids or tet(K) and tet(L) gene sequences. The absence of these genes might have been caused by plasmid elimination during culture propagation. On another front, the N. seriolae UTF1 chromosome was found to have two tetracycline resistance gene candidates, otr(B) (ORF-1750) and tcr3 (ORF-6293) (Table 4), both found in Streptomyces spp. (Actinobacteria) [71], suggesting that these two genes might enhance some degree of resistance to OTC in N. seriolae. As expected, mef(A) and msr(D) genes were not found in the N. seriolae UTF1 chromosome. However, the results from ARDB includes three candidates for macrolide efflux pump genes: carA (ORF-4637), macB (ORF-1987, ORF-4813, ORF-6113 and ORF-6376) and srmB (ORF-4948) (Table 4). Since N. seriolae UTF1 is an Em-sensitive isolate, these three candidates are not likely to be involved in Em-resistance of N. seriolae UTF1. In particular, MacB requires TolC and MacA for its function in E. coli [74,75], but the N. seriolae UTF1 genome lacks both genes. Thus, further information on the antibiotic profile of more N. seriolae isolates, as well as their genomic sequences are needed for an accurate view of their antibiotic resistance genes, and the complete genome sequence of the N. seriolae UTF1 can be used as a reference for these surveys in future studies.

Genome comparison with fully sequenced genomes of the genus Nocardia

Four available complete genome sequences of the genus Nocardia; N. farcinica IFM 10152, N. brasiliensis HUJEG-1, N. cyriacigeorgica GUH-2 and N. nova SH22a, were used for comparison of the genomic structure with the N. seriolae UTF1. From the comparative genomic map (Fig 1), we found that there were no large-scale variations among the genomes, but a considerable number of non-homologous regions are scattered around the N. seriolae UTF1 genome. Most of these non-homologous regions were linked to mobile element-related genes (transposase, endonuclease DDE and integrase) (Fig 1), suggesting they serve as plastic and variable regions for the N. seriolae UTF1 genome. The whole-genome alignments of N. seriolae UTF1 with the other four Nocardia displayed an X-shaped distribution across the origin of replication (Fig 3), which is explained by the fork replication theory [76]. By contrast, the alignments of the three agents of human nocardiosis; N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2, showed only a few visible symmetric inversions and a diagonal line with a slope of approximately 1 (S3 Fig). On the other hand, the alignments of N. nova SH22a and the three human clinical isolates were arranged in an X-shaped distribution as in the case of N. seriolae UTF1. Thus, the degree of symmetrical inversions indicates that the genetic distance among the three human clinical isolates are close to each other, while N. seriolae UTF1 and N. nova SH22a are genetically distant from these species.
Fig 3

Dot plot comparisons of Nucleotide-based alignments were performed with MUMmer version 3.22 and dot plots were generated by the mummerplot script and the Unix program gnuplot [31].

Dot plot comparisons of Nucleotide-based alignments were performed with MUMmer version 3.22 and dot plots were generated by the mummerplot script and the Unix program gnuplot [31]. The ANI and AAI based pairwise comparisons between each genome are shown in Table 5. Typically, the ANI values between genomes of the same bacterial species show above 95%, while the values below 75% are too divergent to be compared based on this measurement [33]. For the latter case, AAI provides a much more robust resolution. AAI cut-offs for genus and species boundary have been estimated to be 55–60% and 85–90%, respectively [33]. The ANI and AAI between N. seriolae UTF1 and the other four Nocardia genomes ranged between 79.21–79.88% and 68.96–69.62%, respectively. Like the results of whole-genome alignments, the three human clinical isolates showed higher ANI and AAI values (81.19–81.61% ANI and 73.61–74.99% AAI) than the comparisons with N. nova SH22a (79.43–79.89% ANI and AAI (68.06–69.82%) and with N. seriolae UTF1 (described above). On the other hand, the ANI and AAI between N. seriolae UTF1 and N. nova SH22a had relatively low values, that displayed 79.63% ANI and 69.17% AAI, respectively. Overall, the ANI and AAI values indicate that the three human clinical isolates are a genetically similar group and N. seriolae UTF1, an agent of fish nocardiosis, is genetically distant from these species as well as N. nova SH22, an environmental isolate.
Table 5

Average nucleotide identity (ANI, upper grids) and average amino acid identity (AAI, lower grids) values (in percent) calculated between Nocardia genomes.

N. seriolae UTF1N. farcinica IFM 10152N. brasiliensis HUJEG-1N. cyriacigeorgica GUH-2N. nova SH22a
N. seriolae UTF1-79.8879.2179.8879.63
N. farcinica IFM 1015269.17-81.3681.6179.77
N. brasiliensis HUJEG-168.9673.61-81.1979.43
N. cyriacigeorgica GUH-269.6274.9973.68-79.89
N. nova SH22a69.1769.2168.0669.82-
We also focused on similarity of the functional profiling among the five complete Nocardia genomes. Fig 4 shows a dendrogram constructed based on the top 50 subsystems from the RAST analysis. Interestingly, although a distant genetic relationship was observed between N. seriolae UTF1 and the three agents of human nocardiosis through whole genome comparisons, the analysis revealed that the functional repertoire of N. seriolae UTF1 is closer to the three agents of human nocardiosis than N. nova SH22a (Fig 4). Similar results were obtained with the dataset of overall subsystems (S4 Fig) and the KEGG modules assigned by BlastKOALA (S5 Fig). These findings suggest that the closer functional relationship between the agents of the human and fish nocardiosis may be associated with their adaptations to infected animal hosts and pathogenic properties.
Fig 4

Functional profiling of the five Nocardia complete genomes.

Heat map shows the abundance of the top 50 subsystems [36, 37] enriched in the five Nocardia genomes. The color scale indicates the abundance of gene content for each category.

Functional profiling of the five Nocardia complete genomes.

Heat map shows the abundance of the top 50 subsystems [36, 37] enriched in the five Nocardia genomes. The color scale indicates the abundance of gene content for each category. In summary, when the five complete genomes of Nocardia are compared, we found that N. seriolae UTF1 is genetically distant from the three agents of human nocardiosis, but is similar to the functional repertoire of them. Further research studies are required to more fully resolve the phylogenetic relationship of Nocardia spp. [40] and the differentially enriched pathways according to their habitat and lifestyle, when a greater number of completely sequenced and thoroughly annotated clinical and environmental Nocardia isolates become available.

Finding functional features in N. seriolae UTF1 genome

The complete sequence of the N. seriolae UTF1 genome has been determined for the first time as a marine Nocardia species, and therefore it is interesting to investigate the characteristic features of its genome. According to the COG classifications, the distribution of COG categories is mostly similar among Nocardia genomes (Fig 5). In the N. seriolae UTF1, the number of genes for the category ‘Mobilome: prophages, transposons (X)’ and ‘Unknown’ (not assignable to COG categories) is higher than that in the other four Nocardia genomes (Fig 5). Notably, genes related to ‘Mobilome: prophages, transposons (X)’ are quite abundant in the N. seriolae UTF1. The N. seriolae UTF1 genome contains 406 genes in this category, compared to only 71 in N. farcinica IFM 10152, 44 in N. brasiliensis HUJEG-1, 45 in N. cyriacigeorgica GUH-2 and 56 in N. nova SH22a (S4 Table). According to the comparative genomic map in Fig 1, these mobile element genes are interspersed throughout the N. seriolae UTF1genome, and their sequences correspond to the variable regions (Fig 1). Overall, the abundance of the genes related to mobile element proteins and unknown function in the N. seriolae UTF1 genome can partially explain the divergence of their genome structure and gene content from the other Nocardia genomes (Figs 1 and 3).
Fig 5

Comparison of COG distribution of N. seriolae UTF1 and the other four Nocardia genomes.

COG definitions are described as follows: A, RNA processing and modification; B, Chromatin structure and dynamics; C, Energy production and conversion; D, Cell cycle control, cell division, chromosome partitioning; E, Amino acid transport and metabolism; F, Nucleotide transport and metabolism; G, Carbohydrate transport and metabolism; H, Coenzyme transport and metabolism; I, Lipid transport and metabolism; J, Translation, ribosomal structure and biogenesis; K, Transcription; L, Replication, recombination and repair; M, Cell wall/membrane/envelope biogenesis; N, Cell motility; O, Posttranslational modification, protein turnover, chaperones; P, Inorganic ion transport and metabolism; Q, Secondary metabolites biosynthesis, transport and catabolism; R, General function prediction only; S, Function unknown; T, Signal transduction mechanisms; U, Intracellular trafficking, secretion, and vesicular transport; V, Defense mechanisms; W, Extracellular structures; X, Mobilome: prophages, transposons; Y, Nuclear structure; Z, Cytoskeleton. ‘Uk’ indicate unknown (unassigned genes). The number of ORFs with each COG category are listed in S4 Table.

Comparison of COG distribution of N. seriolae UTF1 and the other four Nocardia genomes.

COG definitions are described as follows: A, RNA processing and modification; B, Chromatin structure and dynamics; C, Energy production and conversion; D, Cell cycle control, cell division, chromosome partitioning; E, Amino acid transport and metabolism; F, Nucleotide transport and metabolism; G, Carbohydrate transport and metabolism; H, Coenzyme transport and metabolism; I, Lipid transport and metabolism; J, Translation, ribosomal structure and biogenesis; K, Transcription; L, Replication, recombination and repair; M, Cell wall/membrane/envelope biogenesis; N, Cell motility; O, Posttranslational modification, protein turnover, chaperones; P, Inorganic ion transport and metabolism; Q, Secondary metabolites biosynthesis, transport and catabolism; R, General function prediction only; S, Function unknown; T, Signal transduction mechanisms; U, Intracellular trafficking, secretion, and vesicular transport; V, Defense mechanisms; W, Extracellular structures; X, Mobilome: prophages, transposons; Y, Nuclear structure; Z, Cytoskeleton. ‘Uk’ indicate unknown (unassigned genes). The number of ORFs with each COG category are listed in S4 Table. To further characterize the genomic features of N. seriolae UTF1, we focus on the functional characterization of the N. seriolae UTF1-specific genes. As based on the OrthoMCL clustering, 2,745 orthologous genes were presented in all 5 Nocardia genomes and 1,982 genes were unique to N. seriolae UTF1 (Fig 6A). Of the 1,982 N. seriolae UTF1-specific genes, 217 genes (10.9%) were annotated in the KEGG database. The proportion of KEGG categories of the N. seriolae UTF1-specific genes shows some differences compared to those of all N. seriolae UTF1 genes (Fig 6B). The 'Environmental Information Processing' category (21%) are the most abundant in the N. seriolae UTF1-specific genes, followed by ‘Carbohydrate metabolism’ (14%) (Fig 6B). Focusing on the abundant KEGG modules, 14 genes are assigned in the ABC transport system (Module ID: M00254 and M00258) (Table 6). Because of limited resources in ocean environments, marine bacteria require various efficient transport systems to capture essential nutrients [77]. Therefore, the N. seriolae UTF1-specific transport systems-related genes identified in the present study may have a role in adaptation to the marine environment. In addition, most of the N. seriolae UTF1-specific genes (89.1%) were not assignable to KEGG. Further study on such transport systems-related genes and to the genes of unknown function will provide unique insight into their adaptation to fish hosts in the aquatic environment.
Fig 6

Identification of The orthologous and species-specific genes were identified using OrthoMCL [34] (A). The protein sequences of 1,982 N. seriolae UTF1-specific genes were functionally annotated with metabolic information from the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database using KEGG Orthology And Links Annotation (BlastKOALA) program [27] (B).

Table 6

Functional classification of unique genes of N. seriolae UTF1 by KEGG modules.

KEGG ModuleModule IDDescriptionNumber of genes
Environmental information processing
    ABC-2 type and other transport systemsM00254ABC-2 type transport system12
M00258 Putative ABC transport system2
    Mineral and organic ion transport systemM00188NitT/TauT family transport system1
M00190 Iron(III) transport system2
M00299Spermidine/putrescine transport system4
    Phosphotransferase system (PTS)M00273 PTS system, fructose-specific II component3
    Phosphate and amino acid transport systemM00233 Glutamate transport system1
    Two-component regulatory systemM00452 CusS-CusR (copper tolerance) two-component regulatory system2
M00475 BarA-UvrY (central carbon metabolism) two-component regulatory system1
    Bacterial secretion systemM00335Sec (secretion) system1
    Drug efflux transporter/pumpM00713 Fluoroquinolone resistance, efflux pump LfrA 1
    Drug resistanceM00742 Aminoglycoside resistance, protease FtsH 1
M00743 Aminoglycoside resistance, protease HtpX1
M00727 Cationic antimicrobial peptide (CAMP) resistance, N-acetylmuramoyl-L-alanine amidase AmiA and AmiC1
M00745 Imipenem resistance, repression of porin OprD 2
M00714Multidrug resistance, efflux pump QacA1
Energy metabolism
    ATP synthesisM00159 V-type ATPase, prokaryotes5
M00149Succinate dehydrogenase, prokaryotes4
M00151 Cytochrome bc1 complex respiratory unit1
    Nitrogen metabolismM00530Dissimilatory nitrate reduction, nitrate = > ammonia1
    Methane metabolismM00345Formaldehyde assimilation, ribulose monophosphate pathway1
Carbohydrate and lipid metabolism
    Fatty acid metabolismM00083 Fatty acid biosynthesis, elongation3
M00082 Fatty acid biosynthesis, initiation1
    Central carbohydrate metabolismM00307Pyruvate oxidation, pyruvate = > acetyl-CoA3
Nucleotide and amino acid metabolism
    Serine and threonine metabolismM00555 Betaine biosynthesis, choline = > betaine1
    Cysteine and methionine metabolismM00021 Cysteine biosynthesis, serine = > cysteine2
    Other amino acid metabolismM00027 GABA (gamma-Aminobutyrate) shunt1
Genetic information processing
    RibosomeM00178 Ribosome, bacteria1
Identification of The orthologous and species-specific genes were identified using OrthoMCL [34] (A). The protein sequences of 1,982 N. seriolae UTF1-specific genes were functionally annotated with metabolic information from the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database using KEGG Orthology And Links Annotation (BlastKOALA) program [27] (B).

Conclusions

The complete nucleotide sequence of N. seriolae UTF1 consists of a circular chromosome of 8,121,733 bp with a G+C content of 68.1% and 7,697 predicted CDSs. The genome possesses known bacterial virulence genes that have functions in host cell invasion, modulation of phagocyte function and for survival within macrophages. The detected candidate virulence factors provide a novel resource for further study of their pathogenic mechanisms at the molecular level in the fish nocardiosis research community. We also found many antibiotic resistance genes on the N. seriolae UTF1 chromosome, suggesting natural resistance of this bacteria to many drugs. Our comparative analysis with the four existing complete Nocardia spp. genomes revealed that the N. seriolae UTF1 genome structure and gene content differs from the other Nocardia genomes due to a large amount of mobile element genes. In addition, there are homologs of many transporters among the N. seriolae UTF1-specific genes allowing us to speculate on their role in adaptation in the marine environment. Thus, we expect that the availability of the complete genome of N. seriolae UTF1 can be used as the reference sequence not only for N. seriolae isolates, but also for the comparative genomic studies of genus Nocardia as an example of a marine fish pathogen to provide insights into the ecological and functional diversity of this genus [1] in the near future.

Identification and characterization of gap regions in the previous reported two N. seriolae draft genome sequences.

The contig sequences of N. seriolae ZJ0503 (GenBank: NZ_JNCT01000000, 319 contigs) and N-2927 (GenBank: NZ_BAWD02000000, 339 contigs) were aligned to the complete sequence of N. seriolae UTF1 genome. Three hundred (average length: 1373 bp) and 297 (average length: 1502 bp) uncovered regions were detected in the comparison with ZJ0503 and with N-2927, respectively. These gap sequences were subjected to BLASTX search against the NCBI RefSeq database (E value threshold of 1E−5). As a result, 294 (98.0%) for ZJ0503 and 290 (97.6%) for N-2927 have significant BLAST hit. Values represent number of genes with the best BLAST hit for ZJ0503 (red) and N-2927 (blue). (PDF) Click here for additional data file.

Comparison of N. seriolae UTF1 genome and two other N. seriolae draft genomes, ZJ0503 (GenBank: NZ_JNCT01000000) and N-2927 (GenBank: NZ_BAWD02000000).

The BLASTN-based ring image was generated by BLAST Ring Image Generator (BRIG) version 0.95 [32]. The innermost two rings show GC content (black) and GC skew (purple). The second and third innermost rings represent a BLASTN comparison with the draft genomes of ZJ0503 (blue) and N-2927 (red), respectively. Bars indicate the position of mobile-element related genes in the N. seriolae UTF1 genome such as transposases (black), endonuclease DDE (blue) and integrase (green). (PDF) Click here for additional data file.

Dot plot analysis of the genome sequence of four Nocardia species (N. farcinica IFM 10152, N. brasiliensis HUJEG-1, N. cyriacigeorgica GUH-2 and N. nova SH22a.

Nucleotide-based alignments were performed with MUMmer version 3.22 and dot plots were generated by the mummerplot script and the Unix program gnuplot [31]. (PDF) Click here for additional data file.

Functional profiling of the five Nocardia complete genomes with the dataset of overall subsystems.

(TIF) Click here for additional data file.

Functional profiling of the five Nocardia complete genomes with the KEGG modules assigned by BlastKOALA.

(TIF) Click here for additional data file.

Long-range PCR primers used in this study.

(XLSX) Click here for additional data file.

Candidate virulence factors identified in the N. seriolae UTF1 according to VFDB.

(XLSX) Click here for additional data file.

Pairwise amino acid sequence similarities (%) of N. seriolae UTF1 Mce1 proteins with the other four Nocardia species, Rhodococcus equi and Mycobacterium tuberculosis.

(XLSX) Click here for additional data file. (XLSX) Click here for additional data file.
  67 in total

1.  Open source clustering software.

Authors:  M J L de Hoon; S Imoto; J Nolan; S Miyano
Journal:  Bioinformatics       Date:  2004-02-10       Impact factor: 6.937

2.  Role of the major antigen of Mycobacterium tuberculosis in cell wall biogenesis.

Authors:  J T Belisle; V D Vissa; T Sievert; K Takayama; P J Brennan; G S Besra
Journal:  Science       Date:  1997-05-30       Impact factor: 47.728

3.  Helicobacter pylori pore-forming cytolysin orthologue TlyA possesses in vitro hemolytic activity and has a role in colonization of the gastric mucosa.

Authors:  M C Martino; R A Stabler; Z W Zhang; M J Farthing; B W Wren; N Dorrell
Journal:  Infect Immun       Date:  2001-03       Impact factor: 3.441

Review 4.  The antigen 85 complex: a major secretion product of Mycobacterium tuberculosis.

Authors:  H G Wiker; M Harboe
Journal:  Microbiol Rev       Date:  1992-12

5.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence.

Authors:  S T Cole; R Brosch; J Parkhill; T Garnier; C Churcher; D Harris; S V Gordon; K Eiglmeier; S Gas; C E Barry; F Tekaia; K Badcock; D Basham; D Brown; T Chillingworth; R Connor; R Davies; K Devlin; T Feltwell; S Gentles; N Hamlin; S Holroyd; T Hornsby; K Jagels; A Krogh; J McLean; S Moule; L Murphy; K Oliver; J Osborne; M A Quail; M A Rajandream; J Rogers; S Rutter; K Seeger; J Skelton; R Squares; S Squares; J E Sulston; K Taylor; S Whitehead; B G Barrell
Journal:  Nature       Date:  1998-06-11       Impact factor: 49.962

6.  Immune responses to live and inactivated Nocardia seriolae and protective effect of recombinant interferon gamma (rIFN γ) against nocardiosis in ginbuna crucian carp, Carassius auratus langsdorfii.

Authors:  Sukanta Kumar Nayak; Yasuhiro Shibasaki; Teruyuki Nakanishi
Journal:  Fish Shellfish Immunol       Date:  2014-05-29       Impact factor: 4.581

7.  SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.

Authors:  Kai Bernd Stadermann; Bernd Weisshaar; Daniela Holtgräwe
Journal:  BMC Bioinformatics       Date:  2015-09-16       Impact factor: 3.169

8.  Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

Authors:  Lucio Vera-Cabrera; Rocio Ortiz-Lopez; Ramiro Elizondo-Gonzalez; Jorge Ocampo-Candiani
Journal:  PLoS One       Date:  2013-06-03       Impact factor: 3.240

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

10.  Draft Genome Sequence of Erythromycin- and Oxytetracycline-Sensitive Nocardia seriolae Strain U-1 (NBRC 110359).

Authors:  Masayuki Imajoh; Masaki Sukeda; Masato Shimizu; Jin Yamane; Kouhei Ohnishi; Syun-Ichirou Oshima
Journal:  Genome Announc       Date:  2016-01-21
View more
  5 in total

1.  Insights from the comparative genome analysis of natural rubber degrading Nocardia species.

Authors:  Biraj Sarkar; Aayatti Mallick Gupta; Sukhendu Mandal
Journal:  Bioinformation       Date:  2021-10-31

2.  Genomic characterization of Nocardia seriolae strains isolated from diseased fish.

Authors:  Hyun-Ja Han; Min-Jung Kwak; Sung-Min Ha; Seung-Jo Yang; Jin Do Kim; Kyoung-Hee Cho; Tae-Wook Kim; Mi Young Cho; Byung-Yong Kim; Sung-Hee Jung; Jongsik Chun
Journal:  Microbiologyopen       Date:  2018-08-16       Impact factor: 3.139

Review 3.  Pathogenic Nocardia: A diverse genus of emerging pathogens or just poorly recognized?

Authors:  Heer H Mehta; Yousif Shamoo
Journal:  PLoS Pathog       Date:  2020-03-05       Impact factor: 6.823

4.  Comparative genomics of Nocardia seriolae reveals recent importation and subsequent widespread dissemination in mariculture farms in the South Central Coast region, Vietnam.

Authors:  Cuong T Le; Erin P Price; Derek S Sarovich; Thu T A Nguyen; Daniel Powell; Hung Vu-Khac; D İpek Kurtböke; Wayne Knibb; Shih-Chu Chen; Mohammad Katouli
Journal:  Microb Genom       Date:  2022-07

5.  Molecular Characterization and Antimicrobial Susceptibilities of Nocardia Species Isolated from the Soil; A Comparison with Species Isolated from Humans.

Authors:  Gema Carrasco; Sara Monzón; María San Segundo; Enrique García; Noelia Garrido; María J Medina-Pascual; Pilar Villalón; Ana Ramírez; Pilar Jiménez; Isabel Cuesta; Sylvia Valdezate
Journal:  Microorganisms       Date:  2020-06-15
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.