Literature DB >> 27688836

Whole genome shotgun sequence of Bacillus amyloliquefaciens TF28, a biocontrol entophytic bacterium.

Shumei Zhang1, Wei Jiang1, Jing Li1, Liqiang Meng1, Xu Cao1, Jihua Hu2, Yushuai Liu1, Jingyu Chen2, Changqing Sha3.   

Abstract

Bacillus amyloliquefaciens TF28 is a biocontrol endophytic bacterium that is capable of inhibition of a broad range of plant pathogenic fungi. The strain has the potential to be developed into a biocontrol agent for use in agriculture. Here we report the whole-genome shotgun sequence of the strain. The genome size of B. amyloliquefaciens TF28 is 3,987,635 bp which consists of 3754 protein-coding genes, 65 tandem repeat sequences, 47 minisatellite DNA, 2 microsatellite DNA, 63 tRNA, 7rRNA, 6 sRNA, 3 prophage and CRISPR domains.

Entities:  

Keywords:  Bacillus amyloliquefaciens; Biocontrol; Broad spectrum; Endophytic bacterium; Genome sequence

Year:  2016        PMID: 27688836      PMCID: PMC5031281          DOI: 10.1186/s40793-016-0182-6

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

is ubiquitous in nature. Some strains are used as biocontrol agents because of their ability to produce antagonistic metabolites, plant growth promoters and plant health enhancers [1-4]. is usually divided into two subspecies by genome comparison and classical bacterial taxonomy. Plant growth-promoting rhizobacterial strains are classified as , while other strains are regarded as [5]. TF28 is an endophytic bacterium that was isolated from soybean root. Previous studies have shown that TF28 could inhibit soil borne and air borne plant pathogenic fungi by competition, synthesizing antifungal metabolites and inducing systemic plant resistance [6, 7]. Based on 16S rRNA, DNA gyrase subunit A (gyrA) and RNA polymerase subunit B (rpoB) gene sequence analysis, TF28 was classified as . Here we present a whole-genome shotgun sequence of TF28 and its annotation for facilitating its application in the biocontrol of plant diseases.

Organism information

Classification and features

TF28 was isolated from soybean root in China. It exhibited an unusual ability to inhibit a wide range of plant pathogenic fungi. The cell morphology of strain TF28 was determined using scanning electron microscopy (Fig. 1). Cells of TF28 are Gram-positive, rod shape, aerobic and endospore- forming. Strain TF28 utilizes glucose and lactose to produce acid and hydrolyzed gelatin and starch. Starin TF28 is positive for Vogues-Proskaur and Methyl red reaction, nitrate reduction and citrate utilization. Current taxonomic classification and general features of TF28 are provided in Table 1.
Fig. 1

A scanning electron micrograph of B. amyloliquefaciens TF 28 cells

Table 1

Classification and general features of B.amyloliquifaciens TF28 as per MIGS recommendation [26]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [27]
Phylum Firmicutes TAS [2830]
Class Bacilli TAS [31, 32]
Order Bacillales TAS [29, 33]
Family Bacillaceae TAS [29, 34]
Genus Bacillus TAS [29, 35]
Species Bacillus amyloliquefaciens TAS [3638]
Strain: TF28TAS [6]
Gram stainPositiveTAS [6]
Cell shapeRodTAS [6]
MotilityMotileTAS [6]
SporulationEndospore-formingTAS [6]
Temperature range15–37 °CTAS [6]
Optimum temperature30 °CTAS [6]
pH range; Optimum5–9, 7.5TAS [6]
Carbon sourceGlucose, lactose, starchTAS [6]
MIGS-6HabitatSoil, PlantTAS [6]
MIGS-6.3Salinity0–3 % W/VTAS [6]
MIGS-22Oxygen requirementAerobicTAS [6]
MIGS-15Biotic relationshipFree-livingTAS [6]
MIGS-14PathogenicityNon-pathogenNAS
MIGS-4Geographic locationChina/HeilongjiangTAS [6]
MIGS-5Sample collection2006-06-10TAS [6]
MIGS-4.1LatitudeNot reported
MIGS-4.2LongitudeNot reported
MIGS-4.4AltitudeNot reported

aEvidence codes - NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [39]

A scanning electron micrograph of B. amyloliquefaciens TF 28 cells Classification and general features of B.amyloliquifaciens TF28 as per MIGS recommendation [26] aEvidence codes - NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [39] The 16S rRNA gene sequence of strain TF28 and other available 16S rRNA gene sequences of closely related species collected from NCBI database were used to construct a phylogenetic tree (Fig. 2, Additional file 1: Table S1). The evolutionary history was inferred using the Neighbour-joining method with MEGA software version 5.10. BLAST analysis showed strain TF28 shared 99.3–99.7 % 16S rRNA gene identities with the other 14 type strains of species. Taxonomic analysis showed that 14 type strains were divided into two groups. Strain TF28 together with FZB42T, CBMB205T, subsp. amyloliquefaciens DSM7Tand PD-A10T were clustered into one group. Other strains (NBRC 15539, DSM 11031, 10bT, 168T, DSM 10, BGSC 3A28, NBRC 101239, NBRC 15718, CR-95T and LMG 22476) were clustered into another group. Two type stains of subspecies, FZB42T and DSM7 were attributed to the different clade. Strain TF28 was most closely related to FZB42T with 99.7 % 16S rRNA gene sequence identity. Strain TF28 was classified as .
Fig. 2

Phylogenetic trees based on 16S rRNA gene sequences highlighting the position of B. amyloliquefaciens TF 28 (shown in bold). The GenBank accession numbers are shown in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences were constructed using the neighbor-joining method within the MEGA 5.10 software (Additional file 7: Table S7). Numbers at the nodes represent percentages of bootstrap values obtained by repeating the analysis 1000 times to generate a majority consensus tree. The scale bar indicates 0.0005 nucleotide change per nucleotide position, respectively

Phylogenetic trees based on 16S rRNA gene sequences highlighting the position of B. amyloliquefaciens TF 28 (shown in bold). The GenBank accession numbers are shown in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences were constructed using the neighbor-joining method within the MEGA 5.10 software (Additional file 7: Table S7). Numbers at the nodes represent percentages of bootstrap values obtained by repeating the analysis 1000 times to generate a majority consensus tree. The scale bar indicates 0.0005 nucleotide change per nucleotide position, respectively

Genome sequencing information

Genome project history

Genome of TF28 was sequenced by Huada Gene Technology Co., Ltd, Shenzhen, China. The Whole Genome Shotgun sequence has been deposited in GenBank database under the accession number JUDU00000000. The summary of the project information is shown in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS 31Finishing qualityHigh-quality draft
MIGS-28Libraries usedIllumina Paired-End Library (500 bp insert size) and Mate Pair Library (6,000 bp insert size)
MIGS 29Sequencing platformsIllumina Hiseq2000
MIGS 31.2Fold coverage150×
MIGS 30AssemblersSOAPdenovo software 2.04
MIGS 32Gene calling methodGlimmer
Locus TagTH57
GenBank IDJUDU00000000
GenBank Date of Release2015/01/21
GOLD ID-
BIOPROJECTPRJNA268537
MIGS 13Source Material IdentifierTF28
Project relevanceBiocontrol, Agriculture
Project information

Growth conditions and genomic DNA preparation

TF28 was grown in LB medium at 30 °C for 16 h. One liter cultures at the exponential growth phase was taken and centrifuged at 4 °C, 5000 rpm for 10 min. The pellet was collected and about 5 g cell pellet was used to extract genomic DNA by CTAB method [8]. The quality of DNA was assessed using a Qubit Fluorometer. Total DNA (280.6 μg) was obtained to do genome sequencing.

Genome sequencing and assembly

Genomic DNA was sheared randomly. The required length DNA fragments were retained by electrophoresis and used for construction of a 500 bp and 6000 bp long paired-end library. Sequencing was performed by Illumina HiSeq 2000 sequencing platform. Sequencing of the 500 bp library generated 6,649,820 reads (representing 554 Mbp of sequence information), while sequencing of the 6,000 bp paired-end library generated 3,633,388 reads (290 Mbp). Both libraries achieved a genome coverage of 190× for an estimated genome size of 4.4 Mbp. All generated reads were quality trimmed to obtain clear reads. The trimmed reads were assembled by SOAPdenovo software 2.04 using the available genome sequence of FZB42T(CP000560) as reference-guided assembling. The final assembly yielded 182 contigs and 3 scaffolds representing 3.9 Mbp of sequence information.

Genome annotation

The genome sequence was annotated by a combination of several annotation tools. Genes were identified by Glimmer 3.02 [9]. DNA tandem repeat sequences, minisatellite DNA and microsatellite DNA were found with the Tandem Repeats Finder 4.04 [10]. Prediction of non-coding RNA was performed using rRNA database blasting or rRNAmmer 1.2 for rRNA [11], tRNAscan-SE 1.23 for tRNA and their secondary structure [12], and infernal software and Rfam database for sRNA [13]. Prophage was predicted using PHAST software 2013.03.20 [14]. CRISPR domains were found using CRISPR Finder 0.4 [15]. Functional annotation of protein coding genes was based on gene comparisons with GO database (version 1.419) [16], KEGG database (version 59) [17], Cluster of Orthologous Groups of proteins(COG)(version 20090331) [18], NR database(version 20121005), SwissProt (version 201206) [19] and Pfam databases (version 25) [20].

Genome properties

The genome statistics are provided in Table 3 and Fig. 3. The high quality draft genome of TF28 was distributed in 182 contigs with a total size of 3,987,635 bp and an average G + C content of 46.38 %. Genome analysis showed that the genome of strain TF28 contained 3,754 protein coding genes, 65 tandem repeat sequences, 47 minisatellite DNA, 2 microsatellite DNA, 63 tRNA, 7 rRNA, 6 sRNA, 3 prophage and 3 CRISPR domains. The predicted protein coding genes represented 89.57 % of the total genome sequence, with a total length of 3,571,596 bp. The majority of protein coding genes (76.13 %) were assigned to putative functions. The distribution of genes into COG functional categories is presented in Table 4.
Table 3

Genome statistics

AttributeValue% of Total
Genome size (bp)3,987,635100.00
DNA coding (bp)3,571,59689.57
DNA G + C (bp)1,849,46546.38
DNA scaffolds3-
Total genes3863100.00
Protein coding genes375497.18
RNA genes761.97
Pseudo genes380.98
Genes in internal clusters155440.23
Genes with function prediction294176.13
Genes assigned to COGs321883.30
Genes with Pfam domains329285.21
Genes with signal peptides2045.28
Genes with transmembrane helices104126.95
CRISPR repeats3-
Fig. 3

Circle map of strain TF28 genome. From outer to inner circle, circle 1 shows protein-coding genes colored by COG categories; circle 2 shows G + C% content plot; circle 3 shows GC skew

Table 4

Number of genes associated with general COG functional categories

CodeValue%ageDescription
J1363.02Translation, ribosomal structure and biogenesis
A00RNA processing and modification
K2695.97Transcription
L1142.53Replication, recombination and repair
B10.02Chromatin structure and dynamics
D340.75Cell cycle control, Cell division, chromosome partitioning
V491.09Defense mechanisms
T1342.97Signal transduction mechanisms
M1703.77Cell wall/membrane biogenesis
N541.19Cell motility
U450.99Intracellular trafficking and secretion
O962.13Posttranslational modification, protein turnover, chaperones
C1783.95Energy production and conversion
G2435.39Carbohydrate transport and metabolism
E3427.58Amino acid transport and metabolism
F781.73Nucleotide transport and metabolism
H1232.73Coenzyme transport and metabolism
I1162.57Lipid transport and metabolism
P2084.61Inorganic ion transport and metabolism
Q1222.71Secondary metabolites biosynthesis, transport and catabolism
R4299.51General function prediction only
S2776.14Function unknown
-129128.63Not in COGs

The total % age is based on the total number of protein coding genes in the annotated genome

Genome statistics Circle map of strain TF28 genome. From outer to inner circle, circle 1 shows protein-coding genes colored by COG categories; circle 2 shows G + C% content plot; circle 3 shows GC skew Number of genes associated with general COG functional categories The total % age is based on the total number of protein coding genes in the annotated genome

Insights from the genome sequence

Protein coding genes were mainly classified into 3 parts based on their functions by GO analysis (Fig. 4). 1901, 2993 and 4309 genes participated in cellular component, molecular function and biological process, respectively. The metabolic pathway analysis using KEGG annotation showed that the majority of protein coding genes participated in metabolism, genetic information processing, environmental information processing and cellular processes (Fig. 5). 154 metabolic pathways were found using KEGG orthology, including glycolysis, TCA cycle and pentose phosphate pathways, fructose, mannose and galactose metabolisms pathways, fatty acid biosynthesis and metabolism pathways, ubiquinone and other terpenoid-uquinoid synthesis pathways, bacterial chemotaxis, biosynthsis of siderophore group nonribosomal peptides, antibiotic biosynthesis (tetracycline, penicillin and cephalosporin, streptomycin, novobiocin and vancomycin) as well as noxious substance degradation pathways (caprolactam, atrazine, ethylbenzene, toluene, polycyclic aromatic hydrocarbon, chloroalkane and chloroalkene, bisphenol, naphthalene, aminobenzoate, limonene and pinene), and so on.
Fig. 4

GO annotation of protein-coding genes

Fig. 5

KEGG annotation of protein-coding genes

GO annotation of protein-coding genes KEGG annotation of protein-coding genes Genome similarity was detected based on Mummer blast by comparing the genome sequence of strain TF28 with the type strain FZB42Tat amino acid level [21]. The results showed that genome similarity of TF28 and FZB42T reached 98.69 %. Core-pan gene was also determined based on NCBI blast and Muscle analysis [22]. 201 strain-specific genes for B. amyloliquefaciens TF28 was observed, which may contribute to species-specific features of this bacterium. Among them, 83 genes are classified into 17 COG functional categories major belonging to carbohydrate transport and metabolism (6.97 %), general function prediction only (4.48 %), defense mechanisms (4.48 %), signal transduction mechanisms (3.48 %), amino acid transport and metabolism (3.98 %). The remaining 116 unique genes (57.71 %) are not classified into any COG categories (Table 5). Comparative genome analysis revealed that TF28 possessed the giant gene clusters for non-ribosomal synthesis of the polyketides difficidin (TH57_02955-TH57_03045) and bacillaene (TH57_05575-TH57_05655), the antifungal lipopetides surfactin (TH57_12375-TH57_12430), plipastatin (TH57_04780-TH57_04835), mycosubtilin (TH57-04955-TH57-04980), bacilysin (TH57_15685-TH57_15710) and bacillibactin (TH57_05755-TH57_05800) (Additional file 2: Table S2). The size of these gene clusters accounted for 6.8 % of genome, which was smaller than that of strain FZB42T(8.9 %) [23]. Mycosubtilin and plipastatin synthesis gene clusters were only observed in strain TF28. These gene clusters produce the secondary metabolites like NRPSs, PKS, and peptide antibiotics usually displaying antifungal and antibacterial activities [23-25]. The finding of these gene clusters revealed that strain TF28 possessed a high potential to biocontrol. In addition, sporulation genes, spo0ABFJ(TH57_02695,01435,16015and14190),spoVABCDEFKSMRT(TH57_03250,03255,03260,03265,03270,03275,03280),SpoIIBPMERDQSASB(TH57_05470,00570,06315,09520,13850),spoIIIABCDEFGH(TH57_01370,02065,03200,07810,07815,13800,16095,16205and16305),coaX(TH57_01485),YtrIH(TH57_00725,009730),ylbJB(TH57_06685),ydcC(TH57_11735),ydhD(TH57_07580),cse15(TH57_07135), yunB(TH57_18555) and motility genes, motAB (TH57_074157, 07420) and swrABC (TH57_16980,05970 and 10760), were found in the genome.
Table 5

Number of strain-specific genes with general COG functional categories

CodeValue%ageDescription
J20.99Translation,ribosomal structure and biogenesis
A00RNA processing and modification
K52.49Transcription
L41.99Replication, recombination and repair
D52.49Cell cycle control, Cell division, chromosome partitioning
V94.48Defense mechanisms
T73.48Signal transduction mechanisms
N10.49Cell motility
C31.49Energy production and conversion
G146.97Carbohydrate transport and metabolism
E83.98Amino acid transport and metabolism
F41.99Nucleotide transport and metabolism
H62.99Coenzyme transport and metabolism
I10.49Lipid transport and metabolism
P10.49Inorganic ion transport and metabolism
Q10.49Secondary metabolites biosynthesis, transport and catabolism
R94.48General function prediction only
S31.49Function unknown
-11657.71Not in COGs
Number of strain-specific genes with general COG functional categories Comparative genomic analysis of TF28 and other 22 strains of possessing complete genomic sequences indicated that the genome size of the strain TF28 was somewhat bigger than that of FZB42T and subsp. amyloliquefaciensDSM7. Three strains, IT-45, NAU-B3 and TF28, possessed CRISPR domains by CRISPR Finder on line (Additional file 3: Table S3, Additional file 4: Table S4, Additional file 5: Table S5, Additional file 6: Table S6). TF28 possessed 3 CRISPR domains. The CRISPR length is 422 bp with 81 bp direct repeat (DR) sequences be separated by 5 spacers. No CRISPR associated gene was observed due to the incomplete genome sequence. NAU-B3 had 1 CRISPR domains. The CRISPR length is 67 bp with 26 bp DR sequences be separated by 1 spacer. IT-45 had 2 CRISPR domains. The CRISPR length is 129 bp with 37 bp DR sequences be separated by 1 spacer. The full-length sequence of protein-coding gene, DNA gyrase subunit A (gyrA) and RNA polymerase subunit B (rpoB) derived from 22 strains of , were chosen to phylogenetic analysis. The neighbor-joining (NJ) phylogenetic tree revealed that strain TF28 with most of clustered into the same group, which is distinct from the type strain subsp. amyloliquefaciensDSM 7(Fig. 6).
Fig. 6

Phylogenetic trees based on gyrA(a) and rpoB(b). The GenBank accession numbers are shown in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences were constructed using the neighbor-joining method within the MEGA 5.10 software. Numbers at the nodes represent percentages of bootstrap values obtained by repeating the analysis 1000 times to generate a majority consensus tree. The scale bar indicates 0.2 (gyrA) and 0.1 (rpoB) nucleotide change per nucleotide position, respectively

Phylogenetic trees based on gyrA(a) and rpoB(b). The GenBank accession numbers are shown in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences were constructed using the neighbor-joining method within the MEGA 5.10 software. Numbers at the nodes represent percentages of bootstrap values obtained by repeating the analysis 1000 times to generate a majority consensus tree. The scale bar indicates 0.2 (gyrA) and 0.1 (rpoB) nucleotide change per nucleotide position, respectively

Conclusions

In this study, we characterized the genome of TF28 isolated from soybean root. Strain TF28 was classified as on comparative analysis of 16S rRNA sequence, DNA gyrase subunit A (gyrA) and RNA polymerase subunit B (rpoB) gene sequences. The genome of strain TF28 has the giant gene clusters that are linked with biocontrol, including non-ribosomal synthesis of the polyketides difficidin and bacillaene, the antifungal lipopetides surfactin, plipastatin, mycosubtilin, bacilysin and bacillibactin. Mycosubtilin and plipastatin synthesis gene clusters were only observed in strain TF28. Ubiquinone and other terpenoid-uquinoid synthesis, bacterial chemotaxis, biosynthsis of siderophore group nonribosomal peptides, antibiotic biosynthesis and noxious substance degradation pathways were found which reflected a high capacity of strain TF28 to promote plant growth, inhibit pathogens and support environment fitness. 201 specific genes are found in strain TF28 which provides information for further analysis of the strain function. The availability of the genome provides insights to better understand the biocontrol mechanisms and facilitate the utilization of the strain in the future.
  28 in total

1.  Complete genome sequence of Bacillus amyloliquefaciens L-H15, a plant growth promoting rhizobacteria isolated from cucumber seedling substrate.

Authors:  Yuxuan Qin; Yuzhu Han; QingMao Shang; Pinglan Li
Journal:  J Biotechnol       Date:  2015-02-25       Impact factor: 3.307

2.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

3.  Complete genome sequence of Bacillus amyloliquefaciens L-S60, a plant growth-promoting and antifungal bacterium.

Authors:  Yuxuan Qin; Yuzhu Han; Yaqiong Yu; Qingmao Shang; Bao Zhang; Pinglan Li
Journal:  J Biotechnol       Date:  2015-08-20       Impact factor: 3.307

4.  Bacillus velezensis is a later heterotypic synonym of Bacillus amyloliquefaciens.

Authors:  Li-Ting Wang; Fwu-Ling Lee; Chun-Ju Tai; Hsiao-Ping Kuo
Journal:  Int J Syst Evol Microbiol       Date:  2008-03       Impact factor: 2.747

5.  UniProt Knowledgebase: a hub of integrated protein data.

Authors:  Michele Magrane
Journal:  Database (Oxford)       Date:  2011-03-29       Impact factor: 3.451

6.  PHAST: a fast phage search tool.

Authors:  You Zhou; Yongjie Liang; Karlene H Lynch; Jonathan J Dennis; David S Wishart
Journal:  Nucleic Acids Res       Date:  2011-06-14       Impact factor: 16.971

7.  From genomics to chemical genomics: new developments in KEGG.

Authors:  Minoru Kanehisa; Susumu Goto; Masahiro Hattori; Kiyoko F Aoki-Kinoshita; Masumi Itoh; Shuichi Kawashima; Toshiaki Katayama; Michihiro Araki; Mika Hirakawa
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  Whole-Genome Shotgun Sequence of Bacillus amyloliquefaciens Strain UASWS BA1, a Bacterium Antagonistic to Plant Pathogenic Fungi.

Authors:  F Lefort; G Calmin; P Pelleteret; L Farinelli; M Osteras; J Crovadore
Journal:  Genome Announc       Date:  2014-02-06

9.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

10.  Deciphering the conserved genetic loci implicated in plant disease control through comparative genomics of Bacillus amyloliquefaciens subsp. plantarum.

Authors:  Mohammad J Hossain; Chao Ran; Ke Liu; Choong-Min Ryu; Cody R Rasmussen-Ivey; Malachi A Williams; Mohammad K Hassan; Soo-Keun Choi; Haeyoung Jeong; Molli Newman; Joseph W Kloepper; Mark R Liles
Journal:  Front Plant Sci       Date:  2015-08-17       Impact factor: 5.753

View more
  1 in total

1.  Complete genome sequencing and investigation on the fiber-degrading potential of Bacillus amyloliquefaciens strain TL106 from the tibetan pig.

Authors:  Zhenda Shang; Suozhu Liu; Yanzhen Duan; Chengling Bao; Jian Wang; Bing Dong; Yunhe Cao
Journal:  BMC Microbiol       Date:  2022-07-29       Impact factor: 4.465

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.