Literature DB >> 35242952

Genome sequence data of Bacillus velezensis BP1.2A and BT2.4.

Christian Blumenscheit1, Jennifer Jähne1, Andy Schneider1, Jochen Blom2, Thomas Schweder3, Peter Lasch1, Rainer Borriss3,4.   

Abstract

Here, we report the complete genome sequence data of the biocontrol strains Bacillus velezensis BP1.2A and BT2.4 isolated from Vietnamese crop plants. The size of the genomes is 3,916,868 bp (BP1.2A), and 3,922,686 bp (BT2.4), respectively. The BioProjects have been deposited at NCBI GenBank. The GenBank accession numbers for the B. velezensis strains are PRJNA634914 (BP1.2A) and PRJNA634832 (BT2.4) for the BioProjects, CP085504 (BP1.2A) and CP085505 (BT2.4) for the chromosomes, GCA_013284785.2 (BP2.1A), and GCA_013284785.2 (BT2.4) for GenBank assembly accessions, and SAMN15012571 (BP1.2A) and SAMN15009897 (BT2.4) for the BioSamples. Both genomes were closely related to FZB42, the model strain for plant growth promoting bacilli.
© 2022 Published by Elsevier Inc.

Entities:  

Keywords:  Bacillus velezensis; Complete genome; Lipopeptides; Macrolactin; Phylogenetic analysis; Polyketides

Year:  2022        PMID: 35242952      PMCID: PMC8885614          DOI: 10.1016/j.dib.2022.107978

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Data source location BP1.2A was isolated from black pepper roots (Viet Nam; Chu Se, Gia Lei), and BT2.4 was isolated from dragon fruit tree (Viet Nam: Ham thuan Nam, Binh Thuan) by Le Thi Thanh Tam, PPRI Hanoi Viet Nam

Value of the Data

The data of this article demonstrate that it is possible, to isolate closely related Bacillus strains from remote geographical regions with different climatic conditions BP1.2A, and BT2.4 share 99.99% identical residues with the model strain FZB42 (Table 3). The high similarity of the two novel strains with the biocontrol strain FZB42, encourages the development of the strains as promising biocontrol agents used in sustainable agriculture in temperate and subtropical zones, as well.
Table 3

Sequence comparison of BP1.2A, and BT2.4 with FZB4242 using blastn, and ANIb [11]. The italic numbers set in brackets indicate the overlap of the sequences used in the comparison. Analysis of singletons was performed with the EDGAR software package [12].

ANIb comparisonBP1.2A (CP085504.1)BT2.4 (CP085505.1)FZB42 (CP000560.2)
BP1.2A*100 (99.74)100.00 (99.64)
BT2.4100.00 (99.67)*99.99 (99.58)
FZB42100.00 (99.64)99.99 (99.61)*

BLASTN comparisonQuery BP1.2AQuery BT2.4Query FZB42

BP1.2A cover10099.854%98.877%
BP1.2A identities10099.995%99.989%
BP1.2A different nts0184/3,916,940426/3,874,585
BP1.2A gaps074/3,916,940102/3,874,585

BT2.4 cover100%100%99.866%
BT2.4 identities99.996%10099.993%
BT2.4 different nts174/3,916,8680274/3,911,604
BT2.4 gaps25/3,916,868021/3,911,604

FZB42 cover99.697%98.026%100
FZB42 identities99.987%99.990%100
FZB42 different nts490/3,904,992382/3,845,2210
FZB42 gaps182/3,904,992192/3,845,2210

Singletons (CDS)BP1.2ABT2.4FZB42

BP1.2A*141
BT2.40*40
FZB426767*
The data demonstrate that gene clusters involved in non-ribosomal and ribosomal synthesis of antibacterial and antifungal secondary metabolites are highly conserved in different representatives of B. velezensis, despite of their geographical distribution. For the scientific community, the genome data presented here, extend the resources for comparative genomic analysis among the members of the Bacillus amyloliquefaciens operational group, including Bacillus velezensis, at present the most important species used in biological plant protection. Furthermore, extended genomic analyses performed between closely related bacteria should elucidate regions and/or genes with different variability and might identify regions (genes) with an enhanced mutation bias.

Data Description

The draft genome sequences of 59 Gram-positive bacterial strains that were isolated from Vietnamese crop plants have been already reported [1]. Two of these strains, B. velezensis BP1.2A, and B. velezensis BT2.4, were now completely sequenced using the nanopore sequencing technology. Both sequences exhibited a very high degree of similarity with the model strain of plant-growth promoting Gram-positive bacteria, B. velezensis FZB 42 [2]. The complete genomes consist of single circular chromosomes with 3916,868 bps (BP1.2A) and 3922,686 bps (BT2.4), respectively. Automatic genome annotation was performed using the RAST (Rapid Annotation using Subsystems Technology) server [3], and the NCBI Genome Automatic Annotation Pipeline (PGAP) [4] for the general genome annotation deposited in NCBI. As shown in Table 1, subsystem proteins distribution [5] of the two strains is very similar to FZB42 [6] indicating their close relationship. Genome mining of B. velezensis performed with antiSMASH version 6.0 [7] extracted the complete set of gene clusters and genes involved in non-ribosomal and ribosomal synthesis of secondary metabolites previously identified in FZB42 Table 2. shows the potential to synthesize an impressive number of different secondary metabolites in B. velezensis strains BP1.2A, BT2.4, and FZB42.
Table 1

General genomic features of B. velezensis BP1.2A (CP085504.1), and BT2.4 (CP085505.1) compared with FZB42 (NC_009725.2). Methods used for generating the data are set in brackets (PGAP, RAST, EDGAR). Differences to FZB42 are labelled in red letters.

AttributesBP1.2ABT2.4FZB42
Genome size (bp)3,916,8683,922,6863,918,596
G+C%46.546.546,5
Number of genes (PGAP)387138703855
CDSs total (PGAP)375337523734
CDS core genome (EDGAR)363336333633
CDS pan genome (EDGAR)375737573757
RNA genes (RAST)118118118
rRNAs (PGAP)272729
tRNAs (PGAP)868688
ncRNAs (PGAP)554
Pseudo genes (PGAP)716959
Number of coding sequences (RAST)393939463938
Number of Subsystems (RAST)324324324

Subsystem Feature Counts

Cofactors, Vitamins, Prosthetic Groups, Pigments147147147
Cell Wall and Capsule737373
Virulence, Disease and Defense383838
Potassium metabolism333
Miscellaneous242424
Phages, Prophages, Transposable elements, Plasmids000
Membrane Transport424242
Iron acquisition and metabolism252525
RNA metabolism636364
Nucleosides and Nucleotides959595
Protein Metabolism209209211
Cell Division and Cell Cycle666
Motility and Chemotaxis424242
Regulation and Cell signaling282828
Secondary Metabolism666
DNA Metabolism636363
Fatty Acids, Lipids, and Isoprenoids535353
Nitrogen Metabolism202020
Dormancy and Sporulation919191
Respiration404040
Stress Response434343
Metabolism of Aromatic Compounds121213
Amino Acids and Derivatives299300301
Sulfur Metabolism666
Phosphorus Metabolism121212
Carbohydrates215215215
Table 2

Detection of gene clusters involved in synthesis of secondary metabolites in the genomes of B. velezensis BP1.2A (CP085504), and B.velezensis BT2.4 (CP085505). For comparison FZB42 (CP000560.2) was also analyzed. Similarity to known metabolites listed in the MIBiG 2.0 repository [8] is indicated.

RegionCP085504CP085505CP000560.2Similarity
Surfactin318,208383,067318,208383,067322,723387,58295%BGC0000433
Plantazolicin717,159740,336717,099740,276721,674744,851100%BGC0000569
Ketoacyl:ACP synthase935,682976,926935,298976,542940,739981,983100%Bacillus
Squalene/phytoene synthase1062,5521079,7811062,1681079,3971074,7831075,523100%Bacillus
Macrolactin H1366,8411453,2261366,4571452,8421371,8971458,282100%BGC0000181
Bacillaene1676,7551777,3571676,3711776,9731681,8111782,413100%BGC0001089
Fengycin1866,1231903,3731865,7391902,9891871,1791908,429100%BGC0001095
Bacillomycin D1907,8781963,9481918,3191963,5641923,7591969,004100%BGC0001090
Squalene-hopene synthase2010,8802032,7632010,4962032,3792024,2192026,102100%Bacillus
T3PKS2099,2492140,3492098,8652139,9652102,5882143,688100%Bacillus
Difficidin2269,1422362,9312268,7582362,5472344,0122286,309100%BGC0000176
PK-5x Cys2851,2952900,8082850,9112906,7122873,9902884,22588%B.velezensis
Bacillibactin3017,8003024,927,3023,6963030,8233021,0213033,995100%BGC0000309
Amylocyclicin3039,6553045,228,3045,5513051,1243043,4703049,481100%BGC0000616
Bacilysin3574,1343615,5523580,0303621,4483593,8823599,780100%BGC0001184
General genomic features of B. velezensis BP1.2A (CP085504.1), and BT2.4 (CP085505.1) compared with FZB42 (NC_009725.2). Methods used for generating the data are set in brackets (PGAP, RAST, EDGAR). Differences to FZB42 are labelled in red letters. Detection of gene clusters involved in synthesis of secondary metabolites in the genomes of B. velezensis BP1.2A (CP085504), and B.velezensis BT2.4 (CP085505). For comparison FZB42 (CP000560.2) was also analyzed. Similarity to known metabolites listed in the MIBiG 2.0 repository [8] is indicated. The phylogenomic analysis supported by TYGS [10] reveals that BP1.2A, and BT2.4 are representatives of the species B. velezensis (Fig. 1). Differences to B. velezensis FZB42 were not detected when the genomes were pairwise compared using ANIb [11] (Fig. 2) indicating their close relationship, despite that the sites of their isolation (Vietnam and Germany) are very remote from each other.
Fig. 1

Phylogenetic tree of B.velezensis strains BP1.2A (CP085504), and BT2.4 (CP085505) labelled in red letters. The tree, based on whole genome sequences, was inferred with FastME 2.1.6.1 [9] from GBDP distances calculated from genome sequences. The branch lengths are scaled in terms of GBDP distance formula. The numbers below branches are GBDP pseudo-bootstrap support values from 100 replications, with an average branch support of 57.3%.

Fig. 2

Pairwise comparison of the genomes of B. velezensis BP1.2A, and BT2.4 with B. velezensis FZB42, and the type strain of B. velezensis CCUG 50,740 using ANIb [11].

Phylogenetic tree of B.velezensis strains BP1.2A (CP085504), and BT2.4 (CP085505) labelled in red letters. The tree, based on whole genome sequences, was inferred with FastME 2.1.6.1 [9] from GBDP distances calculated from genome sequences. The branch lengths are scaled in terms of GBDP distance formula. The numbers below branches are GBDP pseudo-bootstrap support values from 100 replications, with an average branch support of 57.3%. Pairwise comparison of the genomes of B. velezensis BP1.2A, and BT2.4 with B. velezensis FZB42, and the type strain of B. velezensis CCUG 50,740 using ANIb [11]. Table 3 and the Venn diagram presented in Fig. 3 summarize the comparison of the whole genome sequences of BP1.2A, and BT2.4 with FZB42. The three strains share a core genome of 3633 CDS. There is only one additional CDS (encoding a hypothetical protein) in BP1.2A, when compared with BT2.4 suggesting that both strains are identical or nearly identical clones, and the observed difference is due to sequencing error(s). Slight differences were detected, when the genomes were compared with FZB42. BP1.2A, and BT2.4 harbored 41 or 40 CDS, respectively, not occurring in the FZB42 genome. Vice versa, FZB42 harbored a total of 67 singletons, not present in the Vietnamese strains (Table 3). The slight differences to the numbers given in the Venn diagram (Fig. 3) are due to the different methods applied, as explained in the legend to Fig. 3.
Fig. 3

Venn diagram of the genomes of FZB42 (1), BP1.2A (2), and BT2.4 (3). Please note: The singleton numbers don´t necessarily correspond to the numbers in the “Singleton” interface (Table 3). The Venn diagram constructed with EDGAR shows the number of best hits between subsets of genomes. But: A gene without reciprocal best hit to another genome is not necessarily a singleton [12].

Sequence comparison of BP1.2A, and BT2.4 with FZB4242 using blastn, and ANIb [11]. The italic numbers set in brackets indicate the overlap of the sequences used in the comparison. Analysis of singletons was performed with the EDGAR software package [12]. Venn diagram of the genomes of FZB42 (1), BP1.2A (2), and BT2.4 (3). Please note: The singleton numbers don´t necessarily correspond to the numbers in the “Singleton” interface (Table 3). The Venn diagram constructed with EDGAR shows the number of best hits between subsets of genomes. But: A gene without reciprocal best hit to another genome is not necessarily a singleton [12].

Experimental Design, Materials and Methods

Strain growth conditions and DNA isolation

Cultivation of the Bacillus strains and DNA isolation have been previously described [1].

Genome sequencing, assembly, and annotation

Short-read sequencing was conducted in LGC Genomics (Berlin, Germany) using Illumina HiSeq in a paired 150 bp manner. Default parameters were used for all software unless otherwise specified. The short reads were trimmed and filtered using fastp [12] on default settings. Long-read sequencing was done in house with the Oxford Nanopore MinION with the flowcell R9.4.1 and prepared with the Ligation Sequencing Kit (SQK-LSK109). The samples were sequenced 48 h and basecalled afterwards by Guppy v3.1.5. Long reads were trimmed using Porechop (https://github.com/rrwick/Porechop, v0.2.4) and filtered using Filtlong (https://github.com/rrwick/Filtlong, v0.2.0) on default settings. De-novo assemblies were generated by using the hybrid-assembler Unicycler v0.4.8 [13]. The short-read assembly was done by SPades v3.13.0 [14] without read correction and normal bridging and the long-read assembly was done by racon v1.4.20 [15]. The quality of assemblies was assessed by determining the ratio of falsely trimmed proteins by using Ideel (https://github.com/phiweger/ideel).

Phylogenomics

The genome sequence data were uploaded to the Type (Strain) Genome Server (TYGS) for a whole genome-based analysis [10]. All pairwise comparisons were conducted using GBDP, and 100 distance replicates were calculated each. The resulting intergenomic distances were used to infer a balanced minimum evolution tree via FASTME 2.1.6.1 [9]. The tree was visualized with iTOL (https://itol.embl.de/#).

Ethics Statements

This work did not contain human subjects, animals, cell lines or endangered species.

CRediT authorship contribution statement

Christian Blumenscheit: Investigation, Methodology, Data curation, Software, Writing – original draft. Jennifer Jähne: Investigation, Methodology, Data curation. Andy Schneider: Investigation, Methodology. Jochen Blom: Software. Thomas Schweder: Conceptualization, Supervision. Peter Lasch: Conceptualization, Methodology, Supervision. Rainer Borriss: Conceptualization, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectBiological sciences
Specific subject areaMolecular Phylogenetics
Type of dataTable, Figure, genome sequencing data in FASTA format.
How the data were acquiredShort reads were generated with Illumina HiSeq at LGC Genomics (Berlin, Germany). Long reads were obtained with Oxford Nanopore MinION.
Data formatAnalyzed DNA sequence data in FASTA, NEWICK and text format.
Description of data collectionPure cultures of BP1.2A and BT2.4 were used to isolate genomic DNA and to obtain the genomic data. Genome annotation was carried out using NCBI Genome Automatic Annotation Pipeline (PGAP) and RAST.

Data source location

BP1.2A was isolated from black pepper roots (Viet Nam; Chu Se, Gia Lei), and BT2.4 was isolated from dragon fruit tree (Viet Nam: Ham thuan Nam, Binh Thuan) by

Le Thi Thanh Tam, PPRI

Hanoi

Viet Nam

Data accessibilityThe BioProjects have been deposited at NCBI GenBank under the following accession numbers: Bioprojects: PRJNA634914 (BP1.2A), and PRJNA634832 (BT2.4), Biosamples: SAMN15012571 (BP1.2A), and SAMN15009897 (BT2.4), Sequences of the chromosomes: CP085504.1 (BP1.2A) and CP085505.1 (BT2.4), GenBank assembly accessions: GCA_013285085.2 (BP1.2A), and GCA_013284785.2 (BT2.4). The SRA records could be accessed for BP1.2A, and BT2.4 from their corresponding links from the BioProjects.https://www.ncbi.nlm.nih.gov/sra/PRJNA634914https://www.ncbi.nlm.nih.gov/sra/PRJNA634832
With the article
L.T.T. Tam, J. Jähne, P.T. Luong, L.T.P. Thao, L.T.K. Chung, A. Schneider, C. Blumenscheit, P. Lasch, T. Schweder, R. Borriss. Draft genome sequences of 59 endospore-forming Gram-positive bacteria associated with crop plants grown in Vietnam. Microbiol. Resour. Announc. 9 (2020): e01154–20 https://doi/10.1128/MRA.01154–20.
  15 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.

Authors:  Wenjun Li; Kathleen R O'Neill; Daniel H Haft; Michael DiCuccio; Vyacheslav Chetvernin; Azat Badretdin; George Coulouris; Farideh Chitsaz; Myra K Derbyshire; A Scott Durkin; Noreen R Gonzales; Marc Gwadz; Christopher J Lanczycki; James S Song; Narmada Thanki; Jiyao Wang; Roxanne A Yamashita; Mingzhang Yang; Chanjuan Zheng; Aron Marchler-Bauer; Françoise Thibaud-Nissen
Journal:  Nucleic Acids Res       Date:  2020-12-03       Impact factor: 16.971

3.  The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes.

Authors:  Ross Overbeek; Tadhg Begley; Ralph M Butler; Jomuna V Choudhuri; Han-Yu Chuang; Matthew Cohoon; Valérie de Crécy-Lagard; Naryttza Diaz; Terry Disz; Robert Edwards; Michael Fonstein; Ed D Frank; Svetlana Gerdes; Elizabeth M Glass; Alexander Goesmann; Andrew Hanson; Dirk Iwata-Reuyl; Roy Jensen; Neema Jamshidi; Lutz Krause; Michael Kubal; Niels Larsen; Burkhard Linke; Alice C McHardy; Folker Meyer; Heiko Neuweger; Gary Olsen; Robert Olson; Andrei Osterman; Vasiliy Portnoy; Gordon D Pusch; Dmitry A Rodionov; Christian Rückert; Jason Steiner; Rick Stevens; Ines Thiele; Olga Vassieva; Yuzhen Ye; Olga Zagnitko; Veronika Vonstein
Journal:  Nucleic Acids Res       Date:  2005-10-07       Impact factor: 16.971

4.  FastME 2.0: A Comprehensive, Accurate, and Fast Distance-Based Phylogeny Inference Program.

Authors:  Vincent Lefort; Richard Desper; Olivier Gascuel
Journal:  Mol Biol Evol       Date:  2015-06-30       Impact factor: 16.240

5.  Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

Authors:  Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal:  PLoS Comput Biol       Date:  2017-06-08       Impact factor: 4.475

6.  MIBiG 2.0: a repository for biosynthetic gene clusters of known function.

Authors:  Satria A Kautsar; Kai Blin; Simon Shaw; Jorge C Navarro-Muñoz; Barbara R Terlouw; Justin J J van der Hooft; Jeffrey A van Santen; Vittorio Tracanna; Hernando G Suarez Duran; Victòria Pascal Andreu; Nelly Selem-Mojica; Mohammad Alanjary; Serina L Robinson; George Lund; Samuel C Epstein; Ashley C Sisto; Louise K Charkoudian; Jérôme Collemare; Roger G Linington; Tilmann Weber; Marnix H Medema
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

7.  TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy.

Authors:  Jan P Meier-Kolthoff; Markus Göker
Journal:  Nat Commun       Date:  2019-05-16       Impact factor: 14.919

Review 8.  Biocontrol mechanism by root-associated Bacillus amyloliquefaciens FZB42 - a review.

Authors:  Soumitra Paul Chowdhury; Anton Hartmann; XueWen Gao; Rainer Borriss
Journal:  Front Microbiol       Date:  2015-07-28       Impact factor: 5.640

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

10.  JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison.

Authors:  Michael Richter; Ramon Rosselló-Móra; Frank Oliver Glöckner; Jörg Peplies
Journal:  Bioinformatics       Date:  2015-11-16       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.