Literature DB >> 35864877

Pediococcus pentosaceus IMI 507025 genome sequencing data.

Ivana Nikodinoska1, Jenny Makkonen2, Daniel Blande2, Colm Moran3.   

Abstract

The genome sequence data for the pickled cucumbers isolate, Pediococcus pentosaceus IMI 507025, is reported. The raw reads and analysed genome reads were deposited at NCBI under Bioproject with the accession number PRJNA814992. The number of contigs before and after trimming were 17 and 12 contigs, respectively. The total size of the genome was 1,795,439 bp containing 1,811 total genes, of which 1,751 were coding sequences. IMI 507025 identity was determined via average nucleotide identity (ANI), obtaining an identity value of 99.5994% between IMI 507025 and the type strain P. pentosaceus ATCC 33316, identifying the strain as P. pentosaceus. Screening for the antimicrobial resistance (AMR) and virulence genes in the genome of IMI 507025 showed no hits, confirming the safety of the tested strain. Presence of plasmids was not found.
© 2022 The Authors. Published by Elsevier Inc.

Entities:  

Keywords:  Antimicrobial resistance; Lactic acid bacteria; Microbial genome sequencing; Search for genes of concern

Year:  2022        PMID: 35864877      PMCID: PMC9294475          DOI: 10.1016/j.dib.2022.108446

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

Members of the genus Pediococcus are highly associated with the various types of forage crops microbiota, having an important impact on the fermentation characteristics of silage. A homofermentative Pediococcus pentosaceus isolates with a safe trait, as absence of AMR genes, could be successfully used in silage fermentation improvement. The data herein reported, relate to the Pediococcus pentosaceus IMI 507025 safety characteristics and strain identity. The sequencing data could be used for Pediococcus comparative genomics, and for evaluation of genes of concern among lactic acid bacteria members.

Data Description

The whole genome sequencing data of Pediococcus pentosaceus (P. pentosaceus) IMI 507025, the taxonomic identification data, genome screening for AMR, virulence factors and plasmids related data are described. The whole genome sequencing coverage was 1020x. The annotated assembly consisted of 12 contigs with a total length of 1,794,629 bp, a GC% of 37.03, N50 contig of 354,566 bp. The annotation produced 1811 genes, of which 1751 were coding sequences, 53 RNA genes (2 ribosomal RNAs, 47 transfer RNA and 4 miscellaneous RNA) and 7 pseudogenes. The genome comparison showed the best hit (low distance and high matching) to Pediococcus pentosaceus CGMCC 7049 (Table 1).
Table 1

Taxonomic identification of IMI 507025 via MinHash.

StrainMash distanceStatistically significant differencesMatching Hashes*Assembly accession
Pediococcus pentosaceus CGMCC 70490.006719090.00326/400GCF_000708635.1
Pediococcus pentosaceus IE-30.008471590.00310/400GCF_000285875.1
Pediococcus pentosaceus ATCC 257450.01293470.00274/400GCF_000014505.1 (complete)
Pediococcus pentosaceus SL40.01475540.00261/400GCF_000496265.1 (complete)
Fusobacterium sp. CAG:6490.1952091.02262e-159/400GCF_000433695.1

Selected genomes with upper threshold of 400 hashes, available in the NCBI database, were used for comparison purposes

Taxonomic identification of IMI 507025 via MinHash. Selected genomes with upper threshold of 400 hashes, available in the NCBI database, were used for comparison purposes The similarity between two genome sequences was identified via average nucleotide identity (ANI) using OrthoANI algorithm [1]. Usually the ANI result (%) is approximately (1 – Mash distance) x 100 (see Table 1). In the Table 2. are summarised the genomes that were included in the comparison study via orthoANI.
Table 2

Genome assemblies included in the OrthoANI and Roary calculations.

StrainAssembly AccessionContigsSize (bp)GC%
Pediococcus pentosaceus ATCC 33316 (T)GCF_004354495.1191,764,49837.27
Pediococcus pentosaceus ATCC 25745GCF_000014505.111,832,38737.36
Pediococcus pentosaceus SL001GCF_007923185.121,919,17537.44
Pediococcus pentosaceus SL4GCF_000496265.111,789,13837.30
Pediococcus pentosaceus SRCM 100892GCF_002173535.172,002,47237.30
Pediococcus pentosaceus KCCM 40703GCF_002982155.111,758,36237.20
Pediococcus pentosaceus SRCM 100194GCF_002202155.131,869,79237.38
Pediococcus pentosaceus SS1–3GCF_003429405.131,844,76437.28
Pediococcus pentosaceus wikim20GCF_001411765.241,830,62937.29
Pediococcus pentosaceus JQI-7GCF_006770865.111,732,88037.25
Pediococcus pentosaceus CGMCC 7049GCA_000708635.181,751,04937.30
Pediococcus pentosaceus IE-3GCA_000285875.1911,802,37637.22
Pediococcus parvulus strain NBRC 100673GCA_007990205.11111,968,74538.62
Genome assemblies included in the OrthoANI and Roary calculations. In Table 3. is reported the outcome from the comparison of IMI 507025 with closely related P. pentosaceus strains. The pairwise comparisons showed 99.6397% identity between IMI 507025 and P. pentosaceus CGMCC 7049 genomes. The ANI match with the P. pentosaceus type strain ATCC 33316 was 99.5994%. The species identification cut off is set as 95% [2].
Table 3

OrthoANI (%) calculations between IMI 507,025 and selected Pediococcus strains.

IE-3CGMCC 7049SRCM 100892NBRC 100673ATCC 25745SL4WIKIM20SRMC 100194KCCM 40703SS1–3ATCC 33316JQI-7SL001IMI 507025
IE-310099.585598.618369.661398.789498.804399.00698.80398.849198.868299.801198.992498.865599.5911
CGMCC 704999.585510098.629969.726598.94198.764698.926198.745598.723898.871899.648998.92898.769599.6397
SRCM 10089298.618398.629910070.000998.507798.524998.841998.63698.69498.340398.834898.751598.520198.6664
NBRC 10067369.675969.726570.000910069.499769.562369.778469.843669.437969.644369.332869.673769.706369.3435
ATCC 2574598.789498.94198.507769.49610098.70199.015198.731799.068698.646199.088998.840998.884598.8932
SL498.804398.764698.524969.562398.70110098.811398.594798.789598.548998.999299.060998.739698.9198
WIKIM2099.00698.926198.841969.778499.015198.811310099.800599.371998.607599.07298.911398.905398.9745
SRMC 10019498.80398.745598.63669.843698.731798.594799.800510099.181698.582199.015598.854298.862198.8483
KCCM 4070398.849198.723898.69469.437999.068698.789599.371999.181610098.829998.977298.932298.904998.8254
SS1–398.868298.871898.340369.639998.646198.548998.607598.582198.829910099.007598.929798.730798.8575
ATCC 3331699.801199.648998.834869.332899.088998.999299.07299.015598.977299.007510098.996398.947599.5994
JQI-798.992498.92898.751569.673798.840999.060998.911398.854298.932298.929798.996310099.733298.8202
SL00198.865598.769598.520169.706298.884598.739698.905398.862198.904998.730798.947599.733210098.7146
IMI 50702599.591199.639798.666469.329698.893298.919898.974598.848398.825498.857599.599498.820298.7146100
OrthoANI (%) calculations between IMI 507,025 and selected Pediococcus strains. The threshold values for AMR and virulence genes screening, were considered the once proposed by the European Food Safety Authority (EFSA), namely sequences with above 80% identity and 70% coverage should be considered for further analysis [2]. The genome searches revealed no AMR genes nor virulence or pathogenicity factors presence in the sequenced genome of the strain IMI 507025. The bioinformatic analysis did not identified putative plasmids in the sequenced data. Based on the data presented above, the strain IMI 507025 was unequivocally identified as Pediococcus pentosaceus. In addition, the safety-related data described, confirm that the strain P. pentosaceus IMI 507025 is safe and did not raise safety concerns.

Experimental Design, Materials and Methods

DNA Extraction

For the DNA extraction, 10 mL MRS Broth cultures were incubated aerobically at +30 ⁰C for 16–17 h. The cells were centrifuged (1780 rcf, 10 min) and the pellet was used for DNA extraction according to previously described procedure [8].

Whole Genome Sequencing, Assembly, and Annotation

The DNA was sequenced using Illumina NovaSeq 6000, 150 bp paired-end library, sequencing technology at Eurofins genomics (Constance, Germany), obtaining 6,688,243 reads. Trimmomatic v.0.38.1 [3] was used for trimming the reads and Unicycler v 0.4.8 [4] for assembling. The average reference coverage (total number of bases / assembly length) of the assembly was 1020-fold. Gene predictions and functional annotations were performed using NCBI Prokaryotic Genome Annotation Pipeline v6.0 [5].

Taxonomic Identification

Mash using MinHash v. 0.1.1 [6] and OrthoANI v. 1.40 [7] were used for strain identification via alignment-free genome distance estimation and calculating of average nucleotide identity.

Screening for AMR and Virulence Factors Related Genes

Two databases were used for AMR genes search, the NCBI Bacterial AMR Reference Gene database (v. 2021–06–01.1) and the ResFinder database (downloaded on 20.04.2021). Screening for virulence factors was performed using the virulence factor database (VFDB). Default parameters were used except where otherwise stated in previously published study [8].

Screening for Plasmids

PlasmidFinder database [9] and Blast searches were used for search for plasmid related contigs in the sequenced genome, the circular contigs presence was examined in the assembly files.

CRediT authorship contribution statement

Ivana Nikodinoska: Writing – original draft, Data curation. Jenny Makkonen: Writing – review & editing, Methodology, Software. Daniel Blande: Writing – review & editing, Software, Formal analysis. Colm Moran: Writing – review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors I.N and C.A.M. are employees of Alltech which produces Pediococcus pentosaceus IMI 507025 evaluated in this study.
SubjectMicrobiology
Specific subject areaMicrobial genomics
Type of dataRaw reads and analysed genome of Pediococcus pentosaceus IMI 507025
How the data were acquiredIllumina NovaSeq 6000, Unicycler v 0.4.8, PGAP v6.0, NCBI Bacterial AMR Reference Gene Database v. 2021–06–01.1, ResFinder, Virulence Factor Database (VFDB), PlasmidFinder.
Data formatRawAnalysed
Description of data collectionPediococcus pentosaceus IMI 507025 was isolated from pickled cucumbers. The DNA extracted from pure culture was sequenced with NovaSeq 6000 Platform (Illumina) to obtain information about the strain identity and safety.
Data source locationInstitution: Alltech Inc.City/Town/Region: Nicholasville, KentuckyCountry: USA
Data accessibilityBioproject Accession Number: PRJNA814992NCBI GenBank Accession Number: JALBYI000000000NCBI SRA Accession Number: SRR18325428
  8 in total

1.  OrthoANI: An improved algorithm and software for calculating average nucleotide identity.

Authors:  Imchang Lee; Yeong Ouk Kim; Sang-Cheol Park; Jongsik Chun
Journal:  Int J Syst Evol Microbiol       Date:  2015-11-09       Impact factor: 2.747

2.  In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing.

Authors:  Alessandra Carattoli; Ea Zankari; Aurora García-Fernández; Mette Voldby Larsen; Ole Lund; Laura Villa; Frank Møller Aarestrup; Henrik Hasman
Journal:  Antimicrob Agents Chemother       Date:  2014-04-28       Impact factor: 5.191

3.  Mash: fast genome and metagenome distance estimation using MinHash.

Authors:  Brian D Ondov; Todd J Treangen; Páll Melsted; Adam B Mallonee; Nicholas H Bergman; Sergey Koren; Adam M Phillippy
Journal:  Genome Biol       Date:  2016-06-20       Impact factor: 13.583

4.  Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

Authors:  Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal:  PLoS Comput Biol       Date:  2017-06-08       Impact factor: 4.475

5.  Whole genome sequence data of Lactiplantibacillus plantarum IMI 507027.

Authors:  Ivana Nikodinoska; Jenny Makkonen; Daniel Blande; Colm Moran
Journal:  Data Brief       Date:  2022-03-06

6.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

7.  NCBI prokaryotic genome annotation pipeline.

Authors:  Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell
Journal:  Nucleic Acids Res       Date:  2016-06-24       Impact factor: 16.971

8.  EFSA statement on the requirements for whole genome sequence analysis of microorganisms intentionally used in the food chain.

Authors: 
Journal:  EFSA J       Date:  2021-07-28
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.