Literature DB >> 35313500

Microbiome dataset of spontaneously fermented Ethiopian honey wine, Tej.

Eskindir Getachew Fentie1,2,3, Minsoo Jeong2, Shimelis Admassu Emire3, Hundessa Dessalegn Demsash3, Min A Kim2,4, Hwang-Ju Jeon2, Sung-Eun Lee2, Setu Bazie Tagele2, Yeong-Jun Park2, Jae-Ho Shin2.   

Abstract

This dataset contains raw and analyzed microbial data for the samples of spontaneously fermented Ethiopian honey wine, Tej, collected from three locations of Ethiopia. It was generated using culture independent amplicon sequencing technique. To gain a better understanding of microbial community variance and similarity across Tej samples from the same and different locations, the raw sequenced data obtained from the Illumina Miseq sequencer was subjected to a bioinformatics analysis. Lower diversity and richness of both bacterial and fungal communities were observed for all of the Tej samples. Besides, samples collected from Debre Markos area showed a significant discriminating tax for both bacterial and fungal communities. In nutshell, this amplicon sequencing dataset provides a useful collection of data for modernizing this spontaneous fermentation into a directed inoculated fermentation. Detail discussion on microbiome of Tej samples is given in [1].
© 2022 The Author(s). Published by Elsevier Inc.

Entities:  

Keywords:  Alpha diversity; Beta diversity; Linear discriminated analysis; Tej

Year:  2022        PMID: 35313500      PMCID: PMC8933813          DOI: 10.1016/j.dib.2022.108022

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

Helps to identify the dominant bacterial and fungal genus found in Tej samples. Helps to understand the differences and similarities of the microbial community structure for spontaneously fermented Tej samples. Helps on the development of direct Tej fermentation system.

Data

This dataset contains the microbiome data of both bacteria and fungi communities for Tej samples collected from three different locations of Ethiopia. The raw bacterial and fungal FASTA files of each sample are made accessible via National Center for Biotechnology Information (NCBI) data repository system. These FASTA files were the original metadata that were used for the bioinformatics analysis of this study. Table 1, describes the alpha diversity indices (Chao 1, Shannon, Simpson, Evenness, InvSimpson and observed) of each sample. This table is aimed to show the differences in alpha diversity indices based on sample collecting areas. Besides, Table 2 shows the list of bacterial and fungal communities that has less than 1% relative abundance. It showed all level of taxonomical classifications (Phylum, Class, Order, Family, and Genus) alongside its relative abundance of both bacterial and fungal communities. Both tables are made accessible on Science Data Bank data repository system. Furthermore, the quantitative bacterial and fungal beta diversity of the collected Tej samples was illustrated by using weighted-Unifrac principal coordinate analysis (PCoA) plot (Fig. 1). The relative abundance of each taxon for both bacterial and fungi communities from respective sample collection areas were the major comparing factor for microbial ecology diversity analysis. The distance metrics in the weighted-Unifrac PCoA plot demonstrated differences in microbial taxon abundance between the collected Tej samples (Fig. 1). Moreover, Fig. 2 demonstrate linear discriminant analysis effect size (LefSe) of bacteria and fungi for collected Tej samples based on the sample collection area. This figure was basically used to describe the significantly higher abundant bacterial and fungi taxon found in the grouped samples. Besides, all of the identified taxon in Fig. 2 were screened out using a linear discriminant analysis score of greater than 3.0. (Fig. 2).
Table 1

Alpha diversity of bacteria and fungi communities.

Alpha diversity indices for bacteria
Alpha diversity indices for fungi
LocationsChao1ShannonSimpsonEvennessInvsimpObsChao1ShannonSimpsonEvennessInvsimpObs
A1202.5492320.9029170.85095510.300432010001.001
A2111.8171780.789340.7578224.7469811120.0009680.0001890.0013971.0001892
A3141.7459720.7724410.6615894.394461410001.001
A470.6235370.2647670.3204351.360114710001.001
A5232.4256790.8828810.7736198.538342320.0040830.0009430.0058911.0009442
A6372.7346430.8982130.7573269.8244243710001.001
A7181.6507030.7473590.5711043.9581861810001.001
Average19 ± 9.82 ± 0.70.75±0.221 ± 0.261 ± 3.418.57±9.781.29±0.491.29±0.49
B151.3964040.7478560.8676353.965989520.0568510.0201630.0820191.0205782
B2232.5997360.9071080.82913110.765162310001.001
B3141.6319960.7509320.6184014.0149621420.0013950.0002830.0020131.0002832
B471.3774630.6813230.7078763.137976740.0581880.0177960.0419731.0181184
B5121.7170670.772670.6909994.3988971210001.001
B661.3936190.7471540.7777943.954984620.0491550.0169240.0709161.0172162
B7121.8107430.7942630.7286974.8605751210001.001
Average11 ± 6.211.7 ± 0.430.8 ±0.070.7 ± 0.095.01 ± 2.5911.29±6.211.86±1.071.86±1.07
D1152.1641210.8584540.7991437.0648651520.0018060.0003770.0026061.0003772
D2161.8353790.7865760.6619734.6855111640.1591680.0632110.1148151.0674774
D3102.0836560.8602670.904927.15652310100011
D4362.3138130.8453740.6456826.4672073640.0207060.0051780.0149361.0052054
D5161.7988050.7801910.6487824.5493991650.124280.0465340.077221.0488055
D6362.4576440.8646510.6858197.3882983640.0279070.0073370.0201311.0073914
D7101.562410.7655240.6785464.26482310100011
Average20±11.322.03± 0.310.82±0.040.72±0.105.94±1.3819.86±11.323.00±1.633.00±1.63

p-valuep-value

A Vs B0.1220.4790.820.3330.4910.1220.2230.0600.0590.1640.0590.223
A Vs D0.8240.7530.4210.5490.8760.8240.0210.0840.1040.2920.1070.021
B Vs D0.1040.1310.1220.5790.420.1040.1470.3950.3790.9130.3680.147

A1- A7, B1-B2, D1-D6 are Tej sample collected from Addis Ababa (AA), Bahir Dar(BD) and Debre Markos(DM), respectively

Obs- Observed

Table 2

Bacterial and fungal community structure at the relative abundance < 1% (classified as others).

Bacterial Community structure at the relative abundance of < 1% (grouped as others)
S/NPhylumClassOrderFamilyGenusRA (%)
1ProteobacteriaGammaproteobacteriaAeromonadalesAeromonadaceaeAeromonas0.00023
2ProteobacteriaGammaproteobacteriaPseudomonadalesMoraxellaceaeEnhydrobacter7.10E-06
3ProteobacteriaGammaproteobacteriaEnterobacteralesEnterobacteriaceaeEnterobacteriaceae_Unclassified0.00666
4FirmicutesBacilliLactobacillalesLeuconostocaceaeFructobacillus0.00705
5FirmicutesBacilliLactobacillalesLeuconostocaceaeFructobacillus7.34E-05
6ProteobacteriaAlphaproteobacteriaAcetobacteralesAcetobacteraceaeGluconobacter0.00016
7FirmicutesBacilliLactobacillalesLactobacillales_UnclassifiedLactobacillales_Unclassified2.13E-05
8FirmicutesBacilliLactobacillalesLactobacillaceaeLactobacillus0.00011
9FirmicutesBacilliLactobacillalesLactobacillaceaeLactobacillus0.00018
10FirmicutesBacilliLactobacillalesLactobacillaceaeLactobacillus0.00218
11FirmicutesBacilliLactobacillalesLactobacillaceaeLactobacillus0.00771
12FirmicutesBacilliLactobacillalesStreptococcaceaeLactococcus0.00202
13FirmicutesBacilliLactobacillalesLeuconostocaceaeLeuconostoc0.00242
14FirmicutesBacilliLactobacillalesLactobacillaceaePediococcus0.00161
15FirmicutesBacilliStaphylococcalesStaphylococcaceaeStaphylococcus5.68E-05
16FirmicutesNegativicutesVeillonellales-SelenomonadalesVeillonellales-Selenomonadales_UnclassifiedVeillonellales-Selenomonadales_Unclassified0.00012
17FirmicutesBacilliLactobacillalesLeuconostocaceaeWeissella0.00025

Fungal Community structure for the relative abundance of <1% (grouped as others)

S/NPhylumClassOrderFamilyGenusRA (%)

1AscomycotaSaccharomycetesSaccharomycetalesSaccharomycetales_fam_Incertae_sedisCandida4.49E-06
2AscomycotaSaccharomycetesSaccharomycetalesPhaffomycetaceaeCyberlindnera5.39E-05
3AscomycotaSaccharomycetesSaccharomycetalesSaccharomycetaceaeKazachstania0.00233
4AscomycotaSaccharomycetesSaccharomycetalesSaccharomycetaceaeKazachstania0.00048
6AscomycotaSaccharomycetesSaccharomycetalesSaccharomycetaceaeTorulaspora4.49E-05
7AscomycotaSaccharomycetesSaccharomycetalesPhaffomycetaceaeWickerhamomyces0.00043
8AscomycotaSaccharomycetesSaccharomycetalesSaccharomycetaceaeZygosaccharomyces0.00011
Fig. 1

Principal co-ordinate analysis of weighted UniFrac distance (PCoA) plots demonstrating the beta diversity of a) bacterial and b) fungal communities. The dots on the plots represent the individual samples from respective areas. Red–Addis Ababa (AA), Orange–Bahir Dar (BD), Deep blue–Debre Markos (DM) samples.

Fig. 2

Linear discriminant analysis effect size (LefSe) for a) bacteria and b) fungi communities.

Alpha diversity of bacteria and fungi communities. A1- A7, B1-B2, D1-D6 are Tej sample collected from Addis Ababa (AA), Bahir Dar(BD) and Debre Markos(DM), respectively Obs- Observed Bacterial and fungal community structure at the relative abundance < 1% (classified as others). Principal co-ordinate analysis of weighted UniFrac distance (PCoA) plots demonstrating the beta diversity of a) bacterial and b) fungal communities. The dots on the plots represent the individual samples from respective areas. Red–Addis Ababa (AA), Orange–Bahir Dar (BD), Deep blue–Debre Markos (DM) samples. Linear discriminant analysis effect size (LefSe) for a) bacteria and b) fungi communities.

Experimental Design, Materials and Methods

Sample collection, transportation and storage

Twenty-one fully matured Tej samples were collected from Addis Ababa (lat. 8.9806, long. 38.7578), Bahir Dar (lat. 11.5742, long. 37.3614), and Debre Markos (lat. 10.3296, long. 37.7344), Ethiopia. The samples were collected from local alcohol vendors who were selected randomly based on their willingness to sell. All of the samples were collected aseptically using sterile screw cup. Besides, samples from the same locations were collected on the same day. Finally, the collected samples transported to Kyungpook National University, Korea via insulated ice box with a freezing pack. The samples that required further analysis was stored in freezer at -20 °C.

DNA extraction

About 40 mL of Tej samples were centrifuged at 3200 rpm for 20 m to harvest the highest cell concentration. The microbial DNA was then extracted from the sediment via QIAamp PowerSoil Pro Kit (QIAGEN, Germany) by following manufacturer protocol. The final concentration of the extracted microbial DNA was checked by Qubit 2.0 Fluorometer (Life Technologies, USA).

16SrRNA sequencing

Amplicon sequencing for each sample was performed using a barcode set of Nextera Library Preparation Kit (Illumina Inc., USA). The hypervariable (V4 -V5) region of 16S rRNA gene was PCR amplified by using 515F (GTGNCAGCMGCCGCGGTAA) as the forward-inner primer and 907R (CCGYCAATTYMTTTRAGTTT) as the reverse-inner primer [2]. The PCR amplifications by thermocycler (Mastercycler Nexus GSX1, Eppendorf, Germany) were performed in two phases. The first PCR was run at the condition of 95 ℃ for 5 min of pre-denaturation, followed by 15 cycles of 95 ℃ for 30 s of denaturation, 60 ℃ for 30 s of annealing, 72 ℃ for 30 s of extension, and 72 ℃ for 5 min of final extension [3]. The reaction mixtures were composed of 1 µL (1 µM) of reverse inner primer, 1 µL (1 µM) of forward inner primer, 2 µL DNA template, 25 µL Emerald Amp PCR Master Mix (Takara Co., Ltd., Japan). The total volume of the PCR reaction mixture was then adjusted to become 50 µL by sterilized distilled water (SDW). The second PCR was conducted under the same running conditions as the first, by adding bar code primers and 2 µL of first PCR amplified DNA templets. These PCR amplified products were then multiplexed to 100 ng/µL into the single product via measuring the DNA concentration. Finally, amplified and barcoded DNA having 550 bp of size were selected using AMPure XP for PCR Purification (BECKMAN COULTER Inc., USA) for further downstream procedures.

Internal transcribed spacer (ITS) sequencing

Fungal internal transcribed (ITS2) regions were targeted for amplification using the primers of ITS86F (GTGAATCATCGAATCTTTGAA) and ITS4 (TCCTCCGCTTATTGATATGC) [4,5]. The first PCR amplification was performed at a condition of 95 °C for 5 min, followed by 30 cycles of 95 °C for 30 s, 58 °C for 30 s, 72 °C for 30 s, and finally 72 for 5 min (Jung et al., 2020). The second amplification was also carried out in the same condition as it was done for the first one. The reaction mixtures for the above mentioned two PCR amplifications were composed of 1 µL (1 µM) of reverse primer, 1 µL (1 µM) of forward primer, 2 µL DNA template, 25 µL Emerald Amp PCR Master Mix, 21 µL sterilized distilled water (SDW).

High-throughput sequencing

Before high-throughput sequencing, the amplicon library size, and quality and quantity were double-checked via Agilent 2100 Bioanalyzer (Agilent Technologies Inc., USA). Then amplicon libraries were directly subjected to the Illumina MiSeq platform by following the manufacturer's instructions. The base calling and image analysis were performed using MiSeq Control Software (MCS) which is installed in the Illumina MiSeq instrument.

Bioinformatics and statistical analysis

Quantitative insights into microbial ecology 2 (QIIME2) was used for the analysis of raw sequence FASTQ data. Filtering, trimming, and denoising of the raw sequences were performed via DADA2 to obtain amplicon sequence variants (ASV) [6]. Taxonomic identification of bacterial and fungal communities, the SILVA and UNITE reference databases were utilized, respectively. Vegan package was used for alpha diversity analysis of Shannon, Chao1, Simpson, Evenness, and InvSimpson. Meanwhile, the linear discriminant analysis effect size (LEfSe) and principal coordinates of analysis (PCoA) plots were performed via Web-based Calypso and RStudio 4.0.3. All of these microbiome data analyses were performed by applying a non-parametric Kruskal–Wallis tests with alpha value of less than 0.05 to detect significant difference in microbiome features between the group of collected sample.

CRediT authorship contribution statement

Eskindir Getachew Fentie: Conceptualization, Methodology, Formal analysis, Investigation, Data curation, Writing – original draft, Visualization. Minsoo Jeong: Investigation, Software, Visualization. Shimelis Admassu Emire: Conceptualization, Writing – review & editing, Supervision. Hundessa Dessalegn Demsash: Conceptualization, Writing – review & editing, Supervision. Min A Kim: Investigation. Hwang-Ju Jeon: Investigation. Sung-Eun Lee: Supervision. Setu Bazie Tagele: Methodology. Yeong-Jun Park: Methodology. Jae-Ho Shin: Conceptualization, Writing – review & editing, Resources, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectBiological Science
Specific subject areaMicrobiome, spontaneously fermented beverage
Type of dataTable, Figure, FASTA file
How the data were acquiredIllumina MiSeq (Illumina-MiSeq-USA) platform were used for 16SrRNA and ITS amplicon sequencing. Besides, bioinformatic and statistical analysis were performed via QIIME2 and RStudio 4.0.3, respectively.
Data formatRaw, filtered and analysed
Description of data collectionThe microbial DNA of all Tej samples were extracted, amplified, sequenced and analysed sequentially.
Data source locationA total 21 Tej samples were collected from Addis Ababa (lat. 8.9806, long. 38.7578), Bahir Dar (lat. 11.5742, long. 37.3614), Debre Markos (lat. 10.3296, long. 37.7344) areasThe collected samples were analysed in:Kyungpook National University, Daegu, Korea,
Data accessibilityRepository name: National Center for Biotechnology Information (NCBI)Sequence Read Archive (SRA) data: Accession number PRJNA781236and PRJNA781563Direct URL to data: https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA781236and https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA781563Repository name: Science Data BankData identification number: 31253.11.sciencedb.01345Direct URL to data: https://www.scidb.cn/en/s/URFf2q
Related research articleE. Fentie, M. Jeong, S. Emire, H. Demsash, M.A. Kim, H.J. Jeon, S.E. Lee, S. Tagele, Y.J. Park, J.H. Shin, Physicochemical properties, antioxidant activities and microbial communities of Ethiopian honey wine, Tej, Food Res. 152 (2022) 110765. https://doi.org/10.1016/j.foodres.2021.110765
  5 in total

1.  Rapid identification of fungi by using the ITS2 genetic region and an automated fluorescent capillary electrophoresis system.

Authors:  C Y Turenne; S E Sanche; D J Hoban; J A Karlowsky; A M Kabani
Journal:  J Clin Microbiol       Date:  1999-06       Impact factor: 5.948

2.  Physicochemical properties, antioxidant activities and microbial communities of Ethiopian honey wine, Tej.

Authors:  Eskindir Getachew Fentie; Minsoo Jeong; Shimelis Admassu Emire; Hundessa Dessalegn Demsash; Min A Kim; Hwang-Ju Jeon; Sung-Eun Lee; Setu Bazie Tagele; Yeong-Jun Park; Jae-Ho Shin
Journal:  Food Res Int       Date:  2021-10-17       Impact factor: 6.475

3.  DADA2: High-resolution sample inference from Illumina amplicon data.

Authors:  Benjamin J Callahan; Paul J McMurdie; Michael J Rosen; Andrew W Han; Amy Jo A Johnson; Susan P Holmes
Journal:  Nat Methods       Date:  2016-05-23       Impact factor: 28.547

4.  Potential Association between Vaginal Microbiota and Cervical Carcinogenesis in Korean Women: A Cohort Study.

Authors:  Gi-Ung Kang; Da-Ryung Jung; Yoon Hee Lee; Se Young Jeon; Hyung Soo Han; Gun Oh Chong; Jae-Ho Shin
Journal:  Microorganisms       Date:  2021-01-31
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.