Literature DB >> 35036494

Soil microbiome dataset from Yok Don national park in the Central Highlands region of Vietnam.

Dinh Minh Tran1, To Uyen Huynh1, Thi Huyen Nguyen1, Tu Oanh Do1, Thi Phuong Hanh Tran2, Quang-Vinh Nguyen1, Anh Dzung Nguyen1.   

Abstract

The Central Highlands region contains most of the national parks in Vietnam with different ecosystems, including the national parks of Kon Ka Kinh, Chu Mon Ray, Chu Yang Sin, Yok Don, Bidoup-Nui Ba, and Ta Dung. Thus, this region is considered a center with the highest biodiversity in Vietnam [1]. Among the national parks, Yok Don is unique in its conservation of the dry deciduous dipterocarp forest. Furthermore, Yok Don is the second-largest park in Vietnam; it has the most different ecosystem compared with other national parks in this region [2]. Although some studies have investigated biodiversity preservation in the region, some other studies have only dealt with medicinal plants, lichens, and the rhizospheric bacteria of cultivated black pepper [1,[3], [4], [5]. To the best of our knowledge, no research on the microbial communities in Yok Don national park and in the Central Highlands has been reported. At present, global warming and a decrease in the forest area in the Central Highlands have led to the ongoing reduction in biodiversity and microbial resources. The current study reports the microbiome dataset from the soil sample collected in Yok Don national park. Metagenomic next-generation sequencing was used to characterize the microbial communities in the sample. The metagenome dataset generated provides information on microbial diversity and its functionality and can be useful for further studies on the conservation and use of microbial genetic resources in this region.
© 2022 The Authors.

Entities:  

Keywords:  Metagenomic next-generation sequencing; Soil microbiome; The dry deciduous dipterocarp forest; Yok Don national park

Year:  2022        PMID: 35036494      PMCID: PMC8749163          DOI: 10.1016/j.dib.2022.107798

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

The data generated provides information on the microbiome in the soil at Yok Don national park in the Central Highlands, Vietnam. The data could be useful for the comparative analysis of the taxonomic profiles of Yok Don national park with those of other national parks. The data could be useful for future studies on the conservation and use of indigenous microbial gene resources for sustainable crop production and related fields.

Data Description

The dataset describes the taxonomic and functional profiles of a metagenomic soil sample collected from Yok Don national park in the Central Highlands, Vietnam. The 16S rRNA gene amplicon sequencing was performed using the Illumina MiSeq platform (2 × 150-bp paired ends). Data were analyzed using classify-consensus-blast from QIIME2 aligned with the SILVA SSURef reference database (v.138), PICRUSt2, and MetaCyc database. A total of 190,918 reads were classified out of 190,953 analyzed reads (Table 1). Data were presented as taxonomic and functional profiles, as shown in Figs. 1 and 2, respectively. Among the 29 phyla detected, Proteobacteria (24.33%) was the most dominant, followed by Actinobacteriota (20.28%), Acidobacteriota (14.26%), Myxococcota (8.23%), and Gemmatimonadota (8.09%) (Fig. 1). Of the 188 bacterial orders present, Burkholderiales (13.53%) was the most abundant, followed by Gemmatimonadales (7.7%), Gaiellales (6.8%), Rhizobiales (4.92%), and Solirubrobacterales (4.19%). Moreover, 263 families and 380 genera were identified. Additionally, biosynthesis (71.78%) was the most abundant metagenomic function of the microbiome, followed by the generation of precursor metabolite and energy (12.66%) and degradation/utilization/assimilation of inorganic nutrient metabolism (12.2%) (Fig. 2).
Table 1

Summary statics table.

ReadsCount
Total analyzed reads190,953
Classified reads190,918
Unclassified reads35
Fig. 1

Taxonomic profile based on the 16S rRNA gene amplicon sequencing of the soil sample collected from Yok Don national park in the Central Highlands, Vietnam.

Fig. 2

Functional profile based on the 16S rRNA gene amplicon sequencing of the soil sample collected from Yok Don national park in the Central Highlands, Vietnam.

Summary statics table. Taxonomic profile based on the 16S rRNA gene amplicon sequencing of the soil sample collected from Yok Don national park in the Central Highlands, Vietnam. Functional profile based on the 16S rRNA gene amplicon sequencing of the soil sample collected from Yok Don national park in the Central Highlands, Vietnam.

Experimental Design, Materials and Methods

Sample collection

A 5–30 cm deep soil sample (about 300 g) was collected from Yok Don national park in the Central Highlands, Vietnam, kept at 4°C, and transported to the laboratory within 2 h. The sample was stored at −80°C until analyzed.

DNA extraction and the 16S rRNA gene amplicon sequencing

DNA was extracted from 0.3 g of the soil sample using the DNeasy PowerSoil kit (Qiagen, Germany). The V1–V9 region of the 16S rRNA gene was amplified from the extracted DNA. Libraries of the 16S rRNA gene amplicon were prepared using the Swift amplicon 16S plus internal transcribed spacer panel kit (Swift Biosciences, USA) according to the manufacturer's instructions. The 16S rRNA gene amplicon sequencing was performed using the Illumina MiSeq platform (2 × 150-bp paired ends). Primers used for amplification are shown in Table 2.
Table 2

Primers used for amplification in this study.

PrimerSequence (5′‒3′)
F1GAGTTTGATCMTGGCTCAG
F2CCTACGGGAGGCAGCAG
F3GCCAGCAGCCGCGGTAA
F4ATGGCTGTCGTCAGCT
F5GYAACGAGCGCAACCC
R1CTACCAGGGTATCTAATCC
R2CCGTCAATTCMTTTGAGTTT
R3GACGGGCGGTGTGTACAA
R4TACCTTGTTACGACTT

Note: F, forward primer; R, reverse primer.

Primers used for amplification in this study. Note: F, forward primer; R, reverse primer.

Taxonomic and functional analyses

Taxonomic analysis was performed as described previously [6]. Briefly, the raw basecall (bcl) files were demultiplexed using bcl2fastq, allowing one mismatch in the dual-barcode sequence. Trimmomatic (v.0.39) [7] and Cutadapt (v.2.10) [8] were used to remove adapters, primers, and low-quality sequences (average score: < 20; read length: < 100 bp). The q2-dada2 plugin and denoise-single method within the QIIME2 pipeline (v.2020.8) [9] were used to cluster and dereplicate the reads into amplicon sequence variants. The QIIME2 aligned with the SILVA SSURef reference database (v.138) [10] was used for the taxonomic analysis of the amplicon sequence variants according to the classify-consensus-blast method [11]. Finally, the PICRUSt2 (v.2.3.0-b) [12] and MetaCyc databases [13] were used to predict the functional profiles of the soil sample based on the 16S rRNA gene amplicon sequencing. The analyzed functional profiles included degradation/utilization/assimilation, biosynthesis, super pathways, precursor metabolite and energy generation, detoxification, glycan pathways, metabolic clusters, macromolecule modification, and activation/Inactivation/Interconversion.

Ethics Statements

None

CRediT authorship contribution statement

Dinh Minh Tran: Conceptualization, Methodology, Software, Data curation, Writing – original draft, Investigation, Formal analysis, Validation, Visualization, Writing – review & editing. To Uyen Huynh: Investigation, Formal analysis. Thi Huyen Nguyen: Investigation, Formal analysis. Tu Oanh Do: Investigation, Formal analysis. Thi Phuong Hanh Tran: Investigation, Formal analysis. Quang-Vinh Nguyen: Investigation, Formal analysis, Validation, Visualization. Anh Dzung Nguyen: Investigation, Formal analysis, Validation, Visualization, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectMicrobiology: Microbiome
Specific subject areaMetagenomics
Type of dataFigures, Tables, and Fastq files
How the data were acquiredIllumina MiSeq platform
Data formatRaw and Analyzed
Description of data collectionA soil sample was collected from Yok Don national park in the Central Highlands, Vietnam. Total DNA was extracted from the sample, and 16S rRNA gene amplicon sequencing was performed using the Illumina MiSeq platform (2 × 150-bp paired ends)
Data source location• Institution: Yok Don national park• District/Province/Region: Buon Don, Dak Lak, the Central Highlands• Country: Vietnam• Latitude and longitude coordinates for collected samples: 12°58′22.82′′N,107°49′13.96′′E
Data accessibilityData are available at the NCBI with Bioproject PRJNA783494 and SRA accession number SRR17036647 (https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR17036647)
  7 in total

1.  Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.

Authors:  Evan Bolyen; Jai Ram Rideout; Matthew R Dillon; Nicholas A Bokulich; Christian C Abnet; Gabriel A Al-Ghalith; Harriet Alexander; Eric J Alm; Manimozhiyan Arumugam; Francesco Asnicar; Yang Bai; Jordan E Bisanz; Kyle Bittinger; Asker Brejnrod; Colin J Brislawn; C Titus Brown; Benjamin J Callahan; Andrés Mauricio Caraballo-Rodríguez; John Chase; Emily K Cope; Ricardo Da Silva; Christian Diener; Pieter C Dorrestein; Gavin M Douglas; Daniel M Durall; Claire Duvallet; Christian F Edwardson; Madeleine Ernst; Mehrbod Estaki; Jennifer Fouquier; Julia M Gauglitz; Sean M Gibbons; Deanna L Gibson; Antonio Gonzalez; Kestrel Gorlick; Jiarong Guo; Benjamin Hillmann; Susan Holmes; Hannes Holste; Curtis Huttenhower; Gavin A Huttley; Stefan Janssen; Alan K Jarmusch; Lingjing Jiang; Benjamin D Kaehler; Kyo Bin Kang; Christopher R Keefe; Paul Keim; Scott T Kelley; Dan Knights; Irina Koester; Tomasz Kosciolek; Jorden Kreps; Morgan G I Langille; Joslynn Lee; Ruth Ley; Yong-Xin Liu; Erikka Loftfield; Catherine Lozupone; Massoud Maher; Clarisse Marotz; Bryan D Martin; Daniel McDonald; Lauren J McIver; Alexey V Melnik; Jessica L Metcalf; Sydney C Morgan; Jamie T Morton; Ahmad Turan Naimey; Jose A Navas-Molina; Louis Felix Nothias; Stephanie B Orchanian; Talima Pearson; Samuel L Peoples; Daniel Petras; Mary Lai Preuss; Elmar Pruesse; Lasse Buur Rasmussen; Adam Rivers; Michael S Robeson; Patrick Rosenthal; Nicola Segata; Michael Shaffer; Arron Shiffer; Rashmi Sinha; Se Jin Song; John R Spear; Austin D Swafford; Luke R Thompson; Pedro J Torres; Pauline Trinh; Anupriya Tripathi; Peter J Turnbaugh; Sabah Ul-Hasan; Justin J J van der Hooft; Fernando Vargas; Yoshiki Vázquez-Baeza; Emily Vogtmann; Max von Hippel; William Walters; Yunhu Wan; Mingxun Wang; Jonathan Warren; Kyle C Weber; Charles H D Williamson; Amy D Willis; Zhenjiang Zech Xu; Jesse R Zaneveld; Yilong Zhang; Qiyun Zhu; Rob Knight; J Gregory Caporaso
Journal:  Nat Biotechnol       Date:  2019-08       Impact factor: 54.908

2.  PICRUSt2 for prediction of metagenome functions.

Authors:  Gavin M Douglas; Vincent J Maffei; Jesse R Zaneveld; Svetlana N Yurgel; James R Brown; Christopher M Taylor; Curtis Huttenhower; Morgan G I Langille
Journal:  Nat Biotechnol       Date:  2020-06       Impact factor: 54.908

3.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools.

Authors:  Christian Quast; Elmar Pruesse; Pelin Yilmaz; Jan Gerken; Timmy Schweer; Pablo Yarza; Jörg Peplies; Frank Oliver Glöckner
Journal:  Nucleic Acids Res       Date:  2012-11-28       Impact factor: 16.971

4.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

5.  The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.

Authors:  Ron Caspi; Richard Billington; Luciana Ferrer; Hartmut Foerster; Carol A Fulcher; Ingrid M Keseler; Anamika Kothari; Markus Krummenacker; Mario Latendresse; Lukas A Mueller; Quang Ong; Suzanne Paley; Pallavi Subhraveti; Daniel S Weaver; Peter D Karp
Journal:  Nucleic Acids Res       Date:  2015-11-02       Impact factor: 16.971

6.  Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

Authors:  Nicholas A Bokulich; Benjamin D Kaehler; Jai Ram Rideout; Matthew Dillon; Evan Bolyen; Rob Knight; Gavin A Huttley; J Gregory Caporaso
Journal:  Microbiome       Date:  2018-05-17       Impact factor: 14.650

  7 in total
  4 in total

1.  Taxonomic and functional profiles of Coffea canephora endophytic microbiome in the Central Highlands region, Vietnam, revealed by analysis of 16S rRNA metagenomics sequence data.

Authors:  Dinh Minh Tran
Journal:  Data Brief       Date:  2022-06-10

2.  Analysis of endophytic microbiome dataset from roots of black pepper (Piper nigrum L.) cultivated in the Central Highlands region, Vietnam using 16S rRNA gene metagenomic next-generation sequencing.

Authors:  Dinh Minh Tran; Thi Huyen Nguyen; To Uyen Huynh; Tu Oanh Do; Quang-Vinh Nguyen; Anh Dzung Nguyen
Journal:  Data Brief       Date:  2022-03-28

3.  Rhizosphere microbiome dataset of Robusta coffee (Coffea canephora L.) grown in the Central Highlands, Vietnam, based on 16S rRNA metagenomics analysis.

Authors:  Dinh Minh Tran
Journal:  Data Brief       Date:  2022-03-28

4.  Metagenomic next-generation sequencing of the microbiome dataset from the surface water sample collected from Serepok River in Yok Don National Park, Vietnam.

Authors:  Dinh Minh Tran; To Uyen Huynh; Thi Huyen Nguyen; Tu Oanh Do; Quang Vinh Nguyen; Anh Dzung Nguyen
Journal:  Data Brief       Date:  2022-09-17
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.