Literature DB >> 30505899

Microbiome dataset from Clara Cave and Empalme Sinkhole waters in Puerto Rico.

Luis E Rodriguez-Ramos1, Carlos Rios-Velazquez1.   

Abstract

Camuy River Cave Park (CRCP) is an underground cave system located at the subtropical karst carved by the Camuy River in the subtropical moist forest of northern Puerto Rico (Nieves-Rivera, 2003) [1]. This article contains a metagenomic dataset from the microbial and functional diversity of Clara Cave and Empalme Sinkhole water samples. The environmental DNA (eDNA) from the samples was extracted following direct Metagenomic DNA Isolation method, followed by Next-Generation-Sequencing technology (Illumina MiSeq). The sequences were submitted to MG-RAST online server for taxonomic profile generation and functional in silico description of the samples. The data consisted of domain Bacteria (96.69%), followed up by Viruses (2.87%), Eukaryotes (0.37%), and Archaea (0.02%). The data distribution by phyla showed Proteobacteria (92.61%), Bacteroidetes (1.66%), Actinobacteria (1.12%), and Firmicutes (0.48%). The subsystem functional data showed that 12.97% of genes were related to clustering-based subsystems, 11.40% to carbohydrates, and 11.0% to amino acids and derivatives. The metagenome dataset generated will provide an understanding and comparison framework of the microbial composition and functional diversity present in caves.

Entities:  

Year:  2018        PMID: 30505899      PMCID: PMC6247408          DOI: 10.1016/j.dib.2018.11.028

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications table Value of the data This project represents the first metagenomic water samples dataset of a cave and a nearby sinkhole generated in Puerto Rico. The data obtained provides an insight of microbial composition and functional diversity of cave and sinkhole water samples. Also, the profiles will serve for future comparison of taxonomic and functional profile among different types of cave. Furthermore, the data can be used to explore potential biomedical, industrial or biotechnological applications.

Data

Metagenomics has been used for the study of caves, mostly speleothems [2], [3] or sediment [4], to unravel the microbial communities and activities in this ecosystem. As a complementary sequence data of speleothem or sediment, data from water samples present in the caves are also important to be generated. The Clara Cave and Empalme Sinkhole are an underground cave system and natural attraction from Camuy River Cave Park at the north of Puerto Rico which is part of a network of natural limestone caves [1]. To our knowledge, these environments have never been studied using metagenomics. The dataset obtained after shotgun sequencing of water samples from the Clara Cave and Empalme Sinkhole are described as taxonomic and functional profiles in Fig. 1, Fig. 2 respectively.
Fig. 1

Taxonomic profile of Clara Cave and Empalme Sinkhole waters. The taxonomic analysis revealed the most abundant domain was Bacteria (96.69%), followed up by viruses (2.87%), Eukaryotes (0.37%), other sequences (0.04%), and Archaea (0.02%). The analysis consisted of 52 phyla, from which Proteobacteria (92.61%) showed to be the most abundant, followed up by unclassified for viruses (2.87%), Bacteroidetes (1.66%), Actinobacteria (1.12%), and Firmicutes (0.48%). The metagenome represents the DNA extracted from combined water samples.

Fig. 2

Functional in silico profile of Clara Cave and Empalme Sinkhole water samples using subsystem annotation. Subsystem functional analysis presented that 11.0% of genes are related to clustering-based subsystems, 11.0% to amino acids and derivatives, 10.0% to carbohydrates, 6.0% to miscellaneous, and 5.0% to membrane transport. In addition, genes related to metabolism of proteins (6.0%), DNA (4.0%), RNA (4.0%), iron (3.0%), aromatic compounds (3.0%), sulfur (2.0%), nitrogen (1.0%), phosphorous (1.0%), potassium (0.8%), and secondary metabolism (0.2%). The metagenome represents the DNA extracted from combined water samples.

Taxonomic profile of Clara Cave and Empalme Sinkhole waters. The taxonomic analysis revealed the most abundant domain was Bacteria (96.69%), followed up by viruses (2.87%), Eukaryotes (0.37%), other sequences (0.04%), and Archaea (0.02%). The analysis consisted of 52 phyla, from which Proteobacteria (92.61%) showed to be the most abundant, followed up by unclassified for viruses (2.87%), Bacteroidetes (1.66%), Actinobacteria (1.12%), and Firmicutes (0.48%). The metagenome represents the DNA extracted from combined water samples. Functional in silico profile of Clara Cave and Empalme Sinkhole water samples using subsystem annotation. Subsystem functional analysis presented that 11.0% of genes are related to clustering-based subsystems, 11.0% to amino acids and derivatives, 10.0% to carbohydrates, 6.0% to miscellaneous, and 5.0% to membrane transport. In addition, genes related to metabolism of proteins (6.0%), DNA (4.0%), RNA (4.0%), iron (3.0%), aromatic compounds (3.0%), sulfur (2.0%), nitrogen (1.0%), phosphorous (1.0%), potassium (0.8%), and secondary metabolism (0.2%). The metagenome represents the DNA extracted from combined water samples.

Experimental design, materials and methods

Sample collection

Three water samples were collected from Clara Cave (18°20′85.4′′ N, 66°49′16.0′′ W) using sterile 100 mL water sampling plastic containers. Two of three water samples of Clara Cave were from water streaming down the cave walls. The other was collected at 0.3-m depth from stagnant water. One water sample was collected from a stream at Empalme Sinkhole, which connects directly to Clara Cave exit. Temperature measures of samples ranged from 18 °C to 20 °C. The samples were transported to the laboratory where pH measures were taken; pH values ranged from 8.38 to 9.93.

DNA extraction

For DNA extraction, approximately 50 mL of each water sample were taken to combine it into one sample. The protocol of Metagenomic DNA Isolation Kit from Water (Epicentre, USA, 11) was followed for 200 mL of sample. Extracted high molecular weight (40 kbp) DNA was send to MR DNA (http://www.mrdnalab.com), where a genomic library was prepared using Qubit® dsDNA HS Assay Kit (Life Technologies) and Nextera DNA Sample Preparation Kit (Illumina) following the manufacturer׳s instructions. After using 50 ng of DNA to prepare the library, it underwent fragmentation and addition of adapter sequences. Following the library preparation, the final concentration of DNA was 7.22 ng/uL and the average library size was 2355 bp, this one determined by using Agilent 2100 Bioanalyzer (Agilent Technologies). The library was diluted to 10.0pM and sequenced using the 600 cycle v3 Reagent Kit (Illumina) on the MiSeq (Illumina).

Taxonomic and functional insight

To unravel the microbial diversity and functional in silico of the samples, the metagenomic sequence was examined using the Metagenomic Rapid Annotation using Subsystems Technology (MG-RAST) online server [5]
Subject areaBiology
More specific subject areaMetagenomics
Type of dataText files and figures
How data was acquiredMiSeq (Illumina), MR DNA Laboratories, USA.
Data formatRaw and analyzed
Experimental factorsEnvironmental samples
Experimental featuresThe environmental DNA of water samples was extracted, sequenced and annotated.
Data source locationCamuy, Puerto Rico (18°20′85.4′′N, 66°49′16.0′′W) and (18°20′79.9′′N, 66°49′11.2′′W)
Data accessibilityData is with this article. The data of this metagenome is available in the BioSample Submission Portal as Bioproject PRJNA487362 via https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA487362 and Sequence Read Archive (SRA) accession number SRR7741201 via https://www.ncbi.nlm.nih.gov/sra/?term=SRR7741201
  4 in total

1.  Making a living while starving in the dark: metagenomic insights into the energy dynamics of a carbonate cave.

Authors:  Marianyoly Ortiz; Antje Legatzki; Julia W Neilson; Brandon Fryslie; William M Nelson; Rod A Wing; Carol A Soderlund; Barry M Pryor; Raina M Maier
Journal:  ISME J       Date:  2013-09-12       Impact factor: 10.302

2.  The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.

Authors:  F Meyer; D Paarmann; M D'Souza; R Olson; E M Glass; M Kubal; T Paczian; A Rodriguez; R Stevens; A Wilke; J Wilkening; R A Edwards
Journal:  BMC Bioinformatics       Date:  2008-09-19       Impact factor: 3.169

3.  Illumina-based analysis of bacterial community in Khuangcherapuk cave of Mizoram, Northeast India.

Authors:  Surajit De Mandal; Amrita Kumari Panda; Esther Lalnunmawii; Satpal Singh Bisht; Nachimuthu Senthil Kumar
Journal:  Genom Data       Date:  2015-05-08

4.  Metagenomic Analysis from the Interior of a Speleothem in Tjuv-Ante's Cave, Northern Sweden.

Authors:  Marie Lisandra Zepeda Mendoza; Johannes Lundberg; Magnus Ivarsson; Paula Campos; Johan A A Nylander; Therese Sallstedt; Love Dalen
Journal:  PLoS One       Date:  2016-03-17       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.