| Literature DB >> 31440544 |
Takafumi Kataoka1, Ryuji Kondo1.
Abstract
This Data in Brief article is a supporting information for the research article entitled "Protistan community composition in anoxic sediments from three salinity-disparate Japanese lakes" by Kataoka and Kondo (2019) [1]. Summary of 18S rRNA gene sequences originated from anoxic sediment of three lakes in two seasons using high throughput sequencing techniques (MiSeq, Illumina) was shown in this data article. Supergroup-level taxonomy was compared between the SILVA search for SILVA database and BLASTn search for the PR2 database. Alpha diversity was calculated in each sample, and beta-diversity was calculated among the six amplicon libraries. Partial sequence length between the primer set of 574*f and 1132R Hugerth et al., 2015 was compared between the forward read and the combined read.Entities:
Keywords: 18S rRNA gene; High throughput sequencing (HTS); MiSeq; Protists; V4–V5 hypervariable region
Year: 2019 PMID: 31440544 PMCID: PMC6699457 DOI: 10.1016/j.dib.2019.104213
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Summary of sequence read and OTU number before and after singleton was removed.
| Hiruga1 | Hiruga2 | Suigetsu1 | Suigetsu2 | Biwa1 | Biwa2 | |
|---|---|---|---|---|---|---|
| Including all reads | ||||||
| Sequence read | 119529 | 157402 | 63764 | 48948 | 390826 | 276815 |
| OTU | 984 | 1086 | 426 | 391 | 4141 | 3612 |
| After removed singleton | ||||||
| Sequence read | 119221 | 157176 | 63619 | 48815 | 389041 | 275292 |
| OTU | 676 | 860 | 281 | 258 | 2356 | 2089 |
| Number of singleton | 308 | 226 | 145 | 133 | 1785 | 1523 |
| % singleton | 31.3 | 20.8 | 34.0 | 34.0 | 43.1 | 42.2 |
Number of OTUs showing mismatch between a SINA search (the SILVA database ver. 132) and a BLASTn search (the PR2 database ver. 4.10.0) identification at supergroup taxonomy.
| Number of OTUs | SINA × SILVA identification | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alveolata | Amoebozoa | Archaeplastida | Opisthokonta | Rhizaria | Stramenopiles | Picozoa | Centrohelida | Cryptophyceae | Haptophyta | IncertaeSedis | NAMAKO-1 | |||
| BLASTn × PR2 identification | Alveolata | 62 | – | 12 | 10 | 2 | 38 | |||||||
| Amoebozoa | 22 | – | 20 | 1 | ||||||||||
| Archaeplastida | 42 | 25 | 5 | – | 4 | 5 | 1 | 2 | 20 | |||||
| Opisthokonta | 138 | 76 | 1 | 13 | – | 4 | 18 | 12 | ||||||
| Rhizaria | 10 | 5 | 1 | – | 1 | 3 | ||||||||
| Stramenopiles | 57 | 45 | 5 | 3 | 4 | – | ||||||||
| Hacrobia | 113 | 4 | 1 | 2 | 2 | 11 | 73 | 20 | ||||||
| Apusozoa | 29 | 29 | ||||||||||||
| Unknown | 3 | 2 | 1 | |||||||||||
Fig. 1Rarefaction curves of 98% similarity-based-OTUs in each sample (A) including all reads and (B) with singleton reads removed.
Fig. 2Similarity profile analysis to detect significant clusters (p < 0.05). Dissimilarity was calculated by relative abundance data of sequence reads using the Bray-Curtis index, and significantly distant samples were clustered using Ward's method.
Fig. 3Partial sequence length between the primer sets, 574*f and 1132R [2], of sequences in the PR2 database to which OTU representatives received the best hit using a BLAST search. The labels Combined and Forward indicate the combined sequences yielded from both primers and single sequences yielded from the forward primer, respectively. The number on the top of each plot shows the number of sequences analysed. The bar in the box indicates the median value. The top and bottom of the boxes indicate the upper and lower quartiles, respectively.
Specifications table
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | |
| Experimental features | |
| Data source location | |
| Data accessibility | |
| Related research article |
Comparing methods of annotating taxonomic path for 18S rRNA gene sequence is valuable because sequence in public database is still insufficient for identifying diverse eukaryotic microbes. Information of partial sequence length between the forward- and reverse-primer is valuable for understanding protistan composition in natural environment where unknown microbes inhabit. Alpha and beta diversities of protistan genotypes in lacustrine sediments are rare example. |