| Literature DB >> 33937454 |
Marek Gancarz1,2, Paul J Hurd3, Przemyslaw Latoch4,5, Andrew Polaszek6, Joanna Michalska-Madej7, Łukasz Grochowalski7, Dominik Strapagiel7, Sebastian Gnat8, Daniel Załuski9, Robert Rusinek1, Agata L Starosta5,10, Patcharin Krutmuang11,12, Raquel Martín Hernández13,14, Mariano Higes Pascual13, Aneta A Ptaszyńska3,15.
Abstract
Forager Apis melliefera honeybees were collected from four localities located in Europe, i.e.: London, UK; Athens, Greece; Marchamalo, Spain and Lublin, Poland. Furthermore, from Asia we have collected A. mellifera as well as A. cerana foragers form Chiang Mai in Thailand We used next generation sequencing (NGS) to analyse the 16S rRNA bacterial gene amplicons based on the V3-V4 region and the ITS2 region from fungi and plants derived from honeybee samples. Amplicon libraries, were prepared using the 16S Metagenomic Sequencing Library Preparation, Preparing 16S Ribosomal RNA Gene Amplicons for the Illumina MiSeq System (Illumina®) protocol. NGS raw data are available at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA686953. Furthermore, isolated DNA was used as the template for screening pathogens: Nosema apis, N. ceranae, N. bombi, tracheal mite (Acarapis woodi), any organism in the parasitic order Trypanosomatida, including Crithidia spp. (i.e., Crithidia mellificae), neogregarines including Mattesia and Apicystis spp. (i.e., Apicistis bombi). The presented data can be used to compare the metagenomic samples from different honeybee population all over the world. A higher load of fungi, and bacteria groups such as: Firmicutes (Lactobacillus); γ- proteobacteria, Neisseriaceae, and other unidentified bacteria was observed for Nosema cearana and neogregarines infected honeybees. Healthy honeybees had a higher load of plant pollens, and bacteria groups such as: Orbales, Gilliamella, Snodgrassella, and Enterobacteriaceae. More details can be found in research article [1] Ptaszyńska et al. 2021.Entities:
Keywords: Acarapis woodi; Anthropocene; Apicystis spp.; Crithidia spp.; NGS, Apis cerana; Nosema sp.; Trypanosomatida; neogregarines
Year: 2021 PMID: 33937454 PMCID: PMC8079459 DOI: 10.1016/j.dib.2021.107019
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Sequences filtering statistics.
| • Input – initial number of sequences, | ||||||||
| • Filtered – number of reads after removing low-quality data, | ||||||||
| • Denoised – number of reads after removing data considered as noise, | ||||||||
| • Merged – number of correctly merged forward and reverse reads, | ||||||||
| • Non-chimeric – number of sequences after chimera removal; final number of reads. | ||||||||
| 16S – PL. | ||||||||
| sample-id | input | filtered | percentage of input passed filter | denoised | merged | percentage of input merged | non-chimeric | percentage of input non-chimeric |
| PL1(1) | 38239 | 20701 | 54,14 | 20637 | 20406 | 53,36 | 19902 | 52,05 |
| PL1(2) | 44576 | 29660 | 66,54 | 29562 | 29288 | 65,7 | 28528 | 64 |
| PL1(3) | 49083 | 26149 | 53,28 | 26064 | 25771 | 52,5 | 25092 | 51,12 |
| PL2(1) | 40017 | 24250 | 60,6 | 24216 | 24095 | 60,21 | 23239 | 58,07 |
| PL2(2) | 62869 | 39145 | 62,26 | 39058 | 38865 | 61,82 | 37217 | 59,2 |
| PL3(2) | 22213 | 14178 | 63,83 | 14140 | 14114 | 63,54 | 14096 | 63,46 |
| PL3(3) | 41999 | 31808 | 75,74 | 31750 | 31707 | 75,49 | 31521 | 75,05 |
| PL4(1) | 61900 | 41059 | 66,33 | 40969 | 40507 | 65,44 | 38764 | 62,62 |
| PL4(2) | 54334 | 36764 | 67,66 | 36575 | 36270 | 66,75 | 34761 | 63,98 |
| PL4(3) | 72110 | 49176 | 68,2 | 49052 | 48547 | 67,32 | 46512 | 64,5 |
| PL5(1) | 46551 | 28087 | 60,34 | 27929 | 27743 | 59,6 | 27473 | 59,02 |
| PL5(2) | 41235 | 25786 | 62,53 | 25673 | 25551 | 61,96 | 25359 | 61,5 |
| PL5(3) | 43505 | 28996 | 66,65 | 28945 | 28898 | 66,42 | 28820 | 66,25 |
| PL6(1) | 44911 | 24875 | 55,39 | 24748 | 24401 | 54,33 | 22598 | 50,32 |
| PL6(2) | 37642 | 20884 | 55,48 | 20784 | 20462 | 54,36 | 18948 | 50,34 |
| PL6(3) | 63812 | 34485 | 54,04 | 34327 | 33869 | 53,08 | 31062 | 48,68 |
| ITS2 – PL. | ||||||||
| sample-id | input | filtered | percentage of input passed filter | denoised | merged | percentage of input merged | non-chimeric | percentage of input non-chimeric |
| PL1(1) | 160044 | 51024 | 31,88 | 50799 | 44057 | 27,53 | 43528 | 27,2 |
| PL1(2) | 147968 | 39779 | 26,88 | 39609 | 34607 | 23,39 | 34223 | 23,13 |
| PL1(3) | 166321 | 45521 | 27,37 | 45319 | 39417 | 23,7 | 38953 | 23,42 |
| PL2(1) | 161433 | 123703 | 76,63 | 123528 | 122002 | 75,57 | 113293 | 70,18 |
| PL2(2) | 120890 | 90633 | 74,97 | 90481 | 89267 | 73,84 | 82117 | 67,93 |
| PL2(3) | 140416 | 102813 | 73,22 | 102632 | 102355 | 72,89 | 100746 | 71,75 |
| PL3(1) | 147599 | 106749 | 72,32 | 106540 | 105072 | 71,19 | 102410 | 69,38 |
| PL3(2) | 125356 | 94618 | 75,48 | 94229 | 92777 | 74,01 | 90241 | 71,99 |
| PL3(3) | 91520 | 64349 | 70,31 | 64054 | 62253 | 68,02 | 60547 | 66,16 |
| PL4(1) | 157768 | 84762 | 53,73 | 84292 | 82715 | 52,43 | 81679 | 51,77 |
| PL4(2) | 129401 | 79102 | 61,13 | 78655 | 76783 | 59,34 | 75749 | 58,54 |
| PL4(3) | 113704 | 66404 | 58,4 | 66103 | 64776 | 56,97 | 63820 | 56,13 |
| PL5(1) | 170603 | 128436 | 75,28 | 128035 | 126164 | 73,95 | 123249 | 72,24 |
| PL5(2) | 162549 | 120393 | 74,07 | 120123 | 118010 | 72,6 | 115383 | 70,98 |
| PL5(3) | 144894 | 98781 | 68,17 | 98569 | 96751 | 66,77 | 94714 | 65,37 |
| PL6(1) | 148006 | 108317 | 73,18 | 107808 | 106726 | 72,11 | 105435 | 71,24 |
| PL6(2) | 141798 | 100218 | 70,68 | 100005 | 99095 | 69,88 | 97342 | 68,65 |
| PL6(3) | 130099 | 95359 | 73,3 | 95236 | 94250 | 72,44 | 93123 | 71,58 |
| 16S –UK, GR, ES, TAI. | ||||||||
| sample-id | input | filtered | percentage of input passed filter | denoised | merged | percentage of input merged | non-chimeric | percentage of input non-chimeric |
| UK-1 | 125453 | 70721 | 56,37% | 70261 | 69149 | 55,12% | 62612 | 49,91% |
| UK-2 | 113210 | 60764 | 53,67% | 60625 | 60386 | 53,34% | 58239 | 51,44% |
| GR-1 | 189774 | 101189 | 53,32% | 100943 | 98753 | 52,04% | 98281 | 51,79% |
| GR-2 | 219902 | 124908 | 56,80% | 124282 | 122054 | 55,50% | 107053 | 48,68% |
| ES-1 | 165148 | 91760 | 55,56% | 91535 | 91274 | 55,27% | 89325 | 54,09% |
| ES-2 | 135182 | 73017 | 54,01% | 72836 | 72647 | 53,74% | 72582 | 53,69% |
| TAI-1 | 205607 | 122277 | 59,47% | 122053 | 121803 | 59,24% | 120289 | 58,50% |
| TAI-2 | 275928 | 158746 | 57,53% | 158498 | 157855 | 57,21% | 156277 | 56,64% |
| TAI-3 | 247489 | 148904 | 60,17% | 148390 | 147315 | 59,52% | 136300 | 55,07% |
| TAI-4 | 233312 | 137798 | 59,06% | 137420 | 136713 | 58,60% | 132126 | 56,63% |
| ITS2 – UK, GR, ES, TAI. | ||||||||
| sample-id | input | filtered | percentage of input passed filter | denoised | merged | percentage of input merged | non-chimeric | percentage of input non-chimeric |
| UK-1 | 60154 | 35292 | 58,67% | 35052 | 34656 | 57,61% | 34132 | 56,74% |
| UK-2 | 141920 | 76895 | 54,18% | 76768 | 74365 | 52,40% | 74196 | 52,28% |
| GR-1 | 192913 | 83547 | 43,31% | 83462 | 82158 | 42,59% | 81601 | 42,30% |
| GR-2 | 121281 | 67887 | 55,97% | 67627 | 64625 | 53,29% | 61797 | 50,95% |
| ES-1 | 263444 | 112349 | 42,65% | 112231 | 105442 | 40,02% | 105254 | 39,95% |
| ES-2 | 152574 | 108152 | 70,88% | 108133 | 107913 | 70,73% | 107913 | 70,73% |
| TAI-1 | 297119 | 148360 | 49,93% | 148075 | 121757 | 40,98% | 121337 | 40,84% |
| TAI-2 | 254112 | 138951 | 54,68% | 138582 | 136814 | 53,84% | 135843 | 53,46% |
| TAI-3 | 321713 | 160884 | 50,01% | 160527 | 133804 | 41,59% | 130792 | 40,65% |
| TAI-4 | 206686 | 123427 | 59,72% | 123298 | 121189 | 58,63% | 119821 | 57,97% |
Localities of investigated samples.
| Country | City | Geographical coordinates | Sample abbreviation | Time of samplings | Organisms |
|---|---|---|---|---|---|
| Poland | Lublin | 51°15′N 22°34′E | PL1 | April | |
| PL2 | May | ||||
| PL3 | June | ||||
| PL4 | July | ||||
| PL5 | August | ||||
| PL6 | September | ||||
| UK | London | 51°52′N 0°03′W | UK1 | July | |
| 51°29′N 0°10′W | UK2 | July | |||
| Greece | Athens | 37°59′N 23°42′E | GR1 | November | |
| GR2 | November | ||||
| Spain | Marchamalo | 40°68′N 3°21′W | ES1 | November | |
| ES2 | November | ||||
| Thailand | Chiang Mai | 18°50′ 98°58″E | TAI1 | February | |
| TAI2 | February | ||||
| TAI3 | February | ||||
| TAI4 | February |
Describes the presence of pathogens in collected bee samples.
| Sample abbreviation | Time of samplings | Presence of pathogens based on ITS2 and PCR detection tracheal mite ( any organism in the parasitic order Trypanosomatida, including neogregarines including |
| PL1 | April | |
| PL2 | May | – |
| PL3 | June | |
| PL4 | July | neogregarines |
| PL5 | August | – |
| PL6 | September | neogregarines |
| UK1 | July | – |
| UK2 | July | neogregarines |
| GR1 | November | Cyanobacteria neogregarines |
| GR2 | November | – |
| ES1 | November | – |
| ES2 | November | neogregarines |
| TAI1 | February | neogregarines |
| TAI2 | February | – |
| TAI3 | February | – |
| TAI4 | February | Neogregarines |
Pathogens detected using ITS2 amplicon data and specific primers under standard PCR, according to methodology described to Nosema apis by Martín-Hernández et al. [4], Nosema ceranae by Martín-Hernández et al. [4]; Nosema bombi by Klee et al. [5] Tracheal mite (Acarapis woodi) by Yang et al. [6]; any organism in the parasitic order Trypanosomatida, including Crithidia spp. (i.e. Crithidia mellificae) Meeus et al. [7]; neogregarines including Mattesia and Apicystis spp. (i.e. Apicistis bombi) Meeus et al. [7]; – no detected pathogens.
| Subject | Biological sciences: |
| Specific subject area | |
| Dataset 3. In the excel file are the raw original information form NGS of composition of bacteria from 16S_taxonomyReads from UK, Greece, Spain and Thailand honeybee samples. | |
| Type of data | Tables |
| How data were acquired | NGS sequencing and the analysis of the 16S rRNA bacterial gene amplicon was based on the V3-V4 region and the ITS2 eukaryotic region for bee DNA samples. |
| Sequences were assigned to taxonomy using classifier trained on all eukaryotes UNITE database v8.2 with the minimum similarity of 90% of the read matching to the reference | |
| Data format | Raw |
| Parameters for data collection | Forager honeybees were recognized as bees returning to the hive and captured at the hive entrance around midday. Genomic DNA was extracted from whole honeybees using QIAamp DNA Kit according to manufacturer's instructions. Isolates were sent to the Biobank, Poland for NGS analysis. |
| Description of data collection | Forager honeybees were recognized as bees returning to the hive and captured at the hive entrance around midday. All foragers were captured individually, using tweezers. |
| Data source location | Data source locations are presented in |
| Data accessibility | All raw data are available at |
| Related research article |