| Literature DB >> 32586847 |
Sam D Heraghty1, John M Sutton2, Meaghan L Pimsler2, Janna L Fierst2, James P Strange3, Jeffrey D Lozier2.
Abstract
Bumble bees are ecologically and economically important insect pollinators. Three abundant and widespread species in western North America, Bombus bifarius, Bombus vancouverensis, and Bombus vosnesenskii, have been the focus of substantial research relating to diverse aspects of bumble bee ecology and evolutionary biology. We present de novo genome assemblies for each of the three species using hybrid assembly of Illumina and Oxford Nanopore Technologies sequences. All three assemblies are of high quality with large N50s (> 2.2 Mb), BUSCO scores indicating > 98% complete genes, and annotations producing 13,325 - 13,687 genes, comparing favorably with other bee genomes. Analysis of synteny against the most complete bumble bee genome, Bombus terrestris, reveals a high degree of collinearity. These genomes should provide a valuable resource for addressing questions relating to functional genomics and evolutionary biology in these species.Entities:
Keywords: Bombus bifarius; Bombus vancouverensis; Bombus vosnesenskii; Illumina; MaSuRCA; Oxford Nanopore Technologies; hybrid assembly
Mesh:
Year: 2020 PMID: 32586847 PMCID: PMC7407468 DOI: 10.1534/g3.120.401437
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Three focal species used for genome assembly.
Summary of sample collection information
| Species | Sample | Source Locality | Latitude | Longitude | Elevation (m) |
|---|---|---|---|---|---|
| JDL3187 | Boulder County, Colorado, US | 39.940 | −105.560 | 2,760 | |
| JDL1245 | Tulare County, California, US | 36.597 | −118.736 | 2,214 | |
| JDL3184 | Jackson County, Oregon, US | 42.152 | −122.621 | 685 | |
| JDL3185 | Jackson County, Oregon, US | 42.152 | −122.621 | 685 |
Sequencing statistics. Number of reads (read pairs for Illumina) at each stage in data filtering, including raw data, bacteria-filtered first-assembly data, and filtered data passed to the final second-round assembly. Estimated coverage is based on the number of sequence bases provided to the final assembly and an assumed genome size similar to B. impatiens (245.9 Mb)
| Species | Sequencing | No. raw reads (No. bases) | No. reads first assembly (No. bases) | No. reads second assembly (No. bases) | Estimated coverage |
|---|---|---|---|---|---|
| Illumina | 9.00x107 (26.98 Gb) | 8.98x107 (26.94 Gb) | 8.49x107 (25.48 Gb) | 103.6x | |
| ONT | 1.72x106 (6.35 Gb) | 1.63x106 (5.91 Gb) | 1.46x106 (5.57 Gb) | 22.7x | |
| Illumina | 8.97x107 (26.92 Gb) | 8.96x107 (26.86 Gb) | 8.06x107 (24.18 Gb) | 98.3x | |
| ONT | 2.58x106 (10.43 GB) | 2.53x106 (9.92 Gb) | 2.27x106 (9.19 Gb) | 37.4x | |
| Illumina | 8.90x107 (26.72 Gb) | 8.88x107 (26.66 Gb) | 8.20x107 (24.62 Gb) | 100.1x | |
| ONT | 2.08x106 (9.00 Gb) | 2.02x106 (8.55 Gb) | 1.72x106 (7.84 Gb) | 31.9x |
Figure 2Blob plots (from Illumina data) showing read depth of coverage, GC content, and size for each scaffold after first (left hand column) and second round (right hand column) MaSuRCA assemblies for A) Bombus bifarius, B) Bombus vancouverensis, C) Bombus vosnesenskii. Size of the blob corresponds to size of the scaffold and color corresponds to taxonomic assignment of BLAST (blue = Apidae). Inset statistics for each panel refer to [scaffold count, sum length, N50] associated with BLAST assignments to each taxonomic group. The top and right histograms indicate the total length of scaffolds at a given GC content or average read depth, respectively. Qualitatively similar plots were produced using the ONT data.
Assembly statistics and BUSCO analyses for the three focal species genomes in comparison to other Bombus genomes
| Assembly statistics | BUSCO results | ||||||
|---|---|---|---|---|---|---|---|
| Species | Length (Mb) | N50 (Mb) | No. scaffolds | GC % | Complete [single, duplicated] | Fragmented | Missing |
| 266.8 | 2.20 | 1,249 | 37.96 | 98.1% [97.7%,0.4%] | 0.6% | 1.3% | |
| 282.1 | 3.06 | 1,162 | 38.02 | 98.4% [97.9%,0.5%] | 0.6% | 1.0% | |
| 275.6 | 2.83 | 1,429 | 37.93 | 98.2% [98.0%, 0.2%] | 0.6% | 1.2% | |
| 248.7 | 12.9 | 5,609 | 37.51 | 96.9% [96.7%, 0.2%] | 1.5% | 1.6% | |
| 245.9 | 1.41 | 2,506 | 37.76 | 98.3% [98.1%, 0.2%] | 0.7% | 1.0% | |
BUSCO analysis run using the OrthoDB v.10, Hymenoptera dataset containing 5,991 genes.
Bombus terrestris genome assembly version: Bter_1.0.
Bombus impatiens genome assembly version: BIMP_2.2.
Figure 3Confirming sample identity for the B. bifarius and B. vancouverensis genomes. A-B) Neighbor joining distance trees for the A) serrate RNA effector and B) sodium/potassium transporting ATPase subunit alpha genes from Bombus bifarius and Bombus vancouverensis assemblies aligned to GenBank accessions (tip labels on tree) originally used for delimitation of these sister species (from NCBI PopSet 1803131478 and 1803131398; Ghisbain ).
Annotation statistics from NCBI Eukaryotic genome annotation pipeline for the three focal species genomes in comparison to other Bombus genomes. In the case of the three focal genomes, all are on annotation release 100 whereas B. impatiens and B. terrestris are on 103 and 102, respectively. Details on data used for annotation and comparative statistics are available at the NCBI links given in footnotes a-c
| Protein coding genes | 11,148 | 11,338 | 11,184 | 10,632 | 10,400 |
| Non-coding genes | 1,653 | 1,802 | 1,789 | 2,293 | 607 |
| Pseudogenes | 524 | 547 | 554 | 236 | 76 |
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Bombus_bifarius/100/
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Bombus_vancouverensis_nearcticus/100/
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Bombus_vosnesenskii/100/
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Bombus_impatiens/103/
https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Bombus_terrestris/102/
Figure 4D-GENIES dot plots (using Minimap2 aligner) indicating collinearity of scaffolds (>100 kb in length) with the Bombus terrestris genome for A) Bombus bifarius, B) Bombus vancouverensis, and C) Bombus vosnesenskii.
Figure 5Analysis of de novo genomes in a region previously identified as a candidate target of selection in these lineages. A) MAUVE alignment for a focal scaffold of interest in the new genomes and in B. impatiens, which has repeatedly shown evidence of local adaptation in prior studies using B. impatiens as a reference genome. The bottom track indicates gene IDs from B. impatiens, with the arrow pointing to a previously discovered gene of interest in the region (LOC100741462, Xanthine dehydrogenase/oxidase-like; Pimsler ); B) MAUVE produced a single collinear orientation block across species suggesting no major structural rearrangements in the region.