| Literature DB >> 33561225 |
Jonas Bohn1, Reza Halabian1, Lukas Schrader2, Victoria Shabardina1, Raphael Steffen2, Yutaka Suzuki3, Ulrich R Ernst2, Jürgen Gadau2, Wojciech Makałowski1.
Abstract
The harvester ant genus Pogonomyrmex is endemic to arid and semiarid habitats and deserts of North and South America. The California harvester ant Pogonomyrmex californicus is the most widely distributed Pogonomyrmex species in North America. Pogonomyrmex californicus colonies are usually monogynous, i.e. a colony has one queen. However, in a few populations in California, primary polygyny evolved, i.e. several queens cooperate in colony founding after their mating flights and continue to coexist in mature colonies. Here, we present a genome assembly and annotation of P. californicus. The size of the assembly is 241 Mb, which is in agreement with the previously estimated genome size. We were able to annotate 17,889 genes in total, including 15,688 protein-coding ones with BUSCO (Benchmarking Universal Single-Copy Orthologs) completeness at a 95% level. The presented P. californicus genome assembly will pave the way for investigations of the genomic underpinnings of social polymorphism in the number of queens, regulation of aggression, and the evolution of adaptations to dry habitats.Entities:
Keywords: Hymenoptera; Nanopore sequencing; genome annotation; genome assembly; polygyny; social insect
Year: 2021 PMID: 33561225 PMCID: PMC8022709 DOI: 10.1093/g3journal/jkaa019
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Overview of the annotation workflow. The workflow includes a construction of the transcript assembly (upper part) and a pipeline for the genome annotation (lower part). The transcript assembly and annotations from related species are providing evidence for the annotation of PCGs.
Figure 2Raw read coverage effect on assembly size and quality. Please note that assembly size is provided in mega base pairs.
Comparison of genome assemblies of related insect species
| Parameter |
|
|
|
|
|
|---|---|---|---|---|---|
|
| 241 Mb | 236 Mb | 399 Mb | 284 Mb | 250 Mb |
|
| 208,871 bp | 819,605 bp | 621,039 bp | 1,585,631 bp | 997,192 bp |
|
| 16,229 bp | 117,988 bp | 1,950 bp | 211,219 bp | 147,519 bp |
|
| 6,793 | 4,645 | 66,904 | 657 | 5,644 |
|
| 1.15 | 6.60 | 8.31 | 0.62 | 8.45 |
|
| 36.7 | 36.5 | 36.2 | 34.3 | 32.7 |
|
| n/a | GCF_000187915.1 | GCF_000188075.2 | GCF_003227725.1 | GCF_000002195.4 |
With the exception of P. californicus, the data were taken from NCBI.
Transposable elements present in the P. californius genome
| TE-class | Number of elements | Fraction of the genome |
|---|---|---|
|
| 15,391 | 4.38% |
|
| 9,525 | 1.42% |
|
| 596 | 0.03% |
|
| 72,737 | 8.69% |
|
| 13,610 | 1.37% |
Comparison of P. californicus genome annotation with selected Hymenopteran genomes
| Species | Assembly size | Protein coding | tRNA | lncRNA | Other RNA | Total | Assembly version | Annotation version |
|---|---|---|---|---|---|---|---|---|
|
| 296 Mb | 11,219 | 159 | 1,210 | 449 | 13,037 | Aech_3.9 | 100 |
|
| 233 Mb | 12,512 | 208 | 1,243 | 696 | 14,659 | Cflo_v7.5 | 102 |
|
| 260 Mb | 11,048 | 212 | 570 | 493 | 12,323 | ASM131382v1 | 100 |
|
| 335 Mb | 12,654 | 230 | 1,385 | 928 | 15,197 | Hsal_v8.5 | 102 |
|
| 220 Mb | 11,610 | 178 | 1,411 | 655 | 13,854 | Lhum_UMD_V04 | 100 |
|
| 326 Mb | 14,019 | 186 | 3,126 | 1,318 | 18,649 | ASM1337386v2 | 102 |
|
| 224 Mb | 11,907 | 202 | 1571 | 970 | 14,650 | Obir_v5.4 | 100 |
|
| 236 Mb | 11,348 | 201 | 1,138 | 406 | 13,093 | Pbar_UMD_V03 | 101 |
|
| 241 Mb | 15,688 | 1,180 | 931 | 79 | 17,878 | n/a | n/a |
|
| 283 Mb | 11,572 | 193 | 935 | 558 | 13,258 | ASM200609v1 | 100 |
|
| 399 Mb | 14,820 | 227 | 1,376 | 691 | 17,114 | Si_gnH | 103 |
|
| 225 Mb | 9,935 | 218 | 3,146 | 1,295 | 14,594 | Amel_HAv3.1 | 104 |
All the data are taken from NCBI’s genome database except P. californicus.
Figure 3OR gene repertoires are similar between P. californicus (N = 417 genes) and P. barbatus (N = 453 genes). Most gene models have their closest relative in the other species. The gene tree shows no large clusters containing genes exclusively of one of the two species. This is evidence for a close relatedness between the species and an equally high quality of the two genome assemblies.
Comparison of completeness and quality of Hymenopteran insects used for the annotation of P. californicus
| Species | BUSCO genome completeness (%) | BUSCO genome duplication (%) | BUSCO transcript completeness (%) | DOGMA transcript completeness (%) |
|---|---|---|---|---|
|
| 95.80 | 2.20 | 91.60 | 94.80 |
|
| 94.20 | 0.10 | 95.80 | 97.60 |
|
| 85.70 | 0.30 | 94.10 | 96.50 |
|
| 85.90 | 0.30 | 99.20 | 98.10 |
|
| 97.10 | 0.20 | 98.60 | 98.30 |
BUSCO and DOGMA analyses are based on unique sets of transcripts without duplicated sequences and soft-masked genomes were used in BUSCO assessments.