| Literature DB >> 33788947 |
Craig Michell1, Saskia Wutke1, Manuel Aranda2, Tommi Nyman3.
Abstract
Hymenoptera is a hyperdiverse insect order represented by over 153,000 different species. As many hymenopteran species perform various crucial roles for our environments, such as pollination, herbivory, and parasitism, they are of high economic and ecological importance. There are 99 hymenopteran genomes in the NCBI database, yet only five are representative of the paraphyletic suborder Symphyta (sawflies, woodwasps, and horntails), while the rest represent the suborder Apocrita (bees, wasps, and ants). Here, using a combination of 10X Genomics linked-read sequencing, Oxford Nanopore long-read technology, and Illumina short-read data, we assembled the genomes of two willow-galling sawflies (Hymenoptera: Tenthredinidae: Nematinae: Euurina): the bud-galling species Euura lappo and the leaf-galling species Eupontania aestiva. The final assembly for E. lappo is 259.85 Mbp in size, with a contig N50 of 209.0 kbp and a BUSCO score of 93.5%. The E. aestiva genome is 222.23 Mbp in size, with a contig N50 of 49.7 kbp and a 90.2% complete BUSCO score. De novo annotation of repetitive elements showed that 27.45% of the genome was composed of repetitive elements in E. lappo and 16.89% in E. aestiva, which is a marked increase compared to previously published hymenopteran genomes. The genomes presented here provide a resource for inferring phylogenetic relationships among basal hymenopterans, comparative studies on host-related genomic adaptation in plant-feeding insects, and research on the mechanisms of plant manipulation by gall-inducing insects.Entities:
Keywords: gall-inducing insects; genome; hybrid assembly; sawfly
Year: 2021 PMID: 33788947 PMCID: PMC8104934 DOI: 10.1093/g3journal/jkab094
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1(A) Bud gall induced by E. lappo on Salix lapponum. (B) Leaf gall induced by E. aestiva on S. myrsinifolia. (C) Larva of E. aestiva inside opened gall. (Photographs by TN).
Assembly statistics for the genomes of E. lappo and E. aestiva
|
|
| |
|---|---|---|
| 10X linked reads coverage | 66X | 135X |
| MinION nanopore coverage | 9X | 10X |
| Illumina shotgun coverage | 169X | n.a. |
| Total length (bp) | 259,850,900 | 222,225,666 |
| Number of contigs | 2,503 | 16,952 |
| Longest contig (bp) | 1,919,081 | 797,452 |
| GC-% | 40.5 | 40.25 |
| N50 | 208,956 | 49,744 |
| N75 | 102,897 | 13,796 |
| L50 | 329 | 1,156 |
| Complete BUSCOs—count (%) | 5,602 (93.5%) | 5,404 (90.2%) |
De novo repeat annotation of the E. lappo and E. aestiva genomes
|
|
| |||||
|---|---|---|---|---|---|---|
| Repeat class | Count | bp masked | % masked | Count | bp masked | % masked |
| DNA | ||||||
| DTA | 26,498 | 7,245,658 | 2.79 | 26,790 | 6,952,423 | 3.15 |
| DTC | 19,198 | 5,100,000 | 1.96 | 14,058 | 3,592,678 | 1.63 |
| DTH | 2,169 | 498,037 | 0.19 | 359 | 72,585 | 0.03 |
| DTM | 45,357 | 11,815,614 | 4.55 | 28,874 | 6,809,103 | 3.08 |
| DTT | 1,480 | 490,372 | 0.19 | 419 | 128,294 | 0.06 |
| Helitron | 13,267 | 4,941,840 | 1.90 | 20,362 | 4,941,319 | 2.24 |
| LTR | ||||||
| Copia | 9,789 | 4,041,679 | 1.56 | 2,175 | 783,202 | 0.35 |
| Gypsy | 34,191 | 19,359,338 | 7.45 | 5,264 | 2,092,539 | 0.95 |
| Unknown | 50,728 | 14,651,949 | 5.64 | 36,128 | 10,241,713 | 4.64 |
| MITE | ||||||
| DTA | 3,143 | 592,895 | 0.23 | 4,036 | 729,500 | 0.33 |
| DTC | 1,592 | 285,828 | 0.11 | 1,096 | 170,108 | 0.08 |
| DTH | 149 | 23,489 | 0.01 | 116 | 15,614 | 0.01 |
| DTM | 16,660 | 2,259,373 | 0.87 | 5,305 | 759,185 | 0.34 |
| DTT | 188 | 32,332 | 0.01 | 42 | 3,311 | 0 |
| Total | 224,409 | 71,338,404 | 27.45 | 145,024 | 37,291,574 | 16.89 |
Figure 2UpSet plot showing the number of orthogroups shared across different partitions of the included hymenopteran protein sets. Set size reflects the total number of orthogroups contained in the protein repertoire of each species, while intersection size indicates the number of orthogroups in common among species or unique to a species. Single dots in the lower panel indicate orthogroups unique to a particular species, and dots joined by lines indicate orthogroups shared across species.
Figure 3Maximum-likelihood tree of 15 hymenopteran taxa and one coleopteran outgroup (T. castaneum) based on amino acid sequences of 451 BUSCOs shared by all focal taxa. Numbers below branches indicate clade support (%) according to 1000 ultrafast bootstrap iterations (* = 100%).