| Literature DB >> 31712581 |
Francois Olivier Hebert1, Luca Freschi2, Gwylim Blackburn2, Catherine Béliveau3, Ken Dewar4, Brian Boyle2, Dawn E Gundersen-Rindal5, Michael E Sparks5, Michel Cusson2,3, Richard C Hamelin2,6, Roger C Levesque2.
Abstract
Two subspecies of Asian gypsy moth (AGM), Lymantria dispar asiatica and L. dispar japonica, pose a serious alien invasive threat to North American forests. Despite decades of research on the ecology and biology of this pest, limited AGM-specific genomic resources are currently available. Here, we report on the genome sequences and functional content of these AGM subspecies. The genomes of L.d. asiatica and L.d. japonica are the largest lepidopteran genomes sequenced to date, totaling 921 and 999 megabases, respectively. Large genome size in these subspecies is driven by the accumulation of specific classes of repeats. Genome-wide metabolic pathway reconstructions suggest strong genomic signatures of energy-related pathways in both subspecies, dominated by metabolic functions related to thermogenesis. The genome sequences reported here will provide tools for probing the molecular mechanisms underlying phenotypic traits that are thought to enhance AGM invasiveness.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31712581 PMCID: PMC6848174 DOI: 10.1038/s41598-019-52840-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Genome annotation pipeline specifically optimized and trained to identify gene structures in Lepidopteran genomes.
Composition of the two AGM Lymantria dispar spp. genomes sequenced in this study in comparison to the sequenced genomes of EGM Lymantria dispar dispar and the well characterized sister species Bombyx mori.
| Feature | Bmo* | Ldd▯ | Lda | Ldj |
|---|---|---|---|---|
| Total size (Mb) | 481 | 865 | 921 | 999 |
| Contigs | 88,637 | 194,709 | 8,190 | 11,303 |
| Scaffolds | 43,463 | 134,446 | n.a. | n.a. |
| N50 (Kb) | 4,008 | 5,068 | 212 | 137 |
| N90 (Kb) | 0.262 | 0.245 | 73 | 82 |
| Repeat (%) | 43.6 | 36 | 59.9 | 59.6 |
| GC content (%) | 37.7 | 35.2 | 38.7 | 38.5 |
| Protein-coding genes | 15,488 | 13,331 | 19,588 | 23,292 |
| Exon (%) | 4.7 | 1.8 | 2.7 | 3.0 |
| Intron (%) | 16.1 | 17 | 27.6 | 28.4 |
| BUSCO coverage (%) | 95.5 | 89.2 | 96.5 | 98.2 |
Abbreviations: Bmo, Bombyx mori; Ldd, Lymantria dispar dispar; Lda, Lymantria dispar asiatica; Ldj, Lymantria dispar japonica.
*Reference genome ASM15162v1, accession number GCA_000151625.1 (http://ensembl.lepbase.org/), described in[17].
▯Numbers for this genome were taken from Zhang et al.[13]. Percentages of exon and introns for Bmo and Ldd were taken from this reference, while the values for Lda and Ldj were calculated based on the GFF3 files produced in this study.
BUSCO values were obtained form BUSCO v.3.0[15]. Reported values correspond to total BUSCO genes retrieved, i.e. complete (single + duplicated) and fragmented.
Figure 2Genome assembly post-processing steps. (a) Gene models were refined by (i) reducing redundancy using the program CD-HIT and (ii) assessing protein sequence similarities between L.d. asiatica and L.d. japonica using a reciprocal best hit approach with BLASTp. (b) Rooted species tree inferred from gene trees generated by orthoFinder for all amino acid orthogroups shared among 14 Lepidopteran taxa (including AGM) encompassing the Noctuoidea (orange) and Rhopalocera (violet) superfamilies of the Obtectomera clade (dashed box). The heatmap on the right represents pairwise comparisons between all species included in the analysis, showing the number of many-to-one (N:1) orthologous genes between each species pair.
Figure 3High content of repeated sequences in AGM. (a) Proportion of specific repeated element categories in L.d. asiatica (black) and L.d. japonica (white) genome assemblies. (b) Total percentage of repeat elements in the genome is positively correlated with genome size across Lepidopteran species. Data on genome sizes and repeat elements from 13 Lepidopteran species (Bombyx mori, Calycopis cecrops, Danaus plexippus, Heliconius melpomene, Leptidea sinapis, Lerema accius, Manduca sexta, Melitaea cinxia, Papilio glaucus, Papilio polytes, Papilio xuthus, Papilio sennae, Papilio rapae) were taken from Talla et al.[20] (see Supplementary Table S3). L.d. asiatica photograph taken by Alexander Schintlmeister and L.d. japonica photograph taken by Ken Walker for the Museum Victoria, PaDIL (CC BY 3.0 au).
Figure 4Conservation of genome-wide KEGG metabolic pathways. Characterization and comparisons of metabolic pathways across three different insect orders (Bombyx mori – Lepidoptera, Apis mellifera – Hymenoptera, Drosophila melanogaster – Diptera) and two well-characterized mammalian species (Mus musculus and Homo sapiens) in comparison with L.d. asiatica and L.d. japonica (Lepidoptera). Rows represent specific metabolic pathways grouped in three general categories, i.e. amino acid metabolism, carbohydrate metabolism, and lipid metabolism, as defined by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Colors represent the species-specific pathway conservation level, defined as the percentage of enzymes identified in the genome as compared to the KEGG reference pathway. Amel = Apis mellifera, Bmori = Bombyx mori, Dmel = Drosophila melanogaster, Lda = Lymantria dispar asiatica, Ldj = Lymantria dispar japonica, Hsap = Homo sapiens.