| Literature DB >> 35876165 |
Kalle Tunstrom1, Christopher W Wheat1, Camille Parmesan2,3,4, Michael C Singer2,3, Alexander S Mikheyev5.
Abstract
Insects have been key players in the assessments of biodiversity impacts of anthropogenically driven environmental change, including the evolutionary and ecological impacts of climate change. Populations of Edith's Checkerspot Butterfly (Euphydryas editha) adapt rapidly to diverse environmental conditions, with numerous high-impact studies documenting these dynamics over several decades. However, studies of the underlying genetic bases of these responses have been hampered by missing genomic resources, limiting the ability to connect genomic responses to environmental change. Using a combination of Oxford Nanopore long reads, haplotype merging, HiC scaffolding followed by Illumina polishing, we generated a highly contiguous and complete assembly (contigs n = 142, N50 = 21.2 Mb, total length = 607.8 Mb; BUSCOs n = 5,286, single copy complete = 97.8%, duplicated = 0.9%, fragmented = 0.3%, missing = 1.0%). A total of 98% of the assembled genome was placed into 31 chromosomes, which displayed large-scale synteny with other well-characterized lepidopteran genomes. The E. editha genome, annotation, and functional descriptions now fill a missing gap for one of the leading field-based ecological model systems in North America.Entities:
Keywords: HiC scaffolding; climate-change model; genome; long-read sequencing
Mesh:
Year: 2022 PMID: 35876165 PMCID: PMC9348621 DOI: 10.1093/gbe/evac113
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 4.065
Fig. 1.Genome assembly assessment for the E. editha butterfly, showing improvements during genome refinement steps, the annotation, and an estimate of genome size. (A) Assessment of the content and quality of 5,286 single copy orthologs within Lepidoptera, beginning with the initial genome assembly (fly29_purged), the result of merging the genome down to a haploid copy (fly29_purged_hap; note the decrease in the number of duplicated genes D), the HiC scaffolded genome, and the final polished version (HiC_scaff_polished). After these are the BUSCO results upon the protein sets generated from the genome annotation, for all proteins including isoforms (protein_annotation), as using only the longest isoform per locus in the annotation (protein_longest_isoform). (B) Genome size estimate using k-mer counting of Illumina sequence data, showing the estimated genome size, heterozygosity, k-mer coverage, and duplication rate.
Fig. 2.Assessment of genome contiguity, showing Hi-C scaffolding results and whole genome alignment to related species. (A) Hi-C interaction matrix of the ordered scaffolds along the 31 chromosomes (B) Circos plot of whole genome alignment between M. cinxia chromosomes (colored blocks along outer edge) to E. editha chromosomes (noncolored blocks), with regions of inferred orthology indicated as colored lines between them. For example, M01_B01_H21 in maroon is M. cinxia chromosome 1, which corresponds to B. mori chromosome 1, and H. melpomene chromosome 21. These are all Z chromosomes in these species. This corresponds to E. editha scaffold 4 (Eedi_4). Each of the maroon lines connecting these two is a genomic region of alignment. This harmonic plot, of all colored lines primarily extending between single chromosomes of both species is consistent with the highly conserved nature of chromosome evolution in the Lepidoptera. The small discrepancies are likely repetitive content (or low frequency translocation events). (C) Example of phenotypic variation between a female E. editha from Rabbit meadow (left) and a male E. editha from Tamarack (right). (D) Table of sequencing and assembly summary statistics.