| Literature DB >> 33739414 |
Rachel A Steward1, Yu Okamura2, Carol L Boggs3,4,5, Heiko Vogel2, Christopher W Wheat1.
Abstract
We report a chromosome-level assembly for Pieris macdunnoughii, a North American butterfly whose involvement in an evolutionary trap imposed by an invasive Eurasian mustard has made it an emerging model system for studying maladaptation in plant-insect interactions. Assembled using nearly 100× coverage of Oxford Nanopore long reads, the contig-level assembly comprised 106 contigs totaling 316,549,294 bases, with an N50 of 5.2 Mb. We polished the assembly with PoolSeq Illumina short-read data, demonstrating for the first time the comparable performance of individual and pooled short reads as polishing data sets. Extensive synteny between the reported contig-level assembly and a published, chromosome-level assembly of the European butterfly Pieris napi allowed us to generate a pseudochromosomal assembly of 47 contigs, placing 91.1% of our 317 Mb genome into a chromosomal framework. Additionally, we found support for a Z chromosome arrangement in P. napi, showing that the fusion event leading to this rearrangement predates the split between European and North American lineages of Pieris butterflies. This genome assembly and its functional annotation lay the groundwork for future research into the genetic basis of adaptive and maladaptive egg-laying behavior by P. macdunnoughii, contributing to our understanding of the susceptibility and responses of insects to evolutionary traps.Entities:
Keywords: zzm321990 Pieriszzm321990 ; PoolSeq; evolutionary trap; genome; long-read sequencing; polishing
Mesh:
Year: 2021 PMID: 33739414 PMCID: PMC8085124 DOI: 10.1093/gbe/evab053
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Genome assembly pipeline and metrics for 10 genomes show dramatic improvement during refinement steps. (A) Progressive increase in N50 and decrease in total contigs during polishing and merging of the nanopore assembly. Steps to refine the assembly included polishing with Illumina whole genome short reads (WGS) from a single individual and a pool of 18 individuals. (B) Assessment of the content and quality of 5,286 lepidopteran single copy orthologs shows complete, duplicated, fragmented or missing BUSCOs across the 10 assemblies (v.01–v.10). (C) Assessment of changes in genome quality using whole genome annotations shows similar effects of polishing using individual or PoolSeq Illumina data. (C) Polishing with Illumina short reads improved the ortholog hit ratio (OHR, values closer to 1 indicate a higher quality annotation).
Genes Identified in Braker2 Annotations of Pieris macdunnoughii Assemblies and Ortholog Hit Ratio (OHR) Analysis with Bombyx mori
| Annotation | ||||
|---|---|---|---|---|
|
| Unpolished | Individual Polished | Pool Polished | |
| Good transcripts | NA | 19,640 | 18,347 | 18,603 |
| Genes | 14,802 | 17,362 | 16,251 | 16,496 |
| Clustered proteins (90%) | 14,439 | 17,550 | 16,260 | 16,501 |
| Total | NA | 13,599 | 13,637 | 13,669 |
| Median OHR of longest hit | NA | 0.96 | 0.98 | 0.98 |
| Longest hits with OHR > 0.95 | NA | 7,062 | 8,203 | 8,233 |
| Median identity of longest hit | NA | 63.1% | 64.6% | 64.6% |
| Longest hits with identity > 95% | NA | 240 | 277 | 283 |
| Hits in both the unpolished and polished annotation | NA | NA | 13,524 | 13,550 |
Note.—NA, not applicable.
Chromosome-level assessment of synteny, read depth and genetic variation in Pieris macdunnoughii. (A) A circle plot showing each contig of the P. macdunnoughii v0.10 (noncolored scaffolds) and Pieris napi (colored scaffolds, showing scaffolds > 1 Mb representing 90.9% of the 318 bp assembly) assemblies, with lines between them showing aligned genomic regions of >5000 bp and 90% identity (for example, P. napi Chromosome 2 is covered by four P. macdunnoughii scaffolds, Sc0000044, Sc0000004, Sc0000047, and Sc0000054). (B) Detailed assessment of the alignments between the Z chromosome of P. macdunnoughii (Sc0000000) and aligned, unplaced P. napi scaffolds modScaffold_17_1 and modScaffold_95_1, supporting their inclusion on the Z chromosome between 3003162 and 5135852 bp and 10002530–10466150 bp, respectively. (C) The final pseudo-chromosomal P. macdunnoughii assembly (v0.10_RagTag) aligned with P. napi. (D) Consistently lower read depths of PoolSeq reads mapped to the pseudochromosomal assembly support conclusions about the Z chromosome (Pmac_chromosome_1_RagTag; colors as in B). (E) Nucleotide diversity (π), varied across the genome, as seen in representative autosomes 2, 3, and 23.