Literature DB >> 35355458

High-quality reference genomes of swallowtail butterflies provide insights into their coloration evolution.

Jin-Wu He1,2, Ru Zhang2, Jie Yang2, Zhou Chang1, Li-Xin Zhu3, Si-Han Lu2, Fei-Ang Xie4, Jun-Lai Mao4, Zhi-Wei Dong1, Gui-Chun Liu1, Ping Hu1, Yan Dong3, Wen-Ting Wan1,2, Ruo-Ping Zhao1, Tian-Zhu Xiong5, Jorge L León-Cortés6, Chu-Yang Mao1, Wei Zhang7, Shuai Zhan8, Jun Li1,9, Lei Chen10, Wen Wang1,2,11, Xue-Yan Li12.   

Abstract

Swallowtail butterflies (Papilionidae) are a historically significant butterfly group due to their colorful wing patterns, extensive morphological diversity, and phylogenetically important position as a sister group to all other butterflies and have been widely studied regarding ecological adaption, phylogeny, genetics, and evolution. Notably, they contain a unique class of pigments, i.e., papiliochromes, which contribute to their color diversity and various biological functions such as predator avoidance and mate preference. To date, however, the genomic and genetic basis of their color diversity and papiliochrome origin in a phylogenetic and evolutionary context remain largely unknown. Here, we obtained high-quality reference genomes of 11 swallowtail butterfly species covering all tribes of Papilioninae and Parnassiinae using long-read sequencing technology. Combined with previously published butterfly genomes, we obtained robust phylogenetic relationships among tribes, overcoming the challenges of incomplete lineage sorting (ILS) and gene flow. Comprehensive genomic analyses indicated that the evolution of Papilionidae-specific conserved non-exonic elements (PSCNEs) and transcription factor binding sites (TFBSs) of patterning and transporter/cofactor genes, together with the rapid evolution of transporters/cofactors, likely promoted the origin and evolution of papiliochromes. These findings not only provide novel insights into the genomic basis of color diversity, especially papiliochrome origin in swallowtail butterflies, but also provide important data resources for exploring the evolution, ecology, and conservation of butterflies.

Entities:  

Keywords:  Color evolution; High-quality reference genome; Papiliochromes; Swallowtail butterfly tribe

Mesh:

Year:  2022        PMID: 35355458      PMCID: PMC9113978          DOI: 10.24272/j.issn.2095-8137.2021.303

Source DB:  PubMed          Journal:  Zool Res        ISSN: 2095-8137


INTRODUCTION

Swallowtail butterflies (Papilionidae Latreille, 1802) account for less than 5% (570 species) of all butterfly species (18 000+ described species) (Van Nieukerken et al., 2011a) but are the most historically significant group due to their extensive morphological diversity (especially wing color patterns) (Wallace, 1865) and their phylogenetically important sister position to all other butterflies (Espeland et al., 2018). Studies on Papilionidae have provided fundamental insights into the phylogeny, biogeography, conservation biology, speciation, insect-plant interactions, evolution, and ecology of butterflies (Bonebrake et al., 2010; Condamine et al., 2018; Espeland et al., 2018). The family is distributed worldwide and currently includes 32 genera, seven tribes, and three subfamilies (Häuser et al., 2005; Van Nieukerken et al., 2011b). As the largest subfamily, Papilioninae includes about 490 species worldwide, but with the greatest diversity found in the tropics (Wallace, 1865; Zakharov et al., 2004). The subfamily Parnassiinae includes 80 species, which are mainly distributed in the Palearctic region (Nazari et al., 2007). The monotypic subfamily Baroniinae accommodates one species (Baronia brevicornis) endemic to Mexico (Scriber et al., 1995; Tyler et al., 1994). Of note, Batesian mimicry in Papilionidae represents an empirical model of natural selection (Nijhout, 1991). Like all butterflies, their color patterns consist of finely tiled scale mosaics, in which the color of each scale is determined by both pigment and structure (Nijhout, 1990). Scales develop from epidermal cells, with pigments produced through a developmental process that includes patterning (i.e., spatiotemporal positioning of pigments determined by patterning genes) and biochemical synthesis (determined by effector genes) (Matsuoka & Monteiro, 2018; Wittkopp & Beldade, 2009). Patterning genes regulate the distribution of pigments by directly and indirectly activating the expression of effector genes that encode the enzymes and cofactors required for pigment biosynthesis (Wittkopp & Beldade, 2009). Papiliochromes are a class of pigments uniquely synthesized by Papilionidae (Umebachi, 1985), which produce yellow, orange, and red scales that may be advantageous in avoiding predators and facilitating intraspecific communication (Koch et al., 2000; Umebachi, 1985; Wilts et al., 2012). Although papiliochromes are specific to Papilionidae, their precursors, i.e., intermediates derived from the melanin (tyrosine metabolism) and ommochrome (tryptophan metabolism) pathways and from β-alanine (Li et al., 2015), are ubiquitous in insects. However, the origin and evolution of papiliochromes in this phylogenetically important lineage remain unknown. High-quality reference genomes not only provide comprehensive evidence for inferring robust phylogeny, but also provide a genomic background to investigate the genetic basis of phenotypic trait innovations (Chen et al., 2019). However, previous studies using different datasets and methods have revealed phylogenetic incongruence among swallowtail butterflies at the species, genus, and tribe level (Allio et al., 2020; Condamine et al., 2012, 2018; Igarashi, 1984; Yagi et al., 1999). Furthermore, all currently available reference genomes of swallowtail butterflies are limited to Papilio species and one Parnassius butterfly (Iijima et al., 2018; Li et al., 2015; Lu et al., 2019; Nishikawa et al., 2015; Palmer & Kronforst, 2020; Podsiadlowski et al., 2021). Additional reference genomes from representative swallowtail butterfly tribes are required to help clarify the genomic and genetic basis of their adaptive phenotypic traits and diversity in a well-supported phylogenetic context. Here, we generated high-quality genome assemblies for 11 species representing all tribes in Papilioninae and Parnassiinae (except the monotypic subfamily Baroniinae) using long-read sequencing technology. Combined with published butterfly genomes, we performed comprehensive phylogenetic analyses at the tribe level. Comparative genomic analysis indicated that variation in patterning and transporter/cofactor genes and their regulatory elements likely contributed to papiliochrome origin and wing color diversity in swallowtail butterflies. These genomic resources will promote further studies on the molecular basis underlying rapid diversification in butterflies and contribute to butterfly diversity conservation.

MATERIALS AND METHODS

Butterflies

The 11 swallowtail butterfly species covered all tribes from the two main subfamilies (Parnassiinae and Papilioninae) of Papilionidae. The monotypic subfamily Baroniinae was excluded due to specimen inaccessibility. Live adults were collected from natural populations and returned to the laboratory in plastic bags. For each individual, wings were dissected and kept as voucher specimens. The body samples were flash-frozen in liquid nitrogen and stored at −80 °C until use. For each species, one individual was used for Illumina sequencing and another individual for long-read sequencing. Individuals of the same sex were used for both Illumina and long-reads sequencing, except for two cases due to difficulty in collecting the same sex in the field. Details on all species are listed in Supplementary Tables S1, S2. In addition, at least one dried specimen was retained as a voucher specimen for each sequenced species. All voucher specimens were deposited in the Kunming Institute of Zoology, Chinese Academy of Sciences, China.

Genome sequencing, assembly, and annotation

All species were sequenced to obtain long reads using PacBio SMRT or Nanopore PromethION technology and to obtain short reads using Illumina next-generation methods. Assembly was performed with WTDBG v1.2.8 (Ruan & Li, 2020), and evaluated using three methods, i.e., comparison of estimated and actual assembled genome sizes, BUSCO (Benchmarking Universal Single-Copy Orthologs) analysis, and mapping ratio calculations of both Illumina and long reads. Repeat elements were annotated by combining the results of Tandem Repeat Finder v4.07b (Benson, 1999), RepeatMasker v4.0.5, and RepeatModeler v1.0.4 (Smit et al., 2015) based on Repbase TE library v16.02 (Bao et al., 2015) and LTR_FINDER v1.0.6 (Xu & Wang, 2007). Genes were de novo annotated using SNAP (Korf, 2004), GENSCAN v1.0 (Burge & Karlin, 1997), GlimmerHMM v3.0.3 (Majoros et al., 2004), and AUGUSTUS v2.5.5 (Stanke et al., 2006), combined with homology-based analysis using TBLASTN v2.2.26 (Altschul et al., 1997). EvidenceModeler v1.1.1 (Haas et al., 2008) was used to merge de novo and homology gene sets into comprehensive, non-redundant gene sets.

Phylogenetic analyses

To investigate Papilionidae phylogeny, especially at the tribe level, 17 Papilionidae species covering all tribes of Parnassiinae and Papilioninae, including the 11 genomes sequenced in this study and six previously published genomes, were included in phylogenetic analysis (Supplementary Table S2). Kallima inachus (Nymphalidae) (Yang et al., 2020a) was used as an outgroup. The phylogenetic trees were constructed using RAxML v8.2.10 (Stamatakis, 2014) based on different datasets, including multiple whole-genome alignments (WGAs), orthologous genes and their codon partitioned datasets, and conserved non-exonic elements (CNEs). Multiple WGAs were performed using the Cactus package with default parameters (https://github.com/ComparativeGenomicsToolkit/cactus) (Armstrong et al., 2020). ASTRAL-III v5.6.3 (Zhang et al., 2018) was used to perform coalescent species tree estimation. DiscoVista v1.0 (Sayyari et al., 2018) was used to quantify and visualize gene tree discordance for alternative topologies.

Hybridization inference and incomplete lineage sorting (ILS) simulation

To investigate hybridization inference and the effect of ILS on incongruent phylogeny, a representative species was selected in each Papilionidae tribe, with K. inachus used as an outgroup for analysis. Hybridization inference among tribes was examined using PhyloNet v3.6.8 (Wen et al., 2018), PhyloNetworks (Solís-Lemus et al., 2017), and ABBA-BABA test (D-statistics) using the “qpDstat” command in the AdmixTools v1.01 (Zou & Zhang, 2015). The ILS simulation was performed following Wang et al. (2018) under the multispecies coalescent model.

Genomic features related to swallowtail butterfly evolution

PhyloFit v1.4 (Hubisz et al., 2011) and phastCons v1.4 (Siepel et al., 2005) were used to infer CNEs in the swallowtail butterflies. Positively selected genes (PSGs) and rapidly evolving genes (REGs) were identified using branch and branch site models in the PAML package (Yang, 2007). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of the above genes/gene families was performed using the KEGG Orthology-Based Annotation System (KOBAS) v3.0 (Wu et al., 2006) or the Omicshare platform (http://www.omicshare.com/tools/).

Genomic variations related to swallowtail butterfly pigmentation

We investigated genomic variations in those genes related to pigment position in space and time (patterning genes) and to biochemical synthesis of pigments (effector genes) in swallowtail butterflies. BLASTP homologous searches (E-value<1e-5) and conserved domain scanning in HMMER v3.2.1 (Finn et al., 2015) were performed against the reference genomes of multiple butterfly species (17 swallowtail butterflies, dead-leaf butterfly (K. inachus), and three moths (Trichoplusia ni, Hyposmocoma kahamanoa, and Cydia pomonella)) to identify candidate patterning and effector genes. In addition, gene trees were constructed using RAxML v8.2.10 (Stamatakis, 2014) and homologous searches in the NCBI protein database were used to exclude those candidate genes still in doubt. Finally, the evolutionary features of the obtained candidate genes/gene families were associated with the above results, including Papilionidae-specific CNEs (PSCNEs), PSGs, and REGs. Specifically, Papilionidae-specific transcription factor binding sites (TFBSs) in the upstream regulatory regions of the genes were predicted using JASPAR 2020 (http://jaspar.genereg.net/search?q=&collection=CORE&tax_group=insects) (Fornes et al., 2020). Cytoscape v3.8.0 (Shannon et al., 2003) was used to integrate gene relationships (i.e., patterning genes, effector genes, and predicted TFs) into a network. Details on the methods mentioned above are described in the Supplementary Information.

RESULTS

We de novo assembled the genomes of 11 species representing all tribes in Papilioninae and Parnassiinae (excluding the monotypic subfamily Baroniinae) (Supplementary Table S2) using long-reads (Nanopore/PacBio: ~56–174× coverage) and short-reads (Illumina: ~75–203× coverage) sequencing technologies (Supplementary Table S1). The sizes of the assembled genomes varied from 240 Mb (Papilioninae: Papilionini: Papilio demoleus) to 1176 Mb (Parnassiinae: Parnassini:Parnassius orleans), consistent with their estimated genome sizes using k-mer (Table 1) and C-value analysis (Liu et al., 2020). The long scaffolds (N50: 2.3–14.8 Mb) and high BUSCO scores (89.9%–98.1%) suggested high contiguity and completeness of all assemblies. After masking repeats (Supplementary Table S3: 30.5%–64.8%), we used both de novo- and homology-based gene prediction to annotate the genomes. The final annotated genes among swallowtail butterflies varied from 12956 to 19416 (Supplementary Table S3). The gene structures were similar to each other (Supplementary Figure S1 and Table S3), indicating that our annotations were fairly comprehensive. In addition, our results indicated that Parnassius orleans, which lives in high altitude habitat, possessed the largest genome (1176 Mb) in Papilionidae, which is consistent with that of another Parnassius butterfly (P. apollo) (1392 Mb) (Podsiadlowski et al., 2021).
Table 1

Genome assembly and quality estimation of swallowtail butterflies

SubfamilyTribeSpeciesK-mer Genome assemblyReads mapping ratio (%)
Estimated size (Mb) Assembly size (Mb) Scaffold N50 (Mb) Complete BUSCO ratio (%) Illumina reads Long reads
ParnassiinaeParnassiniParnassius orleans 1 1151 1763.292.098.396
LuehdorfiiniLuehdorfia chinensis 6786562.390.498.993
ZerynthiniSericinus montelus 5715675.595.897.588.4
Bhutanitis thaidina 44944914.89899.199
PapilioninaeLeptocirciniLamproptera curius 5505952.589.995.885.8
TeinopalpiniTeinopalpus imperialis 53550412.595.897.193.0
TroidiniTroides helena 33633010.596.697.478.1
Byasa hedistus 2712889.298.193.498.5
PapilioniniMeandrusa payeni 40639212.596.398.897.5
Papilio demoleus 2402409.197.597.899.0
Papilio protenor 2142465.497.497.798.7

High-quality genomic data revealed a well-supported swallowtail butterfly phylogeny

Phylogeny is fundamental for understanding the evolution of the diverse phenotypes of swallowtail butterflies. High-quality genomic data have unmatched advantages in phylogenetic analysis (Chen et al., 2019; Lü et al., 2021; Yuan et al., 2021). In this study, we reconstructed the phylogenetic trees for 17 swallowtail butterflies covering all tribes of Papilionidae, except for the monotypic subfamily Baroniinae. We first established a WGA tree with RAxML v8.2.10 (Stamatakis, 2014) using K. inachus as an outgroup. In total, ~45 Mb of orthologous syntenic sequences were obtained from the WGA, yielding a whole-genome tree with 100% bootstrap support for all nodes (Figure 1A). We then performed phylogenetic analyses with different genomic partitions, including the amino acid sequences of 12 274 one-to-one orthologous protein-coding genes and their derived sequences (four-fold degenerate (4d) sites, all concatenated codons, third position only (3rd), first-second positions only (1st-2nd), third and first-second positions (1st-2nd+3rd)), and CNEs. Aside from codon position (1st-2nd and 1st-2nd+3rd) (Supplementary Figure S2B), all other tree topologies (Supplementary Figure S2A) were identical to that of the WGA tree (Figure 1A), consistent with the tree based on 6 621 orthologs (Allio et al., 2020). The discordant tree based on codon site partitions (Supplementary Figure S2B) was observed in a previous butterfly phylogenetic tree based on 350 genes (Espeland et al., 2018), which may result from high-base compositional heterogeneity of codon-positions in protein-coding genes (Jarvis et al., 2014).
Figure 1

Phylogeny of swallowtail butterflies

Phylogeny of swallowtail butterflies A: Maximum-likelihood (ML) phylogenetic tree from ~45 Mb of syntenic whole-genome alignment (WGA) sequences of 17 swallowtail butterfly species (Papilionidae) and Kallima inachus (outgroup). One hundred bootstraps were used to compute node support, with all nodes showing 100% support. B: Superimposed ultrametric gene trees in a consensus DensiTree plot. Window-based gene trees (WGTs) and multispecies coalescent simulated gene trees are shown as observed and simulated, respectively. We further assessed genome-wide tree heterogeneity by randomly extracting 2 000 genomic 5 kb windows from WGAs to generate window-based gene trees (WGTs). Phylogenetic discordance at the tribe level was also pervasive across genomic regions (Figure 1B). Accounting for WGT discordance, we performed a coalescent-based method for inferring species trees from WGTs with ASTRAL-III v5.6.3. The results (Supplementary Figure S2A) were identical to the WGA tree, thus supporting the robustness of the inferred WGA phylogeny of the swallowtail butterflies. Nevertheless, the significant differences in quartet frequencies of branch 12 (Leptocircini+all other Papilioninae) in the two alternative topologies (Supplementary Figure S3) suggested that the incongruent positioning of L. curius (Leptocircini) may be caused by ILS and hybridization (Sayyari et al., 2018). We performed ILS simulation using a multispecies coalescent-based method (Wang et al., 2018). The F-statistic (P<2.2e-16), adjustedR2 (0.791), and DensiTree plot of the topology results (Figure 1B) between the simulated and observed gene trees indicated that ILS contributed to the incongruent placement among tribes. We also performed hybridization inference using two methods. PhyloNet and PhyloNetworks analyses based on the WGTs (Supplementary Figure S4A–E) suggested the occurrence of ancient gene flow between Leptocircini (L. curius) and other tribes. The classic ABBA-BABA test (Supplementary Figure S4F, G) also suggested that phylogenetic discordance may be caused by ancient gene flow. Taken together, both ILS and gene flow may have jointly contributed to the observed topological discordances in the swallowtail butterfly phylogeny, similar to the findings reported for Heliconius (Edelman et al., 2019).

PSCNEs promote evolution of swallowtail butterflies

Using nine outgroups (Supplementary Table S2), we identified 230 528 CNEs (≥20 bp) (15.55 Mb) in Papilionidae (Figure 2A). Among them, 65% (150 029 CNEs, 10.33 Mb) were PSCNEs, including 46.53% Type I PSCNEs (107 265 CNEs, 5.58 Mb), in which sequences were not found in outgroups, and 18.55% Type II PSCNEs (42 764 CNEs, 4.75 Mb), which had orthologous sequences in at least three outgroup species but were only conserved in Papilionidae. Both CNEs and PSCNEs were mainly located in the intergenic regions, with ~10% of them observed 2 kb upstream of genes (Figure 2B). Furthermore, 95.26% (142 924 CNEs) of PSCNEs contained the TFBSs (10 248 879) of 585 known transcription factors (TFs) (Supplementary Table S4). The neighboring genes associated with these TFBSs in PSCNEs were enriched in several important signaling pathways (e.g., Wnt, Ras, and Rap1 signaling pathways), transport and catabolism (e.g., endocytosis), and endocrine system (melanogenesis) (Figure 2C; Supplementary Table S5). Wnt signaling underlies the evolution and development of butterfly wing pattern diversity (Martin et al., 2012; Martin & Reed, 2014). Both Rap and Ras are members of the small GTPase family and play important roles in the morphogenesis of fruit flies (Mishra et al., 2005; Reiner & Lundquist, 2018). Taken together, these results suggest that PSCNEs and associated genes may play important roles in trait evolution of swallowtail butterflies, especially morphological evolution.
Figure 2

Evolution of conserved non-exonic elements (CNEs) among swallowtail butterfly genomes

Evolution of conserved non-exonic elements (CNEs) among swallowtail butterfly genomes A: Length and number of CNEs and Papilionidae-specific CNEs (PSCNEs). B: Location distribution of CNEs and PSCNEs in different genomic regions. C: KEGG enrichment (P<0.05) of neighboring genes physically associated with transcription factor binding sites (TFBSs) in PSCNEs. We selected top 20 KEGG terms ranked by count in PSCNEs. Pathways possibly related to color diversity in butterflies are highlighted in bold.

Evolution of genes and gene families related to unique and diverse traits of swallowtail butterflies

We acquired a high-confidence gene set (7 126 single-copy orthologs) for the 17 swallowtail butterflies and four outgroups (Supplementary Table S2) based on conserved genome synteny (Chen et al., 2019). The lineage-specific evolutionary rates for each branch were run for each ortholog using the CODEML program in the PAML package (Yang, 2007). Results demonstrated that the average evolutionary rate (Ka/Ks) within the swallowtail butterfly species (0.056) was twice as fast as that of the outgroups (0.025) (Supplementary Figure S5A). The optimized branch-site model and branch model in the PAML package were used to detect PSGs and REGs along specific lineages, respectively. In total, 264 PSGs and 402 REGs were identified in the ancestral branch of Papilionidae (Supplementary Figure S5B) and were functionally enriched in several important signal transduction pathways (e.g., Ras, Rap1, and Notch signaling pathways) and membrane transport (e.g., adenosine triphosphate (ATP)-binding cassette (ABC) transporter) (Supplementary Table S6). Members of the small GTPase family, including Rap, Ras, and Rab, and the Notch signaling pathway play important roles during morphological development of the fruit fly eye (Mishra et al., 2005; Sundaram, 2005). ABC transporters (especially ABCG members) and Rab are crucial proteins for the transport the pigment precursors or related metabolites from the cytoplasm to subcellular pigment granules (Bhuin & Roy, 2014; Lloyd et al., 1998). Combined with the PSCNE and TFBS results mentioned above, these findings suggest that the pathways related to transportation and signaling may be related to the evolution of the unique and diverse traits of swallowtail butterflies.

Lineage-specific evolution of patterning and effector genes may contribute to papiliochrome origin in swallowtail butterflies

We explored the evolution of both cis-regulatory and protein-coding regions in patterning genes/gene families and effector genes (enzymes and transportation-related proteins) that may contribute to papiliochrome biosynthesis (Figure 3). Our results (Supplementary Figures S6–S8 and Tables S7–S9) indicated that most patterning genes (e.g., wingless (wg), wnt2,wnt6,wnt10, abdominal A (abd-A), cortex, optix, aristaless 1 and 2 (art1/art2), engrailed/invected (en/inv),hairless) contained PSCNEs in their upstream cis-regulatory regions (0–5 kb). Furthermore, Papilionidae-specific TFBSs were also predicted in the PSCNE regions of many genes (e.g., Figure 3: art1/art2,abd-A, en/inv,optix,wg/wnt10), with each TFBS having a binding site for 1–44 TFs. It is worth mentioning that en/inv are highly conserved in Lepidoptera (Figure 4A). Taking en as an example, we identified four PSCNEs in its upstream region (0–5 kb) and six Papilionidae-specific TFBSs (TTATTGAA (hematopoietically expressed homeobox (HHEX)), TAAT (ultrabithorax (Ubx)), G[T/C]AATTAAG, [G/A]AACA (araucan (ara)), [A/G][C/T]GGCGCC (brinker (brk)), and TAAC[G/A]A (ladybird early (lbe))) in its PSCNEs (Figure 4B; Supplementary Tables S8, S9). Notably, one TFBS (G[T/C]AATTAAG) could bind to 31 TFs, including abd-A, Ubx, Distal-less (Dll), en, apterous (ap), and Drop (Dr/msh), which are related to wing disc development in Drosophila and butterflies (Gorfinkiel et al., 1997; Halfon, 2017; Milán et al., 2001; Tong et al., 2014; Villa-Cuesta & Modolell, 2005). As a paralogous gene, inv also evolved many PSCNEs and Papilionidae-specific TFBSs (Supplementary Figure S6).
Figure 3

Schematic of papiliochrome biosynthesis

Figure 4

Evolution of papiliochrome biosynthesis-related genes

Schematic of papiliochrome biosynthesis Firstly, spatiotemporal expression of papiliochromes is mediated by different transcription factors (TFs) and patterning genes (above). Effector genes include two major categories. Enzymes (bottom left) that control biochemical synthesis of papiliochrome precursors. Transporters (such as ATP-binding cassette (ABC) transmembrane transporters and small G-proteins (Rab GTPases)) and their cofactors (bottom right) possibly contribute to the transport of pigment precursors to pigment granules, where different pigment precursors meet to combine into papiliochromes. art1/art2, two copies of aristaless; abd-A, abdominal A;en/inv paralogous genes engrailed andinvected; wg, wingless; ABCGs, subfamily of ABC; GEF, guanine nucleotide exchange factor; GAP, GTPase-activating protein; E3, E3 ubiquitin ligase. Evolution of papiliochrome biosynthesis-related genes A: Syntenic relationships of orthologous patterning genes (en/inv: engrailed and invected) among Lepidoptera were identified and visualized using MCScan. Species abbreviations with light blue and light brown backgrounds in tree represent species in Papilionidae (ingroups) and outgroups, respectively (same in panels B and D). B: Evolution of PSCNEs and Papilionidae-conserved TFBSs in upstream region (0–5 kb) of patterning gene (en). C: Phylogenetic tree of effector genes (small GTPases: Rab family). Papilio bianor and Drosophila melanogaster represent whole ingroup species (Papilionidae) and outgroup species, respectively. Genes labeled with gene ID evolved into Papilionidae-specific TFBSs in their upstream regions, or evolved Papilionidae-specific positively selected sites in their coding regions. Red arrow indicates Lepidoptera-specific (Lep) clade. REG, rapidly evolving gene; PSG, positively selected gene. D: Evolution of positively selected effector gene (Rab: Pb_06387) in Papilionidae. E: Network of patterning and effector genes in pigmentation and predicted TFs in their upstream regulatory regions. Size of different nodes is related to number of connections among nodes. Hub genes (number of connection nodes exceeding 10) in gene regulatory network are highlighted in red. We next investigated the effector genes/gene families encoding the enzymes in papiliochrome biosynthesis (Figure 3; Supplementary Tables S7–S9) (Li et al., 2015). Results revealed no positive selection or rapid evolution for any gene encoding the enzymes involved in papiliochrome biosynthesis. Furthermore, except forblack and laccase (Pb_03719/CG32554 and Pb_14699), no PSCNEs or Papilionidae-specific TFBSs were identified in the upstream regulatory region (0–5 kb) of the genes. These findings suggest conserved evolution in both the protein-coding and cis-regulatory regions of these enzyme-coding genes. We then investigated those effector genes/gene families that potentially function as transporters/cofactors of pigment precursors (Figure 3; Supplementary Tables S7–S9). Results demonstrated that six ABC genes (Supplementary Figure S9A), four Rab genes (Figure 4C), three guanine nucleotide exchange factor (GEF) genes, one GTPase activating protein (GAP) encoding gene, and eight E3 ubiquitin ligase (E3) encoding genes showed positive selection in Papilionidae (Supplementary Table S8). Among them, Papilionidae-specific positively selected sites were observed in five transporter/cofactor genes (ABCG: Pb_03311; Rab: Pb_06387; GEF: Pb_09814; and E3 E3: Pb_08975, and Pb_09937) (Figure 4D; Supplementary Figures S9B, S10 and Table S8). In addition, several PSCNEs and/or Papilionidae-specific TFBSs were identified in the upstream regulatory regions of certain ABCG, Rab, GEF, GAP, and E3 genes (Supplementary Figures S9, S11, S12, and Table S8). We finally explored the relationships among patterning genes, effector genes, and all predicted TFs in their upstream regulatory regions through network analysis. Results suggested that both patterning genes and transporter/cofactor genes formed a complex regulatory network connected via predicted TFs. Three patterning genes (inv, en, and art2) and four predicted TFs (Ubx and iroquois gene complex (Iro-C: ara, mirror (mirr) and caupolican (caup)) were important hubs in the gene regulatory network (Figure 4E). Ubx alters the embryonic body plan and development of eyespots and hindwing patterns in Bicyclus anynana butterflies by acquiring different sets of target genes (Matsuoka & Monteiro, 2021; Tong et al., 2014). In addition, Iro-C is an essential component of the boundary between the body wall and wings in Drosophila (Villa-Cuesta & Modolell, 2005). Four predicted TFs were also connected to many patterning genes (e.g., inv, en, art2,wnt10, optix, abd-A, anddoublesex (dsx)) and some transporters (e.g., ABCGs:Pb_11403_brown and Pb_07121; Rabs: Pb_05109 and Pb_04926; GEF: Pb_01303_DENN; and E3: Pb_11739). Among them, several Rabs (Pb_05109, Pb_12217, and Pb_06852) were located in the Lepidoptera-specific clade (Figure 4C) and all contained PSCNEs and the same TF (lbe) in their 0–5 kb upstream regions (Supplementary Figures S11D, S12A, B and Table S8). Intriguingly, lbe encodes a TF involved in regulating the embryonic expression of wg (Jagla et al., 1994). The latter is associated with color pattern induction in butterfly wing pattern evolution (Martin & Reed, 2010, 2014). Double staining experiments revealed that ladybird late (lbl) and lbe were expressed in cells just anterior to those expressing en. We also predicted that lbl/lbe was located in the 0–5 kb upstream regions of en (Figure 4B). In addition, another Lepidoptera-specific Rab (Pb_06387) evolved a Papilionidae-specific positively selected site (Figure 4D: Q>C). Network analysis, together with CNE, REG, and PSG, suggested that the divergent evolution of these patterning and transporter/cofactor effector genes may have contributed to Papilionidae-specific traits such as papiliochrome biosynthesis via involvement in a complex gene regulatory network.

DISCUSSION

The prerequisite for genetic and evolutionary studies is to clarify the historical relationships among organisms. Although there have been many systematic studies of Papilionidae, debate over historical relationships still exists among swallowtail butterflies at the tribe, genus, and species level due to the quality of datasets (morphological, genetic, mitogenomic, transcriptomic, ultra-conserved element, and genomic data) or limited key taxon sampling (Allio et al., 2020; Condamine et al., 2012, 2018; Igarashi, 1984; Yagi et al., 1999). In this study, we dissected the reference genomes of 11 swallowtail butterfly species representing all tribes in Papilioninae and Parnassiinae, except subfamily Baroniinae, using long-reads sequencing technology. Including six previously published Papilio genomes (Cong et al., 2015; Iijima et al., 2018; Li et al., 2015; Lu et al., 2019), 17 reference genomes were used to infer the phylogeny of Papilionidae and investigate the genomic and genetic basis of their phenotypic trait innovations. Based on comprehensive datasets of the high-quality reference genomes, our results supported Leptocircini as a sister clade to the remaining Papilioninae, confirming that Papilioninae is monophyletic (Allio et al., 2020). To exclude base-bias of codon positions (1st-2nd and 1st-2nd+3rd) (Supplementary Figure S2B), we generated 2 000 WGT trees. Among them, the true topology ratio (species tree) accounted for only 17% and the inconsistent Leptocircini relationship ratio accounted for 13%. Apart from phylogenetic estimation error, the many anomalous gene trees revealed that swallowtail butterflies have likely undergone complicated evolutionary processes. The topology frequencies based on DiscoVista (Supplementary Figure S3), ILS simulation (Figure 1B), PhyloNet and PhyloNetworks (Supplementary Figure S4A–E), and classic ABBA-BABA analysis results (Figure 4F, G) suggested that both ILS and gene flow may have jointly contributed to the observed topological discordances in the phylogeny of swallowtail butterflies. Insufficient sampling may also have impacted the discordant phylogenetic relationships of Leptocircini (Allio et al., 2020). Moreover, deep phylogenetic incongruences between gene and species trees are pervasive in both plants and animals (Wang et al., 2018; Yang et al., 2020b). The significance of such events (ILS and gene flow) in the evolution of swallowtail butterflies is still unknown, but similar events have been reported as ecologically adaptive (Jiao et al., 2020; Mallet et al., 2016). For example, closely related Heliconius species promiscuously exchange protective color-pattern genes to mimic adaptive radiation and speciation (Edelman et al., 2019). Butterfly wing patterns are an ideal model in the field of evolutionary developmental biology (Beldade & Brakefield, 2002; Oxford & Gillespie, 1998). An important tenet of evolutionary developmental biology is that adaptive mutations affecting morphology are more likely to occur in the cis-regulatory regions of the pleiotropic developmental regulatory loci and in the target genes within the vast networks they control than in the protein-coding regions of genes (Carroll, 2008; Hoekstra & Coyne, 2007), and conserved non-exonic genomic sequences usually function in cis regulation of neighboring genes (Nelson & Wardle, 2013; Polychronopoulos et al., 2017; Vavouri et al., 2007). Our high-quality reference genomes provide feasibility and reliability for studying genome-wide CNEs in a comparative genomic context. Many PSCNEs that contain Papilionidae-specific TFBSs were observed in the swallowtail butterfly genomes, suggesting that these CNEs have contributed to adaptive evolution. Based on our findings in Papilionidae, we anticipate that more comprehensive investigations on genome-wide CNEs of more representative butterfly taxa in other families will contribute to our understanding of butterfly diversity evolution. Papiliochromes are a unique class of pigments in swallowtail butterflies (Umebachi, 1985), despite the ubiquity of their precursors, i.e., dopamine, β-alanine, and kynurenine (Li et al., 2015). Transportation of pigment precursors or related metabolites from the cytoplasm to subcellular pigment granules is a key step in insect pigmentation and is usually executed via transporter proteins, such as ABC transmembrane transporters (especially ABCG members) and small G-proteins (e.g., Rabs) (Bhuin & Roy, 2014; Lloyd et al., 1998) and their cofactors (GEFs and E3s). The activation and inactivation of Rab is tightly controlled by specific GEFs, which activate proteins by promoting guanosine diphosphate for guanosine triphosphate exchange, and by GAPs, which cause inactivation by enhancing low intrinsic GTPase activity (Bernards, 2003; Ishida et al., 2016). Abundant E3s guarantee specific substrate recognition and activate GEFs as coordinators of vesicle traffic (Hochrainer et al., 2005; Lamber et al., 2019). Thus, we hypothesize that the evolution of transporter proteins and/or their regulatory network may play key roles in papiliochrome origin. Interestingly, our comparative genomic analyses identified lineage-specific evolution of some transportation-related proteins in both their protein-coding regions and upstream regulatory regions (Supplementary Figures S9–S12 and Tables S7–S9). With this hypothesis in mind and encouraged by our comparative genomic analysis results, we focused on the evolution of certain representative patterning genes, enzyme-coding genes related to Papiliochrome biosynthesis, and transporter/cofactor genes. Our results indicated that most patterning genes contained PSCNEs in their upstream regulatory regions (0–5 kb), and Papilionidae-specific TFBSs were also predicted in these PSCNE regions, e.g., wg, wnt10, abd-A, optix, art1, art2, en, and inv. Wg and wnt10 are the main members of the wg signaling pathway, which plays multiple roles in the regulation of tissue growth, polarity, and patterning (Swarup & Verheyen, 2012). Abd-A, together with two other homeobox genes (abd-B and clawless), plays an important role in camouflage color pre-patterning in Papilio xuthus (Jin et al., 2019). Optix and aristaless (art1 and art2), two major-effect Mendelian loci, are reused in butterfly mimicry polymorphisms (Vankuren et al., 2019). En/inv also participates in mimicry diversification in Papilio dardanus (Timmermans et al., 2020). Notably, in the current study, three patterning genes (i.e., inv, en, and art2) occupied important hubs in the gene regulatory network (Figure 4E). Thus, we speculate that diverse patterning genes contribute to the diverse color evolution in swallowtail butterflies. The PSCNEs and Papilionidae-specific TFBSs in their upstream regulatory regions revealed that Papilionidae-specific regulation in patterning genes may contribute to the convergent evolution of color, such as papiliochromes. We also identified PSCNEs and Papilionidae-specific TFBSs in the upstream cis-regulatory regions (0–5 kb) of some ABCG transporters (e.g., ABCGs:Pb_11403_brown and Pb_07121; Rab transporters (Lepidoptera-specific clade): Pb_05109 and Pb_04926; a well-known pigment transporter in the ommochrome pathway: lightoid) and their cofactor genes (GEF: Pb_01303_DENN; E3: Pb_11739) (Supplementary Table S8). Furthermore, our results indicated that one ABCG transporter (Pb_03311_white), one Rab transporter (Pb_06387), and their cofactors (GEF: Pb_09814; E3: Pb_08975 and Pb_09937) have evolved to be Papilionidae-specific positively selected sites (Figure 4D; Supplementary Figures S9B, S10 and Table S8). Among these transporters/cofactors, white, brown, and scarlet are well-studied ABC members and are involved in the uptake of pigment precursors in the ommochrome and pteridine pathways in the development cells of Malpighian tubules and compound eyes (Tatematsu et al., 2011). In addition, lightoid plays an important role in eye color via participating in the biogenesis and degradation of pigment granules (Ma et al., 2004). Although our previous gene-editing experiments suggest that these four transporters do not contribute to papiliochromes inP. xuthus, they play a key role in body color mimicry in swallowtail butterfly larvae (Liu et al., 2021). These findings also suggest that other transporters (e.g., Lepidoptera-specific Rab transporter) and their cofactors may play key roles in papiliochrome origin and evolution. Thus, we deduce that the PSCNEs and Papilionidae-specific TFBSs of patterning genes may have ensured the spatiotemporal localization of papiliochromes, while PSCNEs, Papilionidae-specific TFBSs, rapid evolution, and positive selection of transporters/cofactors such as Rabs, GEFs, and E3 may have promoted the transportation of papiliochrome precursors into pigment granules, thus promoting the origin and evolution of papiliochromes.

CONCLUSIONS

We obtained high-quality reference genomes for 11 swallowtail butterflies covering all tribes of Papilioninae and Parnassiinae, thus providing important data resources for exploring the genetic basis of morphological diversity in a comprehensive phylogenetic context. We reconstructed a robust phylogeny among tribes in the presence of ILS and gene flow. We also observed that specific mutations in both protein-coding regions and regulatory elements along the Papilionidae lineage may have contributed to morphological diversity in swallowtail butterflies, especially their wing color patterns. Furthermore, we found that PSCNEs and Papilionidae-specific TFBSs of patterning and transporter/cofactor genes as well as the rapid evolution and positive selection of certain transporters/cofactors probably promoted the origin and evolution of papiliochromes in swallowtail butterflies. These findings suggest that both the cis-regulatory and protein-coding regions play important roles in morphological evolution, with changes in the former more often observed in the patterning genes of the upstream regulatory network and changes in the latter more often found in the effector genes of the downstream regulatory network. These results provide a reference for future functional studies on coloration genetics and evolution in swallowtail butterflies.

DATA AVAILABILITY

All data on the 11 swallowtail butterflies produced from our Butterfly Genome Project were submitted to NCBI (BioProjectID PRJNA714807). Illumina short reads were submitted to SRA with accession Nos.: SRR14205975–SRR14205981, SRR14205983, SRR14205993–SRR14205995. Nanopore/PacBio long reads were submitted to SRA with accession Nos.: SRR14205982, SRR14205984–SRR14205992, SRR14209577. Genome assemblies were deposited in NCBI under accession Nos. JAGSM[P-Z]000000000 and Science Data Bank (http://www.scidb.cn/cstr/31253.11.sciencedb.o00023.00001) and GSA under accession Nos.: GWHBHRU00000000, GWHBHRV00000000, GWHBHSA00000000, GWHBHSB00000000, GWHBHSC00000000, GWHBHSD00000000, GWHBHSE00000000, GWHBHSF00000000, GWHBHSG00000000, GWHBHSH00000000, GWHBHSI00000000.

SUPPLEMENTARY DATA

Supplementary data to this article can be found online. Supplementary data to this article can be found online. Click here for additional data file.

AUTHORS’ CONTRIBUTIONS

X.Y.L. and W.W. conceived the study, designed the scientific objectives, and led the project and manuscript preparation. L.C. supervised the data analyses and helped with the manuscript. J.L.M., F.A.X., and J.Y. conducted genome assembly and annotation. J.W.H., R.Z., and J.Y. performed phylogenetic analyses. R.Z., J.Y., and J.W.H. performed evolutionary analyses. R.Z. and J.W.H. performed CNE analyses. J.W.H., X.Y.L., P.H., and J.L. performed analyses on color-related genomic features. Z.C., Z.W.D., L.X.Z., Y.D., G.C.L., W.T.W., and Z.R.Z. collected and raised butterflies and prepared DNA samples for genomic sequencing. W.Z., L.Z., C.Y.M, and S.Z. helped in data analyses and manuscript improvement. T.Z.X., J.L., and L.C. helped in designing study materials and manuscript improvement. X.Y.L., J.W.H., W.W., R.Z., J.Y., S.H.L., Z.C., and L.X.Z. wrote the manuscript. All authors read and approved the final version of the manuscript.
  72 in total

1.  The molecular basis of melanism and mimicry in a swallowtail butterfly.

Authors:  P B Koch; B Behnecke; R H ffrench-Constant
Journal:  Curr Biol       Date:  2000-05-18       Impact factor: 10.834

2.  The human HERC family of ubiquitin ligases: novel members, genomic organization, expression profiling, and evolutionary aspects.

Authors:  Karin Hochrainer; Herbert Mayer; Ulrike Baranyi; Berndr Binder; Joachim Lipp; Renate Kroismayr
Journal:  Genomics       Date:  2005-02       Impact factor: 5.736

Review 3.  The locus of evolution: evo devo and the genetics of adaptation.

Authors:  Hopi E Hoekstra; Jerry A Coyne
Journal:  Evolution       Date:  2007-05       Impact factor: 3.694

4.  Prediction of complete gene structures in human genomic DNA.

Authors:  C Burge; S Karlin
Journal:  J Mol Biol       Date:  1997-04-25       Impact factor: 5.469

5.  Lightoid and Claret: a rab GTPase and its putative guanine nucleotide exchange factor in biogenesis of Drosophila eye pigment granules.

Authors:  Jinping Ma; Heide Plesken; Jessica E Treisman; Irit Edelman-Novemsky; Mindong Ren
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-02       Impact factor: 11.205

6.  Inferring Phylogenetic Networks Using PhyloNet.

Authors:  Dingqiao Wen; Yun Yu; Jiafan Zhu; Luay Nakhleh
Journal:  Syst Biol       Date:  2018-07-01       Impact factor: 9.160

7.  Parallel evolution of conserved non-coding elements that target a common set of developmental regulatory genes from worms to humans.

Authors:  Tanya Vavouri; Klaudia Walter; Walter R Gilks; Ben Lehner; Greg Elgar
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

8.  Outbred genome sequencing and CRISPR/Cas9 gene editing in butterflies.

Authors:  Xueyan Li; Dingding Fan; Wei Zhang; Guichun Liu; Lu Zhang; Li Zhao; Xiaodong Fang; Lei Chen; Yang Dong; Yuan Chen; Yun Ding; Ruoping Zhao; Mingji Feng; Yabing Zhu; Yue Feng; Xuanting Jiang; Deying Zhu; Hui Xiang; Xikan Feng; Shuaicheng Li; Jun Wang; Guojie Zhang; Marcus R Kronforst; Wen Wang
Journal:  Nat Commun       Date:  2015-09-10       Impact factor: 14.919

9.  JASPAR 2020: update of the open-access database of transcription factor binding profiles.

Authors:  Oriol Fornes; Jaime A Castro-Mondragon; Aziz Khan; Robin van der Lee; Xi Zhang; Phillip A Richmond; Bhavi P Modi; Solenne Correard; Marius Gheorghe; Damir Baranašić; Walter Santana-Garcia; Ge Tan; Jeanne Chèneby; Benoit Ballester; François Parcy; Albin Sandelin; Boris Lenhard; Wyeth W Wasserman; Anthony Mathelier
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

10.  Tiger Swallowtail Genome Reveals Mechanisms for Speciation and Caterpillar Chemical Defense.

Authors:  Qian Cong; Dominika Borek; Zbyszek Otwinowski; Nick V Grishin
Journal:  Cell Rep       Date:  2015-02-13       Impact factor: 9.423

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.