| Literature DB >> 24920480 |
Martijn J T N Timmermans1, Simon W Baxter2, Rebecca Clark3, David G Heckel4, Heiko Vogel4, Steve Collins5, Alexie Papanicolaou6, Iva Fukova7, Mathieu Joron8, Martin J Thompson9, Chris D Jiggins2, Richard H ffrench-Constant7, Alfried P Vogler3.
Abstract
The African Mocker Swallowtail, Papilio dardanus, is a textbook example in evolutionary genetics. Classical breeding experiments have shown that wing pattern variation in this polymorphic Batesian mimic is determined by the polyallelic H locus that controls a set of distinct mimetic phenotypes. Using bacterial artificial chromosome (BAC) sequencing, recombination analyses and comparative genomics, we show that H co-segregates with an interval of less than 500 kb that is collinear with two other Lepidoptera genomes and contains 24 genes, including the transcription factor genes engrailed (en) and invected (inv). H is located in a region of conserved gene order, which argues against any role for genomic translocations in the evolution of a hypothesized multi-gene mimicry locus. Natural populations of P. dardanus show significant associations of specific morphs with single nucleotide polymorphisms (SNPs), centred on en. In addition, SNP variation in the H region reveals evidence of non-neutral molecular evolution in the en gene alone. We find evidence for a duplication potentially driving physical constraints on recombination in the lamborni morph. Absence of perfect linkage disequilibrium between different genes in the other morphs suggests that H is limited to nucleotide positions in the regulatory and coding regions of en. Our results therefore support the hypothesis that a single gene underlies wing pattern variation in P. dardanus.Entities:
Keywords: Batesian mimicry; Lepidoptera; genotype–phenotype associations; polymorphism; supergene
Mesh:
Substances:
Year: 2014 PMID: 24920480 PMCID: PMC4071540 DOI: 10.1098/rspb.2014.0465
Source DB: PubMed Journal: Proc Biol Sci ISSN: 0962-8452 Impact factor: 5.349
Figure 1.(a) Examples of phenotypes displayed by P. dardanus and their presumed models. The arrangement from left to right represents the female dominance hierarchy starting with the bottom recessive f. hippocoonides to the top-dominant f. poultoni. (b) A genomic map of the H region. Inferred gene products, including possible candidate genes, are mapped relative to the ACT flanking marker [17] (see the electronic supplementary material for additional information). White and red circles denote sequence markers used to test for recombination analyses in a f. cenea—f. hippocoonides laboratory cross [17] indicating co-segregation (white) and recombination (red) with the H phenotype. Bottom: homologous regions of P. dardanus, H. melpomene (www.butterflygenome.org) and B. mori (http://sgp.dna.affrc.go.jp/KAIKO/). Predicted protein-coding genes are shown by thick red and green lines, and their directions of transcription are indicated by thin vertical lines indicating the 3′-end of the coding region. The scale bar at the right shows physical distances in kilobase. Conserved gene orders in the three species are indicated by alternating red and green shading. Grey shading links several loci that are absent from the BAC tile path, but whose presence was confirmed by next generation sequencing (NGS). Numbers 1–12 refer to loci used in the analysis of SNP associations.
Number of specimens used in this study, their phenotype, subspecies and year of sampling. Papilio dardanus dardanus from Kakamega, P. dardanus polytrophus from Mt. Kenya, P. dardanus tibullus from Watamu, Shimoni, Nguruweni or Taita Hills. Full details and specimens voucher numbers are given in the electronic supplementary material.
| subspecies (year) | (proto) | total | ||||
|---|---|---|---|---|---|---|
| polytrophus (2002–2003) | 16 | 15 | 16 | 1 | 17 | 65 |
| 2 | 6 | 2 | 1 | 3 | 14 | |
| 3 | 3 | |||||
| 10 | 2 | 3 | 1 | |||
| commercial (2008) | 1 | 1 | ||||
| total | 28 | 21 | 20 | 5 | 23 | 97 |
Gene fragments used for SNP analysis and tests of molecular evolution. No., number on physical map (figure 1); length, number of base pairs of PCR fragment; position, position of first nucleotide in fragment on the BAC tile path; missing, number of samples not sequenced, out of 97 in total; SNPs < 0.97, number of SNPs with major-allele frequency smaller than 97%. The asterisk refers to a physical position of approximately 3 kb outside of the BAC tile path determined by LR-PCR. Bold letters indicate genes that are not excluded from H by recombination analyses. McDonald–Kreitman (MK) and Hudson, Kreitman and Aguade [34] (HKA), p-values of the MK and HKA tests with P. glaucus (left) and P. polytes (right) of slash. For the HKA test, the unlinked loci were used for intraspecific comparisons and all P. dardanus f. lamborni were excluded. Jukes–Cantor correction was applied to obtain number of fixed differences between species. NA, not available. NP, not performed.
| no. | gene | length | position | missing | SNPs < 0.97 | MK ( | HKA ( | |
|---|---|---|---|---|---|---|---|---|
| 1 | 145 | — | 33 | NP/NP | NP/NP | |||
| 2 | 162 | 46 589 | 8 | 0.43/1.00 | 0.66/0.80 | |||
| 3 | 821 | 94 342 | 1 | 0.13/0.49 | ||||
| 4 | 129 | 133 078 | 7 | 0.37/0.37 | 0.46/0.88 | |||
| 5 | 513 | 192 833 | 4 | 0.34/0.17 | ||||
| 6 | 177 | 208 074 | 4 | 0.35/0.37 | 0.44/0.80 | |||
| 7 | 191 | 243 505 | 2 | 0.16/0.07 | ||||
| 8 | 335 | 286 121 | 1 | 0.52/1.00 | 0.66/0.85 | |||
| 9 | 192 | 311 476 | 0 | 0.10/1.00 | 0.68/0.67 | |||
| 10 | 140 | 319 618 | 3 | NA/0.79 | NA/0.88 | |||
| 11 | 119 | * | 1 | 0.36/0.27 | 0.37/0.48 | |||
| 12 | hypothetical protein | 202 | — | 1 | 0.70/0.33 | 0.75/0.89 | ||
| unlinked | 13 | 189 | — | 3 | 1.00/1.00 | — | ||
| 14 | 151 | — | 2 | 0.26/1.00 | — | |||
| 15 | 214 | — | 2 | 0.46/1.00 | — | |||
| 16 | 314 | — | 4 | 0.12/ | — |
Figure 2.(a) Genetic associations of SNPs with wing pattern morphs. The significance of association for each SNP with a given colour morph was assessed separately for each morph against all morphs with a lower position in the dominance hierarchy. The horizontal axis represents 11 loci of the H region (2–12) and four unlinked genes (13–16). Locus 1 (MAD) did not contain polymorphic sites. The red horizontal line represents the significance threshold for association after Bonferroni correction for multiple testing. The grey symbols correspond to f. lamborni exhibiting the genomic duplication that is in perfect association with the phenotype. The SNP association therefore extends over the full length of the duplicated region. The extent of the duplication is evident from the presence of three alleles at certain nucleotide positions (see (c), bottom panel). Note that the full SNP association outside of the en locus is exclusively correlated to f. lamborni and likely correlated with the duplicated copy. (b) Heat plot showing LD (r2) of SNPs within and between loci. The 11 loci linked to the BAC tile path are given on the left, the four unlinked loci on the right. Only comparisons significant after Bonferroni correction are visualized in by grey-scale. In general, LD was low, with the exception of intra-locus comparisons within the Solute Carrier Family member and en, and inter-locus comparisons involving the two different exons of en. (c) Number of alleles observed in f. lamborni. Top panel: SNP association within the targeted region with f. lamborni as in panel (a). Bottom panel: the y-axis gives the total number of alleles observed in a 150-bp sliding window as inferred from 454 sequence data of LR-PCR products.