| Literature DB >> 23737327 |
Wei Zhang1, Krushnamegh Kunte, Marcus R Kronforst.
Abstract
Hybrid speciation appears to be rare in animals, yet characterization of possible examples offers to shed light on the genomic consequences of this unique phenomenon, as well as more general processes such as the role of adaptation in speciation. Here, we first generate transcriptome assemblies for a putative hybrid butterfly species, Papilio appalachiensis, its parental species, P. glaucus and P. canadensis, and an outgroup, P. polytes. Then, we use these data to infer genome-wide patterns of introgression and genomic mosaicism using both phylogenetic and population genetic approaches. Our results reveal that there is little genetic divergence among all three of the focal species, but the subset of gene trees that strongly support a specific tree topology suggest widespread sharing of genetic variation between P. appalachiensis and both parental species, likely as a result of hybrid speciation. We also find evidence for substantial shared genetic variation between P. glaucus and P. canadensis, which may be due to gene flow or ancestral variation. Consistent with previous work, we show that P. applachiensis is more similar to P. canadensis at Z-linked genes and more similar to P. glaucus at mitochondrial genes. We also identify a variety of targets of adaptive evolution, which appear to be enriched for traits that are likely to be important in the evolution of this butterfly system, such as pigmentation, hormone sensitivity, developmental processes, and cuticle formation. Overall, our results provide a genome-wide portrait of divergence and introgression associated with adaptation and speciation in an iconic butterfly radiation.Entities:
Keywords: Papilio; adaptation; hybrid speciation; introgression; transcriptome
Mesh:
Year: 2013 PMID: 23737327 PMCID: PMC3698933 DOI: 10.1093/gbe/evt090
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FDistribution of conserved clusters among the four butterfly species. Conserved clusters were retrieved from predicted CDS data sets using Blat. A total of 3,961 clusters yielded a single sequence for each species and this set of conserved clusters was the core data set for subsequent analyses. Each species is depicted with images of female wing pattern phenotypes.
FPatterson’s D-statistic suggests widespread introgression between Papilio appalachiensis and the putative parental species. We calculated a transcriptome-wide D-statistic value for each of three tree topologies (A–C) and found evidence for significant introgression in comparisons with P. appalachiensis (D). Results suggest roughly equal introgression between appalachiensis/canadensis, compared with appalachiensis/glaucus (D1, P = 0.715), but much more introgression between appalachiensis/canadensis and appalachiensis/glaucus, compared with glaucus/canadensis (D2 and D3, P < 0.01 for both).
FGenome-wide distribution and clustering of genes by tree topology. We mapped conserved clusters back to the genome of Heliconius melpomene (A and C) and Bombyx mori (B and D) and compared the chromosome-level distribution of clusters with a given tree topology with the null distribution given by all mapped clusters. The results of these tests are in table 6. (A, B) Tree topologies based on nucleotide alignments whereas (C) and (D) are based on peptide alignments.
Genomic Clustering of Genes Based on Inferred Tree Topology
| Tree Topology | ||||
|---|---|---|---|---|
| Nucleotide Alignment | Peptide Alignment | Nucleotide Alignment | Peptide Alignment | |
| ((A,C),G,P) | 0.082 | 0.001 | 0.189 | 0.011 |
| ((A,G),C,P) | 0.033 | 0.121 | 0.029 | 0.014 |
| ((C,G),A,P) | 0.005 | 0.001 | 0.149 | 0.066 |
Note.—P values reported above are based on Spearman’s rank correlation tests, comparing the chromosomal distributions of clusters with a given tree topology to the distribution of all clusters, using both H. melpomene and B. mori as a reference for chromosomal locations. Tree topologies were inferred using both nucleotide and peptide alignments.
Transcriptome Assembly Results
| Sample | Total Transcripts | Longest Isoform | Predicted CDS | Unique Genes |
|---|---|---|---|---|
| 102,375 | 53,198 | 36,879 | 10,179 | |
| 146,954 | 76,471 | 48,092 | 10,624 | |
| 124,664 | 57,509 | 43,843 | 10,240 | |
| 108,707 | 72,920 | 35,750 | 9,704 |
Tree Topologies of Z-linked Clusters
| Cluster ID | Nucleotide Alignment | Peptide Alignment | Annotation | ||
|---|---|---|---|---|---|
| Topological Structure | Topological Structure | ||||
| 105 | ((A,G),C,P) | 0.94 | ((A,C),G,P) | 1 | ww domain-containing adapter protein with coiled-coil |
| 611 | ((A,C),G,P) | 0.8 | ((A,C),G,P) | 0.77 | Scabrous protein |
| 930 | ((A,G),C,P) | 0.87 | NA | — | Putative flotillin-1 |
| 1294 | ((A,C),G,P) | 0.79 | ((A,C),G,P) | 0.83 | Secernin 3 |
| 1617 | ((A,C),G,P) | 0.83 | ((A,C),G,P) | 0.96 | Catalase |
| 1660 | ((A,C),G,P) | 0.97 | NA | — | Ankyrin repeat domain-containing protein 12 |
| 2021 | ((A,C),G,P) | 0.95 | ((A,C),G,P) | 0.75 | Disulfide-isomerase a5 |
| 2055 | ((A,C),G,P) | 0.79 | ((A,C),G,P) | 0.59 | Hepatic leukemia factor |
| 3130 | ((A,C),G,P) | 0.86 | ((A,C),G,P) | 0.77 | Serine threonine-protein kinase osr1-like |
| 3347 | ((C,G),A,P) | 0.82 | ((A,C),G,P) | 0.63 | Tyrosine hydroxylase |
| 3361 | ((A,C),G,P) | 0.68 | NA | — | Y-box protein |
| 3703 | ((A,C),G,P) | 0.72 | ((A,C),G,P) | 0.9 | Tyrosine-protein kinase abl-like |
| 4566 | ((C,G),A,P) | 0.79 | ((A,C),G,P) | 0.78 | Acetyl-synthetase |
| 4569 | ((A,C),G,P) | 0.72 | ((A,C),G,P) | 1 | Dipeptidase 1-like |
| 4894 | ((A,C),G,P) | 0.87 | ((C,G),A,P) | 0.97 | Serine threonine-protein kinase osr1-like |
| 5837 | ((A,C),G,P) | 0.78 | ((A,C),G,P) | 0.83 | Protein daughter of sevenless |
| 6828 | ((A,C),G,P) | 0.81 | ((A,G),C,P) | 0.73 | Carboxypeptidase N subunit 2-like |
| 6895 | ((A,C),G,P) | 0.94 | ((A,C),G,P) | 0.88 | Kettin |
Note.—Z-linked conserved clusters were identified by comparison with predicted CDS of Z-linked genes in the Heliconius melpomene genome sequence. SH P values were calculated based on both nucleotide and peptide alignments.
aNA indicates no best topology because of the same highest value assigned to more than one topological structure.
*Indicates the tree topology was also supported by NJ method. Most of the tree structures not supported by NJ yielded an ((A,C),G,P) structure in the NJ tree.
Tree Topologies of Mitochondrial Clusters
| Gene | Nucleotide Alignment | Peptide Alignment | ||
|---|---|---|---|---|
| Topological Structure | Topological Structure | |||
| ((C,G),A,P) | 0.824 | NA | — | |
| ((A,G),C,P) | 0.744 | NA | — | |
| ((A,G),C,P) | 0.749 | ((C,G),A,P) | 0.498 | |
| ((A,G),C,P) | 0.673 | ((A,C),G,P) | 0.547 | |
| ((A,G),C,P) | 0.748 | NA | — | |
| ((C,G),A,P) | 0.711 | ((A,C),G,P) | 0.844 | |
| ((A,G),C,P) | 0.797 | ((A,G),C,P) | 0.779 | |
| ((A,G),C,P) | 0.578 | ((C,G),A,P) | 1 | |
| ((A,C),G,P) | 0.986 | ((A,G),C,P) | 1 | |
| ((C,G),A,P) | 0.866 | ((C,G),A,P) | 0.792 | |
| ((A,G),C,P) | 0.617 | ((A,G),C,P) | 1 | |
| ((A,G),C,P) | 0.763 | NA | — | |
| ((C,G),A,P) | 0.961 | ((A,G),C,P) | 0.818 | |
| ((A,G),C,P) | 0.818 | NA | — | |
Note.—Mitochondrial conserved clusters were identified by comparison with predicted mitochondrial CDS or rRNA. SH P values were calculated based on both nucleotide and peptide alignments.
*Indicates the tree topology was also supported by NJ. Most of the tree structures not supported by NJ yielded ((A,G),C,P) structure in the NJ tree.
aNA indicates no peptide alignment because untranslated RNA sequence (12s and 16s rRNA) or no best topology because of the same highest value assigned to more than one topological structure.
Number of Conserved Clusters with Well-Supported Tree Topologies
| Topological Structure | |||
|---|---|---|---|
| ((A,C),G,P) | ((A,G),C,P) | ((C,G),A,P) | |
| Nucleotide | 71 | 58 | 50 |
| Peptide | 113 | 93 | 97 |
| Shared | 27 | 19 | 22 |
Note.—Counts were calculated based on either peptide or nucleotide alignment with the “shared” counts appearing in both groups.
Functional Enrichment of Conserved Clusters with Various Topological Structures
| Topology | GO Term | Category | Type | |
|---|---|---|---|---|
| Nucleotide alignment | ||||
| ((A,C),G,P) | GO:0016301 | Kinase activity | F | 0.005 |
| GO:0016772 | Transferase activity, transferring phosphorus-containing groups | F | 0.005 | |
| GO:0004672 | Protein kinase activity | F | 0.006 | |
| GO:0016773 | Phosphotransferase activity, alcohol group as acceptor | F | 0.006 | |
| GO:0016740 | Transferase activity | F | 0.016 | |
| GO:0006091 | Generation of precursor metabolites and energy | P | 0.028 | |
| ((A,G),C,P) | GO:0007049 | Cell cycle | P | 0.001 |
| GO:0006996 | Organelle organization | P | 0.001 | |
| GO:0071842 | Cellular component organization at cellular level | P | 0.001 | |
| GO:0071841 | Cellular component organization or biogenesis at cellular level | P | 0.001 | |
| GO:0051716 | Cellular response to stimulus | P | 0.002 | |
| GO:0050794 | Regulation of cellular process | P | 0.002 | |
| GO:0007165 | Signal transduction | P | 0.002 | |
| GO:0050896 | Response to stimulus | P | 0.005 | |
| GO:0009987 | Cellular process | P | 0.006 | |
| GO:0007005 | Mitochondrion organization | P | 0.008 | |
| GO:0023052 | Signaling | P | 0.009 | |
| GO:0065007 | Biological regulation | P | 0.009 | |
| GO:0006811 | Ion transport | P | 0.010 | |
| GO:0005215 | Transporter activity | F | 0.011 | |
| GO:0030234 | Enzyme regulator activity | F | 0.012 | |
| GO:0016043 | Cellular component organization | P | 0.012 | |
| GO:0071840 | Cellular component organization or biogenesis | P | 0.012 | |
| GO:0032501 | Multicellular organismal process | P | 0.020 | |
| GO:0007275 | Multicellular organismal development | P | 0.020 | |
| GO:0050789 | Regulation of biological process | P | 0.022 | |
| GO:0032502 | Developmental process | P | 0.041 | |
| ((C,G),A,P) | GO:0045182 | Translation regulator activity | F | 0.027 |
| GO:0035556 | Intracellular signal transduction | P | 0.040 | |
| Peptide alignment | ||||
| ((A,C),G,P) | GO:0005623 | Cell | C | 0.000 |
| GO:0044464 | Cell part | C | 0.004 | |
| GO:0005622 | Intracellular | C | 0.005 | |
| GO:0007267 | Cell–cell signaling | P | 0.015 | |
| GO:0005811 | Lipid particle | C | 0.019 | |
| GO:0016209 | Antioxidant activity | F | 0.028 | |
| GO:0007154 | Cell communication | P | 0.049 | |
| ((A,G),C,P) | GO:0008283 | Cell proliferation | P | 0.011 |
| GO:0007005 | Mitochondrion organization | P | 0.022 | |
| GO:0004518 | Nuclease activity | F | 0.023 | |
| GO:0030528 | Transcription regulator activity | F | 0.027 | |
| GO:0016032 | Viral reproduction | P | 0.043 | |
| GO:0016788 | Hydrolase activity, acting on ester bonds | F | 0.046 | |
| ((C,G),A,P) | GO:0007005 | Mitochondrion organization | P | 0.024 |
aF, P, and C stand for molecular function, biological process, and cellular component, respectively.
Functional Enrichment of Conserved Clusters under Positive Selection between Ingroup and Outgroup
| GO Term | Category | Type | |
|---|---|---|---|
| GO:0006306 | DNA methylation | P | 0.004 |
| GO:0006305 | DNA alkylation | P | 0.004 |
| GO:0006304 | DNA modification | P | 0.004 |
| GO:0051238 | Sequestering of metal ion | P | 0.004 |
| GO:0045448 | Mitotic cell cycle, embryonic | P | 0.005 |
| GO:0043169 | Cation binding | F | 0.008 |
| GO:0043167 | Ion binding | F | 0.008 |
| GO:0071383 | Cellular response to steroid hormone stimulus | P | 0.008 |
| GO:0030003 | Cellular cation homeostasis | P | 0.012 |
| GO:0030684 | Preribosome | C | 0.012 |
| GO:0046915 | Transition metal ion transmembrane transporter activity | F | 0.012 |
| GO:0070851 | Growth factor receptor binding | F | 0.012 |
| GO:0035186 | Syncytial blastoderm mitotic cell cycle | P | 0.012 |
| GO:0008173 | RNA methyltransferase activity | F | 0.012 |
| GO:0004887 | Thyroid hormone receptor activity | F | 0.012 |
| GO:0007394 | Dorsal closure, elongation of leading edge cells | P | 0.012 |
| GO:0046914 | Transition metal ion binding | F | 0.013 |
| GO:0046872 | Metal ion binding | F | 0.013 |
| GO:0055080 | Cation homeostasis | P | 0.017 |
| GO:0008270 | Zinc ion binding | F | 0.020 |
| GO:0007050 | Cell cycle arrest | P | 0.021 |
| GO:0032870 | Cellular response to hormone stimulus | P | 0.022 |
| GO:0006726 | Eye pigment biosynthetic process | P | 0.024 |
| GO:0031163 | Metallo-sulfur cluster assembly | P | 0.024 |
| GO:0033301 | Cell cycle comprising mitosis without cytokinesis | P | 0.024 |
| GO:0031099 | Regeneration | P | 0.024 |
| GO:0016226 | Iron–sulfur cluster assembly | P | 0.024 |
| GO:0000794 | Condensed nuclear chromosome | C | 0.024 |
| GO:0072503 | Cellular divalent inorganic cation homeostasis | P | 0.024 |
| GO:0007392 | Initiation of dorsal closure | P | 0.024 |
| GO:0071495 | Cellular response to endogenous stimulus | P | 0.027 |
| GO:0043324 | Pigment metabolic process involved in developmental pigmentation | P | 0.038 |
| GO:0006497 | Protein lipidation | P | 0.038 |
| GO:0042441 | Eye pigment metabolic process | P | 0.038 |
| GO:0042158 | Lipoprotein biosynthetic process | P | 0.038 |
| GO:0042157 | Lipoprotein metabolic process | P | 0.038 |
| GO:0009826 | Unidimensional cell growth | P | 0.038 |
| GO:0072507 | Divalent inorganic cation homeostasis | P | 0.038 |
| GO:0043474 | Pigment metabolic process involved in pigmentation | P | 0.038 |
| GO:0071156 | Regulation of cell cycle arrest | P | 0.038 |
| GO:0003707 | Steroid hormone receptor activity | F | 0.039 |
| GO:0005615 | Extracellular space | C | 0.039 |
| GO:0004879 | Ligand-activated sequence-specific DNA binding RNA polymerase II transcription factor activity | F | 0.039 |
| GO:0043401 | Steroid hormone–mediated signaling pathway | P | 0.039 |
| GO:0048545 | Response to steroid hormone stimulus | P | 0.045 |
| GO:0009755 | Hormone-mediated signaling pathway | P | 0.049 |
| GO:0048066 | Developmental pigmentation | P | 0.049 |
Note.—GO terms enrichment of conserved clusters with Ka/Ks ratios above one in all three ingroup versus outgroup comparisons.
aF, P, and C stand for molecular function, biological process, and cellular component, respectively.
Annotation of Clusters under Positive Selection among Ingroup Taxa
| Cluster ID | Annotation | |
|---|---|---|
| A vs. C >1, A vs. G >1, C vs. G <1 | 869 | Histone h1-like |
| 1537 | Splicing factor arginine serine-rich 6 | |
| 4761 | Cuticle protein BmorCPR83 (BmEdg84A) | |
| 5014 | Polo | |
| 6025 | Spinophilin-like | |
| 6201 | Uncharacterized protein KIAA1841-like | |
| A vs. C >1, C vs. G >1, A vs. G <1 | 621 | Zinc finger protein on ecdysone puffs |
| 1165 | Vesicle associated | |
| 1475 | Serine protease 14 | |
| 3028 | Shaker-like potassium channel | |
| 3111 | Nuclear hormone receptor | |
| 4179 | Kinase d-interacting substrate of 220 kDa-like | |
| 5888 | NA | |
| 6892 | Hypothetical protein KGM_07109 [ | |
| A vs. G >1, C vs. G >1, A vs. C <1 | 153 | Pab-dependent poly-specific ribonuclease subunit 3-like |
| 726 | Tata-binding protein-associated phosphoprotein | |
| 2649 | NA | |
| 2702 | Ecdysone-induced protein 78c | |
| 3483 | 40s ribosomal protein s3a | |
| 3565 | Pdz and lim domain protein 3 | |
| 4364 | Follistatin | |
| 4564 | Encore protein | |
| 6586 | Tyrosine-protein kinase fps85d-like isoform 1 | |
| A vs. C >1, A vs. G <1, C vs. G <1 | 24 | Putative rRNA processing protein RRP7 |
| 114 | Hypothetical protein KGM_04049 [ | |
| 497 | Cuticular protein 76bd | |
| 754 | Hexokinase | |
| 1392 | Inositol-trisphosphate 3-kinase a-like | |
| 1836 | g-protein coupled receptor mth2-like | |
| 2064 | Cuticle protein BmorCPR141 | |
| 4222 | Ankyrin repeat domain-containing protein 57 | |
| 4286 | Hypothetical protein KGM_21585 [ | |
| 4640 | Rho guanine nucleotide exchange factor 7-like isoform 1 | |
| 4973 | Adipocyte plasma membrane-associated protein | |
| 5095 | Unknown secreted protein [ | |
| 5340 | Katanin p80 wd40-containing subunit b1 | |
| 6084 | NEDD4-binding protein 2-like | |
| A vs. G >1, A vs. C <1, C vs. G <1 | 22 | Elongation factor 1 delta |
| 36 | Chondroitin 4-sulfotransferase | |
| 854 | Atp-binding cassette sub-family g member 1-like | |
| 926 | Naked cuticle-like protein | |
| 1213 | Serine proteinase-like protein 1 | |
| 2044 | xpg-like endonuclease | |
| 2069 | RNA helicase-like protein | |
| 2478 | upf0712 protein c7orf64-like | |
| 2542 | DNA topoisomerase 3-beta-1 | |
| 2668 | Mosc domain-containing protein mitochondrial-like | |
| 5154 | Lim domain-binding protein 3 | |
| 5774 | Hypothetical protein KGM_14584 [ | |
| 5993 | DNA repair protein xp-c rad4 | |
| 6095 | Tyrosine transporter | |
| 6781 | Speckle-type poz protein | |
| C vs. G >1, A vs. C <1, A vs. G <1 | 733 | Serine protease |
| 1247 | Protein sda1 homolog | |
| 1561 | Unc-isoform a | |
| 2898 | Down syndrome cell adhesion molecule isoform d | |
| 2956 | Protein lethal denticleless-like | |
| 3055 | 12 cysteine protein 1 | |
| 3313 | Nuclear protein localization protein 4 homolog | |
| 4064 | tRNA dimethylallyltransferase mitochondrial-like | |
| 5197 | Tryptophanyl-tRNA synthetase mitochondrial-like | |
| 6801 | Acyl-CoA oxidase | |
Note—Three pairwise comparisons were made among P. glaucus, P. canadensis, and P. appalachiensis and clusters with one or two ratios >1 were selected. Highlighted cluster IDs also exhibited evidence of positive selection in comparisons between ingroup and outgroup taxa (table 5).
aNA indicates no BLASTX hit against NCBI’s nr protein database.