James B Pease1, Robert J Driver2, David A de la Cerda1,3, Lainy B Day4, Willow R Lindsay4,5, Barney A Schlinger6,7,8, Eric R Schuppe1, Christopher N Balakrishnan2, Matthew J Fuxjager9. 1. Department of Biology, Wake Forest University, Winston-Salem, NC 27109. 2. Department of Biology, East Carolina University, Greenville, NC 27858. 3. Department of Molecular Genetics and Genomics, Wake Forest University School of Medicine, Winston-Salem, NC 27106. 4. Department of Biology, University of Mississippi, Oxford, MS 38677. 5. Department of Biological and Environmental Sciences, University of Gothenburg, SE 405 30 Gothenburg, Sweden. 6. Department of Integrative Biology and Physiology, University of California, Los Angeles, CA 90095. 7. Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095. 8. Smithsonian Tropical Research Institute, Panama City 0843-03092, Panama. 9. Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912.
Abstract
Identifying the molecular process of complex trait evolution is a core goal of biology. However, pinpointing the specific context and timing of trait-associated changes within the molecular evolutionary history of an organism remains an elusive goal. We study this topic by exploring the molecular basis of elaborate courtship evolution, which represents an extraordinary example of trait innovation. Within the behaviorally diverse radiation of Central and South American manakin birds, species from two separate lineages beat their wings together using specialized “superfast” muscles to generate a “snap” that helps attract mates. Here, we develop an empirical approach to analyze phylogenetic lineage-specific shifts in gene expression in the key snap-performing muscle and then integrate these findings with comparative transcriptomic sequence analysis. We find that rapid wing displays are associated with changes to a wide range of molecular processes that underlie extreme muscle performance, including changes to calcium trafficking, myocyte homeostasis and metabolism, and hormone action. We furthermore show that these changes occur gradually in a layered manner across the species history, wherein which ancestral genetic changes to many of these molecular systems are built upon by later species-specific shifts that likely finalized the process of display performance adaptation. Our study demonstrates the potential for combining phylogenetic modeling of tissue-specific gene expression shifts with phylogenetic analysis of lineage-specific sequence changes to reveal holistic evolutionary histories of complex traits.
Identifying the molecular process of complex trait evolution is a core goal of biology. However, pinpointing the specific context and timing of trait-associated changes within the molecular evolutionary history of an organism remains an elusive goal. We study this topic by exploring the molecular basis of elaborate courtship evolution, which represents an extraordinary example of trait innovation. Within the behaviorally diverse radiation of Central and South American manakin birds, species from two separate lineages beat their wings together using specialized “superfast” muscles to generate a “snap” that helps attract mates. Here, we develop an empirical approach to analyze phylogenetic lineage-specific shifts in gene expression in the key snap-performing muscle and then integrate these findings with comparative transcriptomic sequence analysis. We find that rapid wing displays are associated with changes to a wide range of molecular processes that underlie extreme muscle performance, including changes to calcium trafficking, myocyte homeostasis and metabolism, and hormone action. We furthermore show that these changes occur gradually in a layered manner across the species history, wherein which ancestral genetic changes to many of these molecular systems are built upon by later species-specific shifts that likely finalized the process of display performance adaptation. Our study demonstrates the potential for combining phylogenetic modeling of tissue-specific gene expression shifts with phylogenetic analysis of lineage-specific sequence changes to reveal holistic evolutionary histories of complex traits.
Elaborate phenotypes can appear rapidly within a lineage, and sometimes repeatedly or without apparent trait precursors (1–5). Many studies of trait adaptation have focused on associations with mutations in genomic sequences, but evolution of genotype–phenotype relationships are also profoundly modulated by gene expression levels, developmental trajectories, and regulatory networks (6). Complex traits may also require a stepwise molecular evolutionary process, in which ancestral changes constrain or facilitate opportunities for subsequent diversification of the phenotype (7). However, few studies have specifically mapped the evolutionary timing, contingent order, and molecular systems involved in the appearance of a particular complex trait, an especially challenging task within the broader phylogenetic context of a lineage’s evolution.Courtship displays are some of the most elaborate traits in the animal world, incorporating constellations of bizarre and beautiful signals that stimulate a range of senses. Manakin birds (Pipridae), for example, evolved under intense sexual selection in the tropical forests of Central and South America and exemplify the extraordinary nature of courtship behaviors (8) (Fig. 1 and ). The most striking component of manakin courtship involves rapid forelimb gestures performed at fantastic speeds that would otherwise appear to exceed the limits of striated muscles in other volant birds (Fig. 1 ). Male golden-collared manakins (Manacus vitellinus) hammer their wings together above their backs at rates up to 70 snaps per second (9) (Fig. 1 ), resulting in a loud, buzzing “roll-snap” that penetrates the rainforest understory to help attract potential mates and fight off rivals. Similarly, male scarlet-crowned and red-capped manakins (Ceratopipra cornuta and Ceratopipra mentalis) court females by slapping their wings against the side of their bodies as rapidly as 50 hits per second (10), generating an unusual noise that sounds like loud hand-claps. Both of these “snap behaviors” (the roll-snap and clap) involve rapid wing movements that occur at frequencies nearly two to three times the maximum wingbeat frequency used by similarly sized birds to power flight (11, 12) (Fig. 1).
Fig. 1.
Comparison of SH-PEC muscle tissue differential gene expression in manakin species. (A) The three species with rapid wing-snaps (orange starbursts; C. cornuta [Cc], C. mentalis [Cm], M. vitellinus [Mv]) and their two closest relatives in the M5 clade (L. coronata [Lc] and P. pipra [Pp]) have more differentially expressed genes between the SH and PEC muscles compared to more distant relatives (X. atronitens [Xa] and M. oleaginous [Mo]). Points increase in size with distance from the origin for emphasis of highly differentially expressed genes. The M5 clade contains the vast majority of differentially expressed genes at both P < 10−5 (dashed line) and P < 0.01 (pie charts). Illustrations by J.B.P. (B) The maximum wingbeat frequencies for the SH-based displays (orange) in M. vitellinus and C. mentalis are up to twice as fast as the maximum PEC wingbeat (gray) in M. vitellinus and other birds of similar size (body weight ranges given in parentheses; see for sources). (C) Time series of still frames from a high-speed video of M. vitellinus showing rapid wing motion behavior with orange bursts added to highlight wing snaps ∼18 ms apart. Original video by L. Fusani and B. A. Schlinger (9).
Comparison of SH-PEC muscle tissue differential gene expression in manakin species. (A) The three species with rapid wing-snaps (orange starbursts; C. cornuta [Cc], C. mentalis [Cm], M. vitellinus [Mv]) and their two closest relatives in the M5 clade (L. coronata [Lc] and P. pipra [Pp]) have more differentially expressed genes between the SH and PEC muscles compared to more distant relatives (X. atronitens [Xa] and M. oleaginous [Mo]). Points increase in size with distance from the origin for emphasis of highly differentially expressed genes. The M5 clade contains the vast majority of differentially expressed genes at both P < 10−5 (dashed line) and P < 0.01 (pie charts). Illustrations by J.B.P. (B) The maximum wingbeat frequencies for the SH-based displays (orange) in M. vitellinus and C. mentalis are up to twice as fast as the maximum PEC wingbeat (gray) in M. vitellinus and other birds of similar size (body weight ranges given in parentheses; see for sources). (C) Time series of still frames from a high-speed video of M. vitellinus showing rapid wing motion behavior with orange bursts added to highlight wing snaps ∼18 ms apart. Original video by L. Fusani and B. A. Schlinger (9).Physiologists attribute these rapid wing display performances to the scapulohumeralis caudalis (SH) muscle, which has evolved “superfast” twitch kinetics to adduct the wing and retract the forelimb to the body (13) (Fig. 1 ). Indeed, in situ twitch speed recordings show that Manacus and Ceratopipra SH muscles generate remarkably fast contraction–relaxation cycling speeds (twitch speeds), which are sufficient to drive positive work that powers oscillatory wing movement at frequencies approaching 80 Hz (14). Extreme SH performance specializations are not apparent in other manakin species that do not perform displays involving rapid wing snaps, even if they still produce acrobatic courtship routines (14–16). These specializations are also absent in other muscles, like the pectoralis (PEC) and supracoracoideus muscles that lower and lift the wing during flight, respectively, and therefore set the limits of wingbeat frequency during powered locomotion.The scattered appearance of these extreme behaviors in separate lineages prompts multiple questions about the identity and evolution of their underlying molecular genetic systems. Previous work implicates fast calcium processing as a necessary component of such performance abilities (17, 18), with neuromuscular transmission, passive elasticity, tissue stress and repair, and energetic mobilization also expected to play substantial roles (19, 20). Other studies point to different hormonal regulatory systems as modulators of these effects, including the androgenic system and insulin signaling system (21–24). Here, we report results of whole-transcriptome expression-profile analyses from the SH and PEC muscles collected in situ from males in three snap-performing manakin species, three nonsnapping manakin species, and one closely related tyrant flycatcher outgroup (Tyrannidae). We analyze these expression profiles using both traditional differential gene expression analyses and a platform we developed called the Phylogenetic Differential Gene Expression Tool (PhyDGET). This framework builds on foundational work in analysis of gene expression evolution (25–27) and empirical studies in mammals and birds (28–30) to test for gene expression shifts in individual genes on individual phylogenetic branches. Using this phylotranscriptomic expression approach, we identify specific functional genes that support SH-mediated rapid wing movements in manakins related to calcium trafficking, tissue homeostasis, and hormone signaling. We show also that genes with trait-associated allele patterns and genes with trait-associated expression patterns are separate and complementary sets of identified functional genes. More broadly, we find evidence for emergence of a complex phenotype via temporal layers of historical clade-wide adaptations and lineage-specific refinements.
Results and Discussion
Phylogenomics of Manakin Species Indicates Close Genetic Relationships.
We sequenced whole-mRNA transcriptomes from the SH and PEC muscles from six manakin species and one flycatcher species outgroup (Fig. 1 and ). We uniquely mapped 85.4 to 91.8% of nonmitochondrial short-read pairs to the wire-tailed manakin genome (Pipra filicauda), covering an average of 60 Mb of the P. filicauda reference transcriptome at ≥3× depth (). A concatenated species tree from 29.1 Mb (≥10× coverage) for all seven species agrees with earlier consensus phylogenies (31, 32) (Fig. 1). Analysis of gene trees was consistent with a model of generally strong agreement between gene trees and the consensus species tree, and no evidence of introgressive hybridization. What phylogenomic diversity is present appears to be largely the result of low sequence divergence (0.46 to 1.25% among manakins, 3.21 to 3.31% to the outgroup), leading to many noninformative gene trees (). We concluded that the species tree is a robust estimate to use for analysis of individual gene expression shifts. For specificity and avoidance of confusion with other taxonomic names, we will refer to C. cornuta, C. mentalis, and M. vitellinus as the “snap-performing species” and clades of the six, five, and four manakin species most closely related to the snap-performing species as the M6, M5, and M4 clades, respectively (Fig. 1).
Muscle Differentiation Began before the Appearance of Snap Performance.
We quantified SH and PEC muscle expression levels for three individuals in each of our seven species, measuring 17,194 reference genes with ≥3 quantified read pairs in ≥3 individuals (77.7% of total reference genes). All 42 samples showed quantifiable gene expression in 12,850 genes (58.1%). Expression profiles were highly correlated between the two muscles and seven species, indicating a consistent baseline expression profile against which we can contrast expression shifts in individual genes (). We tested for differential expression between the SH and PEC muscles for each species separately. Given the physiological findings that the SH muscle functionally differs in speed from the PEC and supracoracoideus in two snap-performing species (C. mentalis and M. vitellinus) (12), we expected more SH-PEC expression differentiation in snapping species than nonsnapping species. Contrary to this expectation, all five M5 clade species had elevated numbers of SH-PEC differentially expressed genes compared to more distantly related Mionectes oleagineus and Xenopipo atronitens (Fig. 1 and ). We conclude that SH-PEC differentiation likely occurred ancestrally and affects all species in the M5 clade and not only snap-performing species. SH-PEC differentiation in the M5 clade is also detectable at various significance cutoffs, likely ruling out effects solely driven by gene outliers. Gene ontology (GO) overrepresentation tests find more differentially expressed genes among the three snap-performing species than in other species for muscle speed-related terms, such as “muscle filament sliding” and “actin filament-based movement” ().We next analyzed the similarity among the 21 individuals’ SH and PEC expression profiles and found similarity in SH profiles based on snap behavior rather than strictly phylogenetic relatedness. Principal component (PC) analysis of gene expression levels in each tissue showed 94.2% of SH and 96.3% of PEC expression profile variance was explained by the first two PCs (Fig. 2 ), and showed separation of the snap-performing species (C. cornuta, C. mentalis, and M. vitellinus) from the other species by PC1 and PC2 in the SH muscle but not in the PEC. Cluster analysis corroborates the distinctiveness of snap-performing species in SH expression profiles (Fig. 2). Notably, Pseudopipra pipra is closely related to the snap-performing species but is not known to have snap display behavior. P. pipra SH expression is more similar to X. atronitens and M. oleagineus, distantly related species with relatively simple displays (15, 33). Interestingly, Lepidothrix coronata does not snap but still performs an unusual aerial display incorporating acrobatic flight maneuvers, and also has the most similar expression profiles to the snap-performing species in both muscles (34). These differential expression patterns indicate that the enhancements observed physiologically in SH performance are also apparent at the molecular level by SH-specific similarities in expression profiles among snap-performing species.
Fig. 2.
Species with rapid wing movements cluster in SH, but not PEC expression profiles. (A and B) PC analysis of expression levels across all genes in the SH and PEC tissues shows that individuals from snap-performing species (stars) and nonsnap species (circles) form a cluster in SH but not PEC. Dashed circles show clusters. (C) SH (orange, Left) and PEC (Right, blue) expression levels for genes for each of the 21 individuals. A clustering topology is shown for each tissue based on their cpm expression values with k groups shown (dotted boxes). Abbreviations: Cc, Ceratopipra cornuta; Cm, Ceratopipra mentalis; Lc, Lepidothrix coronata; Mo, Mionectes oleaginous; Mv, Manacus vitellinus; Pp, Pseudopipra pipra; Xa, Xenopipo atronitens.
Species with rapid wing movements cluster in SH, but not PEC expression profiles. (A and B) PC analysis of expression levels across all genes in the SH and PEC tissues shows that individuals from snap-performing species (stars) and nonsnap species (circles) form a cluster in SH but not PEC. Dashed circles show clusters. (C) SH (orange, Left) and PEC (Right, blue) expression levels for genes for each of the 21 individuals. A clustering topology is shown for each tissue based on their cpm expression values with k groups shown (dotted boxes). Abbreviations: Cc, Ceratopipra cornuta; Cm, Ceratopipra mentalis; Lc, Lepidothrix coronata; Mo, Mionectes oleaginous; Mv, Manacus vitellinus; Pp, Pseudopipra pipra; Xa, Xenopipo atronitens.
Phylogenetic Tests Point to Temporally Layered Gene Expression Evolution.
Identification of genes whose expression patterns are congruent with the appearance of snap displays using SH tissues requires a single-tissue comparison of gene expression levels across species that accounts for phylogenetic nonindependence and can test specific models of expression change on different parts of the phylogeny. Therefore, we developed an analysis called PhyDGET, which blends differential gene expression data transformation and phylogenetic comparative methods to identify changes in gene expression levels on distinct branches in an evolutionary history. PhyDGET treats each gene’s expression values as a quantitative trait, transforming RNA-sequencing (RNA-seq) counts to a standardized expression level for each individual tissue sample. PhyDGET then tests the fit of each gene’s cross-species expression levels to one or more phylogenetic models of branch-specific shifts in expression, compared to a null Brownian motion model of stochastic change (Fig. 3). Using this approach, we can infer both significant improvements in model fit (Bayes Factor, BF) and the best-fit phylogenetic model for each gene’s expression values. Based on the best-fitting model, we can also infer on which branch (or branches) a significant change in expression has occurred for each gene in each tissue. When PhyDGET identifies a gene’s cross-species expression profile as a strong fit with a particular phylogenetic model, we refer to it as a phylogenetically differentially expressed gene (phyloDEG).
Fig. 3.
Layered evolution of gene expression in manakins. (A) PhyDGET detects genes with branch-specific shifts in expression level by comparing alternative models where target branches can have a faster rate of change in expression level against a null model where expression levels change under a single rate category for all branches. Many genes show significantly improved alternative model fit (BF ≥ 1.5) for expression levels in the SH muscle (blue), PEC (green), or both (magenta). (B) Seven focal alternative models are shown with target branches highlighted, underneath which are example gene expression profiles showing cpm reads in SH and PEC. The mean value (horizontal line), ± SE (shaded box), and individual data points (open circles), and BF values (corner numbers) are shown for each species. Below those, a plot shows PEC and SH BF scores for each gene (with the example gene circled), and gene counts for each quadrant in the corners. (C) Several genes for each model show strong shifts in expression, high peak expression levels, and function related to enhanced muscle contraction–relaxation. Signs before the gene names indicate increase (+) or decrease (−) in expression for the target branches. Square brackets indicate BF was between 0.3 and the 1.5 threshold, but the gene has a known role in muscle contraction. An asterix (*) indicates genes with substantially stronger BF in the SH over PEC. Subscript L (L) indicates genes not reference-annotated as a consensus vertebrate gene symbol but are a “-like” nearest-match paralog.
Layered evolution of gene expression in manakins. (A) PhyDGET detects genes with branch-specific shifts in expression level by comparing alternative models where target branches can have a faster rate of change in expression level against a null model where expression levels change under a single rate category for all branches. Many genes show significantly improved alternative model fit (BF ≥ 1.5) for expression levels in the SH muscle (blue), PEC (green), or both (magenta). (B) Seven focal alternative models are shown with target branches highlighted, underneath which are example gene expression profiles showing cpm reads in SH and PEC. The mean value (horizontal line), ± SE (shaded box), and individual data points (open circles), and BF values (corner numbers) are shown for each species. Below those, a plot shows PEC and SH BF scores for each gene (with the example gene circled), and gene counts for each quadrant in the corners. (C) Several genes for each model show strong shifts in expression, high peak expression levels, and function related to enhanced muscle contraction–relaxation. Signs before the gene names indicate increase (+) or decrease (−) in expression for the target branches. Square brackets indicate BF was between 0.3 and the 1.5 threshold, but the gene has a known role in muscle contraction. An asterix (*) indicates genes with substantially stronger BF in the SH over PEC. Subscript L (L) indicates genes not reference-annotated as a consensus vertebrate gene symbol but are a “-like” nearest-match paralog.Our primary results focus on the evaluation of expression levels for 17,416 genes under seven alternative models of expression shift (Fig. 3 and ). The “parallel” and “reversion” models test genes that are changing specifically in the snap-performing species, either as the results of parallel changes in the Manacus and Ceratopipra branches or an expression shift in the M4 ancestor followed by a reverse shift on the Pseudopipra branch. The M. vitellinus and Ceratopipra ancestor branches test the two lineages individually for specific changes in expression. Finally, the M6, M5, and M4 models test for changes on the ancestral branches of these clades. Across all seven models, PhyDGET identified 21 to 178 phyloDEGs in SH and 24 to 160 phyloDEGs in PEC with expression shifts that significantly improved fit in the alternative model compared to the null stochastic model (BF > 1.5) (). GO term overrepresentation tests were nonsignificant but included multiple genes under “response to endoplasmic reticulum stress” and “calcium ion transport from cytosol to endoplasmic reticulum,” consistent with later gene specific findings (). Numerically, most phyloDEGs are SH-specific and PEC-specific, but a statistical excess of overlap of genes changing in both muscles is observed in each alternative model (magenta points in Fig. 3) (all P < 10−50, χ2 test-of-independence, df = 1). This overlap matches a biological model, where we expect many changes to transcriptional regulation that affect both skeletal muscles, alongside SH-specific expression changes that are putatively associated with snap-displays.To validate and contextualize our findings from PhyDGET, we compared our sets of phyloDEGs to differentially expressed genes detected by pairwise approximations of each PhyDGET model (). The comparison of snap-performing species versus other species in a grouped pairwise differential expression test showed fewer overlap with phyloDEGs from the multibranch parallel or reversion models compared with more overlap for models concerning single phylogenetic branches. We conclude that PhyDGET appears particularly useful in cases where a “control versus treatment” pairwise comparison cannot encompass a more gradual trend over the phylogeny or an up-and-down dynamic in the expression levels. We also found each PhyDGET alternative model yielded distinct gene sets, except in the cases where the focal branches overlap (). The parallel and reversion models are equally parsimonious models for the same nonmonophyletic snap-associated expression pattern, but we noted that different genes were highlighted under these two model tests in a manner that appears sensitive to expression levels in the other M5 clade relatives (L. coronata and P. pipra).We conclude that PhyDGET effectively identified tissue-specific gene expression shifts and matched phylogenetic patterns less directly addressable with group-level pairwise differential expression approaches or post hoc interpretation of multiple differential gene expression tests in separate species. We pinpointed these expression shifts on both terminal and ancestral branch models throughout the manakin evolutionary history. The discovery of diverse, but functionally connected, shifts in expression throughout the manakin tree indicate that phenotypic innovations in courtship snap behavior arose through a historical accumulation of changes to gene expression profiles, some of which are shared with species that do not express the snap trait of interest. Based on these broad-scale results from PhyDGET, we further investigated the functional roles of SH muscle phyloDEGs associated with snap-performing species.
SH-Specific Expression Shifts Are Associated with Muscle Performance Genes.
To address the possible functional roles in SH-mediated rapid wing movement of the phyloDEGs identified, we focused on 64 phyloDEGs across our seven focal models that showed moderate expression levels (≥20 maximum counts-per-million reads, cpm), where detection is more robust and had an annotated gene function in vertebrates (Fig. 3 and ). Of these 64 phyloDEGs, 17 had SH-specific changes in expression, while PEC had either lower magnitude of change or different multispecies patterns of expression. In looking at changes associated with snap-performing species, we identified 19 phyloDEGs (6 SH-specific) that fit either the parallel or reversion models.We found that SH snaps likely evolved through a process in which layered, sequential changes to the expression patterns of genes associated with rapid myocytic calcium flux accrued slowly over the manakin radiation. Genes fitting the snap-specific parallel and reversion PhyDGET models include increased expression of two paralogs of ryanodine receptor 1 (RYR1-L1 and RYR-L2), triadin (TRDN), and sarcalumenin (SRL). These results indicate either parallel changes in gene expression in the two snap-performing lineages or a loss of abundant expression in the close relative without snap displays (Pseudopipra). On the M6, M5, or M4 ancestral branches, six SH-specific phyloDEGs were detected (Fig. 3), including increased expression of calmodulin 2 (CALM2) on M5 and two sarco/endoplasmic reticulum Ca(2+)-ATPase (SERCA) genes (ATP2A3 and ATP2A1-L) on M4. Finally, the Manacus and Ceratopipra branches were associated with 10 and 5 phyloDEGs, respectively, including elevated expression of calcium regulator clarin 1 (CLRN1) in Manacus and two paralogs of parvalbumin (PVALB-L1, PVALB-L2) in Ceratopipra. Changing the expression of these genes in isolation is known to be insufficient to produce superfast twitch speeds (35). Instead, trait modulation requires coordinated changes in calcium ion buffering (CALM2, SRL, PVALB), reuptake to the SR (SERCA), and release into the myoplasm (RYR1, TRDN). Therefore, our data show that factors involved in rapid contraction–relaxation cycling kinetics did not likely emerge at any one particular branch on the manakin phylogeny, but rather accumulated through the course of the history of these species.Beyond calcium flux, we also found snap display is associated with SH-specific shifts in gene expression related to a host of other functions that support muscle fiber passive elasticity, neuromuscular integrity, and energy mobilization. Our parallel and reversion models, for example, uncovered phyloDEGs for antioxidant activity (β-carotene oxygenase 2, BCO2) and muscle contractile filaments titin (TTN) and myosin (MYH3-like). We also found changes in genes associated with regeneration of blood vessels (NINJ2) in Ceratopipra, and antioxidant activity (PON2, SH-specific) and neuromuscular junctions (ANK2, STAU2) in Manacus. On the ancestral M6, M5, and M4 branches, we detected phyloDEGs with functions related to muscle growth and repair (CIAPIN1, FOXF1). In particular, we note detection of early shifts in metabolic function genes regulating glycolysis and oxidative phosphorylation (PFKP, PDK2) in the manakin ancestor (M6). This suggests that the earliest birds in this family may have evolved muscles specialized for anaerobic respiration, potentially facilitating subsequent generations of manakin taxa to evolve “athletic-like” displays that are contingent on bursts of energy. Finally, we note the presence of multiple phyloDEGs related to cell–cell junctions and the extracellular matrix, which offer intriguing candidate loci for promoting coordinated muscle action, but whose specific functional roles in the cell are less clear.Another functional set of phyloDEGs found throughout the phylogeny are related to ER and SR recovery and autophagy in muscles. ER/SR membranes are not only crucial for calcium transport in myocytes, but also serve as a key site of protein degradation during high stress (36). Several subunits of the 26S proteasome (PSMA6, PSMC7, PSMC6) and associated signaling factors (UBE3B, UBXN1, UFL1) were found as phyloDEGs changing expression on or before the M4 ancestral branch. The 26S proteasome uses a ubiquitination signaling system to recycle the vast majority of cell proteins, and therefore has a strong role in damage response (37). This layered accumulation of expression changes related to a host of homeostatic processes (neuromuscular connectivity, anaerobic energy mobilization, tissue repair, and intracellular recovery) suggests that, while the SH muscles primarily depend on calcium for their rapid action, the muscle tissue may not have adapted this level of speed and stress without preexisting enhancements to their homeostatic tissue repair and recovery pathways.Finally, we report expression increases in male hormonal systems associated with snap display. Androgen receptor (AR) was previously identified as an enhancer of muscle twitch speed in the Manacus SH (38). We found elevated AR expression in the SH muscle for snap-performing species, likely as a modulator of twitch speed (Fig. 3). Additional phyloDEGs are associated with the turnover of AR itself (lysine demethylases KDM4C-like and KDM4B on the M6 and M4 branches, respectively) (39) and the enhancement of gene regulation by AR (HIPK3 in Ceratopipra). We note that upstream regulators of AR’s transcription either precede or are concurrent with AR expression increase, while actors modulating the regulatory activity of AR protein occur later.
Genes with Past Expression Shifts Are Not under Past Positive Selection for Sequence.
We next examined our transcriptomes for coding sequence substitutions associated with snap-performing behavior and the broader relationship between genes under expression and sequence changes. We conducted branch tests of positive selection using PAML on 8,809 gene sequence alignments for the M5 and M4 ancestral branches, and the Manacus and Ceratopipra branches (Fig. 4 and ). The number of testable genes is reduced compared to the expression analysis because sequence is unavailable from species with no expression (a 100-bp protein-coding sequence alignment across all seven species was required). The highlighted phyloDEGs (Fig. 3) were nearly all analyzable for both sequence and expression, and so our overall functional conclusions here were not substantially impacted by missing sequence for the candidate genes.
Fig. 4.
Gene sequence changes and SH expression changes are not positively correlated. (A) Comparison of branch-specific positive selection on gene sequences (dN and dN/dS from PAML) and branch-specific shift in gene expression (BF from PhyDGET) for 8,809 genes in SH (Left) and PEC (Right) muscles. Shift in expression and positive selection on sequence are uncorrelated, consistent with a model of sequence shifts in regulators creating expression changes separately in their target molecules. (B) Counts of amino acids (AAs) per gene for specific taxa compared to changes in expression level (BF) for the SH show that, even with a nonphylogenetic measure of amino acid change that ignores gene length, changes in sequence and changes in expression are not positively correlated. Normally distributed “jitter” was added to x coordinates to show the BF-frequency distribution.
Gene sequence changes and SH expression changes are not positively correlated. (A) Comparison of branch-specific positive selection on gene sequences (dN and dN/dS from PAML) and branch-specific shift in gene expression (BF from PhyDGET) for 8,809 genes in SH (Left) and PEC (Right) muscles. Shift in expression and positive selection on sequence are uncorrelated, consistent with a model of sequence shifts in regulators creating expression changes separately in their target molecules. (B) Counts of amino acids (AAs) per gene for specific taxa compared to changes in expression level (BF) for the SH show that, even with a nonphylogenetic measure of amino acid change that ignores gene length, changes in sequence and changes in expression are not positively correlated. Normally distributed “jitter” was added to x coordinates to show the BF-frequency distribution.When sequence divergences of individual genes are low, accurate estimation of dN/dS becomes difficult. Few genes in our data had more than one amino acid change and even fewer had a substantial number of nonsynonymous and synonymous substitutions to estimate the dN/dS ratio properly. Additionally, the effects of local gene tree discordance can be difficult to assess when phylogenetically informative sites are limited, and thus gene tree inference is unreliable. Finally, genes with more amino acid substitutions (by count or by proportion of the total peptide length) are not necessarily more functionally relevant. To test for a correlation between amino acid changes and expression changes without considering the level of divergence or gene sequence phylogeny, we also counted the number of sites in each gene’s amino acid alignment where the following pairs of groups had different amino acid states: 1) the three snap-performing species compared with the other four species, 2) M. vitellinus compared to all other species, and 3) C. mentalis and C. cornuta compared to all other species (Fig. 4). The number of lineage-specific amino acid changes was compared to the BF expression shifts for each gene.In both the nonsynonymous substitution per nonsynonymous site (dN) and the ratio of dN to synonymous substitutions per synonymous site (dN/dS) and amino acid pattern counting methods we found no evidence of positive correlation between branch-specific rates of nonsynonymous substitutions (dN from PAML) and branch-specific rates of expression change (BF from PhyDGET) (Fig. 4). An inverse relationship, if any, is indicated generally by the distribution of the results, meaning genes are not under concurrent directional sequence selection and expression shift. In the Manacus branch test and amino acid count test, PON2, HEMK1, and TPRKB are exceptions to this trend, and have both one to two lineage-specific amino acid substitutions and shifts in expression (SH-specific in PON2). These three genes are involved in cellular stress reduction and mitochondrial function ().The parvalbumin-like calcium binding sequence (PVALB-like-1) that showed a massive expression increase in Ceratopipra (Fig. 3) showed greatest sequence similiarity to oncomodulin-3 (OCM3) (). OCM3 was previously called “avian thymic hormone” in chicken (40), and represents a distinct and ancient paralog in the parvalbumin/oncomodulin protein family that is present in most tetrapod lineages, but is apparently absent in placental mammals (). Manakin peptide sequences of OCM3 were identical to each other and nearly identical to other birds, meaning that its enhanced expression is not accompanied by functional changes to the protein itself. While parvalbumin binds calcium as part of muscle contraction–relaxation, the exact function of these members of the oncomodulin family is still unknown, particularly in birds (41).These results show for this particular trait system that genes under positive selection for coding sequence are generally not simultaneously changing expression levels, consistent with a model of regulatory signaler mutation and regulatory target expression change. In this system, amino acid changes that could affect any or all tissues and expression-level changes specific to the target tissue provide complementary information about separate sets of functional candidate genes. The tissue-specificity of the expression level shifts makes this evidence more directly interpretable for our trait of interest, while the sequence variants add corroboration and potentially represent the heritable genetic factors of expression shifts.
Rapid Wing Movements Are Associated with Temporally Layered Changes in Diverse Factors.
With wing movements three times as fast as comparable birds, Manacus and Ceratopipra attract mates through the dense tropical undergrowth by producing loud mechanical “snaps.” Far less sharp are the molecular evolutionary signals of complex traits, which hide among many molecular factors that must evolve within the context of the interlocking systems of the whole organism. In this study, we report specific genes associated with the superfast twitch kinetics of an avian wing muscle used in complex mating displays. We identify an abundance of genes related to calcium transients whose multispecies expression patterns are associated with species exhibiting rapid wing movement. Our analysis brings molecular evolutionary support to existing physiological hypotheses that implicated calcium systems as crucial for extreme display phenotypes (11, 16, 34). We also report that calcium-related gene expression changes are not only directly correlated with snap-performing species, but also appear to be shared across related nonsnapping species. Our phylogenetic analysis of expression paints a compelling picture of a temporally layered accumulation of ancestral calcium-related changes specifically to the SH muscle, which we interpret as foundational molecular shifts complemented by later species-specific expression changes to complete the molecular-physiological requirements for rapid wing movements that produce the snap displays.In contrast to the crucial role that calcium systems occupy in physiological models of extreme muscle performance, homeostatic maintenance of muscle tissue seldom is considered important for specifically mediating rapid muscle contraction–relaxation cycling. However, the attractiveness of Manacus roll-snaps is correlated with both higher speed and longer duration, meaning males who resist fatigue while engaging superfast SH motion likely maintain higher reproductive fitness (12). We report multiple expression changes in homeostatic factors that maintain tissue integrity and metabolic factors that fuel energetic capacity. Here again, some gene expression changes correlated specifically with snap-performing species while others appear to shift in their common ancestors. We conclude that tissue maintenance and metabolic systems are also likely important facilitators of exaggerated muscle performance, despite not directly influencing the kinetics that underlie either rapid muscle contraction, relaxation, or both. Of course, the performance attributes of the SH in ancestors are difficult to deduce, and thus whether the early changes were directly associated with reproductive fitness is unclear. However, the use of the SH muscle for courtship displays is consistent with a hypothesis of greater functional constraint on the PEC due to its crucial role in flight (42), compared to the SH that is relatively more free to adapt to extremes (14).Within the broader evolutionary history of the manakin family, our results suggest that ancestors of species that do and do not perform snap displays had already shifted expression of several genes likely important for superfast twitch kinetics. Increased expression of sarco/endoplasmic reticulum Ca(2+)-ATPase genes in the SH and later increases in ryanodine receptors and triadin, point to a history of adaptation where foundational molecular shifts were complemented by proximal expression changes. In this way, we have derived a model of layered evolution for a complex behavioral component using tissue-specific phylogenetic modeling of expression profiles. Our study therefore not only has revealed a complex history of molecular factors involved in an ornate and extreme sexual display phenotype, but also shows how phylotranscriptomic modeling among closely related species can be specifically applied to discover functional gene candidates. Our study establishes both a foundation for continued inquiry into the molecular nature of highly involved traits like behavioral displays, and an approach for integrating shifting levels of gene expression and patterns of sequence allele association for inquiry in complex functional genomics.
Materials and Methods
Tissue Collection.
We collected SH and PEC muscle tissues from six species within the manakin (Pipridae) family and one outgroup species within the closely related tyrant flycatcher (Tyrannidae) family. Each of these species perform an elaborate courtship display that engages both the SH and PEC (15); however, only M. vitellinus, C. mentalis, and C. cornuta also use their SH to generate rapid wing-snaps as part of their display routines (10, 21). We obtained tissue from three reproductively active adult males from each species that were captured using passive mist netting at active breeding leks. All males in this study maintained courtship display sites (e.g., arenas, courts, and so forth), and thus we could reasonably assume that these individuals were continually performing their display routines to solicit copulations from females. This means that males in our study were using their SH and PEC muscles to similar levels at the time of capture.M. vitellinus, C. mentalis, L. coronata, and M. oleagineus were captured near the Panama Canal in Gamboa, Panama, while P. pipra, C. cornuta, and X. atronitens were captured near the Boro Boro and Rupununi rivers in Guyana (). Individuals were in reproductive condition, as verified by the presence of enlarged testes observed during dissection. Once caught, males were immediately extracted from the mist net and killed via rapid decapitation. Muscle tissues (both SH and PEC) were quickly dissected and preserved in RNAlater for long-term storage, following the manufacturer’s instructions. All muscle samples were transported to the United States on dry ice and were held at −80 °C until processing. These methods were approved by the appropriate Institutional Animal Care and Use Committees (IACUCs) at the Smithsonian Tropical Institute (STRI), Brown University, Wake Forest University, and the University of Mississippi.
Extraction, Sequencing, and Quality Control.
We homogenized samples using a rotor-stator homogenizer set to medium speed. We extracted total RNA using a Zymo Direct-zol RNA miniprep kit (Zymo Research), in which we included an initial phenol-chloroform separation of RNA following the manufacturer's instructions. RNA quantity and integrity (RIN) were measured using a Qubit and BioAnalyzer, respectively. Across all samples the average RIN was 8, indicating high-quality RNA ().Library preparation and sequencing were performed at the University of Illinois Roy J. Carver Biotechnology Center. The 42 RNA-seq libraries were prepared using the Illumina TruSeq Stranded mRNAseq Sample Prep Kit. Libraries were quantitated by quantitative PCR. Samples were sequenced on two lanes of an Illumina HiSeq 4000 to produce 125-bp paired-end RNA-seq short reads. FASTQ files were generated and demultiplexed with bcl2fastq (v2.17.1.14). As expected for muscle tissue, we detected a high proportion of mitochondrial transcripts in our sequenced read sets. We assembled mitochondrial reference genomes for each species and filtered mitochondria reads into a separate set. Full details are in . Raw sequence data are available on the National Center for Biotechnology Information Short Read Archive (PRJNA807902) (43).
Transcriptome Mapping and Quantification.
Nonmitochondrial reads for each sample were mapped using STAR (v2.7.3a) (44) to the P. filicauda reference genome (v1; GenBank: GCA_003945595.1) using single-pass mode with the reference GTF and all other parameters default. The unmapped read pairs from STAR were then aligned using BWA-mem (v0.7.17-r1188) (45) using a slightly relaxed mismatch parameter (-B 2), split hits as secondary (-M), and all other command flags default.Read pairs were counted per gene using full gene coordinate boundaries using featureCounts (v2.0.1) (46). An average of 95% of uniquely mapped read pairs were assigned uniquely to a gene. On average, 13.9 million read pairs were mapped per sample (range 19.1 million to 11.8 million) with final read counts proportional to starting library size and showing no apparent sign of increase in quantification rates with genetic distance.
Phylogenetic Analysis.
Reference genome read alignments were converted to VCF files using BCFtools (45). The VCF files were converted to MVF format (47) and merged into a single MVF file. Using the “FilterMVF” module, we collapsed the sequences from the separate species into one individual sample and inferred a transcriptome-wide phylogeny. We used MVFtools “InferTrees” method to construct a RAxML-NG maximum-likelihood phylogeny of a single concatenated alignment using the GTR+Γ model () (48). We also analyzed gene tree discordance using Quartet Sampling branch supports using the gene tree mode (49) and site concordance using the sCF test (50). This procedure draws random quartets for a random gene spanning each internal branch on the phylogeny to determine the concordance of genes with the concatenated phylogeny.
Pairwise Differential Gene Expression.
We analyzed changes in pairwise differential gene expression using R (51) with the limma and voom packages (52, 53). We normalized expression data to cpm reads and filtered out genes without at least ≥3 cpm in three or more samples. A standard linear model with “tissue” (SH/PEC) as the independent variable was used for testing within each species. P values were adjusted for multiple testing using the Benjamini–Hochberg correction. Top genes were ranked by P value () and these top candidate lists tested for GO enrichment ().
Phylogenetic Differential Gene Expression.
In addition to standard differential expression techniques, we introduce a new model-testing method that uses techniques from comparative methods applied to each gene’s expression level as a quantitative trait. PhyDGET first normalizes and transforms the data into log2(cpm), which brings values that span five orders-of-magnitude into a narrower numeric space to apply phylogenetic comparative methods (https://github.com/peaselab/phydget) (54). Each unit of change equals a twofold change in expression, a common metric in differential expression analyses.We tested each gene’s log2(cpm) values across the seven species for a given tissue tested using BayesTrait (v3.0.2) (55) under a null phylogenetic model where expression values are changing at a constant rate across the entire phylogeny, and several alternate models where specific branches can change under their own separate rate parameters (Fig. 3). Three individuals were used as biological replicate for each species. The marginal log-likelihoods of these two models are compared to calculate a BF (56). PhyDGET is a Python3 script that parallelizes data extraction, BayesTrait execution, and tabulation of outputs across all genes and models. A full description of the model and simulation-based verification are available in .
Grouped Pairwise Differential Gene Expression.
We tested multispecies groups of samples to compare the phylogenetic differential gene expression model with an approximation by uncorrected pairwise differential expression. For SH and PEC tissues separately, we conducted seven tests using limma+voom, where each species’ three samples were the focal treatment and the remaining species were coded as the alternative treatment. We also tested the three snap-performing species as the focal treatment and the other species as the alternative treatment in a snap versus nonsnap pairwise comparison (). Finally, we tested groups of taxa spanning the M6, M5, and M4 branches used for the PhyDGET tests. See for additional details.
Molecular Phylogenetic Tests of Selection.
We extracted the coding sequence alignments for all reference genes sing MVFtools (47), filtering genes with insufficient data for all species, pseudogenes, and likely paralogous duplicates. We tested each gene alignment for evidence of positive selection using the branch-test in PAML (57). Using a custom Python3 script, we prepared sequence alignment files containing a random individual’s sequence from each species for each gene. We ran four PAML tests on these focal branches: 1) M5 ancestor branch, 2) the M4 ancestor branch, 3) the Ceratopipra ancestor branch, and 4) the terminal branch for M. vitellinus (Fig. 3). To identify genes experiencing positive selection in each foreground branch, we estimated branch-specific rates of nonsynonymous substitution per nonsynonymous site (dN) and the ratio of dN to synonymous substitutions per synonymous site (dN/dS) values using the four branch models listed above. For the taxon-specific amino acid pattern counts, a codon MVF file was prepared using MVFtools, and filtered for at least 10× coverage. The “InferGroupSpecificAllele” function in MVFtools was used to generate allele pattern counts. See for additional details.
Authors: David Brawand; Magali Soumillon; Anamaria Necsulea; Philippe Julien; Gábor Csárdi; Patrick Harrigan; Manuela Weier; Angélica Liechti; Ayinuer Aximu-Petri; Martin Kircher; Frank W Albert; Ulrich Zeller; Philipp Khaitovich; Frank Grützner; Sven Bergmann; Rasmus Nielsen; Svante Pääbo; Henrik Kaessmann Journal: Nature Date: 2011-10-19 Impact factor: 49.962
Authors: Sangeet Lamichhaney; Jonas Berglund; Markus Sällman Almén; Khurram Maqbool; Manfred Grabherr; Alvaro Martinez-Barrio; Marta Promerová; Carl-Johan Rubin; Chao Wang; Neda Zamani; B Rosemary Grant; Peter R Grant; Matthew T Webster; Leif Andersson Journal: Nature Date: 2015-02-11 Impact factor: 49.962
Authors: Matthew J Fuxjager; Kristy M Longpre; Jennifer G Chew; Leonida Fusani; Barney A Schlinger Journal: Endocrinology Date: 2013-06-19 Impact factor: 4.736
Authors: James B Pease; Robert J Driver; David A de la Cerda; Lainy B Day; Willow R Lindsay; Barney A Schlinger; Eric R Schuppe; Christopher N Balakrishnan; Matthew J Fuxjager Journal: Proc Natl Acad Sci U S A Date: 2022-04-01 Impact factor: 12.779