Literature DB >> 29211852

Migration-Selection Balance Drives Genetic Differentiation in Genes Associated with High-Altitude Function in the Speckled Teal (Anas flavirostris) in the Andes.

Allie M Graham¹, Philip Lavretsky², Violeta Muñoz-Fuentes^3,4, Andy J Green⁴, Robert E Wilson⁵, Kevin G McCracken^1,5,6,7.

Abstract

Local adaptation frequently occurs across populations as a result of migration-selection balance between divergent selective pressures and gene flow associated with life in heterogeneous landscapes. Studying the effects of selection and gene flow on the adaptation process can be achieved in systems that have recently colonized extreme environments. This study utilizes an endemic South American duck species, the speckled teal (Anas flavirostris), which has both high- and low-altitude populations. High-altitude speckled teal (A. f. oxyptera) are locally adapted to the Andean environment and mostly allopatric from low-altitude birds (A. f. flavirostris); however, there is occasional gene flow across altitudinal gradients. In this study, we used next-generation sequencing to explore genetic patterns associated with high-altitude adaptation in speckled teal populations, as well as the extent to which the balance between selection and migration have affected genetic architecture. We identified a set of loci with allele frequencies strongly correlated with altitude, including those involved in the insulin-like signaling pathway, bone morphogenesis, oxidative phosphorylation, responders to hypoxia-induced DNA damage, and feedback loops to the hypoxia-inducible factor pathway. These same outlier loci were found to have depressed gene flow estimates, as well as being highly concentrated on the Z-chromosome. Our results suggest a multifactorial response to life at high altitudes through an array of interconnected pathways that are likely under positive selection and whose genetic components seem to be providing an effective genomic barrier to interbreeding, potentially functioning as an avenue for population divergence and speciation.

Entities: Chemical Disease Gene Species

Keywords: hypoxia; local adaptation; waterfowl

Mesh：

Year: 2018 PMID： 29211852 PMCID： PMC5757641 DOI： 10.1093/gbe/evx253

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Introduction

Heterogeneous landscapes provide venues in which populations experiencing divergent selective pressures can differentiate into locally adapted subpopulations (Nosil et al. 2009). However, the probability for local adaptation and continued divergence depends on the strength of selection balanced by gene flow among populations; that is, migration-selection balance. Specifically, if the strength of selection is weak relative to gene flow, most local genetic variation will tend to be homogenized among populations. Alternatively, if selection is stronger than the amalgamating force of gene flow, genetic differentiation can accumulate and be maintained across specific loci experiencing strong divergent selection (Yeaman and Whitlock 2011). In essence, in the latter scenario, selection against maladapted immigrants limits the effects of gene flow and provides a means for local adaptation (Yeaman and Whitlock 2011; Feder et al. 2012). In general, studies continue to find that much of a genome is free to introgress during early stages of population divergence (Via and West 2008; Nosil et al. 2009; Feder and Nosil 2010; Via 2012). For example, specific genetic regions continue to be identified as being important in early stages of divergence and maintaining species boundaries even in the face of gene flow, including sex-linked traits (Carling and Brumfield 2008; Ellegren 2009; Storchová et al. 2010; Elgvin et al. 2011; Martin et al. 2013; Lavretsky et al. 2015), chromosomal inversions, or “supergenes” (Kirkpatrick and Barton 2006; Thompson and Jiggins 2014), and a variety of other genes associated with adaptation to extreme environments (Chapman et al. 2013; DeFaveri et al. 2013; Wit and Palumbi 2013; Soria-Carrasco et al. 2014).Thus, the probability of becoming locally adapted is influenced by this interplay between gene flow and divergent selection (Le Corre and Kremer 2012; Savolainen et al. 2013). Species that have recently invaded extreme environments are ideal systems to study the effects of selection and gene flow on the adaptation process (Feder ). Among these, high-altitude habitats offer an excellent opportunity to investigate the genetic basis of local adaptation (Storz et al. 2010; Cheviron and Brumfield 2012) as the characteristics of high-altitude habitats (e.g., low temperatures, ultraviolet radiation, increased water loss, and low oxygen or hypoxic conditions) are typically debilitating for low-altitude individuals that are unacclimated (Hopkins and Powell 2001). Therefore, high-altitude species represent an unparalleled system for studying adaptation in animals, not only because organisms have to overcome a clear challenge to survival and reproduction but also because the physiological mechanisms of oxygen transport are well understood (Beall 2001; Powell 2003), and selective pressures relatively well defined (Cheviron and Brumfield 2012). In general, high-altitude species have dynamic respiratory and circulatory systems capable of responding to changes in oxygen (O2) supply and demand (Beall 2001). Adaptation to hypoxic environments has been shown to be refereed in part through the activation of hypoxia-inducible transcription factors (HIF), which start a signaling cascade of genes involved in biological processes such as angiogenesis and erythropoiesis, and assist in promoting and increasing O2 delivery to hypoxic tissues (Gorr et al. 2006; Hoogewijs et al. 2007; Rytkönen et al. 2011; Semenza 2007, 2011). Moreover, for aerobic energy production, genes involved in mitochondrial function and energy metabolism, O2 binding and delivery, and hematopoiesis are activated (Hopkins and Powell 2001; Hoogewijs et al. 2007). Ultimately, high-altitude adaptation involves coordinated changes in the expression of many genes that are involved in interacting biochemical pathways (Cork and Purugganan 2004; Cheviron and Brumfield 2012; Cheviron et al. 2012). Moreover, with different biochemical pathways, organisms can mechanistically differ in their response to hypoxic environments (Storz et al. 2010), making it important to understand how organisms adapt in different ways to these environments. At the genomic level, determining the balance between gene flow and selection is especially important when attempting to demarcate genomic regions associated with how populations have adapted to new environments. We expect established high-altitude populations to be genetically differentiated from their low-altitude counterparts at those genes/genetic regions associated with their new environment. Additionally, if gene flow is maintained between two populations, then we expect selection against maladapted genotypes to result in signatures of divergent selection at specific genes, with other parts of the genome largely undifferentiated (Feder et al. 2012; Via 2012). In this study, we tested for genomic differentiation, and specifically for regions putatively under divergent selection between low- (<1,500 m) and high- (up to 5,000 m) altitude populations of the speckled teal (Anas flavirostris), a widespread South American waterfowl. Although high-altitude speckled teal (A. f. oxyptera) are locally adapted to the Andean environment and largely allopatric from low-altitude birds (A. f. flavirostris), speckled teal are known to disperse long-distances, which appears to result in occasional gene flow across altitudinal gradients (McCracken et al. 2009a, 2009b). Previous research suggested a degree of asymmetry in gene flow in the speckled teal, with more pronounced levels of gene flow going from low to high than high to low (McCracken et al. 2009a). Additionally, genes shown to be under positive selection such as hemoglobin (Natarajan ) had lower levels of gene flow compared with neutral loci, attributed to countervailing selection at loci associated with high-altitude adaptation (McCracken et al. 2009a, 2009c). However, studies to date have been restricted in genomic coverage, and thus the extent and influence of gene flow in regards to local, high-altitude adaption in speckled teal remains still unexplored. Here, we used Restriction Site-Associated DNA Sequencing (RAD-Seq) to subsample the genomes of low- and high-altitude speckled teal to examine whether highly differentiated loci are associated with possible selective mechanisms in high-altitude environments, or are consistent with neutral (allopatric) divergence. In addition, we also sequenced the α- and β-hemoglobin genes and the mitochondrial (mtDNA) control region. To identify highly differentiated loci, we calculated FST and performed genomic scans and outlier analyses using Bayesian and other methods at three marker-types with different patterns of inheritance, autosomal, Z-chromosomal sex-linked, and mitochondrial. If the two populations are diverged due to selective processes, we expected outliers to be associated with pathways related to hypoxia, and/or those involved with growth and development, metabolism, O2 transport (i.e., hemoglobin), energy production, and oxidative damage (Cheviron and Brumfield 2012; Storz and Cheviron 2016). Moreover, if there has been local adaptation in the face of continuous gene flow, then we expect a small number of genes/genetic regions to exhibit high levels of allelic differentiation between populations, whereas the rest of the genome to exhibit low levels of allelic differentiation. Conversely, if high-altitude populations have been largely allopatric during their divergence from lower altitude populations, then we expect to find a larger portion of loci across the genome to be differentiated, primarily through stochastic processes such as genetic drift, characteristic of later stages of population divergence (Feder et al. 2012).

Methods

Specimen Collection and DNA Extraction

A total of 20 speckled teal were collected from low- (n = 10; elevation range 77–860 m) and high-altitude (n = 10; elevation range 3,211–4,405) sites (fig. 1 and supplementary table S1, Supplementary Material online). Animal collection was performed in adherence to Guidelines to the Use of Wild Birds in Research (Fair and Jones 2010) under permits to collect specimens from Peru (2002), Bolivia (2001), and Argentina (2001, 2003). Genomic DNA was extracted from muscle or blood using a DNeasy Tissue Kit (Qiagen, Valencia, CA) and following manufacturer’s protocols.

. 1.

—Specimen collection locations of the speckled teal, Anas flavirostris (A) 10 individuals collected from high-altitude populations, A. f. oxyptera, shown in gray, and 10 individuals from low-altitude populations, A. f. flavirostris, shown in white (B) Representative photograph.

Hemoglobin and Mitochondrial Sequences

The αA-hemoglobin subunit (676 bp; HBA2), βA-hemoglobin subunit (1,578 bp; HBB), and mitochondrial sequences containing part of the control region and tRNA-Phe gene (McCracken and Wilson 2011) were obtained from GenBank, for the same 20 individuals that were used in this study (McCracken et al. 2009a), except for the high-altitude population, for which the mitochondrial sequences had not been published previously (NCBI accession numbers are provided in supplementary table S2, Supplementary Material online). For both hemoglobins, all exons and introns were sequenced. All genes were sequenced using PCR and DNA sequencing protocols as described in McCracken et al. (2009a).

RAD Sequencing and Bioinformatics

Genomic DNA was normalized to ∼2 ng/μl using a Qubit Fluorometer (Invitrogen, Grand Island, NY) and submitted to Floragenex (Eugene, OR) for RAD-Seq. In short, genomic DNA was first digested with the 8-bp SbfI restriction enzyme (CCTGCAGG) followed by barcode and adapter ligation. Individually barcoded samples were multiplexed and sequenced on the Illumina HiSeq 2000 with single-end 100-bp sequencing chemistry. Following the run, RAD sequences were demultiplexed and trimmed to yield resulting RAD sequences of 90 bp. The methods used by Floragenex are described by Miller et al. (2007), Baird et al. (2008), and Hohenlohe et al. (2010) and are summarized below. Genotypes at each variable site were determined using Floragenex’s VCF_popgen v.4.0 pipeline to generate a customized VCF 4.1 (variant call format) database with parameters set as follows: minimum AF for genotyping = 0.075, minimum Phred score = 15, minimum depth of sequencing coverage = 10×, and allowing missing genotypes from up to 2 out of 20 individuals (10%) at each site. To filter out base calls that were not useful due to low-quality scores or insufficient coverage, genotypes at each nucleotide site were inferred using the Bayesian maximum likelihood algorithm. The genotyping algorithm incorporates the site-specific sequencing error rate, and assigns the most likely diploid genotype to each site using a likelihood ratio test and significance level of P = 0.05. Sites for which the test statistic is not significant are not assigned a genotype for that base in that individual, effectively removing data from sites for which there are too few high-quality sequencing reads. The sequencing coverage and quality scores were then summarized using the software VCFtools v.0.1.11 (Danecek et al. 2011). Custom perl scripts written by M. Campbell (University of Alaska Fairbanks) were used to retain biallelic sites only and converted to VCF file format.

Population Structure

Nucleotide diversity (π) and pairwise FST (Weir and Cockerham 1984) for each SNP between low- and high-altitude populations for RAD-Seq, hemoglobin, and the mtDNA were calculated in Arlequin v. 3.5 (Excoffier and Lischer 2010). A haplotype network was used to visualize structure at mtDNA using TCS (Clement et al. 2000) and as implemented and visualized in the program Population Analysis with Reticulate Trees (PopART) (software available at: http://popart.otago.ac.nz). Analyses were done under default settings. For nuclear markers, population structure was first visualized using a principal component analysis (PCA) as implemented in the software package PLINK v1.9 (Purcell et al. 2007). The PCA used in PLINK uses a two-dimensional reduction routine based on the variance-standardization relationship matrix. The top principle components are used as covariants in association analysis regressions to help correct for population stratification, whereas multidimensional scaling (MDS) coordinates help visualize genetic distances (Compagnon and Green 1994). Next, assignment probabilities were calculated using ADMIXTURE (Alexander et al. 2009; Alexander and Lange 2011). ADMIXTURE assumes a specific number of hypothetical populations (K) and provides a maximum likelihood estimate (i.e., Q estimate) of allele frequencies for each population and admixture proportion for each individual. We analyzed values of K = 1–5 using the block-relaxation method algorithm for point estimation and terminating them when the estimates increased by <0.0001.

Identifying Outlier Loci: Hemoglobin Sequences and RAD-Seq Data

A genomic scan was performed by obtaining pairwise FST estimates calculated in Arlequin v. 3.5 (Excoffier and Lischer 2010) between low- and high-altitude populations. Instances of speciation-with-gene-flow are typically characterized by a vast majority of the genome being homogenized via gene-flow with low FST, whereas a few regions containing genes under strong divergent selection have restricted gene-flow and high FST. Therefore, empirical examples are expected to produce a characteristic L-shaped distribution of differentiation across loci in the genome, with most loci having low FST values. Whereas, populations experiencing allopatric speciation with established reproductive isolation are characterized by less gene-flow and greater divergence across much more of the genome; characterized by a distribution with less skew, more density in the center, and a more pronounced tail of higher FST values (Via 2001; Savolainen et al. 2006; Feder et al. 2012). Next, we tested for signatures of selection using two programs, which minimized the probability of detecting false positives. First, Lositan, which implements the FDIST2 function, was used to demarcate markers putatively under positive selection (Beaumont and Nichols 1996; Antao et al. 2008): analyses were based on the Infinite Alleles Model with 50,000 simulations, a confidence interval of 0.95 and a false-discovery rate (FDR) of 0.01, as well as using the neutral mean FST and forcing mean FST options. In addition, we used a Bayesian approach as implemented in BayeScan v. 2.1 to again identify putatively selected loci, using default parameters suggested for large data sets (prods = 100). BayeScan uses a logistic regression model to separate locus-specific effects of selection from demographic differences (Foll and Gaggiotti 2008). Previous tests have shown that outliers detected by BayeScan are likely to be “better” candidate regions of the genome, because of its more conservative approach (Pérez‐Figueroa et al. 2010). Foll (2012) proposed a logarithmic scale for model choice defined as: >3 substantial (log10PO > 0.5); >10 strong (log10PO > 1.0); >32 very strong (log10PO > 1.5); and >100 decisive evidence for accepting a model (log10PO > 2.0). In our genome scans, a threshold for PO > 0.5 (substantial) was used for a marker to be considered under selection. Thus, top outlier loci were considered to be those that were identified by both methods, as previously stated. General outliers were considered to be those that were identified by both methods, including markers that were significant in Lositan analyses, and were in the top 1% of Log10(PO) values in BayeScan; these data were used for gene-flow analyses (see below). Any markers identified as putatively being under divergent selection by both Lositan and BayeScan were next subjected to a BLAST search (database – nr, e-value < e − 10, annotation cut-off > 50) in Blast2GO (Conesa et al. 2005) using the taxonomy filter for birds (taxa: 8782, Aves) to determine gene identity. Additionally, they were categorized by chromosomal location, either as autosomal or Z-linked based on a BLAST search to the Chicken genome (Gallus gallus 5.0 reference) through NCBI, using stringent criteria (blastn, >75% identity, e-value < 10−3, max-score >40). Putative chromosomal locations of each of the autosomal categorized clusters were also identified; although we focus primarily on the Z-chromosome versus autosome comparison due to the tendency for sex chromosomes to house genes related to sexual selection, reproductive isolation, and speciation (Ellegren 2009; Ellegren et al. 2012; Lavretsky et al. 2015). Gene ontology (GO) terms were then assigned to each of the sequences with significant BLAST hits using PANTHER (Mi et al. 2013); the GO terms associated with our outliers were subjected to term enrichment/overrepresentation analyses and visualization in AmiGO 2.0 (Carbon et al. 2009), using both the PANTHER and Go Experimental annotation data sets. In addition, different functional classifications of the gene list were performed, using the gene list analysis tool in PANTHER to determine which categories were enriched in the confirmed hit gene list relative to their representation in the reference list. For all GO term analyses, the reference list for the Chicken genome (Gallus gallus) was utilized.

Gene Flow (∂a∂i) and Coalescent Analyses

Previous multilocus work in this species suggested asymmetric gene flow from low- to high-altitude populations (McCracken et al. 2009a). We tested for and estimated rates, as well as the directionality of gene flow, with the program ∂a∂i (Gutenkunst et al. 2009, 2010), which implements a diffusion-based approach to test against specified evolutionary models with the best-fit model determined using a log likelihood-based multinomial approach. ∂a∂i analyses were done with RAD-Seq data, which were parsed into three categories that were analyzed separately, 1) All SNPs, 2) outliers-only as determined by Lositan/BayeScan analyses (see above), and 3) nonoutliers only. Given our interest in identifying differences between migration rates between outlier and nonoutlier loci, autosomal and Z-chromosome linked SNPs were analyzed together. To estimate bidirectional migration rates, the data were tested against the Isolation-with-Migration evolutionary model. Parameters in ∂a∂i were optimized prior to calculating the likelihood, which is the product of Poisson likelihoods for each parameter given an expected value from the model frequency spectrum. In the previous study of this species, five introns were used in a multilocus coalescent analysis to estimate the effective population size parameter (θ) and bidirectional gene flow (m1 and m2) between low- and high-altitude populations (McCracken et al. 2009a). We reanalyzed these sequence data here to estimate divergence time (t) and other parameters with the IM software (Nielsen and Wakeley 2001; Hey and Nielsen 2004). Starting parameters for the IM analysis were as follows, for models incorporating and not incorporating population splitting: -q1 2 -q2 2 -qA 4 -m1 50 -m2 100 -t 2. Parameter estimates from the IM analyses were then used with MS (Hudson 2002) to create a simulated distribution of FST, conditioned on the previously estimated values of t, θ, and m, generating 1,000 randomly sampled genealogies under the Infinite Sites Model. This was performed for three different divergence estimates (-ej 0.25, 0.5, 1.0) and under three different migration rate parameters (equal, and asymmetrical in both directions). The empirical values of FST for all SNPs, outliers-only as determined by Lositan/BayeScan analyses, and nonoutliers were then compared with the distribution of simulated FST values generated by MS.

Results

A total of 159 million RAD-Seq reads were obtained with an average depth of 38.6 (±8.0 SD) million reads per sample. After filtering, an average of 19,434 RAD clusters were recovered per sample, which represents ∼1% of a typical 1.1 gigabase avian genome (Zhang et al. 2014). Finally, a total of 47,731 polymorphic SNPs were identified. Overall, FST estimates between low- and high-altitude speckled teal populations were identical for Lositan (FST = 0.065) and Arlequin (FST = 0.065). The mtDNA with FST = 0.77 exhibited a 14-fold difference in population differentiation compared with nuclear DNA (table 1).

Table 1

Divergence Measurements (FST) Associated with Different Subsets of Markers (mitochondrial vs. nuclear/RAD-Seq), and Their Chromosomal Location (putatively Z-linked vs. autosomal)

Chromosomal Location	Marker Type	F_ST
Mitochondria	—	0.770
Nuclear (RAD-seq)	All	0.065
Z-linked	All	0.123
Z-linked	Outlier	0.577
Z-linked	Nonoutlier	0.064
Autosomal	All	0.061
Autosomal	Outlier	0.442
Autosomal	Nonoutlier	0.049

Divergence Measurements (FST) Associated with Different Subsets of Markers (mitochondrial vs. nuclear/RAD-Seq), and Their Chromosomal Location (putatively Z-linked vs. autosomal)

Population Structure and Outlier Detection

For nuclear sequences, ADMIXTURE results corresponded with PCA analyses in which low- and high-altitude individuals were highly differentiated into two clusters with 99% assignment probabilities (fig. 2), despite overall low FST. The mitochondrial sequences corroborate the nuclear data showing two distinct populations, however, suggesting a much deeper level of divergence with FST = 0.77 (fig. 2).

. 2.

—Population structure between the high- and low-altitude speckled teal populations (A) mitochondrial haplotype (B) ADMIXTURE results (C) principle component analysis.

—Population structure between the high- and low-altitude speckled teal populations (A) mitochondrial haplotype (B) ADMIXTURE results (C) principle component analysis. BayeScan identified ∼1% of markers (457 SNPs associated with 420 RAD clusters), as compared with Lositan’s ∼6% (3,008 SNPs, associated with 2,664 RAD clusters) as being putatively under positive selection (supplementary table S2, Supplementary Material online). Thus, our RAD-Seq data show a FST distribution with a pronounced right-tailed L-shape, with most variants showing little-to-no population differentiation and a smaller percentage of the genome comprised of highly differentiated loci (figs. 3 and 4). Loci were classified as outliers if they were identified by both methods (Lositan and BayeScan) as being significant. Ultimately, although 356 RAD-Seq loci overlapped both programs, only 34 of them returned a significant BLAST hit, and 29 of those were to a gene or transcript with an identifier. For the RAD-Seq data, the top hits with the highest prior odds were TOPORS and NNT, whereas F8, IGF-1, and BMP-2 had slightly lower prior odds (table 2). The low hit rate for our BLAST results may, in part, be because bird genomes currently have annotated 70% of the total number genes compared with the human genome; this is likely due to an enrichment of genes in GC-rich regions which are difficult to sequence, assemble, and annotate; thus they appear missing (Botero-Castro et al. 2017).

. 3.

—RAD cluster distribution against measurement of population divergence (FST): markers with black arrows signify RAD-Seq markers and their respective gene identifier that were labeled as outliers for both LOSITAN and BayeScan.

. 4.

—Distribution of FST values based on MS simulations compared with empirical RAD-Seq data for divergence time of 0.5 Ma: stars = location of significant outliers, gray bars = RAD-Seq data, gray line = equal migration rate, black line = unequal migration rate (1 → 2, 10; 2 → 1, 100), dotted line = equal migration rate (1 → 2, 100; 2 → 1, 100).

Table 2

Gene ID	BLAST Hit Description	Min. e-Value	Mean Similarity (%)	PANTHER Family/Subfamily	Seq. Name	Chromosome
METRN	Anas platyrhynchos meteorin-like transcript variant mrna	1.54E-34	93.33	METEORIN (PTHR28593: SF4)	RADid_0003361_depth_116	Autosomal
ARHGAP44	Gallus gallus rho gtpase activating protein 44 transcript variant mrna	6.54E-33	95.19	RHO GTPASE-ACTIVATING PROTEIN 44 (PTHR14130: SF16)	RADid_0006028_depth_98	Autosomal
TOPORS^a	Anas platyrhynchos e3 ubiquitin-protein ligase topors-like partial mrna	2.97E-37	100.00	E3 UBIQUITIN-PROTEIN LIGASE TOPORS (PTHR22937: SF86)	RADid_0016436_depth_109	Z-linked
ZNF469	Anas platyrhynchos zinc finger protein 469 mrna	1.26E-35	98.00	ZINC FINGER PROTEIN 469 (PTHR21465: SF3)	RADid_0021978_depth_146	Autosomal
IL20	Anas platyrhynchos interleukin 20 mrna	6.54E-33	89.50	INTERLEUKIN-20 (PTHR10078: SF59)	RADid_0022434_depth_89	Autosomal
—	Gallus gallus bac clone tam33-29c6 from chromosome complete sequence	5.74E-21	89.00	—	RADid_0027544_depth_59	Z-linked
—	Opisthocomus hoazin bio-material lwl<deu	1.04E-17	85.00	—	RADid_0027947_depth_120	Z-linked
CAMKK1	Anas platyrhynchos calcium calmodulin-dependent protein kinase kinase alpha transcript variant mrna	1.26E-35	98.00	CALCIUM/CALMODULIN-DEPENDENT PROTEIN KINASE KINASE 1 (PTHR24347: SF330)	RADid_0032309_depth_138	Autosomal
TRIOBP	Mus musculus trio and f-actin binding protein transcript variant mrna	1.54E-34	91.78	TRIO AND F-ACTIN-BINDING PROTEIN (PTHR17271: SF15)	RADid_0035703_depth_125	Autosomal
F8^b	Anas platyrhynchos coagulation factor procoagulant component mrna	4.41E-35	92.67	COAGULATION FACTOR VIII (PTHR10127: SF725)	RADid_0042196_depth_153	Autosomal
SYDE2	Anas platyrhynchos synapse defective rho homolog 2 (elegans) transcript variant mrna	7.97E-32	95.00	RHO GTPASE-ACTIVATING PROTEIN SYDE2 (PTHR23176: SF89)	RADid_0043458_depth_294	Autosomal
LIPA	Anas platyrhynchos lysosomal acid lipase cholesteryl ester hydrolase-like transcript variant mrna	6.54E-33	98.00	LYSOSOMAL ACID LIPASE/CHOLESTERYL ESTER HYDROLASE (PTHR11005: SF61)	RADid_0044346_depth_103	Autosomal
IGF1^b	Anser anser insulin-like growth factor i (igf-i) intron 3	1.88E-14	98.00	INSULIN-LIKE GROWTH FACTOR I (PTHR11454: SF24)	RADid_0050513_depth_248	Autosomal
TRIM71	Anas platyrhynchos e3 ubiquitin-protein ligase trim71-like mrna	1.26E-35	98.00	E3 UBIQUITIN-PROTEIN LIGASE TRIM71 (PTHR24103: SF544)	RADid_0053903_depth_95	Autosomal
—	Gallus gallus bac clone ch261-26d19 from chromosome complete sequence	1.04E-17	87.00	—	RADid_0061725_depth_138	Z-linked
USP24	Anas platyrhynchos ubiquitin specific peptidase 24 mrna	1.53E-34	93.71	—	RADid_0064189_depth_48	Autosomal
CLEC2L	Anas platyrhynchos c-type lectin domain family 2 member l-like transcript variant misc_rna	1.26E-35	96.00	C-TYPE LECTIN DOMAIN FAMILY 2 MEMBER L (PTHR22800: SF216)	RADid_0066143_depth_105	Z-linked
ARHGEF10	Chlorocebus sabaeus rho guanine nucleotide exchange factor 10-like transcript variant mrna	1.75E-27	88.63	RHO GUANINE NUCLEOTIDE EXCHANGE FACTOR 18 (PTHR12673: SF204)	RADid_0068177_depth_21	Autosomal
VIPR1	Ficedula albicollis vasoactive intestinal polypeptide receptor-like mrna	9.71E-12	85.00	VASOACTIVE INTESTINAL POLYPEPTIDE RECEPTOR 1 (PTHR12011: SF363)	RADid_0069283_depth_65	Z-linked
MIEF1/MID51	Anas platyrhynchos smith-magenis syndrome chromosome candidate 7-like transcript variant mrna (mitochondrial dynamic protein MID51, mitochondrial elongation factor 1 MIEF1)	2.97E-37	100.00	MITOCHONDRIAL DYNAMICS PROTEIN MID51 (PTHR16451: SF16)	RADid_0071374_depth_108	Autosomal
—	Anser cygnoides clone zaas082 microsatellite sequence	2.43E-19	82.00	—	RADid_0073508_depth_100	Z-linked
APEH	Ficedula albicollis vasoactive intestinal polypeptide receptor-like mrna	5.03E-28	92.50	ACYLAMINO-ACID-RELEASING ENZYME (PTHR42776: SF4)	RADid_0075723_depth_129	Autosomal
TSPAN19	Anas platyrhynchos tetraspanin 19 transcript variant mrna	8.49E-19	94.07	TETRASPANIN-19-RELATED (PTHR19282: SF274)	RADid_0075864_depth_49	Autosomal
TTLL11	Anas platyrhynchos tubulin tyrosine ligase-like member 11 transcript variant mrna	2.97E-37	100.00	TUBULIN POLYGLUTAMYLASE TTLL11 (PTHR12241: SF136)	RADid_0081068_depth_86	Autosomal
DCIR	Anas platyrhynchos dendritic cell immunoreceptor complete cds dcar*null complete sequence and dendritic cell immunoactivating receptor complete cds	4.13E-29	93.00	C-TYPE LECTIN DOMAIN FAMILY 4 MEMBER A (PTHR22802: SF289)	RADid_0081629_depth_106	Z-linked
MRPL54	Anas platyrhynchos mitochondrial ribosomal protein l54 partial mrna	1.26E-35	87.50	39S RIBOSOMAL PROTEIN L54, MITOCHONDRIAL (PTHR28595: SF2)	RADid_0084279_depth_80	Autosomal
ARHGEF18	Anas platyrhynchos rho rac guanine nucleotide exchange factor 18 transcript variant mrna	2.97E-37	93.00	RHO GUANINE NUCLEOTIDE EXCHANGE FACTOR 18 (PTHR12673: SF204)	RADid_0091284_depth_213	Autosomal
MED10	Anas platyrhynchos mediator complex subunit 10 mrna	2.97E-37	100.00	MEDIATOR OF RNA POLYMERASE II TRANSCRIPTION SUBUNIT 10 (PTHR13345: SF7)	RADid_0098634_depth_34	Autosomal
BMP2^b	Anas platyrhynchos bone morphogenetic protein 2-like mrna	1.26E-35	86.50	BONE MORPHOGENETIC PROTEIN 2 (PTHR11848: SF210)	RADid_0105984_depth_195	Autosomal
DCIR	Anas platyrhynchos dendritic cell immunoreceptor complete cds dcar*null complete sequence and dendritic cell immunoactivating receptor complete cds	2.78E-31	85.70	C-TYPE LECTIN DOMAIN FAMILY 4 MEMBER A (PTHR22802: SF289)	RADid_0107007_depth_106	Autosomal
NNT^a	Chaetura pelagica nicotinamide nucleotide transhydrogenase mrna	3.39E-30	90.83	NICOTINAMIDE NUCLEOTIDE TRANSHYDROGENASE (PTHR10160: SF25)	RADid_0107938_depth_49	Z-linked
—	Gallus gallus bac clone ch261-78c3 from chromosome complete sequence	2.28E-13	91.00	—	RADid_0115589_depth_45	Z-linked
TLE2	Aquila chrysaetos canadensis transducin-like enhancer of split 2 transcript variant mrna	1.75E-27	91.48	AMINO-TERMINAL ENHANCER OF SPLIT-RELATED (PTHR10814: SF33)	RADid_0118842_depth_161	Autosomal
DCIR	Anas platyrhynchos dendritic cell immunoreceptor complete cds dcar*null complete sequence and dendritic cell immunoactivating receptor complete cds	2.60E-25	87.50	C-TYPE LECTIN DOMAIN FAMILY 4 MEMBER A (PTHR22802: SF289)	RADid_0124734_depth_48	Z-linked

Candidates of top interest.

Candidates of medium interest.

The Gene Identification Information for RAD-Seq Markers That Were Outliers from Both LOSITAN and BayeScan Analyses, with Chromosomal Location, Gene Sequence Description, Min e-value, and Mean Similarity (blastn), as well as Gene Ontology Term Links (PANTHER) Candidates of top interest. Candidates of medium interest. —RAD cluster distribution against measurement of population divergence (FST): markers with black arrows signify RAD-Seq markers and their respective gene identifier that were labeled as outliers for both LOSITAN and BayeScan. —Distribution of FST values based on MS simulations compared with empirical RAD-Seq data for divergence time of 0.5 Ma: stars = location of significant outliers, gray bars = RAD-Seq data, gray line = equal migration rate, black line = unequal migration rate (1 → 2, 10; 2 → 1, 100), dotted line = equal migration rate (1 → 2, 100; 2 → 1, 100). Outlier detection within the hemoglobin sequences was performed by calculating pairwise FST for each SNP via a locus-by-locus AMOVA. This identified four SNPs on the α-chain and four SNPs on the β-chain subunit of the major hemoglobin. All four of these variants have previously been identified, and for several on the β chain a putatively beneficial role in hemoglobin function has been identified via experimental tests revealing increased O2-binding affinity in the high-altitude population: α77 (pos 379, FST = 0.95), β116 (pos 346, FST = 1.00), β133 (pos 397, FST = 1.00), and β13 (pos 37, FST = 0.42) (Natarajan et al. 2015).

GO Terms of Outliers

Gene ontology terms were found for 27 of the 28 genes (table 2 and supplementary table S3a, Supplementary Material online); the gene list analysis summarizes the represented GO term groups in the form of those associated with pathway, protein, cellular component, biological process, and molecular function. Pathway, molecular function, and biological process yielded the most illuminating categories, whereas protein and cellular component were very general (supplementary table S3b, Supplementary Material online). The categories with the highest numbers in the “pathway” class were Gonadotropin-releasing hormone receptor pathway (2), followed by CCRK signaling, Wnt signaling pathway, Insulin/IGF pathway-protein kinase B signaling cascade, Insulin/IGF pathway-mitogen activated protein kinase kinase/MAP kinase cascade, Androgen/estrogene/progesterone biosynthesis, TGF-beta signaling pathway, and blood coagulation (1 each). In the “biological process” and “molecular function” classes, metabolic process (12), and catalytic activity (12) were their highest represented categories, respectively. However, the results of AmiGO statistical representation yielded no significantly overrepresented categories, likely due to the small size of the gene list.

Chromosomal Location

In general, 29 out of the 35 chromosomes were represented in our RAD-Seq data. The only chromosomes that were missing some cluster representation were Chromosome 16 and 30–33, which are some of the smallest microchromosomes. Approximately 4.8% of the total clusters “mapped” to still unplaced scaffolds within the Chicken genome. Of the 19,434 RAD clusters, 889 clusters were identified as putatively Z-linked (i.e., having a significant BLAST hit to the Chicken Z-chromosome); therefore, the rest of the 18,549 RAD-Seq clusters were classified as putatively autosomal, although there is no way of knowing for sure given the short length of the RAD clusters (i.e., 90 bp). Compared with the autosomal clusters, the Z-linked clusters had higher FST by a factor of two; Z-linked FST = 0.123 (1,797 SNPs from 889 seqs) and autosomal FST = 0.061 (45,944 SNPs from 18,549 seqs). This Z:Autosomal ratio of 2.01 is higher than expected (expected ≤ 1.33) under a neutral model of evolution (Whitlock and McCauley 1999). The chromosomal locations of the outliers were also identified: of the total outliers (356), 1.67% were mapped to the autosomes (310/18,549), compared with 5.17% outliers on the Z-chromosome (46/889), approximately a 3-fold difference. The Chicken Z-chromosome is 82,363,669 bp representing ∼7.68% of the total Chicken genome (1,072,544,763 bp, Gallus gallus, NCBI Annotation Release 103). Thus, given what we know about the size of the Chicken genome, out of the 356 the overlapping outliers, we classified 49 as belonging on the Z-chromosome (12.9%), which is a significantly larger portion of outliers than expected by chance (X2 = 5.5102, P value = 0.019). The two top hits in the outlier analyses were located on the Z-chromosome (NNT, TOPORS).

Gene Flow Estimates

Although low- and high-altitude populations are differentiated, the ∂a∂i results support extensive gene flow between populations under an Isolation-With-Migration model, specifically asymmetric gene flow from low to high altitude (table 3), corroborating the previous coalescent study (McCracken et al. 2009a). These asymmetric gene flow estimates correspond with the same directionality obtained from mtDNA in which 3% (n = 2) of high-altitude individuals were recovered with low-altitude haplotypes, whereas no low-altitude individuals were recovered with high-altitude haplotypes (fig. 2). It is important to note that these introgressed mitochondrial haplotype were collected from the far northwest Argentina, which is close to where the low- and high-altitude subspecies distributions are adjacent.

Table 3

Gene Flow Estimates for Both Nonoutlier and Outlier Loci with Isolation-with-Migration (IM) Model in ∂a∂i Showing Two Independent Runs between Low- and High-Altitude Populations

						N_e		Migration Rate
		Theta	Likelihood	S	Time	Low	High	High–> Low	Low–> High
Nonoutliers	IM—run 1	9598.944863	−6375.446714	0.07592961	4.99593192	1.24293272	0.46102421	0.60743074	5.5158448
	IM—run 2	9598.944863	−6375.446714	0.06175858	1.68199377	1.10955537	0.37120589	0.44283411	9.99918778
	IM average	9598.944863	—	0.068844095	3.338962845	1.176244045	0.41611505	0.525132425	7.75751629
Outliers	IM—run 1	87.12585977	−996.6523572	0.80475161	9.99245969	0.90984298	0.17140578	0.24171386	2.34512902
	IM—run 2	87.12585977	−996.6523572	0.124112	9.99098621	2.9884493	0.23662533	0.22651885	1.58623989
	IM average	87.12585977	—	0.464431805	9.99172295	1.94914614	0.204015555	0.234116355	1.965684455

Note.—Averaged migration rates between the marker types for the two populations are shown in bold text. Theta, Waterson’s θ or nucleotide diversity; Ne, effective population size; migration rate, individual migrants per generation.

Gene Flow Estimates for Both Nonoutlier and Outlier Loci with Isolation-with-Migration (IM) Model in ∂a∂i Showing Two Independent Runs between Low- and High-Altitude Populations Note.—Averaged migration rates between the marker types for the two populations are shown in bold text. Theta, Waterson’s θ or nucleotide diversity; Ne, effective population size; migration rate, individual migrants per generation. The analyses also suggest a substantial reduction in gene flow in outlier loci compared with nonoutlier loci, showing a 2-fold decrease from high- to low-altitude estimates (0.525 → 0.234 migrants per generation), and a 4-fold decrease from low- to high-altitude (7.76→1.97 migrants per generation; table 2). These outlier regions may represent those portions of the genome that are under selective constraint due to their being associated with adaptations to O2 deprivation and/or cold temperatures at high-altitude. Finally, the simulated distribution of FST in the MS analysis (fig. 4 and supplementary fig. S1, Supplementary Material online) indicates that we observed significantly more outlier loci, and a more right-shifted distribution of FST, than expected under a neutral model of evolution, in which all loci evolve per drift-mutation equilibrium with migration, but without influence of migration-selection balance.

Discussion

Our data reveal that despite low levels of genomic divergence (FST = 0.065) and high gene flow, the low- and high-altitude populations of speckled teal are genetically distinct. Evidence of this can be seen in the population structure associated with both mitochondrial and nuclear markers (table 1 and fig. 2). In the nuclear DNA, this pattern is due, in part, to relatively few regions (i.e., regions with outliers) associated with genes known to be involved in responses to ROS production/oxidative damage (TOPORS, NNT), response to hypoxia through the HIF pathway (TOPORS, IGF, BMP), and response to hypoxia through heme factors/blood (F8, HBA2, HBB, BMP-2). These genes were found to involve autosomal (IGF, BMP, HBA2, HBB, F8), Z-linked (NNT, TOPORS) and mitochondrial elements (NNT = nuclear encoded, mitochondrial embedded) (fig. 3 and table 2). As predicted, these outliers also showed depressed gene flow between the populations, compared with the rest of the genome, suggesting an important role for migration-selection balance in the evolution of these loci in these two populations (table 3). Lastly, our results also point at a dominant role for genes located on the Z-chromosome (fig. 3 and table 2).

Migration-Selection Balance and High-Altitude Adaptation in Speckled Teal

The interplay between selection and gene flow is important for population divergence in the presence of gene flow (Feder et al. 2012). Under such a scenario, the magnitude of selection dictates whether populations continue to diverge or evolve as a single group. Here, we identified a relatively small proportion of markers to be under positive/divergent selection with restricted levels of gene flow between low- and high-altitude speckled teal populations. Under neutrality, where genetic drift acts on markers with different effective population sizes (Ne) (Whitlock and McCauley 1999), there are different expectations for FST estimates associated with certain marker types (i.e., mitochondrial vs. nuclear, sex chromosome vs. autosome). The expectations associated with Ne are useful in illuminating whether the evolution of these different markers types has been shaped through different selective pressures on specific chromosome types (Charlesworth 2009). Previous work on waterfowl support that ducks fit the ¼ rule for mtDNA and ¾ rule for Z-linked markers (Lavretsky et al. 2015, 2016). Specifically, under a scenario of equal reproductive success between males and females, the mitochondria will have ¼ the effective population size of the nuclear genome, resulting in a 4-fold difference in FST between mtDNA and nuDNA (Caballero 1995; Whitlock and McCauley 1999; Dean et al. 2015). There is a similar expectation for Z chromosome/autosomal comparisons. Under equal reproductive success between males and females, the Z chromosome will have ¾ the effective population size of autosomes, resulting in an expected Z:Autosomal FST ratio of ≤1.33 (Caballero 1995; Whitlock and McCauley 1999; Dean et al. 2015). However, our results show significant deviations from predictions under neutrality, with mtDNA: nuDNA resulting in a 14-fold difference, and a 2-fold difference for Z:Autosomal (table 1), suggesting a substantial role for evolutionary forces such as selection, drift, or gene flow. There are several factors that may partly explain these patterns. First, the observed difference is most extreme for the mtDNA, which could be because of the faster sorting rate of mtDNA allowing it to accumulate fixed differences faster than nuDNA (Zink and Barrowclough 2008), or because many waterfowl including speckled teal have very large population sizes (Frankham 2007; Wilson et al. 2011), although this explanation is less likely because the high-altitude population is not small. Second, the extreme divergence in the mitochondria, despite similar estimates of gene flow, may suggest that this pattern has resulted from selective processes, since the mitochondrial genome is frequently a source for selective sweeps and strong purifying selection on genetic variation associated with adaptations to hypoxic environments (Xu et al. 2007; Scott et al. 2011; Yu et al. 2011; Li et al. 2013b; Zhang et al. 2013). This could then contribute to mito-nuclear incompatibilities hampering interbreeding between the two populations (Burton and Barreto 2012), with introgression further limited by the selective pressure exerted by hypoxia (Wolff et al. 2014). However, recent evidence suggests this is not the case, as there was no discernable evidence for positive selection on the mitochondrial genome of speckled teal, and an overall lack of nonsynonymous variation associated with any protein-coding regions (Graham AM and McCracken KG, unpublished data). Lastly, it is perhaps most likely that male-biased dispersal and female philopatry common in ducks underlies the substantial structuring of mtDNA in the speckled teal (Avise et al. 1992; Peters et al. 2007, 2012; Zink and Barrowclough 2008; Flint et al. 2009). The Z:Autosome ratio also is larger (2.01) than predicted under neutrality (1.33), although not as large as the mtDNA:nuDNA ratio, suggesting a role for selective pressures for the Z chromosome. For example, significant gene flow, predominantly from low- to high altitude, was detected by ∂a∂i analyses regardless of marker set; however, there was a substantial decrease in gene flow across outlier loci as compared with nonoutliers (table 3). Similarly restricted gene flow was detected in Hb sequences compared with other nuclear loci in speckled teal and other Andean waterfowl (McCracken et al. 2009a). We also found that more significant outliers were on the Z chromosome than expected, and such loci have been implicated in population divergence and speciation in many bird species (Saether et al. 2007; Carling and Brumfield 2009; Pryke 2010; Storchová et al. 2010; Elgvin et al. 2011; Ellegren et al. 2012; Stölting et al. 2013; Lavretsky et al. 2015). A higher density of Z-linked genes in these types of analysis is thought to be the result of a higher rate of evolution relative to autosomes (Mank et al. 2007; Ellegren 2009), as well as through its role in sexual selection, and reproductive isolation (Ritchie 2007). However, it could also be because of a general reduction in recombination rates across the Z chromosome, which can cause blocks of linkage disequilibrium, serving to further offset the disrupting force of gene flow (Saether et al. 2007; Qvarnström and Bailey 2009). Given that the two speckled teal populations are phenotypically distinct in both plumage and body size, it seems likely that the outlier analyses have pinpointed genes underlying mechanisms involved in speciation through adaptive divergence. The two genes with the highest divergence were Z-linked, and potentially have a direct link to adaptations involving high-altitude functions (see next section). Two possibilities are that 1) hypoxia and the high-altitude environment are driving divergence directly in these genes or closely linked genes, or 2) that reproductive isolation is driving phenotypic divergence and these outliers are effectively hitchhiking on that selective pressure on the sex chromosomes. However, with our current data, we are not able to clarify further which of the two possibilities is responsible, thus warranting future research. Nonetheless, in the case of the speckled teal, we see a significant role for nonneutral processes involving Z-linked and autosomal markers—a result consistent with recent findings in other birds (Dean et al. 2015; Dhami et al. 2016) as well as other duck species (Lavretsky et al. 2015, 2016). Overall, our results suggest selection is acting to prevent admixture at loci of adaptive importance, likely through selection against alleles originated from maladapted migrants (i.e., low-altitude individuals immigrating to the highlands or vice versa). This appears to be occurring despite these speckled teal populations being relatively divergent, estimated between 0.5 and 1.0 Ma or older, with gene flow likely playing a recurrent role throughout their recent history. Thus, at the genomic level, these populations do not seem to be suffering from an erosion of locally adaptive alleles through swamping. Ultimately, this matches predictions of divergent selection generating extrinsic reproductive isolation (Yeaman and Whitlock 2011; Feder et al. 2012), due to poor reproductive success of low-altitude migrants at high altitude compared with high-altitude residents with high-altitude genotypes. Extra-limital examples of low-altitude waterfowl species (e.g., A. discors, A. bahamensis) occasionally are observed at high altitude; however, low egg hatchability due to embryonic susceptibility to hypoxia appears to present a serious selective pressure potentially limiting reproductive success, which ultimately may influence a species’ ability to colonize the high-altitude environment (Visschedijk et al. 1980; Leon-Velarde et al. 1984; Monge and Leon-Velarde 1991; Carey et al. 1994). In addition, the asymmetrical gene flow found in this study corroborates prior work on speckled teal, 1) which showed that there were higher levels of gene flow from low to high altitude than the reverse (McCracken et al. 2009a), 2) and that other genes under positive selection (i.e., hemoglobin) had much lower levels of gene flow compared with neutral loci. In the latter case, this was attributed to countervailing selection at loci that have now been shown experimentally to be associated with mechanisms related to high-altitude adaptation (McCracken et al. 2009a, 2009c). Both the asymmetry of gene flow and the low hatchability of low-altitude genotypes at high-altitude suggest that crucial genes for survival at high-altitudes are thus sequestered in high-altitude locations, whereas the rest of the genome is seemingly free to admix. In addition, this asymmetry could be enhanced by the fact that the low altitude population is considered more migratory than their high-altitude counterparts (McCracken et al. 2009a). Therefore, our results support strong roles for both selection and gene flow in shaping the genomic architecture of speckled teal populations in response to genetic adaptations to the high-altitude Andean environment.

High-Altitude Adaptation and Z-Linked Genes: TOPORS and NNT

It is well known that exposure to low O2 leads to a cascade of metabolic and physiological changes. At high altitudes, oxygen utilization has crucial consequences for cellular function, especially through the production of reactive oxygen species (ROS) typically produced through inefficiencies during mitochondrial respiration (Turrens 2003; Palmer and Clegg 2014). The imbalance of ROS and antioxidant capacity is driven by hypoxic stress and increases at high altitudes; therefore, mechanisms to protect against oxidative damage or reduce ROS production would be advantageous (Storz et al. 2010). Our data present two gene regions with functions associated with a response to oxidative damage, TOPORS and NNT. The region that we identified as having the highest allelic divergence between the two populations was TOPORS (FST = 1.0), which is an E3 ubiquitin-protein ligase known to regulate p53 (Rajendra et al. 2004; Weger et al. 2005) and DNA Topoisomerase I (Haluska 1999; Weger et al. 2005). Both of these genes/pathways are utilized as part of an intracellular nonimmune surveillance mechanism that controls cellular response to a variety of stress conditions, including DNA damage and hypoxia, among others, leading to cell growth arrest and apoptosis (Levine 1997; Sermeus and Michiels 2011; Bertozzi et al. 2014). Although TOPORS has not previously been identified as candidate gene involved in adaptation to high-altitude environments, p53 has been implicated in response to low O2, including, hypoxic microenvironments created by tumors (Royds et al. 1998), underground burrows of rodents (Ashur-Fabian 2004; Quina et al. 2015), and even high-altitude environments (Zhao et al. 2013). In addition to roles previously stated, TOPORS has also been implicated in playing a key role in regulating primary cilia-dependent development and function in the retina (Chakarova et al. 2011), potentially in response to increased levels of ultraviolet radiation exposure, as well as oxidative stress, at higher altitudes (Paul and Gwynn-Jones 2003). Therefore, the role of TOPORS in hypoxia and DNA damage presents a potentially pleiotropic response to both low O2 and DNA repair from oxidative and ultraviolet radiation damage. The other gene of interest is the nuclear-encoded, but mitochondrially embedded NAD (P) transhydrogenase (NNT), which is essential for oxidative phosphorylation (OXPHOS). Under normal conditions, NNT uses energy from the mitochondrial proton gradient to produce high concentrations of NADPH; the resulting NADPH is used for biosynthesis, as well as in reactions inside the mitochondria required to remove reactive oxygen species (Figueira et al. 2013). To date, this is the first time NNT has been identified as an outlier/gene of interest in relation to high-altitude adaptation, or low-O2 environments. Previously, mutations in NNT have been shown to increase the amount of oxidative damage due to its inability to regulate ROS within the mitochondria (Freeman et al. 2006; Huang et al. 2006). Although, other mechanisms involved in DNA damage have been implicated in other high-altitude species (Subramani et al. 2015; Yang et al. 2015; Qiao et al. 2016; Zhang et al. 2016), it has been in response to ultraviolet radiation.

Autosomal Genes Associated with High-Altitude Adaptation: IGF, BMP, HBA2, and HBB

One of the biggest selective pressures for species invading high-altitude environments is the low levels of O2 in the atmosphere causing hypoxia (Storz et al. 2010). Hypoxia triggers a conservation of energy through a global reduction in protein expression, as well as a switch to anaerobic metabolism via a family of transcription factors involved in the HIF pathway (Semenza 2007). This pathway is widely considered the “master regulator” of oxygen sensing, and because of its central role in mediating a system-wide response to low O2, is frequently found to be of importance in high-altitude species (Bigham et al. 2009, 2013; Beall et al. 2010; Hanaoka et al. 2012; Li et al. 2014; Wang et al. 2014). Although our study did not identify members of the HIF pathway specifically, we did find that the insulin-like signaling pathway (IGF-1, IGFBP-5, MAPK) and skeletal morphogenesis (BMP2) are likely under positive selection between these two populations. Not only have both the IGF and BMP pathways been shown to interact with each other (Nakae et al. 2001; Guntur and Rosen 2013) but they are also involved in the system-wide response to hypoxia mediated via the HIF pathway (Feldser et al. 1999; Fukuda et al. 2002; Wang et al. 2007). Both IGF signaling (Li et al. 2013a; Welch et al. 2014; Berg et al. 2015) and bone morphogenesis (Qu et al. 2013; Yang et al. 2016) have been implicated in the acclimation to various environmental changes, including adaptations to low O2. Additionally, both the IGF and BMP pathways are known to interact with the p53 circuit (Harris and Levine 2005), suggesting the potential for multiple outliers in interconnected pathways having an effect on high-altitude adaptation in the speckled teal. Hemoglobin has frequently been of interest in relationship to adaptive responses to low O2 (Weber 2007; Storz and Moriyama 2008; Storz et al. 2009; Beall et al. 2010; Grispo et al. 2012; Projecto-Garcia et al. 2013; Revsbech et al. 2013; Cheviron et al. 2014; Tufts et al. 2015), and has been shown to be under selection in a variety of organisms, including Andean ducks (McCracken et al. 2009a, 2009b, 2009c, 2010). Our study was able to identify eight significant SNP variants between low- and high-altitude populations in the HBA2 and HBB subunits; of those genotypic variants identified, two have been experimentally characterized (βA116-Ser and βA133-Met) as causing an increase in the HbA isoform’s O2-binding affinity (McCracken et al. 2009a, 2009c) in high-altitude speckled teal (Natarajan et al. 2015). In addition to selective pressures on the hemoglobin subunits, another candidate for adaptation to high altitude is the blood coagulation factor VIII or F8, as found in our outlier analyses. This gene encodes for a large plasma glycoprotein, most notably responsible for hemophilia in humans (Gouw et al. 2012). However, it has not been identified as a candidate in other high-altitude organism studies, even though it is well established that plasma concentration of F8, and other clotting factors, are elevated in humans encountering hypoxic situations (O’Brodovich et al. 1984; Chohan 2014).

Using RAD-Seq in Outlier Analyses of the Speckled Teal

We acknowledge that our data set represents a small percentage (∼1%) of the speckled teal genome, and that there are likely many other candidate gene regions for high-altitude adaptation for this species (Lowry et al. 2017a), including genes involved in the HIF pathway, as well as mitochondrial elements (Scheinfeldt and Tishkoff 2010; Storz et al. 2010; Beall et al. 2012). Still, the incomplete nature of RAD-Seq data does not necessarily detract from the results gathered for several reasons, including the limitations associated with sequencing in nonmodel organisms, and those with large or complex genomes, as well as the fact that it is technically impossible to identify all loci subjected to selection regardless of methodology (McKinney et al. 2017). Thus, RAD-Seq still remains a powerful and efficient approach for studying selection in natural populations (Andrews et al. 2016; Catchen et al. 2017), as long as the potential limitations of such data are acknowledged, including concerns with physical linkage (Lowry et al. 2017b). It is important to note that the patterns we observed in autosomal loci (in addition to some of the putative outlier genes) might be the result of genetic hitchhiking, or genetic draft, arising from linkage to genes or regions on the same chromosome that are the “actual” targets of selection. However, in birds, linkage disequilibrium (LD) decays quickly even over short distances, reflecting high recombination rates (Wong et al. 2004; Backström et al. 2006; Balakrishnan and Edwards 2009; Stapley et al. 2010); therefore, this possibility seems less likely. Ultimately, we are careful to suggest that these outliers are not singularly causative, but instead represent just a sampling of the genes possibly under selection in high-altitude/hypoxic environments. In addition, the outlier loci identified in this study are largely not without precedent, and have been previously characterized as outliers based on prior whole-genome analyses or functional assessments in other organisms. The other loci found as candidates (those without asterisks; table 2) possibly represent those with closer potential linkage to loci not captured in the current data set. Therefore, the patterns seen in this study are likely representative of the genome as a whole, and represent a springboard for future projects in this species looking to elucidate the genetic architecture associated with selection pressures resulting from O2 deprivation.

Conclusions

Our analyses suggest that adaptations to high-altitude environments are resulting in genomic divergence, despite longstanding and recurrent gene flow between populations of speckled teal. This has created a genome-wide signature in which patterns of migration-selection balance are prevalent across various portions of the genome. We have identified a set of loci putatively under selection with differences in allele frequencies strongly correlated with high- and low-altitude habitats—most notably those involved in the insulin-like signaling pathway, bone morphogenesis, metabolic processes through the mitochondria (oxidative phosphorylation), responders to hypoxia-induced DNA damage, and feedback loops to the HIF pathway. Although gene flow was found on all linkage groups (mitochondrial, autosomal, and Z chromosome), many outlier loci in the nuclear genome were found to have depressed gene flow estimates, compared with other loci. We also discussed Z-linked loci and their role in the population differentiation of incipient diverging species; our data suggest that Z-linked loci may be simultaneously under selection due to their mechanistic role in high-altitude adaptation as well as phenotypic divergence. Together the data identify a proportion of the genome, known to be linked to previously studied examples of high-altitude adaption, that are likely under positive directional selection in the high-altitude speckled teal population. Overall, our results suggest a multifactorial response to life at high altitudes through an array of interconnected pathways, that are not only under positive selection but whose genetic components seem to be providing at least a partial genomic barrier to gene flow and continued interbreeding, functioning as an avenue for population divergence and speciation, even if the speciation process has stalled short of completion (Peters et al. 2012). Ultimately, this study illustrates another example of how populations are able to invade novel, and sometimes adapt to extreme habitats, and provides the most comprehensive genomic study of this Andean species.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online.

Author contributions

A.M.G. and K.G.M. designed the study; V.M.F., A.J.G., and K.G.M. provided funding; K.G.M., and R.E.W. performed the research/generated the data; A.M.G. and P.L. analyzed the data and wrote the manuscript. All authors commented on the manuscript. Click here for additional data file.

143 in total

1. Regulatory changes contribute to the adaptive enhancement of thermogenic capacity in high-altitude deer mice.

Authors: Zachary A Cheviron; Gwendolyn C Bachman; Alex D Connaty; Grant B McClelland; Jay F Storz
Journal: Proc Natl Acad Sci U S A Date: 2012-05-14 Impact factor: 11.205

Review 2. Ecological genomics of local adaptation.

Authors: Outi Savolainen; Martin Lascoux; Juha Merilä
Journal: Nat Rev Genet Date: 2013-11 Impact factor: 53.242

Review 3. Mitochondrial DNA under siege in avian phylogeography.

Authors: Robert M Zink; George F Barrowclough
Journal: Mol Ecol Date: 2008-04-03 Impact factor: 6.185

4. Molecular evolution of cytochrome C oxidase underlies high-altitude adaptation in the bar-headed goose.

Authors: Graham R Scott; Patricia M Schulte; Stuart Egginton; Angela L M Scott; Jeffrey G Richards; William K Milsom
Journal: Mol Biol Evol Date: 2010-08-04 Impact factor: 16.240

5. On the effective size of populations with separate sexes, with particular reference to sex-linked genes.

Authors: A Caballero
Journal: Genetics Date: 1995-02 Impact factor: 4.562

6. Topors acts as a SUMO-1 E3 ligase for p53 in vitro and in vivo.

Authors: Stefan Weger; Eva Hammer; Regine Heilbronn
Journal: FEBS Lett Date: 2005-09-12 Impact factor: 4.124

7. Topors functions as an E3 ubiquitin ligase with specific E2 enzymes and ubiquitinates p53.

Authors: Rajeev Rajendra; Diptee Malegaonkar; Pooja Pungaliya; Henderson Marshall; Zeshaan Rasheed; James Brownell; Leroy F Liu; Stuart Lutzker; Ahamed Saleem; Eric H Rubin
Journal: J Biol Chem Date: 2004-07-09 Impact factor: 5.157

8. Parallel evolution in the major haemoglobin genes of eight species of Andean waterfowl.

Authors: K G McCracken; C P Barger; M Bulgarella; K P Johnson; S A Sonsthagen; J Trucco; T H Valqui; R E Wilson; K Winker; M D Sorenson
Journal: Mol Ecol Date: 2009-09-15 Impact factor: 6.185

9. Genomic analyses identify distinct patterns of selection in domesticated pigs and Tibetan wild boars.

Authors: Mingzhou Li; Shilin Tian; Long Jin; Guangyu Zhou; Ying Li; Yuan Zhang; Tao Wang; Carol K L Yeung; Lei Chen; Jideng Ma; Jinbo Zhang; Anan Jiang; Ji Li; Chaowei Zhou; Jie Zhang; Yingkai Liu; Xiaoqing Sun; Hongwei Zhao; Zexiong Niu; Pinger Lou; Lingjin Xian; Xiaoyong Shen; Shaoqing Liu; Shunhua Zhang; Mingwang Zhang; Li Zhu; Surong Shuai; Lin Bai; Guoqing Tang; Haifeng Liu; Yanzhi Jiang; Miaomiao Mai; Jian Xiao; Xun Wang; Qi Zhou; Zhiquan Wang; Paul Stothard; Ming Xue; Xiaolian Gao; Zonggang Luo; Yiren Gu; Hongmei Zhu; Xiaoxiang Hu; Yaofeng Zhao; Graham S Plastow; Jinyong Wang; Zhi Jiang; Kui Li; Ning Li; Xuewei Li; Ruiqiang Li
Journal: Nat Genet Date: 2013-10-27 Impact factor: 38.330

10. Comparative genomic data of the Avian Phylogenomics Project.

Authors: Guojie Zhang; Bo Li; Cai Li; M Thomas P Gilbert; Erich D Jarvis; Jun Wang
Journal: Gigascience Date: 2014-12-11 Impact factor: 6.524

8 in total

1. Pervasive Genomic Signatures of Local Adaptation to Altitude Across Highland Specialist Andean Hummingbird Populations.

Authors: Marisa C W Lim; Ke Bi; Christopher C Witt; Catherine H Graham; Liliana M Dávalos
Journal: J Hered Date: 2021-05-24 Impact factor: 2.645

2. Adaptive introgression of the beta-globin cluster in two Andean waterfowl.

Authors: Allie M Graham; Jeffrey L Peters; Robert E Wilson; Violeta Muñoz-Fuentes; Andy J Green; Daniel A Dorfsman; Thomas H Valqui; Kevin Winker; Kevin G McCracken
Journal: Heredity (Edinb) Date: 2021-04-26 Impact factor: 3.832

3. Genetic variation in PTPN1 contributes to metabolic adaptation to high-altitude hypoxia in Tibetan migratory locusts.

Authors: Ding Ding; Guangjian Liu; Li Hou; Wanying Gui; Bing Chen; Le Kang
Journal: Nat Commun Date: 2018-11-26 Impact factor: 14.919

4. Divergent Fine-Scale Recombination Landscapes between a Freshwater and Marine Population of Threespine Stickleback Fish.

Authors: Alice F Shanfelter; Sophie L Archambeault; Michael A White
Journal: Genome Biol Evol Date: 2019-06-01 Impact factor: 3.416

5. Old divergence and restricted gene flow between torrent duck (Merganetta armata) subspecies in the Central and Southern Andes.

Authors: Luis Alza; Philip Lavretsky; Jeffrey L Peters; Gerardo Cerón; Matthew Smith; Cecilia Kopuchian; Andrea Astie; Kevin G McCracken
Journal: Ecol Evol Date: 2019-08-15 Impact factor: 2.912

6. Convergent evolution on the hypoxia-inducible factor (HIF) pathway genes EGLN1 and EPAS1 in high-altitude ducks.

Authors: Allie M Graham; Kevin G McCracken
Journal: Heredity (Edinb) Date: 2019-01-10 Impact factor: 3.821

7. Convergent changes in muscle metabolism depend on duration of high-altitude ancestry across Andean waterfowl.

Authors: Neal J Dawson; Luis Alza; Gabriele Nandal; Graham R Scott; Kevin G McCracken
Journal: Elife Date: 2020-07-30 Impact factor: 8.140

8. Genomic Adaptations to Salinity Resist Gene Flow in the Evolution of Floridian Watersnakes.

Authors: Rhett M Rautsaw; Tristan D Schramer; Rachel Acuña; Lindsay N Arick; Mark DiMeo; Kathryn P Mercier; Michael Schrum; Andrew J Mason; Mark J Margres; Jason L Strickland; Christopher L Parkinson
Journal: Mol Biol Evol Date: 2021-03-09 Impact factor: 16.240

8 in total