Literature DB >> 25663487

Phylogenomics of phrynosomatid lizards: conflicting signals from sequence capture versus restriction site associated DNA sequencing.

Adam D Leaché¹, Andreas S Chavez², Leonard N Jones³, Jared A Grummer³, Andrew D Gottscho⁴, Charles W Linkem⁵.

Abstract

Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both "recent" and "deep" timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus.

Entities: Chemical Disease Gene Species

Keywords: RADseq; coalescence; ddRADseq; incomplete lineage sorting; single nucleotide polymorphism; species tree; ultraconserved elements

Mesh：

Substances：
DNA Restriction Enzymes

Year: 2015 PMID： 25663487 PMCID： PMC5322549 DOI： 10.1093/gbe/evv026

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Introduction

New methods for obtaining comparative genomics data are transforming phylogenetic studies of nonmodel organisms. Sequence capture and restriction site associated DNA sequencing (RADseq) are emerging as two of the most useful reduced-representation genome sequencing methods for phylogenetic and population-level studies. Sequence capture methods use short probes (60–120 nt) to hybridize to specific genomic regions that are subsequently sequenced, and therefore these methods require some advanced level of knowledge of the genomes under investigation (Gnirke et al. 2009; Mamanova et al. 2010). Sequence capture has been applied to a variety of studies aiming to resolve phylogenetic relationships at relatively “deep” evolutionary timescales, including mammals (McCormack et al. 2012), birds (McCormack et al. 2013), turtles and archosaurs (Crawford et al. 2012), fishes (Li et al. 2013), and squamates (Leaché et al. 2014; Pyron et al. 2014). RADseq methods (Baird et al. 2008) rely on restriction enzyme digestion of genomic DNA followed by the subsequent size-selection and sequencing of fragments that are of a certain size range (Miller et al. 2007; Puritz et al. 2014). The approach requires limited to no previous knowledge of the genome, which has made it a popular choice for studying recent speciation in organisms that lack existing genomic resources, including mosquitos (Emerson et al. 2010), plants (Eaton and Ree 2013), cichlids (Wagner et al. 2013), and beetles (Cruaud et al. 2014). Sequence capture and RADseq data have great utility for phylogenetic investigations at different evolutionary timescales, yet the boundary separating the utility of each approach is unclear. Sequence capture using ultraconserved elements (UCEs) was originally described as an approach for resolving deep phylogenies (Faircloth et al. 2012); however, recently it has been shown to be useful for phylogeographic studies (Smith et al. 2014). Likewise, the application of RADseq methods has been extended from shallow timescales to divergences dating back to 50–60 Ma (Rubin et al. 2012; Cariou et al. 2013). Whether the two approaches provide similar results (i.e., congruent phylogenetic trees) for relationships across any particular timescale is unknown, because both data types have not been collected for the same study system (but see Harvey et al. 2013). The properties of the DNA sequence data alignments provided by the methods are quite different, which could result in different biases during phylogenetic analysis. For example, sequence capture provides relatively long loci (hundreds to thousands of nucleotides) with little missing data, whereas RADseq has the potential to recover thousands of short loci (50–150 nt, depending on sequencing effort), with large amounts of missing data resulting from allelic dropout (Arnold et al. 2013). Resolving difficult phylogenetic problems such as rapid speciation events requires sampling hundreds or thousands of loci (Liu and Edwards 2009), but whether the increased number of loci offered by RADseq methods is offset by the short length of the loci and missing data have not been explored. The iguanian lizard family Phrynosomatidae is composed of 9 genera and 148 species and is therefore the most diverse and species-rich family of lizards in North America (Uetz 2014). This family is distributed broadly across North and Central America from southern Canada to Panama, and most diversity is centered in arid regions of the American Southwest and Mexico. The broad distribution and high species diversity of phrynosomatid lizards have made them an important focal group for comparative studies in ecology and evolutionary biology (e.g., Sinervo and Lively 1996; Lambert and Wiens 2013; Wiens et al. 2013). However, despite numerous phylogenetic studies, the relationships among the nine genera have been difficult to resolve. The relationships among the sand lizard genera Cophosaurus, Callisaurus, Holbrookia, and Uma are unclear, and previous studies based on morphology (de Queiroz 1989), allozymes (de Queiroz 1992), and mitochondrial DNA (mtDNA; Reeder 1995; Reeder and Wiens 1996; Wilgenbusch and de Queiroz 2000; Leaché and McGuire 2006; Wiens et al. 2010) have produced conflicting results. Identifying the order of divergence events within the sand lizards, and whether or not the two “earless” genera with concealed tympanic membranes (Cophosaurus and Holbrookia) form a clade are the two main questions that remain unanswered. Recent phylogenetic studies utilizing mitochondrial and nuclear genes converge on a common topology for these genera and support both Uma as sister to the other sand lizards and monophyly of the earless lizards (Wiens et al. 2010, 2013). The relationships among the sceloporines (Petrosaurus, Sceloporus, Urosaurus, and Uta) have been difficult to resolve due to rapid and successive speciation. These studies support a clade containing Urosaurus and Sceloporus (Wiens et al. 2010, 2013). However, determining whether Petrosaurus or Uta is the sister group to other sceloporines has remained uncertain (Wiens et al. 2010). Analyses based on concatenating independent loci differ from coalescent-based species trees, which indicates that gene tree conflict from incomplete lineage sorting could be affecting this part of the phrynosomatid tree. In this study, we use new molecular data collected using sequence capture and double digest RADseq (ddRADseq; Peterson et al. 2012) to estimate the phylogenetic relationships among phrynosomatid lizard genera. We estimate phylogenetic trees for the sequence capture data using concatenation and coalescent-based species tree inference techniques, and we examine the genome-wide support for competing phylogenetic hypotheses for phrynosomatid lizards. The ddRADseq data are assembled using a variety of thresholds that govern the homology, paralogy, and levels of missing data. The phylogenetic trees estimated from the ddRADseq data assemblies are compared against each other and to the sequence capture data.

Materials and Methods

Sampling

We sampled one species from each of the nine genera of the Phrynosomatidae (table 1), including Callisaurus draconoides, Cophosaurus texanus, Holbrookia maculata, Petrosaurus thalassinus, Phrynosoma sherbrookei, Sceloporus occidentalis, Uma notata, Urosaurus ornatus, and Uta stansburiana. Two additional species, Gambelia wislizenii and Liolaemus darwinii, were included as outgroups for the sequence capture experiment, and G. wislizenii was included in the ddRADseq protocol for the same purpose. DNA was extracted from tissues using a NaCl extraction method (MacManes 2013) or a Qiagen DNeasy kit.

Table 1

Species Included in the Analysis and an Overview of the Sequence Capture Data

Species	Voucher	Raw Reads	Clean Reads	Nuclear Loci Captured^a	Nuclear Loci k-mer Depth^b	mtDNA (bp)^c	mtDNA k-mer Depth^d
Phrynosomatidae
Callisaurus draconoides	MVZ 265543	9,622,116	9,035,068	575	23,280	13,106	1,502,772
Cophosaurus texanus	UWBM 7347	9,176,180	8,625,204	573	24,401	15,609	2,482,706
Holbrookia maculata	UWBM 7362	12,314,136	11,604,340	573	31,000	12,865	1,307,531
Petrosaurus thalassinus	MVZ 161183	4,500,868	3,959,796	523	8,281	7,898	248,342
Phrynosoma sherbrookei	MZFC 28101	7,634,142	6,971,920	579	14,107	12,967	47,287
Sceloporus occidentalis	UWBM 6281	13,531,214	12,733,646	540	30,235	7,422	113,757
Uma notata	SDSNH 76166	2,332,400	2,099,068	577	4,232	7,296	20,763
Urosaurus ornatus	UWBM 7587	3,427,288	3,042,766	577	6,673	6,286	28,028
Uta stansburiana	UWBM 7605	12,927,696	12,085,734	538	25,034	16,703	1,144,368
Outgroups
Gambelia wislizenii	UWBM 7353	9,874,902	7,824,714	549	5,180	15,790	581,925
Liolaemus darwinii	LJAMM-CNP 14634	3,253,800	2,935,874	581	8,715	11,751	41,572

aTotal loci targeted = 585.

bAverage number of 90-bp k-mers across all captured loci.

cTotal base pairs; aligned length = 17,187 bp.

dNumber of 90-bp k-mers.

Species Included in the Analysis and an Overview of the Sequence Capture Data aTotal loci targeted = 585. bAverage number of 90-bp k-mers across all captured loci. cTotal base pairs; aligned length = 17,187 bp. dNumber of 90-bp k-mers.

Sequence Capture Data Collection

To obtain a large collection of homologous loci from throughout the genome, we designed a set of RNA probes specific for iguanian lizards. The probes are a subset of the 5,472 UCE probes published by Faircloth et al. (2012) with ≥99% sequence similarity to published genomes for Anolis carolinensis (Alföldi et al. 2011) and S. occidentalis (Genomic Resources Development Consortium et al. 2015). We excluded loci that were within 100 kb of one another to reduce any chance of linkage. We identified 541 UCE loci that matched both published genomes, and we tiled two 120-bp probes for each locus that overlapped by 60 bp. We included probes for 44 additional genes used in the squamate Tree of Life project (Wiens et al. 2012). The loci were included to increase the overlap between our new data with existing genetic resources for squamate reptiles. In total, we synthesized 1,170 custom probes (targeting 585 loci) using the MYbaits target enrichment kit (MYcroarray Inc., Ann Arbor, MI). Genomic DNA (400 ng) was sonicated to a target peak of 400 bp using a Bioruptor Pico (Diagenode Inc.). Genomic libraries were prepared using an Illumina Truseq Nano library preparation kit. The samples were hybridized to the RNA-probes in the presence of a blocking mixture composed of forward and reverse compliments of the Illumina Truseq Nano Adapters, with inosines in place of the indices, as well as chicken blocking mix (Chicken Hybloc, Applied Genetics Lab Inc.) to reduce repetitive DNA binding to beads. Libraries were incubated with the RNA probes for 24 h at 65 °C. Post-hybridized libraries were enriched using Truseq adapter primers with Phusion Taq polymerase (New England Biolabs Inc.) for 20 cycles. Enriched libraries were cleaned with AMPure XP beads. We quantified enriched libraries using quantitative polymerase chain reaction (qPCR) (Applied Biosystems Inc.) with primers targeting five loci mapping to different chromosomes in the Anolis genome. Library quality was verified using an Agilent TapeStation 2200 (Agilent Tech.). These samples were pooled in equimolar ratios and sequenced using an Illumina HiSeq2000 (100-bp, paired-end reads) at the QB3 facility at UC Berkeley.

Sequence Capture Bioinformatics

The raw DNA sequences were processed using Casava (Illumina), which demultiplexes the sequencing run based on sequence tags. The program Trimmomatic (Bolger et al. 2014) was used to remove low-quality reads, trim low-quality ends, and remove adapter sequences. The cleaned paired-reads were organized by individual and then assembled with the de novo assembler IDBA (Peng et al. 2010). We ran IDBA iteratively over k-mer values from 50 to 90 with a step length of 10. We used phyluce (Faircloth et al. 2012) to assemble loci across species. We started by aligning species-specific assemblies to the probe sequences using the program LASTZ (available from http://www.bx.psu.edu/miller_lab/ last accessed February 20, 2015). After creating an SQL relational database of assembly-to-probe matches for each species, we queried the database for loci that were shared for a minimum of three species across all samples, and for those that were present across all species. We performed multiple sequence alignments for each locus using MAFFT (Katoh and Standley 2013), and long ragged-ends were trimmed to reduce missing or incomplete data. We authenticated the identity of each sample by aligning our new data for one of the protein-coding nuclear genes (PRLR) with data published by Wiens et al. (2010). This is an important step when using exemplar sampling to verify the identity of each sample. We conducted a multiple sequence alignment with MAFFT, and performed a maximum likelihood (ML) analysis using RAxML v8.0.2 (Stamatakis 2014) with 100 bootstrap replicates under the GTRGAMMA model. As expected, the phrynosomatid lizards in our study each formed a clade with their proper genus (results not shown).

Sequence Capture Phylogenetic Analysis

ML phylogenetic analyses were conducted using RAxML v8.0.2 (Stamatakis 2014) with the GTRGAMMA model. We estimated gene trees for each locus separately, and also conducted an analysis of the concatenated data. Branch support was estimated using the automatic bootstrap function, which calculates a stopping rule to determine when sufficient replicates have been generated (Pattengale et al. 2010). The individual sequence capture ML trees were filtered in PAUP* v.4b10 (Swofford 2003) to calculate the number of loci that supported particular topological arrangements for phrynosomatid lizards found by previous studies using morphology, allozymes, mtDNA, or nuclear loci. The concatenated data were also analyzed using Bayesian inference (BI) with MrBayes v3.2 (Ronquist et al. 2012). The MrBayes analysis was run for 2 million generations with two independent runs (each with four chains), sampling every 1,000 generations. Summaries of the posterior distribution excluded the first 25% of samples as burn-in. We also conducted phylogenetic analyses of mtDNA genome data using ML and BI (as described above). The mtDNA genomes are present in high copy number during library preparation, and fragments of this locus are sequenced as “by-catch” along with the nuclear loci. All trees were rooted with G. wislizenii. We estimated divergence times for the concatenated sequence capture data using BEAST v1.8.1 (Drummond et al. 2012). We repeated the analysis for the mtDNA data to obtain a time-calibrated gene tree for this locus. We used marginal likelihood estimation (Baele et al. 2013) to compare a strict clock to the uncorrelated lognormal relaxed clock. Marginal likelihoods were estimated using path sampling and stepping-stone analyses (Baele et al. 2012), both with 100 sampling steps with 100,000 generations for each step. The strict clock was rejected for the sequence capture data (2 × loge Bayes Factor = 872) and for the mtDNA data (2 × loge Bayes Factor = 34). All analyses used an uncorrelated lognormal relaxed clock, Yule tree prior, and an HKY (Hasegawa–Kishino–Yano)+Γ model of nucleotide substitution. We applied one calibration point to obtain divergence times across the tree using the molecular dating results of previous studies that included up to four fossil calibrations (Wiens et al. 2013). We assumed that the crown group age for phrynosomatid lizards was on average 55 Ma (normal distribution, mean = 55, SD = 4), resulting in a 95% highest probability density ranging from 48.4 to 61.6 Ma. Two replicate analyses of 40 million generations each were run (2 million for the mtDNA), sampling every 4,000 steps (1,000 for the mtDNA), and discarding the first 25% prior to combining the results using LogCombiner v1.8. We calculated a maximum clade credibility tree using TreeAnnotator v1.8. We estimated a species tree using MPEST v1.4 (Liu et al. 2010). This method estimates a coalescent species tree using the gene tree topology for each locus as the starting input. Using gene tree topologies instead of DNA sequences decreases the computation time of estimating a species tree and makes the approach advantageous for large phylogenomic data sets. However, the method does not account for gene tree estimation error, and this can reduce the accuracy of the species tree. We used the best ML gene tree estimated for each locus as the input for MPEST. To obtain support measures on the species tree, we ran MPEST 100 times using each of the 100 ML bootstrap trees obtained for each locus. The support measures were obtained by calculating an extended majority-rule consensus tree for the 100 species trees estimated by MPEST. The resulting taxon bipartitions measure the percentage of times that each bipartition occurred across the 100 species trees. We also estimated a species tree for the sequence capture data using BP&P v3 (Rannala and Yang 2003; Yang and Rannala 2014). This method estimates a species tree using the multispecies coalescent model directly from the DNA sequence alignments while accounting for incomplete lineage sorting due to ancestral polymorphism. This full-Bayesian procedure accommodates uncertainty in gene tree estimation during species tree estimation and provides posterior probability values for species relationships. The method assumes the Jukes–Cantor model for the substitution process, with no rate variation across sites within a locus. Prior distributions are required for the population sizes and the age of the root of the tree in units of expected substitutions. A gamma prior G(2, 1,000), with mean 2/2,000 = 0.001, was used for the population size parameters. The age of the root in the species tree was assigned the gamma prior G(2, 100). After an initial burn-in of 1,000 steps we ran the analysis for 1 million generations, sampling every 100 steps. The analysis was repeated four times with random starting seeds to confirm adequate mixing and consistent results. We also estimated a species tree using SVDquartets (Chifman and Kubatko 2014). This method infers the topology among randomly sampled quartets of species using a coalescent model, and then a quartet method is used to assemble the randomly sampled quartets into a species tree. We randomly sampled 10,000 quartets from the data matrix, and used the program Quartet MaxCut v.2.1.0 (Snir and Rao 2012) to infer a species tree from the sampled quartets. We measured uncertainty in relationships using nonparametric bootstrapping with 100 replicates. The bootstrap values were mapped to the species tree estimated from the original data matrix using SumTrees v.3.3.1 (Sukumaran and Holder 2010).

ddRADseq Data Collection

We collected ddRADseq data following the protocol described by Peterson et al. (2012). We double-digested 500 ng of genomic DNA for each sample with 20 units each of a rare cutter SbfI (restriction site 5′-CCTGCAGG-3′) and a common cutter MspI (restriction site 5′-CCGG-3′) in a single reaction with the manufacturer recommended buffer (New England Biolabs) for 4 h at 37 °C. Fragments were purified with Agencourt AMPure beads before ligation of barcoded Illumina adaptors onto the fragments. The oligonucleotide sequences used for barcoding and adding Illumina indexes during library preparation are provided in Peterson et al. (2012). The libraries were size-selected (between 415 and 515 bp after accounting for adapter length) on a Pippin Prep size fractionator (Sage Science). Precise size selection is critical with ddRADseq, because it minimizes variation in fragment size-based locus selection among libraries and increases the likelihood of obtaining homologous loci across samples (Puritz et al. 2014). The final library amplification used proofreading Taq and Illumina’s indexed primers. The fragment size distribution and concentration of each pool were determined on an Agilent 2200 TapeStation or 2100 Bioanalyzer, and qPCR was performed to determine sequenceable library concentrations before multiplexing equimolar amounts of each pool for sequencing on a single Illumina HiSeq 2500 lane (50-bp, single-end reads; pooled with 60 other samples) at the QB3 facility at UC Berkeley.

ddRADseq Bioinformatics

We processed raw Illumina reads using the program pyRAD v.2.17 (Eaton 2014). An advantage of pyRAD over other RADseq data set assembly tools such as Stacks (Catchen et al. 2013) is that it is designed to assemble data for phylogenetic studies containing divergent species using global alignment clustering, which may include indel variation. We demultiplexed samples using their unique barcode and adapter sequences, and sites with Phred quality scores under 99% (Phred score = 20) were changed into “N” characters, and reads with ≥10% N’s were discarded. Each locus was reduced from 50 to 39 bp after the removal of the 6-bp restriction site overhang and the 5-bp barcode. The filtered reads for each sample were clustered using the program USEARCH v.6.0.307 (Edgar 2010), and then aligned with MUSCLE (Edgar 2004). This clustering step establishes homology among reads within a species. We assembled the ddRADseq data using three different clustering thresholds (clustering = 80%, 90%, and 95%) to determine the impact of this parameter on phylogeny inference. As an additional filtering step, consensus sequences were discarded that had low coverage (<6 reads), excessive undetermined or heterozygous sites (>3), or too many haplotypes (>2 for diploids). The consensus sequences were clustered across samples using the same three thresholds used to cluster data within species (80%, 90%, and 95%). This step establishes locus homology among species. Each locus was aligned with MUSCLE, and a filter was used to exclude potential paralogs. The paralog filter removes loci with excessive shared heterozygosity among samples. The justification for this filtering method is that shared heterozygous single nucleotide polymorphisms (SNPs) across species are more likely to represent a fixed difference among paralogs than shared heterozygosity within orthologs among species. We applied two paralog filter levels to determine the potential impact of paralog detection on phylogeny inference, including a strict filter that allowed no shared heterozygosity (paralog = 1), and a more relaxed filter that allowed a maximum of three species to be heterozygous at a given site (paralog = 3). The final ddRADseq loci were assembled by adjusting a minimum individual (min. ind.) value, which specifies the minimum number of individuals that are required to have data present at a locus in order for that locus to be included in the final matrix. Our ddRADseq data set contains ten species (nine phrynosomatid lizard genera and one outgroup), and setting min. ind. = 10 retains loci with data present for all ten species ( = 100% complete matrix). In contrast, setting min. ind. = 3 retains any locus with data present for three or more species. We compiled data matrices with min. ind. values ranging from 3 to 10 to study the sensitivity of missing data on phylogenetic analysis.

ddRADseq Phylogenetic Analysis

We estimated phylogenetic trees for the concatenated ddRADseq data using RAxML with the GTRGAMMA model. We did not attempt to estimate gene trees for the individual RAD loci, because each locus was only 39 bp after removing the 5-bp barcode and 6-bp restriction enzyme recognition sequences. The data were concatenated and branch support was estimated with the automatic bootstrap function. We estimated phylogenetic trees using 36 combinations of assembly parameters, including 1) six different min. ind. values that modulated the amount of missing data tolerated at any given locus (min. ind. values ranged from 3 to 8; higher values produce too few loci for meaningful comparisons), 2) two paralog filter values (paralogs = 1, paralogs = 3), and 3) three locus clustering thresholds (80%, 90%, and 95%). Species trees were estimated from the ddRADseq data using SVDquartets. An advantage of this approach for analyses of ddRADseq data is that it seems to be able to handle large amounts of missing data. We randomly sampled 10,000 quartets from the data matrix, and used Quartet MaxCut to infer a species tree from the sampled quartets. We used nonparametric bootstrapping with 100 replicates to measure uncertainty in the tree. The bootstrap values were mapped to the species tree estimated from the original data matrix using SumTrees.

Results

Sequence Capture

Of the 585 loci targeted by the probes, the sequence capture protocol resulted in 584 loci shared among a minimum of three species. A total of 471 loci were shared among all phrynosomatid and outgroup species included in the study. These 584 loci provided a total of 358,363 bp for phylogenetic analysis, and they varied in length from 284 to 1,054 bp (mean = 615 bp). On average, the loci contained 11.2% variation (parsimony informative and uninformative sites; min = 0.8%; max = 31.2%; table 2). The number of parsimony informative sites ranged from 0 to 70 (mean = 20). The mtDNA data alignment was 17,187 bp in length, and these data contained 3,773 parsimony informative characters (19.4% variation; table 2).

Table 2

Characteristics of the Sequence Capture Loci

Data	Length (bp)	Variation (%)	PI
Nuclear loci^a	615 (284–1,054)^b	11.2 (0.8–31.2)	20 (0–70)
Combined nuclear loci	358,363	11.2%	11,850
Mitochondrial DNA	17,187	19.4%	3,773

Note.—PI, parsimony-informative characters.

aLoci captured for ≥3 species = 584.

BMean (min–max).

Characteristics of the Sequence Capture Loci Note.—PI, parsimony-informative characters. aLoci captured for ≥3 species = 584. BMean (min–max). Phylogenetic analyses of the concatenated sequence capture loci using ML and BI (MrBayes and BEAST) provided strong support (ML bootstraps = 100%; posterior probabilities = 1.0) for a fully resolved phylogeny (fig. 1). Within the sceloporines, Sceloporus and Urosaurus are sister taxa, and Uta is sister to this clade, followed by Petrosaurus (fig. 1). The divergence time for the sceloporine crown group is 40.1 Ma (95% highest posterior density [HPD] = 33.2–46.9), and the subsequent times between speciation events leading to Uta and the Sceloporus + Urosaurus clade are short (1.7 and 3.7 Ma, respectively; fig. 1). These short divergence times are likely responsible for the difficulties that previous studies faced when trying to resolve this phylogeny with fewer loci. Within the Phrynosomatinae, Phrynosoma is the sister taxon to the remaining genera that form the sand lizards (i.e., Uma, Callisaurus, Cophosaurus, and Holbrookia) with a divergence time estimated at 38.2 Ma (95% HPD = 31.9–45.0 Ma). Within the sand lizards, Uma is sister to the remaining genera, followed by Cophosaurus. The clade containing Callisaurus and Holbrookia results in the paraphyly of the earless genera Holbrookia and Cophosaurus (fig. 1). The internal branch separating these three genera is short (2.7 Ma).

Phylogenomic relationships among phrynosomatid lizards estimated with sequence capture data using BEAST. Bars on nodes indicate the 95% HPD for divergence times. Analyses using concatenation (RAxML, MrBayes, BEAST; 584 or 471 loci) and coalescent methods (SVDquartets, MPEST, BP&P; 471 loci) support the same topology. Concatenation provides absolute support on each node (bootstrap = 100%; posterior probability = 1.0), whereas the coalescent methods provide lower support for three short internal branches. Numbers on nodes are support values from SVDquartets (top), MPEST (middle), and BP&P posterior probabilities (bottom). Photographs by C.W.L., J.A.G., and A.D.G. The coalescent-based species tree analyses supported the same topology as the concatenated data analyses, although the support was not as decisive for the shorter internal branches of the tree. Only three branches were not supported by 100% of the replicate MPEST or SVDquartet analyses. First, the clade containing Sceloporus and Urosaurus was only recovered 89% of time using MPEST. Second, the placement of Uta sister to the Sceloporus + Urosaurus clade received 99% bootstrap support from MPEST and 91% from SVDquartets. Third, the sister group relationship between Holbrookia and Callisaurus received 92% from MPEST and 99% from SVDquartets. The species tree analyses conducted with the Bayesian method BP&P provided posterior probabilities for relationships, and all relationships received a posterior probability of 1.0 with the exception of the clade containing Uta, Sceloporus, and Urosaurus (posterior probability = 0.54). We quantified the number of gene trees that supported the estimated and alternative phylogenetic relationships to gauge the level of gene tree discordance among the sequence capture data (table 3). The relationship of Callisaurus + Holbrookia was represented by 137 loci (37.2%), the highest proportion of the possible relationships. The primary alternative relationship that we tested was the monophyly of the earless lizard genera, Holbrookia + Cophosaurus. A total of 103 of the sequence capture loci (21.9% of all loci examined) supported this alternative topology (table 3). An alternative that was even more common among the gene trees was a clade containing Cophosaurus + Callisaurus (120 loci), an untraditional grouping that also renders the earless lizards paraphyletic. We also quantified the number of nuclear loci that supported the alternative groupings recovered by the mtDNA gene tree (fig. 2). For example, the mtDNA clade containing Sceloporus + Petrosaurus is supported by 55 nuclear loci, and the Urosaurus + Uta clade is supported by 74 loci. The phylogenetic signal in the mtDNA gene tree is present in some of the sequence capture loci, but at very low frequency (<20% of all loci examined).

Table 3

The Number of Nuclear Gene Trees Supporting Alternative Phrynosomatid Lizard Topologies

Clade	Number of Loci	Frequency (%)^a
Holbrookia + Callisaurus^b	175	37.2
Holbrookia + Callisaurus + Cophosaurus^b	340	72.2
Sand lizards^b	210	44.6
Sand lizards + Phrynosoma^b	319	67.7
Sceloporines^b	226	48.0
Sceloporus + Urosaurus + Uta^b	91	19.3
Sceloporus + Urosaurus^b	130	27.6
Cophosaurus + Callisaurus	120	25.5
Holbrookia + Cophosaurus^c	103	21.9
Urosaurus + Uta^d	74	15.7
Sceloporus + Uta	63	13.4
Sceloporus + Petrosaurus^d	55	11.7
Uma + Cophosaurus	19	4.0

aCalculated from complete loci only (471 total).

bClade supported by the sequence capture data in figure 1.

cEarless lizard clade.

dMitochondrial gene tree relationship.

Gene tree estimated from mtDNA data fragments. Bars on nodes indicate the 95% HPD for divergence times. Support values are shown on branches (BEAST/MrBayes/RAxML), and the overall completeness for the mtDNA genomes is shown on the tips. The Number of Nuclear Gene Trees Supporting Alternative Phrynosomatid Lizard Topologies aCalculated from complete loci only (471 total). bClade supported by the sequence capture data in figure 1. cEarless lizard clade. dMitochondrial gene tree relationship.

Double Digest RADseq

The number of loci assembled for each species with the ddRADseq data scales with the sequence similarity threshold used to determine homology while clustering reads (table 4). Conservative clustering (e.g., 95% clustering vs. 80% clustering) produces more loci per species, but as a consequence the mean sequencing depth per locus is reduced (table 4). The characteristics of the ddRADseq data matrices assembled using different thresholds for among-sample clustering, paralog filtering, and sequence coverage are provided in table 5. Although we recovered thousands of ddRADseq loci for each sample (table 4), there are no shared loci recovered across all ten species (i.e., min. ind. = 10) using conservative clustering. Allowing one individual to have missing data at a locus (i.e., min. ind. = 9) only increases the total number of loci to 3, which demonstrates the difficulty in obtaining homologous loci using the ddRADseq approach for distantly related species (table 5). Setting min. ind. = 3 and relaxing the clustering threshold to 80% produce over 2,600 loci containing 16,002 or 15,725 SNPs depending on the paralog filter (table 5). Increasing the stringency on the min. ind. parameter provides fewer loci and reduces the amount of missing data in the final data matrix. The coverage values for the ddRADseq assemblies are high (table 4), indicating that sequencing effort is probably not the main contributor to the high levels of missing data that we observed. It seems more likely that allelic dropout due to mutations at restriction sites (or mutations causing changes in the size of loci) is responsible for the patterns of missing data that we observed.

Table 4

Summary of ddRADseq Data within Sample Clustering

Species		Clustering^a = 80%		Clustering = 90%		Clustering = 95%
Species	Reads^b	Loci^c	Depth^d	Loci	Depth	Loci	Depth
Callisaurus draconoides	1,883,604	10,723	43.4	12,449	36.9	13,100	17.8
Cophosaurus texanus	1,452,471	8,686	41.8	10,048	35.9	10,553	18.4
Holbrookia maculata	699,921	4,657	27.5	7,880	24.2	11,156	14.2
Petrosaurus thalassinus	2,590,961	11,929	51.9	14,168	46.2	14,868	20.3
Phrynosoma sherbrookei	814,375	6,043	31.6	7,257	26.9	7,692	14.9
Sceloporus occidentalis	1,404,985	6,852	52.8	5,368	45.0	5,561	20.0
Uma notata	806,846	3,751	40.3	4,698	35.9	5,298	25.3
Urosaurus ornatus	3,465,996	7,695	122.5	9,512	102.6	8,305	28.9
Uta stansburiana	4,818,547	9,177	119.5	11,878	96.5	14,058	29.7
Gambelia wislizenii	5,406,187	14,306	88.4	19,823	66.9	23,088	23.6

aThreshold for clustering of reads within a species.

bRaw read counts after sample demultiplexing.

cLoci passing quality filters.

dMean sequencing depth.

Table 5

The Number of Loci (and SNPs) Obtained from Different Assemblies of the ddRADseq Data

	Minimum Individuals^a
	3	4	5	6	7	8	9	10
95% clustering^b, paralog = 1^c	1,079 (2,228)	375 (841)	173 (404)	72 (182)	27 (73)	9 (26)	3 (7)	0 (0)
95% clustering, paralog = 3	1,100 (2,282)	384 (860)	177 (413)	74 (186)	28 (76)	10 (29)	3 (7)	0 (0)
90% clustering, paralog = 1	1,826 (6,506)	674 (2,637)	306 (1,212)	154 (632)	68 (306)	28 (128)	7 (27)	1 (3)
90% clustering, paralog = 3	1,856 (6,655)	693 (2,733)	312 (1,244)	158 (655)	69 (313)	29 (135)	8 (34)	1 (3)
80% clustering, paralog = 1	2,629 (15,725)	1,057 (6,893)	478 (3,037)	227 (1,409)	109 (722)	50 (348)	13 (75)	2 (13)
80% clustering, paralog = 3	2,670 (16,002)	1,083 (7,079)	493 (3,155)	234 (1,458)	113 (752)	53 (371)	13 (75)	2 (13)

aMinimum number of individuals (min. ind.) required to retain a locus in the final alignment (out of ten sequences total).

bThreshold for both within-sample and across-sample clustering.

cMaximum number of shared polymorphic bases.

Summary of ddRADseq Data within Sample Clustering aThreshold for clustering of reads within a species. bRaw read counts after sample demultiplexing. cLoci passing quality filters. dMean sequencing depth. The Number of Loci (and SNPs) Obtained from Different Assemblies of the ddRADseq Data aMinimum number of individuals (min. ind.) required to retain a locus in the final alignment (out of ten sequences total). bThreshold for both within-sample and across-sample clustering. cMaximum number of shared polymorphic bases. We estimated phylogenetic trees for the ddRADseq data using concatenation and a coalescent-based species tree approach (fig. 3). We present a comparison of phylogenies estimated using three different clustering threshold (i.e., 80%, 90%, and 95%) in figure 3. The phylogenetic trees estimated for SNP alignments assembled using different clustering thresholds, and with different methods, are in conflict. For example, the earless lizard genera, Cophosaurus and Holbrookia, form a clade with 80% and 90% clustering when using concatenation, but the species tree analysis supports a clade containing Holbrookia and Callisaurus (similar to the sequence capture and mtDNA results; figs. 1 and 2). Concatenation also supports a Holbrookia + Callisaurus clade, but only with a 95% clustering threshold (fig. 3E). The phylogenetic relationships for the sceloporine lizards are consistent and congruent with the sequence capture data when using 80% clustering (fig. 3A and B), but more conservative clustering thresholds (i.e., 90% and 95%) result in conflicting topologies, none of which are strongly supported.

Phylogenetic trees estimated from the ddRADseq data using concatenation and coalescent-based species tree inference. For each clustering threshold (80%, A and B; 90%, C and D; 95%, E and F), results are shown for concatenation with RAxML (A, C, and E) and species tree inference with SVDquartets (B, D, and F). All results are from assemblies with min. ind. = 4 (minimum needed to form a quartet) and paralog filtering assuming no shared heterozygous sites (paralog = 1). Numbers on nodes are bootstrap values. We compared the variation in bootstrap support from the concatenation analyses for the clade containing Callisaurus and Holbrookia with that of the earless lizard clade (i.e., Cophosaurus and Holbrookia) across different pyRAD assembly parameters (fig. 4). Data assembly parameters have an influence on the topology and bootstrap support for these alternative clades. The results are most consistent when the clustering threshold is high (fig. 4C), and as expected, there is still some variation across data assemblies containing different amounts of data. The paralog filter did not play a significant role in changing the bootstrap support values when using a clustering threshold of 80% or 95% (fig. 4). However, for the intermediate clustering threshold of 90% (fig. 4B), the paralog filter introduces large differences in the support for the alternative topologies. The most stringent clustering threshold (i.e., 95%) favors the Holbrookia + Callisaurus clade over the earless clade over all parameter settings that we explored.

Variability in ddRADseq data support for monophyly of the earless lizards (Cophosaurus + Holbrookia) as a function of clustering threshold (A, 80%; B, 90%, C, 95%), minimum individuals (x axis), and paralog filtering. Results are from ML analyses of the concatenated data.

Discussion

Comparison of Approaches

Sequence capture and RADseq are two reduced-representation genome sequencing approaches for obtaining large numbers of homologous loci for phylogenetic inference. The utilities of the methods for phylogenetic inference are well established at opposite timescales, with sequence capture showing great promise for resolving relationships among distantly related species (Faircloth et al. 2012), and RADseq for phylogeographic and population-level investigations (Davey and Blaxter 2010). The methods have also been shown to work at largely overlapping timescales, but they have not been studied in a comparative manner, with the exception of a phylogeographic comparison by Harvey et al. (2013). For example, in silico studies of RADseq data have been applied to divergences dating back to 55–60 Ma in mammals, Drosophila, and fungi (Rubin et al. 2012; Cariou et al. 2013), and sequence capture has shown to be useful for phylogeographic studies of Pleistocene divergence in birds (Smith et al. 2014). We have conducted a comparison of these approaches using phrynosomatid lizards as a model system. We found that the sequence capture data collected here were sufficient for resolving the relationships among phrynosomatid genera with strong support whether the loci were concatenated and assumed to share the same underlying genealogical history, or whether they were allowed to have independent histories and analyzed within a coalescent framework (fig. 1). The coalescent-based analyses provided lower support for the short internal branches of the tree, but there were no biases in terms of the support at particular timescales that might be expected if these data were insufficient for resolving recent divergences. However, as a consequence of sampling only one species per genus we excluded recent divergences within genera that occurred within the last 10 million years. Therefore, the phylogeny that we investigated was skewed toward containing relatively deeper divergences. The ddRADseq also showed no bias at different timescales. These data were able to resolve the deepest divergence in the phylogeny, but the short internal branches caused problems for the ddRADseq data; different data assemblies and different types of analyses of the same data assembly (concatenation vs. species tree inference) resulted in different topologies (figs. 3 and 4). Incomplete lineage sorting is an important factor that can cause gene trees to conflict with the species tree. The time intervals between speciation events together with ancestral population sizes modulate the amount of incomplete lineage sorting that is expected; therefore, more data are required to resolve some speciation histories than others (Leaché and Rannala 2011). There is a substantial amount of gene tree discordance in the sequence capture loci presented here, and nearly 250 loci (approximately 50% of all loci sampled) support a topology for the sand lizards that conflicts with the estimated species tree (table 3). Gene tree discordance can cause phylogenetic inference error (Degnan and Rosenberg 2009), and the majority of gene trees could support an incorrect species tree if the phylogeny is in the anomaly zone (Degnan and Rosenberg 2006). Incidentally, the most common topology for sand lizards found across the sequence capture data support a clade containing Holbrookia and Callisaurus (table 3). The phrynosomatid genera do not appear to be in the anomaly zone, because if they were we would expect concatenation and coalescent inference to support different topologies (Kubatko and Degnan 2007; Liu and Edwards 2009). The large amount of loci generated through RADseq approaches is particularly valuable for phylogeography, migration assessment, and phylogenetic inference among closely related species (e.g., Rheindt et al. 2014). In terms of their applications to nonmodel organisms, RADseq methods are more amenable to a broader set of evolutionary systems (Cruaud et al. 2014), since genomic resources are not needed to design probes as is the case with sequence capture. For phylogenetic investigations, ddRADseq data are most useful for studies of relatively closely related taxa, because the number of homologous loci obtained decreases in relation to time since divergence (Wagner et al. 2013). Furthermore, the pattern of missing data may be nonrandom, as the rate of allelic dropout is positively correlated with sequence divergence (Arnold et al. 2013). A large assumption of RADseq approaches is that homologous loci are those that share a restriction site and high sequence similarity near the conserved restriction site. However, a reasonable possibility of clustering with nonhomologous genomic regions exists with this approach, particularly with short sequence reads (e.g., 50-bp single-end sequence reads, as used here). Bioinformatic postprocessing of ddRADseq data is the critical step that determines sequence homology (Ilut etal. 2014); as seen here, the thresholds selected for assembly parameters can have a strong influence on the size of the resulting data set and inferred phylogenetic relationships (table 5; fig. 4). Assembling sequence capture is more straightforward, because we know the number of loci, and a reference sequence is available for each locus (the 180-bp probe sites). Phylogenetic inference with RADseq is feasible at the relatively deep evolutionary timescales studied here, and these branches did not seem particularly difficult for the SNP data to resolve. However, different assemblies of the ddRADseq data provided conflicting topologies for the short internal branches of the phylogeny. This suggests that the limitations of ddRADseq data are not focused on a particular timescale in the phylogeny, but are instead related to the length of the internal branches of the phylogeny. Even for studies focusing on recent population-level divergences, current RADseq protocols (reviewed by Puritz et al. 2014; Andrews et al. 2014) are highly susceptible to allelic dropout resulting from mutations at restriction sites (Arnold et al. 2013). The problem is exacerbated when attempting to assemble ddRADseq data for distantly related species (Rubin et al. 2012). Simulation work has shown that the loci with the highest mutation rates are those that have the most missing data (Huang and Knowles 2014), but those same loci may be the least valuable for resolving relationships among distantly related species. Only two loci were recovered for all ten species included in our ddRADseq experiment; these loci were obtained when the clustering threshold was reduced to 80% similarity (table 5). Different enzymes are expected to yield substantially different numbers of loci (Davey et al. 2011), and the enzyme combination selected here does not represent the optimum potential at which any RAD method will perform. Based on the phrynosomatid lizard data presented here, and the specific enzyme combination that we used (SbfI and MspI), there seems to be a low probability of obtaining large numbers of shared loci among distantly related species using ddRADseq. At least for phrynosomatid lizards, phylogenetic relationships are sensitive to the parameter settings used during RADseq data assembly (fig. 4), especially for the short internal branches on the tree. We found conflicting topologies and variable levels of bootstrap support when changing the clustering threshold, paralogy filter, and the minimum number of individuals needed to retain a locus in the final alignment (fig. 3). The most consistent phylogenetic signal that we recovered for the short internal branch located within the Cophosaurus, Callisaurus, and Holbrookia clade was obtained when the sequence similarity threshold was high (95%); the phylogenetic relationships and bootstrap values stabilized across the various parameter settings (fig. 4C). Using lower sequence similarity thresholds doubled the number of loci, and this may seem beneficial, but this increase comes at the cost of introducing “RAD noise” that at worst produces conflicting topologies (fig. 3), and at the best only changes the support for the topology (fig. 4). Of course, we do not necessarily know the correct phylogeny, and this is why simulation studies are needed to quantify the errors and understand the consequences resulting from RADseq data misassembly on phylogeny inference. Overall, RADseq data can be collected faster and are less expensive than sequence capture data, and RADseq has the potential to provide an order of magnitude more SNPs for evolutionary inference. There is no limit on the number of loci that can be targeted for sequence capture experiments, and in some model systems (e.g., humans) the method is used for sequencing the entire exome (Ng et al. 2009). However, for phylogeographic studies, it is possible that the sequence capture protocols that target highly conserved genomic regions (Lemmon et al. 2012) and/or UCEs (Faircloth et al. 2012) will provide relatively few SNPs. For example, a phylogeography study of Neotropical rainforest birds using sequence capture data recovered approximately 4,500 SNPs (1,500 UCE loci containing 2–3 variable sites per locus; Smith et al. 2014). In contrast, a phylogeographic study of Zimmerius flycatchers using RADseq recovered over 37,000 SNPs (Rheindt et al. 2014). If the goal of a study is to discern fine-scale phylogeographic patterns, then RADseq methods have the potential to provide more data at lower cost and effort. Although the number of loci that we targeted using sequence capture is lower than what we obtained using ddRADseq, the loci are longer and were more straightforward to analyze under a variety of inference techniques, including coalescent-based models that benefit from complete sampling at each locus. In the case of higher-level relationships among phrynosomatid lizard genera, we found sequence capture data to provide a more consistent phylogenetic signal compared with ddRADseq data.

Phylogenomics of Phrynosomatids

The phylogenomic signal from the sequence capture data and the mtDNA data provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus (fig. 1). Determining whether these two “earless” genera with concealed tympanic membranes form a clade has been difficult to resolve. Previous studies using mtDNA have provided contradictory, ambiguous, or spurious support for the resolution of these taxa (Reeder 1995; Wilgenbusch and de Queiroz 2000; Leaché and McGuire 2006; Wiens et al. 2010). The spurious relationships for sand lizards supported by the Leaché and McGuire (2006) study were the result of sample mislabeling errors that occurred during specimen collection (the tissues for Uma and Callisaurus were swapped during specimen collection), and those data were removed from GenBank in 2008. These new sequence capture data and partial mtDNA genomes presented here, all collected from authenticated samples, recover a clade containing Holbrookia and Callisaurus to the exclusion of Cophosaurus. Some of the SNP assemblies also support this relationship, including the coalescent-based analysis of the largest SNP matrix. The largest ddRADseq assembly also supports this relationship when analyzed using a species tree approach (fig. 3B). The preferred topology suggests that the earless morphology either evolved twice independently in Holbrookia and Cophosaurus or that evolved once in the common ancestor of Holbrookia, Callisaurus, and Cophosaurus, and was subsequently lost in Callisaurus. Either reconstruction requires the same number of character state transitions, and in the context of parsimony they are equivalent explanations for the evolution of the earless morphology. The divergence times separating the sceloporine genera Sceloporus, Petrosaurus, Urosaurus, and Uta are on the order of 1.7–3.7 Myr (fig. 1), and these short time intervals have resulted in a difficult phylogenetic problem. Previous studies attempting to resolve these relationships with either a single locus (mtDNA) or a handful of nuclear loci have not been able to obtain strong support for the relationships among these groups (Wiens et al. 2010). Simulation studies have shown that rapid speciation events are difficult to resolve without hundreds or thousands of loci (Liu and Edwards 2009), and the new sequence capture data collected here provide strong support for the relationships among these genera using concatenation and coalescent-based analyses. The new mtDNA data (fig. 2) continue to struggle with resolving these relationships, and although these data are still fragmentary, it is unlikely that this single locus will be sufficient for resolving this part of the tree with strong support even after being sequenced to completion. The largest SNP assembly that we analyzed supported the same topology as the sequence capture and mtDNA data. These three new data sets provide compelling evidence for a new phyrnosomatid lizard phylogeny that contains a novel relationship among the sand lizards.

61 in total

1. Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci.

Authors: Bruce Rannala; Ziheng Yang
Journal: Genetics Date: 2003-08 Impact factor: 4.562

Review 2. Gene tree discordance, phylogenetic inference and the multispecies coalescent.

Authors: James H Degnan; Noah A Rosenberg
Journal: Trends Ecol Evol Date: 2009-03-21 Impact factor: 17.712

3. Phylogenetic relationships of phrynosomatid lizards based on nuclear and mitochondrial data, and a revised phylogeny for Sceloporus.

Authors: John J Wiens; Caitlin A Kuczynski; Saad Arif; Tod W Reeder
Journal: Mol Phylogenet Evol Date: 2009-09-12 Impact factor: 4.286

4. Trade-offs and utility of alternative RADseq methods: reply to Puritz et al.

Authors: Kimberly R Andrews; Paul A Hohenlohe; Michael R Miller; Brian K Hand; James E Seeb; Gordon Luikart
Journal: Mol Ecol Date: 2014-12 Impact factor: 6.185

5. Empirical assessment of RAD sequencing for interspecific phylogeny.

Authors: Astrid Cruaud; Mathieu Gautier; Maxime Galan; Julien Foucaud; Laure Sauné; Gwenaëlle Genson; Emeric Dubois; Sabine Nidelet; Thierry Deuve; Jean-Yves Rasplus
Journal: Mol Biol Evol Date: 2014-02-03 Impact factor: 16.240

6. Resolving the phylogeny of lizards and snakes (Squamata) with extensive sampling of genes and species.

Authors: John J Wiens; Carl R Hutter; Daniel G Mulcahy; Brice P Noonan; Ted M Townsend; Jack W Sites; Tod W Reeder
Journal: Biol Lett Date: 2012-09-19 Impact factor: 3.703

7. Phylogenetic relationships among phrynosomatid lizards as inferred from mitochondrial ribosomal DNA sequences: substitutional bias and information content of transitions relative to transversions.

Authors: T W Reeder
Journal: Mol Phylogenet Evol Date: 1995-06 Impact factor: 4.286

8. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty.

Authors: Guy Baele; Philippe Lemey; Trevor Bedford; Andrew Rambaut; Marc A Suchard; Alexander V Alekseyenko
Journal: Mol Biol Evol Date: 2012-03-07 Impact factor: 16.240

9. Is RAD-seq suitable for phylogenetic inference? An in silico assessment and optimization.

Authors: Marie Cariou; Laurent Duret; Sylvain Charlat
Journal: Ecol Evol Date: 2013-02-27 Impact factor: 2.912

10. Defining loci in restriction-based reduced representation genomic data from nonmodel species: sources of bias and diagnostics for optimal clustering.

Authors: Daniel C Ilut; Marie L Nydam; Matthew P Hare
Journal: Biomed Res Int Date: 2014-06-25 Impact factor: 3.411

29 in total

Review 1. Harnessing the power of RADseq for ecological and evolutionary genomics.

Authors: Kimberly R Andrews; Jeffrey M Good; Michael R Miller; Gordon Luikart; Paul A Hohenlohe
Journal: Nat Rev Genet Date: 2016-01-05 Impact factor: 53.242

Review 2. Targeted capture in evolutionary and ecological genomics.

Authors: Matthew R Jones; Jeffrey M Good
Journal: Mol Ecol Date: 2015-07-30 Impact factor: 6.185

3. Excluding Loci With Substitution Saturation Improves Inferences From Phylogenomic Data.

Authors: David A Duchêne; Niklas Mather; Cara Van Der Wal; Simon Y W Ho
Journal: Syst Biol Date: 2022-04-19 Impact factor: 9.160

4. Capturing Darwin's dream.

Authors: Travis C Glenn; Brant C Faircloth
Journal: Mol Ecol Resour Date: 2016-09 Impact factor: 7.090

5. Hierarchical Hybrid Enrichment: Multitiered Genomic Data Collection Across Evolutionary Scales, With Application to Chorus Frogs (Pseudacris).

Authors: Sarah E Banker; Alan R Lemmon; Alyssa Bigelow Hassinger; Mysia Dye; Sean D Holland; Michelle L Kortyna; Oscar E Ospina; Hannah Ralicki; Emily Moriarty Lemmon
Journal: Syst Biol Date: 2020-07-01 Impact factor: 15.683

6. Comparison of Target-Capture and Restriction-Site Associated DNA Sequencing for Phylogenomics: A Test in Cardinalid Tanagers (Aves, Genus: Piranga).

Authors: Joseph D Manthey; Luke C Campillo; Kevin J Burns; Robert G Moyle
Journal: Syst Biol Date: 2016-01-28 Impact factor: 15.683

7. Evolutionary Rate Variation among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference.

Authors: Mezzalina Vankan; Simon Y W Ho; David A Duchêne
Journal: Syst Biol Date: 2022-02-10 Impact factor: 15.683

8. Exploring the utility of cross-laboratory RAD-sequencing datasets for phylogenetic analysis.

Authors: Serap Gonen; Stephen C Bishop; Ross D Houston
Journal: BMC Res Notes Date: 2015-07-08

9. A New Pipeline for Removing Paralogs in Target Enrichment Data.

Authors: Wenbin Zhou; John Soghigian; Qiu-Yun Jenny Xiang
Journal: Syst Biol Date: 2022-02-10 Impact factor: 15.683

10. Unravelling hybridization in Phytophthora using phylogenomics and genome size estimation.

Authors: Kris Van Poucke; Annelies Haegeman; Thomas Goedefroit; Fran Focquet; Leen Leus; Marília Horta Jung; Corina Nave; Miguel Angel Redondo; Claude Husson; Kaloyan Kostov; Aneta Lyubenova; Petya Christova; Anne Chandelier; Slavcho Slavov; Arthur de Cock; Peter Bonants; Sabine Werres; Jonàs Oliva Palau; Benoit Marçais; Thomas Jung; Jan Stenlid; Tom Ruttink; Kurt Heungens
Journal: IMA Fungus Date: 2021-07-01 Impact factor: 3.515