| Literature DB >> 30026799 |
Stephen J Gaughran1, Maud C Quinzin1, Joshua M Miller1, Ryan C Garrick2, Danielle L Edwards3, Michael A Russello4, Nikos Poulakakis5,6, Claudio Ciofi7, Luciano B Beheregaray8, Adalgisa Caccone1.
Abstract
High-throughput DNA sequencing allows efficient discovery of thousands of single nucleotide polymorphisms (SNPs) in nonmodel species. Population genetic theory predicts that this large number of independent markers should provide detailed insights into population structure, even when only a few individuals are sampled. Still, sampling design can have a strong impact on such inferences. Here, we use simulations and empirical SNP data to investigate the impacts of sampling design on estimating genetic differentiation among populations that represent three species of Galápagos giant tortoises (Chelonoidis spp.). Though microsatellite and mitochondrial DNA analyses have supported the distinctiveness of these species, a recent study called into question how well these markers matched with data from genomic SNPs, thereby questioning decades of studies in nonmodel organisms. Using >20,000 genomewide SNPs from 30 individuals from three Galápagos giant tortoise species, we find distinct structure that matches the relationships described by the traditional genetic markers. Furthermore, we confirm that accurate estimates of genetic differentiation in highly structured natural populations can be obtained using thousands of SNPs and 2-5 individuals, or hundreds of SNPs and 10 individuals, but only if the units of analysis are delineated in a way that is consistent with evolutionary history. We show that the lack of structure in the recent SNP-based study was likely due to unnatural grouping of individuals and erroneous genotype filtering. Our study demonstrates that genomic data enable patterns of genetic differentiation among populations to be elucidated even with few samples per population, and underscores the importance of sampling design. These results have specific implications for studies of population structure in endangered species and subsequent management decisions.Entities:
Keywords: Chelonoidis; conservation; genomics; population structure; sampling design; single nucleotide polymorphism
Year: 2017 PMID: 30026799 PMCID: PMC6050186 DOI: 10.1111/eva.12551
Source DB: PubMed Journal: Evol Appl ISSN: 1752-4571 Impact factor: 5.183
Figure 1Distribution map of Galápagos giant tortoises throughout the archipelago. The islands with extant species are shown in gray, while the islands with extinct species are in white. Black triangles identify the location of the four volcanoes on Isabela Island, each with its own locally endemic tortoise species. Extinct species are identified by a cross symbol. Names of each species are in cursive with a black line pointing to the island or location within an island where they occur. The populations from the three species in this study are identified by two or three letter symbols in bold: CRU = C. porteri, Santa Cruz Island (La Caseta). VA = C. vandenburghi, Volcano Alcedo, central Isabela Island, and PBL = C. becki, Piedras Blancas, Volcano Wolf, northern Isabela Island
Number of polymorphic loci present in all individuals (n = 10 per species) used for analyses of each population (diagonal) and population pair (below diagonal)
| PBL ( | CRU ( | VA ( | |
|---|---|---|---|
| PBL ( | 9,580 | ||
| CRU ( | 19,654 | 11,703 | |
| VA ( | 13,520 | 16,432 | 5,732 |
Pairwise F ST values between given species pairs. Above the diagonal, values calculated using our dataset of SNPs with no missing data and common to the population pair, along with 95% confidence intervals. Below the diagonal, values calculated using 12 microsatellite loci from Garrick et al. (2015) (see section VIII in Appendix S1). Data were obtained using 10 samples for each population (PBL, VA, CRU) for the three species
| PBL ( | CRU ( | VA ( | |
|---|---|---|---|
| PBL ( | 0.169 (0.164–0.174) | 0.181 (0.175–0.187) | |
| CRU ( | 0.137 | 0.233 (0.226–0.240) | |
| VA ( | 0.163 | 0.202 |
Figure 2Principal component 1 (PC1) plotted against principal component 2 (PC2) for 30 individuals from three populations, resulting from PCA analysis on 23,057 SNPs. Stars, open circles, and open triangles identify individuals from the PBL (C. becki), CRU (C. porteri), and VA (C. vandenburghi) populations, respectively. The analysis was carried out using PLINK (Chang et al., 2015)
Figure 3Boxplots of pairwise ST estimates using 1,000 randomly drawn subsamples of individuals for each sample size (n = 2, 3, or 5) from each population. PBL, CRU and VA correspond to population samples from C. becki, C. porteri, and C. vandenburghi, respectively. (a) is the pairwise comparison of PBL and CRU, (b) is the pairwise comparison of PBL and VA, and (c) is the pairwise comparison of CRU and VA. The horizontal black line in each boxplot marks the ST value calculated using all 10 individuals from each population in the pairwise comparison (see Table S1). Lower hinge corresponds to first quartile (25th percentile); upper hinge corresponds to third quartile (75th percentile). Whiskers indicate points within 1.5 times the interquartile range (IQR), with outliers indicated as points beyond that range