| Literature DB >> 29593776 |
Andrew Storfer1, Austin Patton1, Alexandra K Fraik1.
Abstract
As next-generation sequencing data become increasingly available for non-model organisms, a shift has occurred in the focus of studies of the geographic distribution of genetic variation. Whereas landscape genetics studies primarily focus on testing the effects of landscape variables on gene flow and genetic population structure, landscape genomics studies focus on detecting candidate genes under selection that indicate possible local adaptation. Navigating the transition between landscape genomics and landscape genetics can be challenging. The number of molecular markers analyzed has shifted from what used to be a few dozen loci to thousands of loci and even full genomes. Although genome scale data can be separated into sets of neutral loci for analyses of gene flow and population structure and putative loci under selection for inference of local adaptation, there are inherent differences in the questions that are addressed in the two study frameworks. We discuss these differences and their implications for study design, marker choice and downstream analysis methods. Similar to the rapid proliferation of analysis methods in the early development of landscape genetics, new analytical methods for detection of selection in landscape genomics studies are burgeoning. We focus on genome scan methods for detection of selection, and in particular, outlier differentiation methods and genetic-environment association tests because they are the most widely used. Use of genome scan methods requires an understanding of the potential mismatches between the biology of a species and assumptions inherent in analytical methods used, which can lead to high false positive rates of detected loci under selection. Key to choosing appropriate genome scan methods is an understanding of the underlying demographic structure of study populations, and such data can be obtained using neutral loci from the generated genome-wide data or prior knowledge of a species' phylogeographic history. To this end, we summarize recent simulation studies that test the power and accuracy of genome scan methods under a variety of demographic scenarios and sampling designs. We conclude with a discussion of additional considerations for future method development, and a summary of methods that show promise for landscape genomics studies but are not yet widely used.Entities:
Keywords: landscape genetics; landscape genomics; local adaptation; selection; spatial analyses
Year: 2018 PMID: 29593776 PMCID: PMC5859105 DOI: 10.3389/fgene.2018.00068
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
General differences between landscape genetics and landscape genomics studies.
| Landscape genetics | Influence of landscape on gene flow | Among populations | Mantel tests, | |
| Influences of landscape on at-site variation | Within populations | Graph models (e.g, Popgraph), GDMs, Structural equation models | ||
| Barriers | Among populations | Wombling, Monmonier's maximum difference algorithm, spatial assignment tests (e.g., Geneland) | ||
| Species' ecology | Within and among populations | Ordination, Least cost paths, Spatial autocorrelation, Spatial regression | ||
| Source-sink dynamics | Among populations | Mantel tests, genetic diversity estimates (e.g., F-statistics, bottleneck tests) | ||
| Landscape genomics | Spatial patterns of selection | Among populations | Outlier differentiation methods (eg., Bayescan, FLK, XTX); Genotype-environment associations (e.g., Bayenv2, PC Adapt, LFMM, sGLMM, Samβada), | |
| Influence of landscape on local adaptation | Among populations | Transect sampling, paired sampling, stratified sampling | Outlier differentiation methods; Genotype-environment associations, |
Note that, when conducting a landscape genomics study, that when loci under selection are removed and putatively neutral loci remain, that landscape genetics questions and analyses can then be conducted. Nonetheless, sampling designs generally differ between landscape genetics and landscape genomics studies, so some landscape genetics questions may not be addressable in studies with landscape genetics goals. Bolded sampling designs indicate preferred designs for that particular question. Not all analysis methods under each study type are listed, just those that are most commonly used or best suited to address the goals of the study. Note also that assignment test methods generally differ between landscape genetics and landscape genomics studies. Italicized words under analysis type indicate those commonly used in both landscape genetics studies of gene flow and landscape genomics studies of loci involved in adaptation. dbRDA, distance-based redundancy analyses; sPCA, spatial principal components analysis; MDS, multidimensional scaling; MLPE, maximum likelihood of population effects (Clarke et al., .
indicates methods not yet widely used but show promise–see Sections Generalized Dissimilarity Modeling (GDM)–Clinal Analyses.
Simulation studies of genome scan methods in landscape genomics.
| De Mita et al., | 1. Compare methods evaluating differences in type I/II error rates and power | Logistic Regression (LR; Joost et al., | Island Model | S1-1 individual/population | None tested | LR and GEE have high FPR (false-postive rates), but fast run time |
| Frichot et al., | 1. Identify signatures of selection controlling for population structure | LFMM (Frichot et al., | Isolation by Distance (IBD) | None tested | P1 - Correlated with demographic history | LFMM has low FPR under IBD |
| de Villemereuil et al., | 1. Individual-based simulation comparing power and error rates of genome scan methods | Allele frenquency-environmental linear regression (LRM; Storey and Tibshirani, | Hierarchical Model (HM) | None tested | P1 - Correlated with demographic history | Decrease in power in methods under polygenic vs. monogenic selection |
| Lotterhos and Whitlock, | 1. Test effects of IBD and range expansion to detect spatially divergent selection among methods | Beaumont & Nichols test (FDIST2; Beaumont and Nichols, | Island Model (IM) | None tested | Soft selection | Under IBD, FDIST2 and BayeScan have low power and high FPR |
| Forester et al., | 1. Describe how variation in environment, strength of selection and dispersal affect strength of local adaptation | Principial components analysis (PCA) | IBD with varying dispersal distances: | None tested | P1 - Continuous (clinal) gradient | RDA and dbRDA have highest power, low FPRs and strongest GEA indices under all scenarios PCA, PCoA & LFMM show stronger GEA indices at intermediate dispersal levels Ordination methods broadly control for population structure due to IBD better then other techniques |
| Lotterhos and Whitlock, | 1. Compare power of GEAs and outlier differentiation methods to detect loci involved in local adaptation based on: Sampling design and | XTX (Günther and Coop, | Island Model (IM) | S1 - Transect | Weak clinal selection | Pairwise sampling have high power for detecting genes under weak selection, transects better at detecting clines |
Summarized are questions, sampling methods, analysis methods and conclusions as to which methods lead to low false positive rates and high power to detect loci under selection.
Figure 1An illustration of clines. X-axes correspond to position along geographic transects (ecological gradient) or hybrid indexes (genomic gradient) in the case of genomic cline analyses. (A) Illustration of the three parameters typically estimated in the use of geographic of genomic cline analysis. Cline slope is the estimate of the rate of allele frequency turnover at the steepest point in the cline. In genomic cline analysis this corresponds to the rate of introgression. Cline center corresponds to the point along the geographic transect or hybrid index at which allele frequency turnover is greatest. Cline width corresponds to the region along the gradient at which it's influence on allele frequency is greatest. (B) Three examples of clines. (i) A transect along which no selection appears to be acting, or the effects of gene flow are such that changes in allele frequency are purely a function of distance. In the case of genomic cline analyses, the loci under consideration appears to be favored equally in both parental taxa. (ii) A modest cline in which the allele favored by selection changes along the gradient. Given its shallower slope, selection may either be weaker, gene flow stronger (in the case of geographic transects) or the ecotone separating ends of the transect greater. (iii) A steep cline, often called a step cline. In the case of geographic clines, these are formed either by strong selection acting in favor of one allele along a sudden ecotone, or extremely limited gene flow along said ecotone. In the case of genomic clines, this may be due to heterozygote disadvantage, as in the case of reinforcement.