| Literature DB >> 23861382 |
Gaurav Bhatia1, Nick Patterson, Sriram Sankararaman, Alkes L Price.
Abstract
In a pair of seminal papers, Sewall Wright and Gustave Malécot introduced FST as a measure of structure in natural populations. In the decades that followed, a number of papers provided differing definitions, estimation methods, and interpretations beyond Wright's. While this diversity in methods has enabled many studies in genetics, it has also introduced confusion regarding how to estimate FST from available data. Considering this confusion, wide variation in published estimates of FST for pairs of HapMap populations is a cause for concern. These estimates changed-in some cases more than twofold-when comparing estimates from genotyping arrays to those from sequence data. Indeed, changes in FST from sequencing data might be expected due to population genetic factors affecting rare variants. While rare variants do influence the result, we show that this is largely through differences in estimation methods. Correcting for this yields estimates of FST that are much more concordant between sequence and genotype data. These differences relate to three specific issues: (1) estimating FST for a single SNP, (2) combining estimates of FST across multiple SNPs, and (3) selecting the set of SNPs used in the computation. Changes in each of these aspects of estimation may result in FST estimates that are highly divergent from one another. Here, we clarify these issues and propose solutions.Mesh:
Year: 2013 PMID: 23861382 PMCID: PMC3759727 DOI: 10.1101/gr.154831.113
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
FST estimates for pairs of populations in 1000 Genomes
A comparison of the FST estimated using 1000 Genomes and HapMap data by either using a ratio of averages or an average of ratios
Assessing the effect of ascertainment schemes and combination methods on the resulting FST estimate for CEU and CHB
Figure 1.Allele frequency dependence of FST under different ascertainment schemes. This shows FST for CEU and CHB as a function of allele frequency when ascertaining in either CEU, CHB, or YRI. The increased FST for rare variants is consistent with bottlenecks being a stronger force on FST for CEU and CHB than recent expansion. In fact, this is consistent with a stronger bottleneck in the population history of CHB. We note that this frequency dependence disappears when ascertaining in YRI, suggesting that YRI is a reasonable outgroup for the comparison of CEU and CHB.