| Literature DB >> 25838885 |
Wilfried M Guiblet1, Kai Zhao2, Stephen J O'Brien3, Steven E Massey4, Alfred L Roca2, Taras K Oleksyk1.
Abstract
BACKGROUND: Adaptive alleles may rise in frequency as a consequence of positive selection, creating a pattern of decreased variation in the neighboring loci, known as a selective sweep. When the region containing this pattern is compared to another population with no history of selection, a rise in variance of allele frequencies between populations is observed. One challenge presented by large genome-wide datasets is the ability to differentiate between patterns that are remnants of natural selection from those expected to arise at random and/or as a consequence of selectively neutral demographic forces acting in the population.Entities:
Keywords: Evolution; Galaxy; Genome; Population; Resampling; Selection
Mesh:
Year: 2015 PMID: 25838885 PMCID: PMC4382839 DOI: 10.1186/2047-217X-4-1
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1Smilefinder program description, output and outcome examples. A. Description of the SmileFinder algorithm. The program finds chromosomal regions with patterns of selection by comparing distributions of allele frequencies chromosome-wide in two or more populations, and infers the most extreme percentile values for each SNP from a resampled distribution representing a baseline approximating the neutral scenario. The program (1) takes an input of allele frequencies from two or more population and (2) samples along the chromosome sequential loci using a sliding window of n = 5. (3) At the same time, the program combines allele frequencies into sets from random loci using unrestricted random sampling (r = 10 K, 100 K, 1 M, 10 M or 100 M). (4) The algorithm then calculates mean and variance of Heterozygosity and FST in each window and the resampled set, and (5) builds a frequency distribution to (6) calculate the percentiles that are (7) superimposed onto the observed distribution. (8) The inferred percentiles are deposited into the output, then the process is repeated with incrementally larger window sizes (5, 7, 9, 11, 13, … , 65). (9) Percentiles are combined across all the different sized windows, and (10) the maximum value is chosen for the visual inspection of the data. B. The output can be plotted chromosome-wide to help find four patterns of putative regions for signatures of positive selection (modified from Oleksyk et al. [3]). Percentiles have been transformed for visualization: -log10 percentiles = log10 (1/percentile). C. The outcomes of a selection scan with SmileFinder algorithm indicating possible selection in two genes, CUL5 and TRIM5, in Biaka populations from central Africa (modified from Zhao et al. [4]). The position of the genes on chromosome 11 are given in megabases (Mb).