| Literature DB >> 26248562 |
Colin A Maclean1, Neil P Chue Hong1, James G D Prendergast2.
Abstract
Understanding how the genome is shaped by selective processes forms an integral part of modern biology. However, as genomic datasets continue to grow larger it is becoming increasingly difficult to apply traditional statistics for detecting signatures of selection to these cohorts. There is therefore a pressing need for the development of the next generation of computational and analytical tools for detecting signatures of selection in large genomic datasets. Here, we present hapbin, an efficient multithreaded implementation of extended haplotype homzygosity-based statistics for detecting selection, which is up to 3,400 times faster than the current fastest implementations of these algorithms.Entities:
Keywords: EHH; XP-EHH; iHS; selection; software
Mesh:
Year: 2015 PMID: 26248562 PMCID: PMC4651233 DOI: 10.1093/molbev/msv172
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FHapbin versus selscan comparisons. (A) Time taken by hapbin and selscan to calculate iHS across chromosome 22 across 48 cores (1 node) onz ARCHER and on an Amazon c3.8xlarge instance. Subsets of individuals being randomly sampled from the 1000 genomes dataset. (B) Time taken by hapbin and selscan to calculate iHS in the 1000 genomes GBR (Great Britain) population of 89 individuals on the Amazon c3.8xlarge instance. Runs of contiguous SNPs by location were subsampled from all of those on chromosome 22. (C) Hapbin’s relative speedup versus selscan when run across chromosome 22 with varying numbers of cores and individuals on ARCHER. (D) Comparison of unstandardized iHS values output by both selscan and hapbin when run across 500 randomly selected individuals and all SNPs on chromosome 22.