| Literature DB >> 23311589 |
Simon Boitard1, Robert Kofler, Pierre Françoise, David Robelin, Christian Schlötterer, Andreas Futschik.
Abstract
Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for genome-wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool-seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool-Seq data. We introduce Pool-hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool-Seq sample. Pool-hmm includes several options that allow a flexible analysis of Pool-Seq data, and can be run in parallel on several processors. Source code and documentation for Pool-hmm is freely available at https://qgsp.jouy.inra.fr/.Entities:
Mesh:
Year: 2013 PMID: 23311589 PMCID: PMC3592992 DOI: 10.1111/1755-0998.12063
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Fig. 1AFS in a quail sample of n = 20 chromosomes, computed from a random sample of genomic positions (empty circles), genomic positions within exons (full circles, left panel), genomic positions within sweep window 1 (empty triangles, right panel) or genomic positions within sweep window 2 (plus, right panel). Probabilities of 0- and 20-derived alleles are not shown because they are not at the same scale. The large probability observed for 19-derived alleles may be due to the misspecification of the ancestral allele at a small proportion of segregating sites. Such errors are expected if there is shared polymorphism between quail and chicken.
Execution time of Pool-hmm for the analysis of chromosome 1 in a quail sample of n = 20 chromosomes. Results are provided for several types of analyses and for one, four or eight available processors on a computing cluster. Pool-hmm commands corresponding to these analyses in the case of one available processor are listed below the table
| Number of processors | AFS estimation | First sweep prediction | Additional sweep prediction | Allele frequency estimation |
|---|---|---|---|---|
| 1 | 6 h 4 min 9 s | 12 h 21 min 36 s | 0 h 10 min 14 s | 31 h 7 min 47 s |
| 4 | 1 h 57 min 46 s | 4 h 32 min 57 s | 0 h 09 min 21 s | 7 h 47 min 9 s |
| 8 | 1 h 33 min 30 s | 3 h 15 min 53 s | 0 h 07 min 06 s | 4 h 17 min 14 s |
Python pool-hmm.py –input-file quail -n 20 -a ‘reference’ –only-spectrum –theta 0.005 –ratio 50.
Python pool-hmm.py –input-file quail -n 20 -a ‘reference’ –pred –spectrum-file quail –k 0.0000000001.
Python pool-hmm.py –input-file quail -n 20 -a ‘reference’ –pred –emit-file –k 0.0000000001.
Python pool-hmm.py –input-file quail -n 20 -a ‘reference’ –estim –spectrum-file quail.