Gabriel Renaud1. 1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany.
Abstract
Motivation: Research projects involving population genomics routinely need to store genotyping information, population allele counts, combine files from different samples, query the data and export it to various formats. This is often done using bespoke in-house scripts, which cannot be easily adapted to new projects and seldom constitute reproducible workflows. Results: We introduce glactools, a set of command-line utilities that can import data from genotypes or population-wide allele counts into an intermediate representation, compute various operations on it and export the data to several file formats used by population genetics software. This intermediate format can take two forms, one to store per-individual genotype likelihoods and a second for allele counts from one or more individuals. glactools allows users to perform operations such as intersecting datasets, merging individuals into populations, creating subsets, perform queries (e.g. return sites where a given population does not share an allele with a second one) and compute summary statistics to answer biologically relevant questions. Availability and implementation: glactools is freely available for use under the GPL. It requires a C ++ compiler and the htslib library. The source code and the instructions about how to download test data are available on the website (https://grenaud.github.io/glactools/). Contact: gabriel.reno@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Research projects involving population genomics routinely need to store genotyping information, population allele counts, combine files from different samples, query the data and export it to various formats. This is often done using bespoke in-house scripts, which cannot be easily adapted to new projects and seldom constitute reproducible workflows. Results: We introduce glactools, a set of command-line utilities that can import data from genotypes or population-wide allele counts into an intermediate representation, compute various operations on it and export the data to several file formats used by population genetics software. This intermediate format can take two forms, one to store per-individual genotype likelihoods and a second for allele counts from one or more individuals. glactools allows users to perform operations such as intersecting datasets, merging individuals into populations, creating subsets, perform queries (e.g. return sites where a given population does not share an allele with a second one) and compute summary statistics to answer biologically relevant questions. Availability and implementation: glactools is freely available for use under the GPL. It requires a C ++ compiler and the htslib library. The source code and the instructions about how to download test data are available on the website (https://grenaud.github.io/glactools/). Contact: gabriel.reno@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Jeremy J Berg; Arbel Harpak; Nasa Sinnott-Armstrong; Anja Moltke Joergensen; Hakhamanesh Mostafavi; Yair Field; Evan August Boyle; Xinjun Zhang; Fernando Racimo; Jonathan K Pritchard; Graham Coop Journal: Elife Date: 2019-03-21 Impact factor: 8.140
Authors: Drew R Schield; Blair W Perry; Richard H Adams; Matthew L Holding; Zachary L Nikolakis; Siddharth S Gopalan; Cara F Smith; Joshua M Parker; Jesse M Meik; Michael DeGiorgio; Stephen P Mackessy; Todd A Castoe Journal: Nat Ecol Evol Date: 2022-07-18 Impact factor: 19.100
Authors: Cooper Alastair Grace; Sarah Forrester; Vladimir Costa Silva; Kátia Silene Sousa Carvalho; Hannah Kilford; Yen Peng Chew; Sally James; Dorcas L Costa; Jeremy C Mottram; Carlos C H N Costa; Daniel C Jeffares Journal: Genome Biol Evol Date: 2021-12-01 Impact factor: 3.416