| Literature DB >> 30345380 |
Gerry Tonkin-Hill1, John A Lees2, Stephen D Bentley1, Simon D W Frost3,4, Jukka Corander1,5,6.
Abstract
Identifying structure in collections of sequence data sets remains a common problem in genomics. hierBAPS, a popular algorithm for identifying population structure in haploid genomes, has previously only been available as a MATLAB binary. We provide an R implementation which is both easier to install and use, automating the entire pipeline. Additionally, we allow for the use of multiple processors, improve on the default settings of the algorithm, and provide an interface with the ggtree library to enable informative illustration of the clustering results. Our aim is that this package aids in the understanding and dissemination of the method, as well as enhancing the reproducibility of population structure analyses.Entities:
Keywords: R; clustering; population structure
Year: 2018 PMID: 30345380 PMCID: PMC6178908 DOI: 10.12688/wellcomeopenres.14694.1
Source DB: PubMed Journal: Wellcome Open Res ISSN: 2398-502X
Figure 1. Phylogenetic tree built using Iqtree and annotated with the top level clusters identified using rhierBAPS.
Figure 2. Phylogenetic tree focusing on the 9th cluster at the top level identified using rhierBAPS and plotted using the plot_sub_cluster function. The subsequent clustering at the 2nd level is indicated in the sub-tree to the right.