Literature DB >> 22820204

GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Rod Peakall1, Peter E Smouse.   

Abstract

SUMMARY: GenAlEx: Genetic Analysis in Excel is a cross-platform package for population genetic analyses that runs within Microsoft Excel. GenAlEx offers analysis of diploid codominant, haploid and binary genetic loci and DNA sequences. Both frequency-based (F-statistics, heterozygosity, HWE, population assignment, relatedness) and distance-based (AMOVA, PCoA, Mantel tests, multivariate spatial autocorrelation) analyses are provided. New features include calculation of new estimators of population structure: G'(ST), G''(ST), Jost's D(est) and F'(ST) through AMOVA, Shannon Information analysis, linkage disequilibrium analysis for biallelic data and novel heterogeneity tests for spatial autocorrelation analysis. Export to more than 30 other data formats is provided. Teaching tutorials and expanded step-by-step output options are included. The comprehensive guide has been fully revised.
AVAILABILITY AND IMPLEMENTATION: GenAlEx is written in VBA and provided as a Microsoft Excel Add-in (compatible with Excel 2003, 2007, 2010 on PC; Excel 2004, 2011 on Macintosh). GenAlEx, and supporting documentation and tutorials are freely available at: http://biology.anu.edu.au/GenAlEx. CONTACT: rod.peakall@anu.edu.au.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22820204      PMCID: PMC3463245          DOI: 10.1093/bioinformatics/bts460

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

GenAlEx 6 was originally developed as a teaching tool to facilitate teaching population genetic analysis at the graduate level (Peakall and Smouse, 2006). GenAlEx operates within Microsoft Excel—the widely used spreadsheet software that forms part of the cross-platform Microsoft Office suite. Packaging genetic analysis within a familiar and flexible environment resulted in quick understanding and effective performance of population genetic analyses. Taking advantage of the rich graphical options available within Excel, GenAlEx offers a wide range of graphical outputs that aid genetic data analysis and interpretation. GenAlEx is now widely used by university teachers at both undergraduate and graduate levels around the world. Moreover, the software has also attracted a large number of researchers who utilize its unique features. Here we provide an update on the new features offered in GenAlEx 6.5 that we believe will be welcomed by students, teachers and researchers. GenAlEx offers population genetic analysis of diploid codominant, haploid, haplotypic and binary genetic data from animals, plants and microorganisms. It accommodates a wide range of genetic markers, including microsatellites (SSRs), single-nucleotide polymorphisms (SNPs), amplified fragment length polymorphisms and DNA sequences. Both allele frequency-based and distance-based analysis options are provided. The former includes estimates of heterozygosity and genetic diversity, F-statistics, Nei’s genetic distance, population assignment and relatedness. The latter includes Analysis of Molecular Variance (AMOVA), Principal Coordinates Analysis (PCoA), Mantel tests, TwoGener, multivariate and 2D spatial autocorrelation. Readers are referred to Peakall and Smouse (2006) for a more comprehensive outline of these standard procedures, data formats and data import options. GenAlEx 6.5 maintains backward compatibility, but it provides access to the expanded spreadsheet of Excel 2007 onward. Thus, the maximum numbers of loci and samples are vastly expanded and only constrained by memory. More than 30 different Excel graphs summarize the outcomes of genetic analyses. Graphics can be further manipulated with Excel options and easily converted to pdf or other publication-quality formats.

2 NEW FEATURES

2.1 New estimators of population structure

There has been much recent debate about the utility of FST as a measure of population genetic structure (Jost, 2008; Ryman and Leimar, 2009; Whitlock, 2011). GenAlEx 6.5 offers the calculation of G′ST, G′′ST and Jost’s Dest, providing [0,1]-standardized allele frequency-based estimators of population genetic structure, following Meirmans and Hedrick (2011), testing the null by random permutation and estimating variances via jackknifing and bootstrapping over loci. New AMOVA routines now enable the estimation of standardized F′ST, following Meirmans (2006). The calculation of these statistics was validated by comparison with the software GenoDive v2.0b22 (Meirmans and Van Tienderen, 2004).

2.2 Shannon’s information statistics

Shannon information indices have been widely used in ecology but largely overlooked in genetics despite offering a framework for quantifying biological diversity across multiple scales (genes to landscapes). GenAlEx offers the calculation of a series of Shannon indices, including the mutual information index UA, an alternative estimator of population structure. The methods follow Sherwin who assessed the performance of Shannon indices for estimating genetic diversity. Smouse and Ward (1978) extend to multiple hierarchical levels, with a unique three-level partition option and statistical testing by random permutation offered in GenAlEx 6.5.

2.3 Tools for comparing pairwise population statistics

The Mantel test capability of GenAlEx has been extended to allow multiple comparison among pairwise population statistics such as FST, F′ST, G′ST, G′′ST, Dest and . This will allow informed comparison of the new estimators of population structure.

2.4 Heterogeneity testing for spatial autocorrelation

GenAlEx 6.5 introduces novel heterogeneity tests (Smouse ), extending application of the multiallelic, multilocus spatial autocorrelation analysis methods of Smouse and Peakall (1999), Peakall and Double . These new methods provide valuable insights into fine-scale genetic processes across a wide range of animals and plants. Banks and Peakall (2012) have confirmed the statistical power and performance of this heterogeneity test by spatially explicit computer simulations.

2.5 Linkage disequilibrium tests (LD) for biallelic data

Despite its importance, there is no universal test for disequilibrium (Slatkin, 2008). GenAlEx 6.5 offers pairwise tests for disequilibrium between biallelic markers such as SNPs. When phase is known, this includes the calculation of D, D′, r and r2, following Hedrick (2005). Maximum likelihood estimation is used to calculate D and r when phase is unknown (Weir, 1990, p. 310). The results were validated against GDA (Lewis and Zaykin, 2001). Inclusion of LD fills an important technical gap, particularly for teachers. For large SNP sets, or multiallelic data, GenAlEx users are encouraged to take advantage of the options to export their data to other packages such as Arlequin 3.5 (Excoffier and Lischer, 2010).

2.6 New allele frequency format

Retrospective calculation of the new estimators of population structure such as G′ST, Dest and Shannon indices are now possible from published allele frequency data. Teachers will also find this a helpful option for the re-analysis of textbook examples.

2.7 Import and export options

GenAlEx offers data import from several popular formats and tools for importing and manipulating raw data from DNA sequencers. Export to more than 30 other data formats is provided, enabling access to myriad other software packages. For example, direct export is offered to programs such as GENEPOP (Rousset, 2008) and STRUCTURE (Pritchard ), and via these same formats to many other programs, including genetic packages in R such as adegenet (Jombart, 2008) and pegas (Paradis, 2010). The full list of export options, along with notes on the export process, can found at the website.

3 SPECIAL FEATURES FOR TEACHING

Offering a user-friendly software package for university students and teachers remains an ongoing goal of GenAlEx. We continue to expand the popular step-by-step output options that allow students to follow the steps in the analytical pathway. Teaching-specific menu options are also provided. For example, the Rand menu allows students to permute and bootstrap hypothetical datasets with color tracking, to aid an understanding of how these statistical tests work. Finally, we have made freely available a set of tutorial notes and supporting datasets drawn from the graduate workshops that we have offered (both jointly and independently) around the world.

4 DOCUMENTATION

More than 150 pages of documentation are provided. This includes Appendix 1 that outlines the statistical analyses used and their supporting references. The revised guide to GenAlEx 6.5 fully cross-links with the GenAlEx tutorials and Appendix 1.

5 CONCLUSION

GenAlEx 6.5 offers a wide range of population genetic analysis options for the full spectrum of genetic markers within the Microsoft Excel environment on both PC and Macintosh computers. When combined with its user-friendly interface, rich graphical outputs for data exploration and publication, tools for data manipulation and export options to many other software packages, we believe that GenAlEx offers an ideal launching pad for population genetic analysis by students, teachers and researchers alike.
  19 in total

1.  Spatial autocorrelation analysis of individual multiallele and multilocus genetic structure.

Authors:  P E Smouse; R Peakall
Journal:  Heredity (Edinb)       Date:  1999-05       Impact factor: 3.821

2.  Dispersal, philopatry, and infidelity: dissecting local genetic structure in superb fairy-wrens (Malurus cyaneus).

Authors:  M C Double; R Peakall; N R Beck; A Cockburn
Journal:  Evolution       Date:  2005-03       Impact factor: 3.694

3.  Using the AMOVA framework to estimate a standardized genetic differentiation measure.

Authors:  Patrick G Meirmans
Journal:  Evolution       Date:  2006-11       Impact factor: 3.694

4.  Measurement of biological information with applications from genes to landscapes.

Authors:  William B Sherwin; Franck Jabot; Rebecca Rush; Maurizio Rossetto
Journal:  Mol Ecol       Date:  2006-09       Impact factor: 6.185

5.  G(ST) and its relatives do not measure differentiation.

Authors:  Lou Jost
Journal:  Mol Ecol       Date:  2008-09       Impact factor: 6.185

6.  pegas: an R package for population genetics with an integrated-modular approach.

Authors:  Emmanuel Paradis
Journal:  Bioinformatics       Date:  2010-01-14       Impact factor: 6.937

7.  adegenet: a R package for the multivariate analysis of genetic markers.

Authors:  Thibaut Jombart
Journal:  Bioinformatics       Date:  2008-04-08       Impact factor: 6.937

8.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.

Authors:  Laurent Excoffier; Heidi E L Lischer
Journal:  Mol Ecol Resour       Date:  2010-03-01       Impact factor: 7.090

9.  genepop'007: a complete re-implementation of the genepop software for Windows and Linux.

Authors:  François Rousset
Journal:  Mol Ecol Resour       Date:  2008-01       Impact factor: 7.090

10.  GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors:  Rod Peakall; Peter E Smouse
Journal:  Bioinformatics       Date:  2012-07-20       Impact factor: 6.937

View more
  2000 in total

1.  Analytical methods for immunogenetic population data.

Authors:  Steven J Mack; Pierre-Antoine Gourraud; Richard M Single; Glenys Thomson; Jill A Hollenbach
Journal:  Methods Mol Biol       Date:  2012

2.  Comparative use of InDel and SSR markers in deciphering the interspecific structure of cultivated citrus genetic diversity: a perspective for genetic association studies.

Authors:  Andrés García-Lor; François Luro; Luis Navarro; Patrick Ollitrault
Journal:  Mol Genet Genomics       Date:  2011-12-11       Impact factor: 3.291

3.  New multilocus variable-number tandem-repeat analysis tool for surveillance and local epidemiology of bacterial leaf blight and bacterial leaf streak of rice caused by Xanthomonas oryzae.

Authors:  L Poulin; P Grygiel; M Magne; L Gagnevin; L M Rodriguez-R; N Forero Serna; S Zhao; M El Rafii; S Dao; C Tekete; I Wonni; O Koita; O Pruvost; V Verdier; C Vernière; R Koebnik
Journal:  Appl Environ Microbiol       Date:  2014-11-14       Impact factor: 4.792

4.  Genetic diversity and population structure in cultivated sunflower and a comparison to its wild progenitor, Helianthus annuus L.

Authors:  J R Mandel; J M Dechaine; L F Marek; J M Burke
Journal:  Theor Appl Genet       Date:  2011-06-03       Impact factor: 5.699

5.  Molecular Characterization of the Indigenous Stingless Bees (Tetragonula spp. Complex) Using ISSR Marker from Southern Peninsular India.

Authors:  P P Nayak; J Prakash
Journal:  Neotrop Entomol       Date:  2017-07-19       Impact factor: 1.434

6.  Genetic variation and population genetic structure of Rhizophora apiculata (Rhizophoraceae) in the Greater Sunda Islands, Indonesia using microsatellite markers.

Authors:  Andi Fadly Yahya; Jung Oh Hyun; Jae Ho Lee; Yong Yul Kim; Kyung Mi Lee; Kyung Nak Hong; Seung-Chul Kim
Journal:  J Plant Res       Date:  2013-12-10       Impact factor: 2.629

7.  Pollen limitation and reduced reproductive success are associated with local genetic effects in Prunus virginiana, a widely distributed self-incompatible shrub.

Authors:  Adriana Suarez-Gonzalez; Sara V Good
Journal:  Ann Bot       Date:  2013-12-09       Impact factor: 4.357

8.  Genetic diversity analysis reveals that geographical environment plays a more important role than rice cultivar in Villosiclava virens population selection.

Authors:  Fei Wang; Shu Zhang; Mei-Gang Liu; Xian-Song Lin; Hui-Jiang Liu; You-Liang Peng; Yang Lin; Jun-Bin Huang; Chao-Xi Luo
Journal:  Appl Environ Microbiol       Date:  2014-02-28       Impact factor: 4.792

9.  Genetic and palaeo-climatic evidence for widespread persistence of the coastal tree species Eucalyptus gomphocephala (Myrtaceae) during the Last Glacial Maximum.

Authors:  Paul G Nevill; Donna Bradbury; Anna Williams; Sean Tomlinson; Siegfried L Krauss
Journal:  Ann Bot       Date:  2013-11-26       Impact factor: 4.357

10.  Clonal growth is enhanced in the absence of a mating morph: a comparative study of fertile stylar polymorphic and sterile monomorphic populations of Nymphoides montana (Menyanthaceae).

Authors:  Azadeh Haddadchi; Mohammad Fatemi; C L Gross
Journal:  Ann Bot       Date:  2013-11-27       Impact factor: 4.357

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.