A Gray1, I Stewart, A Tenesa. 1. EPCC, The University of Edinburgh, Edinburgh EH9 3JZ, UK. a.gray@ed.ac.uk
Abstract
MOTIVATION: The Genome-wide Complex Trait Analysis (GCTA) software package can quantify the contribution of genetic variation to phenotypic variation for complex traits. However, as those datasets of interest continue to increase in size, GCTA becomes increasingly computationally prohibitive. We present an adapted version, Advanced Complex Trait Analysis (ACTA), demonstrating dramatically improved performance. RESULTS: We restructure the genetic relationship matrix (GRM) estimation phase of the code and introduce the highly optimized parallel Basic Linear Algebra Subprograms (BLAS) library combined with manual parallelization and optimization. We introduce the Linear Algebra PACKage (LAPACK) library into the restricted maximum likelihood (REML) analysis stage. For a test case with 8999 individuals and 279,435 single nucleotide polymorphisms (SNPs), we reduce the total runtime, using a compute node with two multi-core Intel Nehalem CPUs, from ∼17 h to ∼11 min. AVAILABILITY AND IMPLEMENTATION: The source code is fully available under the GNU Public License, along with Linux binaries. For more information see http://www.epcc.ed.ac.uk/software-products/acta. CONTACT: a.gray@ed.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The Genome-wide Complex Trait Analysis (GCTA) software package can quantify the contribution of genetic variation to phenotypic variation for complex traits. However, as those datasets of interest continue to increase in size, GCTA becomes increasingly computationally prohibitive. We present an adapted version, Advanced Complex Trait Analysis (ACTA), demonstrating dramatically improved performance. RESULTS: We restructure the genetic relationship matrix (GRM) estimation phase of the code and introduce the highly optimized parallel Basic Linear Algebra Subprograms (BLAS) library combined with manual parallelization and optimization. We introduce the Linear Algebra PACKage (LAPACK) library into the restricted maximum likelihood (REML) analysis stage. For a test case with 8999 individuals and 279,435 single nucleotide polymorphisms (SNPs), we reduce the total runtime, using a compute node with two multi-core Intel Nehalem CPUs, from ∼17 h to ∼11 min. AVAILABILITY AND IMPLEMENTATION: The source code is fully available under the GNU Public License, along with Linux binaries. For more information see http://www.epcc.ed.ac.uk/software-products/acta. CONTACT: a.gray@ed.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Carmen Amador; Jennifer Huffman; Holly Trochet; Archie Campbell; David Porteous; James F Wilson; Nick Hastie; Veronique Vitart; Caroline Hayward; Pau Navarro; Chris S Haley Journal: BMC Genomics Date: 2015-06-06 Impact factor: 3.969
Authors: Enrique Sánchez-Molano; Ricardo Pong-Wong; Dylan N Clements; Sarah C Blott; Pamela Wiener; John A Woolliams Journal: Front Genet Date: 2015-03-13 Impact factor: 4.599
Authors: Enrique Sánchez-Molano; John A Woolliams; Ricardo Pong-Wong; Dylan N Clements; Sarah C Blott; Pamela Wiener Journal: BMC Genomics Date: 2014-10-01 Impact factor: 3.969
Authors: J Friedrich; R Antolín; S M Edwards; E Sánchez-Molano; M J Haskell; J M Hickey; P Wiener Journal: Anim Genet Date: 2018-07-05 Impact factor: 3.169
Authors: Juliane Friedrich; Erling Strandberg; Per Arvelius; E Sánchez-Molano; Ricardo Pong-Wong; John M Hickey; Marie J Haskell; Pamela Wiener Journal: Heredity (Edinb) Date: 2019-10-14 Impact factor: 3.821
Authors: Suzanne J Rowe; Amy Rowlatt; Gail Davies; Sarah E Harris; David J Porteous; David C Liewald; Geraldine McNeill; John M Starr; Ian J Deary; Albert Tenesa Journal: PLoS One Date: 2013-12-12 Impact factor: 3.240