| Literature DB >> 21637627 |
Eduardo da Cruz Gouveia Pimentel1, Mehdi Sargolzaei, Henner Simianer, Flávio Schramm Schenkel, Zengting Liu, Luiz Alberto Fries, Sandra Aidar de Queiroz.
Abstract
The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use.Entities:
Keywords: breeding value; genomic selection; mixed model equations; numerical method
Year: 2010 PMID: 21637627 PMCID: PMC3036084 DOI: 10.1590/S1415-47572010005000014
Source DB: PubMed Journal: Genet Mol Biol ISSN: 1415-4757 Impact factor: 1.771
Computational requirements for solving the equations using Gentleman-Givens (GG), Gauss-Seidel with Residual Update (GSRU) and Cholesky decomposition (CHD), for the different numbers of animals and markers contemplated in the simulated data set.
| Number of animals | Number of markers | Method
| |||||||
| GG
| GSRU
| CHD
| |||||||
| Time1 | Memory2 | Time | Memory | Time | Memory | ||||
| 1,200 | 1,200 | 3.1 | 2.75 | 2.3 | 2.75 | 2.6 | 2.75 | ||
| 2,400 | 11.8 | 10.99 | 4.4 | 5.49 | 14.0 | 10.99 | |||
| 3,600 | 26.3 | 24.73 | 7.0 | 8.24 | 40.4 | 24.73 | |||
| 4,800 | 47.4 | 43.95 | 9.3 | 10.99 | 87.5 | 43.95 | |||
| 5,925 | 71.9 | 66.97 | 11.5 | 13.56 | 156.0 | 66.97 | |||
| 2,400 | 1,200 | 6.2 | 2.75 | 8.7 | 5.49 | 4.2 | 2.75 | ||
| 2,400 | 23.6 | 10.99 | 15.4 | 10.99 | 20.0 | 10.99 | |||
| 3,600 | 52.6 | 24.73 | 25.4 | 16.48 | 54.0 | 24.73 | |||
| 4,800 | 94.9 | 43.95 | 36.7 | 21.97 | 111.2 | 43.95 | |||
| 5,925 | 143.8 | 66.97 | 43.4 | 27.12 | 191.8 | 66.97 | |||
| 3,600 | 1,200 | 9.4 | 2.75 | 19.0 | 8.24 | 5.7 | 2.75 | ||
| 2,400 | 35.4 | 10.99 | 35.3 | 16.48 | 26.0 | 10.99 | |||
| 3,600 | 78.9 | 24.73 | 58.9 | 24.72 | 67.2 | 24.73 | |||
| 4,800 | 142.3 | 43.95 | 83.3 | 32.96 | 134.9 | 43.95 | |||
| 5,925 | 215.6 | 66.97 | 98.4 | 40.68 | 227.8 | 66.97 | |||
| 4,800 | 1,200 | 12.6 | 2.75 | 30.4 | 10.99 | 7.3 | 2.75 | ||
| 2,400 | 47.2 | 10.99 | 53.9 | 21.97 | 32.6 | 10.99 | |||
| 3,600 | 105.2 | 24.73 | 96.0 | 32.96 | 82.1 | 24.73 | |||
| 4,800 | 189.7 | 43.95 | 131.6 | 43.95 | 161.4 | 43.95 | |||
| 5,925 | 287.5 | 66.97 | 167.5 | 54.24 | 268.8 | 66.97 | |||
| 5,865 | 1,200 | 15.3 | 2.75 | 42.9 | 13.42 | 8.7 | 2.75 | ||
| 2,400 | 57.6 | 10.99 | 87.3 | 26.85 | 38.2 | 10.99 | |||
| 3,600 | 128.2 | 24.73 | 153.2 | 40.27 | 94.2 | 24.73 | |||
| 4,800 | 232.0 | 43.95 | 192.4 | 53.70 | 182.9 | 43.95 | |||
| 5,925 | 351.4 | 66.97 | 240.2 | 66.28 | 301.4 | 66.97 | |||
1in seconds.
2in Megabytes.
Figure 1Relative performance of the methods for the lowest, intermediate and the largest number of animals in the simulated data set.
Figure 2Relative performance of the methods for the lowest, intermediate and the largest number of markers in the simulated data set.
Figure 3Log10 of relative difference between consecutive solutions across iterations in GSRU applied to the weighted analysis of the real data.
Figure 4Processing times (in seconds) for estimating standard errors using Gentleman-Givens /Cholesky decomposition, for different numbers of markers in the simulated data.