| Literature DB >> 24339886 |
Kaarina Matilainen1, Esa A Mäntysaari, Martin H Lidauer, Ismo Strandén, Robin Thompson.
Abstract
Estimation of variance components by Monte Carlo (MC) expectation maximization (EM) restricted maximum likelihood (REML) is computationally efficient for large data sets and complex linear mixed effects models. However, efficiency may be lost due to the need for a large number of iterations of the EM algorithm. To decrease the computing time we explored the use of faster converging Newton-type algorithms within MC REML implementations. The implemented algorithms were: MC Newton-Raphson (NR), where the information matrix was generated via sampling; MC average information(AI), where the information was computed as an average of observed and expected information; and MC Broyden's method, where the zero of the gradient was searched using a quasi-Newton-type algorithm. Performance of these algorithms was evaluated using simulated data. The final estimates were in good agreement with corresponding analytical ones. MC NR REML and MC AI REML enhanced convergence compared to MC EM REML and gave standard errors for the estimates as a by-product. MC NR REML required a larger number of MC samples, while each MC AI REML iteration demanded extra solving of mixed model equations by the number of parameters to be estimated. MC Broyden's method required the largest number of MC samples with our small data and did not give standard errors for the parameters directly. We studied the performance of three different convergence criteria for the MC AI REML algorithm. Our results indicate the importance of defining a suitable convergence criterion and critical value in order to obtain an efficient Newton-type method utilizing a MC algorithm. Overall, use of a MC algorithm with Newton-type methods proved feasible and the results encourage testing of these methods with different kinds of large-scale problem settings.Entities:
Mesh:
Year: 2013 PMID: 24339886 PMCID: PMC3858226 DOI: 10.1371/journal.pone.0080821
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Variance components used for the simulation, initial values used for the analyses and estimates by analytical EM REML.
|
|
|
|
|
|
| |
| Simulation value | 500.0 | 14.00 | 0.800 | 750.0 | 29.00 | 1.400 |
| Initial value | 350.3 | 12.18 | 0.599 | 615.8 | 21.34 | 1.061 |
| EM REML | 511.9 | 18.11 | 0.747 | 842.6 | 29.10 | 1.590 |
| NR REML | 512.1 | 18.20 | 0.730 | 842.3 | 29.02 | 1.607 |
| AI REML | 512.1 | 18.20 | 0.730 | 842.3 | 29.02 | 1.607 |
| BM REML | 512.6 | 18.08 | 0.751 | 841.9 | 29.13 | 1.586 |
) and three unique residual () (co)variance components. All values are presented in thousands. The model includes three unique genetic (
Means (relative standard deviation) of estimates over the last 10 rounds by MC REML.
| Method |
|
|
|
|
|
|
| EM 20 | 519.3 (0.5%) | 18.30 (0.5%) | 0.752 (0.4%) | 843.3 (1.1%) | 28.98 (1.0%) | 1.578 (1.0%) |
| NR 20 | 446.8 (60.8%) | 15.54 (71.8%) | 0.653 (67.5%) | 877.3 (32.4%) | 30.70 (35.4%) | 1.655 (25.3%) |
| NR 100 | 509.8 (5.4%) | 17.91 (6.4%) | 0.712 (7.7%) | 842.3 (2.6%) | 29.20 (3.4%) | 1.620 (3.3%) |
| NR 1000 | 510.9 (1.6%) | 18.18 (2.1%) | 0.730 (2.5%) | 843.3 (0.8%) | 29.04 (1.0%) | 1.607 (0.9%) |
| AI 20 | 495.5 (7.2%) | 17.44 (8.1%) | 0.689 (8.4%) | 855.3 (3.4%) | 29.57 (4.5%) | 1.632 (4.0%) |
| AI 100 | 513.4 (4.2%) | 18.20 (4.7%) | 0.729 (5.2%) | 839.9 (2.6%) | 28.93 (2.8%) | 1.602 (2.4%) |
| AI 1000 | 513.8 (1.6%) | 18.28 (1.9%) | 0.734 (1.9%) | 840.3 (0.9%) | 28.92 (1.1%) | 1.603 (0.8%) |
| BM 1000 | 502.1 (3.2%) | 17.73 (3.5%) | 0.758 (1.1%) | 852.7 (1.9%) | 29.48 (1.9%) | 1.581 (0.5%) |
) and three unique residual () (co)variance components. Values were calculated over REML rounds 402 to 411 for MC EM REML, 6 to 15 for MC NR and MC AI REML, and 12 to 21 for MC BM REML with 20, 100 or 1000 MC samples. Mean values are presented in thousands. The model includes three unique genetic (
Figure 1Estimates of the genetic covariance component by Newton-type methods.
Analyses by MC NR REML (Figure A), MC AI REML (Figure B) and MC BM REML (Figure C) with 20, 100 and 1000 MC samples (green, blue and red line, respectively). MC EM REML with 20 MC samples is plotted as a reference (grey line). The straight lines in the figures are the estimated genetic covariance (solid line) and plus/minus one standard deviation (dashed lines) based on standard errors by analytical AI REML.
Figure 2Relative difference between MC AI REML estimates and the true estimate obtained by analytical AI REML.
The relative difference (%) is plotted for MC AI REML estimates with 20, 100 and 1000 MC samples (green, blue and red line, respectively) along the iteration. MC EM REML with 20 MC samples is plotted as a reference (grey line).