| Literature DB >> 25942577 |
Gustavo de Los Campos1, Daniel Sorensen2, Daniel Gianola3.
Abstract
Whole-genome regression methods are being increasingly used for the analysis and prediction of complex traits and diseases. In human genetics, these methods are commonly used for inferences about genetic parameters, such as the amount of genetic variance among individuals or the proportion of phenotypic variance that can be explained by regression on molecular markers. This is so even though some of the assumptions commonly adopted for data analysis are at odds with important quantitative genetic concepts. In this article we develop theory that leads to a precise definition of parameters arising in high dimensional genomic regressions; we focus on the so-called genomic heritability: the proportion of variance of a trait that can be explained (in the population) by a linear regression on a set of markers. We propose a definition of this parameter that is framed within the classical quantitative genetics theory and show that the genomic heritability and the trait heritability parameters are equal only when all causal variants are typed. Further, we discuss how the genomic variance and genomic heritability, defined as quantitative genetic parameters, relate to parameters of statistical models commonly used for inferences, and indicate potential inferential problems that are assessed further using simulations. When a large proportion of the markers used in the analysis are in LE with QTL the likelihood function can be misspecified. This can induce a sizable finite-sample bias and, possibly, lack of consistency of likelihood (or Bayesian) estimates. This situation can be encountered if the individuals in the sample are distantly related and linkage disequilibrium spans over short regions. This bias does not negate the use of whole-genome regression models as predictive machines; however, our results indicate that caution is needed when using marker-based regressions for inferences about population parameters such as the genomic heritability.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25942577 PMCID: PMC4420472 DOI: 10.1371/journal.pgen.1005048
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Mean (SD) of estimates of genomic heritability by simulation scenario (rows) and information used for analysis.
| LD Block | Trans. Prob. | Parameter Values | Average (SE) Maximum Likelihood Estimate | ||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
| QTL | QTL+ MRK.LD | ALL | MRK.LD | MRK.LE+MR.LD | MRK.LE | ||
|
| Fixed | .500 | .295 | .498 (.024) | .499 (.035) | .536 (.225) | .305 (.036) | .289 (.191) | .078 (.113) |
| Rand. | .500 | .303 | .500 (.024) | .499 (.033) | .535 (.214) | .312 (.034) | .324 (.187) | .075 (.105) | |
|
| Fixed | .500 | .328 | .500 (.024) | .505 (.085) | .543 (.210) | .337 (.087) | .353 (.189) | .071 (.100) |
| Rand. | .500 | .325 | .500 (.024) | .502 (.074) | .518 (.152) | .381 (.071) | .415 (.144) | .051 (.071) | |
h 2: trait heritability; : genomic heritability; Short/Long refer to the length of the LD blocks. Fixed/Rand define whether the LD patterns were the same (Fixed) or varied (Rand.) between blocks (Rand). QTL (only QTL), QTL+MRK.LD (QTL and markers in LD with QTL), ALL (all loci), MRK.LD (only markers in LD with QTL), MRK.LD+MRK.LE (only markers, no QTL) and MRK.LE (only markers in LE with QTL) were used to compute the genomic relationship matrix.
Fig 1Boxplot of estimated genomic heritability (3,000 MC replicates) by simulation and analysis scenario.
Each plot presents results for one simulation scenario (5/10K 10 thousand LD blocks with 5 loci each; 50/1K one thousand LD blocks, each with 50 loci; FTP, ‘fixed transition probability, indicates that the LD patterns were the same across LD blocks, RTP, random transition probability, is a scenario where LD patterns changed between blocks). The labels in the horizontal axis indicate what information was used to compute the G-matrix (QTL = genotypes at causal loci, MRK.LD = markers in LD with QTL, MRK.LE = markers in LE with QTL, ALL = all loci).
Fig 2Density plot of estimated genomic heritability (1,000 MC replicates) by analysis scenario (Simulation 2).
The vertical dashed line gives the simulated heritability (QTL, MRK, MRK+QTL indicate whether QTL genotypes, or marker genotypes, MRK, or both, MRK+QTL, were used to compute genomic relationships).