| Literature DB >> 26584903 |
Abstract
Many computations with SNP data including genomic evaluation, parameter estimation, and genome-wide association studies use an inverse of the genomic relationship matrix. The cost of a regular inversion is cubic and is prohibitively expensive for large matrices. Recent studies in cattle demonstrated that the inverse can be computed in almost linear time by recursion on any subset of ∼10,000 individuals. The purpose of this study is to present a theory of why such a recursion works and its implication for other populations. Assume that, because of a small effective population size, the additive information in a genotyped population has a small dimensionality, even with a very large number of SNP markers. That dimensionality is visible as a limited number of effective SNP effects, independent chromosome segments, or the rank of the genomic relationship matrix. Decompose a population arbitrarily into core and noncore individuals, with the number of core individuals equal to that dimensionality. Then, breeding values of noncore individuals can be derived by recursions on breeding values of core individuals, with coefficients of the recursion computed from the genomic relationship matrix. A resulting algorithm for the inversion called "algorithm for proven and young" (APY) has a linear computing and memory cost for noncore animals. Noninfinitesimal genetic architecture can be accommodated through a trait-specific genomic relationship matrix, possibly derived from Bayesian regressions. For populations with small effective population size, the inverse of the genomic relationship matrix can be computed inexpensively for a very large number of genotyped individuals.Entities:
Keywords: genomic relationship matrix; genomic selection; inversion; recursion; single-step GBLUP
Mesh:
Substances:
Year: 2015 PMID: 26584903 PMCID: PMC4788224 DOI: 10.1534/genetics.115.182089
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Figure 1Correlations between genomic estimated breeding values (GEBVs) for selection candidates using regular and the APY inverse of the genomic relationship matrix (GRM) with various numbers of base individuals (Fragomeni ). Correlations are based on analysis of 10,102,702 final scores on 6,930,618 Holstein cows, with genotypes available on 100,000 animals; and correlations are based on GEBVs for 49,611 selection candidates.
Figure 2Sparsity pattern of a regular genomic relationship matrix (G) and its inverse (G−1) and elements of the genomic relationship matrix needed for construction of the APY (G for the APY) and the APY inverse (APY G−1).