| Literature DB >> 31878869 |
Dan Jiang1, Cong Xin2, Jinhua Ye3, Yingbo Yuan1, Ming Fang4.
Abstract
BACKGROUND: Genomic prediction is an advanced method for estimating genetic values, which has been widely accepted for genetic evaluation in animal and disease-risk prediction in human. It estimates genetic values with genome-wide distributed SNPs instead of pedigree. The key step of it is to construct genomic relationship matrix (GRM) via genome-wide SNPs; however, usually the calculation of GRM needs huge computer memory especially when the SNP number and sample size are big, so that sometimes it will become computationally prohibitive even for super computer clusters. We herein developed an integrative algorithm to compute GRM. To avoid calculating GRM for the whole genome, ICGRM freely divides the genome-wide SNPs into several segments and computes the summary statistics related to GRM for each segment that requires quite few computer RAM; then it integrates these summary statistics to produce GRM for whole genome.Entities:
Keywords: Gblup; Genomic relationship matrix; Genomic selection
Mesh:
Year: 2019 PMID: 31878869 PMCID: PMC6933885 DOI: 10.1186/s12859-019-3319-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The flowchart of the ICGRM
The computer memory cost at different level of individuals and SNPs
| 0.1 Million SNPs | 0.5 Million SNPs | 1 Million SNPs | 10 Million SNPs | |
|---|---|---|---|---|
| 5000 | 3. 9G | 11.4G | 20.7G | 181.2Gb |
| 10,000 | 6.4G | 21.2G | 39.9G | – |
| 20,000 | 12.3G | 42.1G | 79.4G | – |
| 30,000 | 19.8G | 64.5G | 118.3 G | – |
Fig. 2The computational memory (a) and time cost (b) at different number of individuals and splitting parts