| Literature DB >> 29700474 |
Luke M Evans1, Rasool Tahmasbi2, Scott I Vrieze3, Gonçalo R Abecasis4, Sayantan Das4, Steven Gazal5,6, Douglas W Bjelland2, Teresa R de Candia2, Michael E Goddard7,8, Benjamin M Neale6, Jian Yang9, Peter M Visscher9, Matthew C Keller10,11.
Abstract
Multiple methods have been developed to estimate narrow-sense heritability, h2, using single nucleotide polymorphisms (SNPs) in unrelated individuals. However, a comprehensive evaluation of these methods has not yet been performed, leading to confusion and discrepancy in the literature. We present the most thorough and realistic comparison of these methods to date. We used thousands of real whole-genome sequences to simulate phenotypes under varying genetic architectures and confounding variables, and we used array, imputed, or whole genome sequence SNPs to obtain 'SNP-heritability' estimates. We show that SNP-heritability can be highly sensitive to assumptions about the frequencies, effect sizes, and levels of linkage disequilibrium of underlying causal variants, but that methods that bin SNPs according to minor allele frequency and linkage disequilibrium are less sensitive to these assumptions across a wide range of genetic architectures and possible confounding factors. These findings provide guidance for best practices and proper interpretation of published estimates.Entities:
Mesh:
Year: 2018 PMID: 29700474 PMCID: PMC5934350 DOI: 10.1038/s41588-018-0108-x
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Summary of commonly applied methods and a description of findings from simulations.
| Method & | Description | Major Assumptions | Simulation findings regarding
| Computational Issues |
|---|---|---|---|---|
| GREML-SC[ | Often called the “GCTA approach.” Originally applied to common array SNPs only. Estimates
| 1) Genetic similarity is uncorrelated with environmental similarity; 2) an infinitesimal model; 3) SNP effects are normally distributed, independent of LD, and inversely proportionate to MAF (α=−1). | Biased to the degree that the average LD among SNPs is different than the average LD between SNPs and CVs. This occurs in stratified samples and when MAF & LD distributions of SNPs do not match those of CVs. | Simple model tractable with large samples (>100K). |
| GREML-MS[ | The first multi-component approach, usually applied by binning SNPs according to their MAF, annotation, or physical regions in order to explore genetic architecture. | Requires that the same assumptions of GREML-SC hold within each GRM. | Biased if CVs have generally higher or lower levels of LD than the SNPs used to make the GRM. Relatively large standard errors. | Run times and memory requirements higher than GREML-SC and increase as a function of the number of variance components estimated. |
| GREML-LDMS-R[ | A multi-component approach that bins imputed SNPs by their MAF and regional LD. | Same as GREML-MS | Use of regional LD scores can lead to biases if CVs have different LD on average than surrounding SNPs. Relatively large standard errors. | Same as GREML-MS. |
| GREML-LDMS-I | A multi-component approach introduced here that bins imputed SNPs by their MAF and individual LD. | Same as GREML-MS | Appears to be the least biased approach, even when traits have complex genetic architectures. Relatively large standard errors. | Same as GREML-MS. |
| LDAK-SC[ | Introduced to account for redundant tagging of CVs by common SNPs. Recently modified to incorporate error due to imputation and to alter the MAF-effect size relationship. | Same as GREML-SC, except that allelic effects are a function of LD. Extended to assume that effects are also a function of imputation quality and weakly inversely proportionate to MAF (α=−0.25). | Can correct for the overestimation observed in GREML-SC from redundant tagging of CVs, but otherwise about as biased as GREML-SC when assumptions are unmet, although the biases are sometimes in different directions. | Same as GREML-SC. |
| LDAK-MS[ | A multi-component extension of LDAK-SC that bins SNPs by MAF. | Requires that the same assumptions of LDAK-SC hold within each GRM. | Less biased on average than LDAK-SC, but more biased than GREML-LDMS (-I or -R). Relatively large standard errors. | Same as GREML-MS. |
| Threshold GRMs[ | A multi-component approach with two GRMs: the normal (unthresholded) GRM built from all SNPs, and a second GRM with entries set to 0 if below a threshold. Conducted in samples that include close relatives. | Same as GREML-SC for the unthresholded GRM. Assumes no shared environmental influences among close relatives. | Estimates associated with unthresholded GRM similar to those of GREML-SC. When used in samples that include close relatives, the second GRM captures pedigree-associated variation but can be upwardly biased by shared environmental influences. | See GREML-SC. |
| LD Score Regression[ | Uses the slope from χ2 (from GWAS) regressed on SNPs’ LD scores to estimate the | Infinitesimal model with allelic effects normally distributed. | Largely robust to confounding due to stratification and shared environmental influences. Estimates | The most computationally efficient method of those compared and is tractable for very large datasets. |
Figure 1Mean across 100 replicates from GRMs built from WGS SNPs in the least structured subsamples. Methods on the x-axis as follows: Single-component GREML (GREML-SC) with all SNPs or only MAF > 0.01; MAF-stratified GREML (GREML-MS); LD and MAF-stratified GREML (GREML-LDMS-R [regional LD] & -I [individual SNP LD]); Single-component Linkage Disequilibrium-Adjusted Kinships (LDAK-SC) with all SNPs or only MAF > 0.01; MAF-stratified LDAK (LDAK-MS); Extended Genealogy with Thresholded GRMs with all SNPs or only common (MAF > 0.01), presenting both h and h (=h + h); LD score regression (LDSC) using no PCs as covariates in GWAS, using PCs as covariates, or partitioned using PCs with MAF-stratification. Estimates are from samples of unrelated individuals (relatedness <0.05) except for those from the Threshold GRM method, which included all individuals. Simulated (true) h = 0.5. Colors represent the MAF range of the 1,000 randomly drawn CVs. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 2 for numerical results. Error bars represent 95% confidence intervals.
Figure 2Mean for four MAF bins across 100 replicates from multi-component approaches in unrelated individuals using WGS SNPs in the least structured subsample. See Fig. 1 for specific methods. Black lines are the true (simulated) h values; note that in the top panel, the true h values differ across MAF. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 4 for numerical results. Error bars represent 95% confidence intervals.
Figure 3Mean across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x-axes). The x-axes of each panel show the simulated CV MAF-scaling parameter, α, and the CV effect size distribution, β. The four panels show different MAF ranges of the 1,000 randomly-drawn CVs. DHS sites were randomly sampled without respect to MAF. Bar colors indicate the fitted model, with a single GRM used except for the “LDMS” models, which used 16 GRMs (α=−1) stratified by MAF and either regional (-R) or individual SNP (-I) LD score. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 6 for numerical results. Error bars represent 95% confidence intervals.
Figure 4Mean across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x-axes). CV effect sizes were simulated from ~N(0,τ). The x-axes of each panel show the simulated CV MAF-scaling parameter, α. The three panels show different MAF ranges of the 1,000 randomly-drawn CVs. Bar colors indicate the fitted model. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 6 for numerical results. Error bars represent 95% confidence intervals.
Figure 5Boxplots of the absolute bias of heritability estimates across all simulated phenotypes from Supplementary Figures 24 & 26 using WGS data to estimate GRMs (top), and from Figures 3–4 using imputed variants to estimate the GRMs (bottom). X axis indicates the parameters for the estimation model, including the MAF scaling factor, α, and the assumed effect size distribution, β, specified in the GRM and whether imputation scores (r) were used in the GRM estimation. All used a single GRM except for LD- & MAF-stratified GREML (LDMS), which used 16 GRMs (α=−1) stratified by MAF and either regional (-R) or individual SNP (-I) LD score. * Typical GREML-SC parameters. † Typical LDAK-SC parameters. Boxplots show the median and interquartile, with whiskers extending 1.5 times the quartiles and more extreme points shown for N=22 (WGS) and 26 (imputed) mean estimates of heritability.
Figure 6Estimated using multiple methods with imputed variants for six complex traits in the UK Biobank. MAF>0.01 indicates common SNPs were used to create the GRMs. ∅ = information matrix was not invertible. HM3 indicates that only imputed HapMap3 sites were used in the LDSC analysis. Sample sizes as follows: height N=94,769; BMI N=94,595; impedance N=93,451; trunk fat N=93,414; fluid intelligence N=31,724; neuroticism N=78,565. See Supplementary Table 8 for numerical results. Error bars are 1 S.E.M.