| Literature DB >> 27980638 |
Samantha Lent1, Xuan Deng1, L Adrienne Cupples1, Kathryn L Lunetta1, C T Liu1, Yanhua Zhou1.
Abstract
BACKGROUND: Recent focus on studying rare variants makes imputation accuracy of rare variants an important issue. Many approaches have been proposed to increase imputation accuracy among rare variants, from reference panel selection to combinations of existing methods to multistage analyses. We aimed to bring the strengths of these new approaches together with our proposed two-stage imputation for family data.Entities:
Year: 2016 PMID: 27980638 PMCID: PMC5133481 DOI: 10.1186/s12919-016-0032-y
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Distribution of family size
| Family size | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No. of families | 6 | 8 | 24 | 17 | 15 | 5 | 8 | 1 | 3 | 7 | 2 | 2 | 2 |
Summary statistics of correlation and IQScomparing the imputation with dense markers and sparse markers
| Quality measurements | Minimum | Median | Mean | Maximum | SD | |
|---|---|---|---|---|---|---|
| Correlation | Masked individuals with GWA | 0.00049 | 0.6983 | 0.5766 | 1 | 0.3714 |
| Masked individuals with GWA in LE | 0.00087 | 0.6798 | 0.5708 | 1 | 0.3729 | |
| Impute with –cluster option | 0.0000 | 0.6879 | 0.5748 | 1 | 0.3727 | |
| IQS | Masked individuals with GWA | −0.04793 | 0.4046 | 0.3758 | 0.9715 | 0.3007 |
| Masked individuals with GWA in LE | −0.04758 | 0.3883 | 0.3659 | 0.9682 | 0.2980 | |
| Impute with –cluster option | −0.04887 | 0.3989 | 0.3712 | 0.9682 | 0.2995 |
GWA genome-wide association
Tabulation of genotypes used for IQS calculation
| True Genotypes | ||||
|---|---|---|---|---|
| Imputed Genotypes | AA | AB | BB | Total |
| AA | n11 | n12 | n13 | n1. |
| AB | n21 | n22 | n23 | n2. |
| BB | n31 | n32 | n33 | n3. |
| Total | n.1 | n.2 | n.3 | n.. |
Summary of Imputation Quality by MAF
| Imputation Approach | (0,0.01) | (0.01,0.05) | (0.05,0.4) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| #SNPp* | Mean | Var | #SNPp* | Mean | Var | #SNPp* | Mean | Var | |
| IQS | |||||||||
| Impute2 | 3023 | 0.840 | 0.097 | 1164 | 0.772 | 0.126 | 761 | 0.899 | 0.048 |
| Merlin | 4028 | 0.348 | 0.132 | 1416 | 0.437 | 0.041 | 1142 | 0.404 | 0.006 |
| Combined (0.01)a | 4028 | 0.350 | 0.133 | 1416 | 0.965 | 0.017 | 1142 | 0.992 | 0.004 |
| Combined (0.05) a | 4028 | 0.349 | 0.133 | 1416 | 0.443 | 0.041 | 1142 | 0.992 | 0.004 |
| Correlation | |||||||||
| Impute2 | 3023 | 0.918 | 0.048 | 1164 | 0.981 | 0.006 | 761 | 0.999 | 0.00004 |
| Merlin | 4028 | 0.512 | 0.195 | 1416 | 0.663 | 0.064 | 1142 | 0.687 | 0.007 |
| Combined (0.01) a | 4028 | 0.514 | 0.196 | 1416 | 0.975 | 0.010 | 1142 | 0.994 | 0.003 |
| Combined (0.05) a | 4028 | 0.513 | 0.196 | 1416 | 0.669 | 0.063 | 1142 | 0.993 | 0.003 |
*#SNPp is the number of SNPs with a MAF greater than 0 for both real and imputed genotypes (varies by method)
aCombined (m) indicates the 2-stage imputation approach with MAF cutoff m
Fig. 1Imputation quality vs. MAF. a IQS for all polymorphic sequence variants. b Correlation between true and imputed dosages for all polymorphic sequence variants. c IQS for rare (MAF < 0.05) polymorphic sequence variants. d Correlation between true and imputed dosages for rare (MAF < 0.05) polymorphic sequence variants