| Literature DB >> 19664279 |
Abstract
BACKGROUND: In affected sibling pair linkage analysis, the presence of linkage disequilibrium (LD) has been shown to lead to overestimation of the number of alleles shared identity-by-descent (IBD) among sibling pairs when parents are ungenotyped. This inflation results in spurious evidence for linkage even when the markers and the disease locus are not linked. In our study, we first theoretically evaluate how inflation in IBD probabilities leads to overestimation of a nonparametric linkage (NPL) statistic under the assumption of linkage equilibrium. Next, we propose a two-step processing strategy in order to systematically evaluate approaches to handle LD. Based on the observed inflation of expected logarithm of the odds ratio (LOD) from our theoretical exploration, we implemented our proposed two-step processing strategy. Step 1 involves three techniques to filter a dense set of markers. In step 2, we use the selected subset of markers from step 1 and apply four different methods of handling LD among dense markers: 1) marker thinning (MT); 2) recursive elimination; 3) SNPLINK; and 4) LD modeling approach in MERLIN. We evaluate relative performance of each method through simulation.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19664279 PMCID: PMC2731784 DOI: 10.1186/1471-2156-10-44
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Step 1: Summary descriptive statistics of the average maximum NPL LOD scores and the average IC with ungenotyped parents.
| 2 affected sibs | 3 affected sibs | 4 affected sibs | |||||
| Ave # SNPs | MLS (SD) | IC | MLS (SD) | IC | MLS (SD) | IC | |
| Unadjusted | 6012 | 13.27 (2.77) | 0.72 | 14.41 (3.64) | 0.80 | 7.86 (2.20) | 0.84 |
| MAF ≥ 0.05 | 5387 | 13.49 (2.66) | 0.72 | 14.77 (3.64) | 0.80 | 8.06 (2.23) | 0.84 |
| MAF ≥ 0.10 | 4616 | 13.87 (2.64) | 0.72 | 15.00 (3.61) | 0.80 | 8.18 (2.25) | 0.84 |
| MAF ≥ 0.20 | 3203 | 15.01 (2.69) | 0.72 | 15.47 (3.61) | 0.79 | 7.90 (2.42) | 0.84 |
| r2 ≥ 0.95 | 4596 | 5.47 (1.79) | 0.72 | 4.93 (2.06) | 0.80 | 2.96 (1.55) | 0.85 |
| MAF ≥ 0.05 & r2 ≥ 0.95 | 3713 | 9.62 (2.75) | 0.72 | 14.01 (3.43) | 0.80 | 5.02 (1.95) | 0.85 |
Summary of average maximum LOD scores and average IC using the baseline marker subset from step 1 for the 9 study designs with ungenotyped parents.
| Number of Sibs | MLS (SD) | Average IC |
| 2 Affected | 9.62 (2.75) | 0.72 |
| 3 Affected | 14.01 (3.43) | 0.80 |
| 4 Affected | 5.02 (1.95) | 0.85 |
| 2 Affected + 1 Unaffected | 8.66 (2.49) | 0.80 |
| 3 Affected + 1 Unaffected | 5.41 (2.12) | 0.85 |
| 4 Affected + 1 Unaffected | 3.17 (1.55) | 0.88 |
| 2 Affected + 2 Unaffected | 4.41 (1.80) | 0.85 |
| 3 Affected + 2 Unaffected | 3.18 (1.57) | 0.88 |
| 4 Affected + 2 Unaffected | 1.89 (1.29) | 0.90 |
Step 2 using D' LD threshold and MT: Summary descriptive statistics of average maximum NPL LOD scores and average IC for families with 2, 3 or 4 affected sibling and ungenotyped* parents.
| 2 affected sibs | 3 affected sibs | 4 affected sibs | ||||||
| Method | LD threshold | Ave # SNPs | MLS (SD) | IC | MLS (SD) | IC | MLS (SD) | IC |
| Unadjusted | 3713 | 9.62 (2.75) | 0.72 | 14.01 (3.43) | 0.80 | 5.02 (1.95) | 0.85 | |
| MT | 8snp1cM | 480 | 1.20 (0.96) | 0.72 | 1.34 (1.01) | 0.80 | 0.36 (0.47) | 0.85 |
| MT | 4snp1cM | 259 | 0.65 (0.67) | 0.71 | 0.60 (0.63) | 0.80 | 0.22 (0.36) | 0.85 |
| MT | 2snp1cM | 135 | 0.66 (0.68) | 0.67 | 0.76 (0.74) | 0.79 | 0.26 (0.38) | 0.84 |
| MT | 1snp1cM | 68 | 0.76 (0.73) | 0.61 | 1.05 (0.88) | 0.76 | 0.34 (0.47) | 0.83 |
| RE | 0.7 | 409 | 0.44 (0.53) | 0.73 | 0.54 (0.61) | 0.80 | 0.2 (0.33) | 0.85 |
| RE | 0.5 | 309 | 0.37 (0.50) | 0.72 | 0.45 (0.55) | 0.80 | 0.18 (0.34) | 0.85 |
| RE | 0.3 | 200 | 0.43 (0.54) | 0.7 | 0.58 (0.60) | 0.80 | 0.23 (0.39) | 0.85 |
| RE | 0.1 | 62 | 0.73 (0.73) | 0.61 | 1.18 (0.96) | 0.76 | 0.44 (0.57) | 0.83 |
| SNPLINK | 0.7 | 531 | 0.65 (0.78) | 0.73 | 0.66 (0.79) | 0.80 | 0.3 (0.46) | 0.85 |
| SNPLINK | 0.5 | 401 | 0.75 (0.81) | 0.72 | 0.73 (0.83) | 0.80 | 0.31 (0.46) | 0.85 |
| SNPLINK | 0.3 | 287 | 0.85 (0.88) | 0.71 | 0.81 (0.84) | 0.80 | 0.33 (0.47) | 0.85 |
| SNPLINK | 0.1 | 120 | 0.65 (0.74) | 0.64 | 0.70 (0.81) | 0.77 | 0.31 (0.46) | 0.83 |
*With complete data where both parents are genotyped, the unadjusted average MLS for 2, 3 or 4 affected sibs are 0.58, 0.5 and 0.47.
Step 2 using r2 LD threshold: Summary descriptive statistics of average maximum NPL LOD scores and average IC for families with 2, 3 or 4 affected sibling and ungenotyped* parents.
| 2 affected sibs | 3 affected sibs | 4 affected sibs | ||||||
| Method | LD threshold | Ave # SNPs | MLS (SD) | IC | MLS (SD) | IC | MLS (SD) | IC |
| Unadjusted | 3713 | 9.62 (2.75) | 0.72 | 14.01 (3.43) | 0.80 | 5.02 (1.95) | 0.85 | |
| RE | 0.7 | 1562 | 1.39 (0.98) | 0.73 | 2.00 (1.28) | 0.80 | 0.73 (0.70) | 0.85 |
| RE | 0.5 | 1240 | 1.21 (0.90) | 0.73 | 1.67 (1.17) | 0.81 | 0.59 (0.63) | 0.85 |
| RE | 0.3 | 892 | 0.45 (0.53) | 0.73 | 0.66 (0.68) | 0.81 | 0.26 (0.39) | 0.85 |
| RE | 0.1 | 435 | 0.32 (0.44) | 0.72 | 0.40 (0.49) | 0.80 | 0.18 (0.34) | 0.85 |
| MERLINLD | 0.7 | 562 | 1.35 (0.97) | 0.73 | 1.83 (1.22) | 0.80 | 0.62 (0.61) | 0.85 |
| MERLINLD | 0.5 | 575 | 1.12 (0.96) | 0.73 | 1.53 (1.19) | 0.80 | 0.53 (0.59) | 0.85 |
| MERLINLD | 0.3 | 542 | 0.55 (0.61) | 0.74 | 0.77 (0.77) | 0.80 | 0.32 (0.43) | 0.85 |
| MERLINLD | 0.1 | 423 | 0.51 (0.58) | 0.74 | 0.69 (0.70) | 0.81 | 0.29 (0.41) | 0.85 |
| SNPLINK | 0.7 | 2596 | 3.94 (1.85) | 0.73 | 5.10 (2.11) | 0.80 | 2.06 (1.23) | 0.85 |
| SNPLINK | 0.5 | 2351 | 3.63 (1.81) | 0.73 | 4.88 (2.09) | 0.80 | 1.97 (1.20) | 0.85 |
| SNPLINK | 0.3 | 2057 | 3.04 (1.73) | 0.73 | 4.10 (1.95) | 0.80 | 1.56 (1.07) | 0.85 |
| SNPLINK | 0.1 | 1519 | 2.69 (1.64) | 0.73 | 3.57 (1.87) | 0.80 | 1.28 (0.98) | 0.85 |
*With complete data where both parents are genotyped, the unadjusted average MLS for 2, 3 or 4 affected sibs are 0.58, 0.5 and 0.47