| Literature DB >> 22373204 |
Wai-Ki Yip1, Gourab De1, Nan Laird1, Benjamin A Raby2.
Abstract
Linkage- and association-based methods have been proposed for mapping disease-causing rare variants. Based on the family information provided in the Genetic Analysis Workshop 17 data set, we formulate a two-pronged approach that combines both methods. Using the identity-by-descent information provided for eight extended pedigrees (n = 697) and the simulated quantitative trait Q1, we explore various traditional nonparametric linkage analysis methods; the best result is obtained by assuming between-family heterogeneity and applying the Haseman-Elston regression to each pedigree separately. We discover strong signals from two genes in two different families and weaker signals for a third gene from two other families. As an exploratory approach, we apply an association test based on a modified family-based association test statistic to all rare variants (frequency < 1% or < 3%) designated as causal for Q1. Family-based association tests correctly identified causal single-nucleotide polymorphisms for four genes (KDR, VEGFA, VEGFC, and FLT1). Our results suggest that both linkage and association tests with families show promise for identifying rare variants.Entities:
Year: 2011 PMID: 22373204 PMCID: PMC3287856 DOI: 10.1186/1753-6561-5-S9-S21
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Pedigree information based on the combined sample
| Pedigree number | Number of nuclear families | Number of affected sibs | Total number of sib pairs | Number of affected sib pairs |
|---|---|---|---|---|
| 0 | 23 | 22 | 86 | 5 |
| 1 | 29 | 26 | 100 | 8 |
| 2 | 26 | 29 | 90 | 4 |
| 3 | 20 | 19 | 74 | 1 |
| 4 | 20 | 18 | 73 | 2 |
| 5 | 20 | 20 | 73 | 7 |
| 6 | 36 | 48 | 128 | 34 |
| 7 | 20 | 18 | 73 | 1 |
| Total | 194 | 200 | 697 | 62 |
Top candidate genes from separate pedigrees
| Pedigree 1 | Pedigree 3 | Pedigree 4 | Pedigree 5 | ||||
|---|---|---|---|---|---|---|---|
| 0.000004 | 0.00000002 | 0.0052 | 0.0003 | ||||
| 0.000013 | 0.00000002 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000002 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000002 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000002 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000002 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000045 | 0.0052 | 0.0009 | ||||
| 0.000013 | 0.00000045 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00005598 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00005598 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00047549 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00047549 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00047549 | 0.0054 | 0.0009 | ||||
| 0.000013 | 0.00167619 | 0.0054 | 0.0009 | ||||
| 0.000014 | 0.00167619 | 0.0054 | 0.0010 | ||||
| 0.000014 | 0.00167619 | 0.0092 | 0.0011 | ||||
| 0.000014 | 0.00167619 | 0.0208 | 0.0013 | ||||
| 0.000014 | 0.00242110 | 0.0213 | 0.0013 | ||||
| 0.000019 | 0.00242110 | 0.0229 | 0.0015 | ||||
| 0.000019 | 0.00303729 | 0.0229 | 0.0015 |
Linkage analysis results of top candidate genes by regressing the square of the difference of Q1 against IBD for all sib pairs in a pedigree.
P-values corresponding to the true causal genes using Q1 as phenotype
| Chromosome | Gene | 1% cutoff | 3% cutoff | ||
|---|---|---|---|---|---|
| Nuclear families | Pedigrees | Nuclear families | Pedigrees | ||
| 1 | 0.441 | 0.301 | 0.450 | 0.406 | |
| 1 | 0.447 | 0.347 | 0.952 | 0.948 | |
| 4 | 0.03 | 0.09 | 0.229 | 0.092 | |
| 4 | 0.009 | 0.317 | 0.009 | 0.317 | |
| 5 | 0.314 | 0.299 | 0.319 | 0.304 | |
| 6 | 0.0002 | 0.122 | 0.002 | 0.156 | |
| 13 | 0.076 | 0.128 | 0.0003 | 0.024 | |
| 14 | NA | NA | 0.317 | 0.317 | |
| 19 | 0.508 | 0.466 | 0.638 | 0.609 | |
a Gene that has polymorphic causal SNPs. The other five causal genes (not marked by superscript a) cannot be identified in our method because there were no causal SNPs corresponding to those genes in the sample.
True-positive rates corresponding to the true causal genes using Q1 as phenotype (estimated from the 200 replications provided in the GAW17 data set)
| Chromosome | Gene | 1% cutoff | 3% cutoff |
|---|---|---|---|
| 4 | 0.085 | 0.035 | |
| 4 | 0.995 | 1 | |
| 6 | 0.995 | 0.990 | |
| 13 | 0.075 | 0.775 |
Figure 1Detection rates from modified FBAT for all genes on chromosomes 4, 5, 6, and 13. Each bar in the graphs represents the percentage of times that the gene was significant (p < 0.05) in the 200 replicates. True-positive disease genes are labeled. Of note, the KIT locus on chromosome 4, frequently detected as a false positive, is in close proximity (394 kb) to the disease-causing KDR locus.