| Literature DB >> 18466471 |
Ilja M Nolte1, André R de Vries2, Geert T Spijker3, Ritsert C Jansen4, Dumitru Brinza5, Alexander Zelikovsky5, Gerard J Te Meerman2.
Abstract
We propose two new haplotype-sharing methods for identifying disease loci: the haplotype sharing statistic (HSS), which compares length of shared haplotypes between cases and controls, and the CROSS test, which tests whether a case and a control haplotype show less sharing than two random haplotypes. The significance of the HSS is determined using a variance estimate from the theory of U-statistics, whereas the significance of the CROSS test is estimated from a sequential randomization procedure. Both methods are fast and hence practical, even for whole-genome screens with high marker densities. We analyzed data sets of Problems 2 and 3 of Genetic Analysis Workshop 15 and compared HSS and CROSS to conventional association methods. Problem 2 provided a data set of 2300 single-nucleotide polymorphisms (SNPs) in a 10-Mb region of chromosome 18q, which had shown linkage evidence for rheumatoid arthritis. The CROSS test detected a significant association at approximately position 4407 kb. This was supported by single-marker association and HSS. The CROSS test outperformed them both with respect to significance level and signal-to-noise ratio. A 20-kb candidate region could be identified. Problem 3 provided a simulated 10 k SNP data set covering the whole genome. Three known candidate regions for rheumatoid arthritis were detected. Again, the CROSS test gave the most significant results. Furthermore, both the HSS and the CROSS showed better fine-mapping accuracy than straightforward haplotype association. In conclusion, haplotype sharing methods, particularly the CROSS test, show great promise for identifying disease gene loci.Entities:
Year: 2007 PMID: 18466471 PMCID: PMC2367507 DOI: 10.1186/1753-6561-1-s1-s129
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Analysis results of the NARAC data set. Results of the CROSS test (green line), the HSS (pink line) and single-marker association (black line) are plotted for all 2300 SNPs on chromosome 18q.
Figure 2Region showing the strongest association in the NARAC data set. The CROSS (green line), the HSS (pink line) and single-SNP association (black line) -log(p-values) are plotted for the 150-kb region showing the strongest association.
Figure 3Analysis results of the simulated data set in the regions 6p21 (a), 11q23.1 (b), and 18q22.2 (c). On the x-axis, the cumulative position with respect to the start of chromosome 1 is given. The black, blue, pink, and green lines represent the mean of the -log(p-values) of the ten replicates of single-SNP association, five-SNP haplotype association, HSS, and CROSS, respectively.
Type I error and power
| Type I errora | Powerb | |||||
| Method | 0.05 | 0.005 | 0.0005 | Chr 6 | Chr 11 | Chr 18 |
| Single-SNP association | 0.051 | 0.0048 | 0.00057 | 1.0 | 0.7 | 0.1 |
| Haplotype association | 0.053 | 0.0052 | 0.00051 | 1.0 | 0.8 | 0.1 |
| HSS | 0.052 | 0.0046 | 0.00031 | 1.0 | 1.0 | 0.1 |
| CROSS | 0.051 | 0.0051 | 0.00061 | 1.0 | 1.0 | 0.4 |
aType I results at three different significance levels were determined from the Replicates 1 to 10 of SNPs on all chromosomes except 6, 11, and 18.
bPower is determined as the fraction of replicates showing a significant result (p < 0.05/9187) in a 1-Mb region around SNP6_153 on chromosome 6, SNP11_389 on chromosome 11, and SNP18_269 on chromosome 18.