| Literature DB >> 26185143 |
Pouya Khankhanian1, Pierre-Antoine Gourraud1, Antoine Lizee1, Douglas S Goodin1.
Abstract
Genome-wide association studies (GWAS), using single nucleotide polymorphisms (SNPs), have yielded 110 non-human leucocyte antigen genomic regions that are associated with multiple sclerosis (MS). Despite this large number of associations, however, only 28% of MS-heritability can currently be explained. Here we compare the use of multi-SNP-haplotypes to the use of single-SNPs as alternative methods to describe MS genetic risk. SNP-haplotypes (of various lengths from 1 up to 15 contiguous SNPs) were constructed at each of the 110 previously identified, MS-associated, genomic regions. Even after correcting for the larger number of statistical comparisons made when using the haplotype-method, in 32 of the regions, the SNP-haplotype based model was markedly more significant than the single-SNP based model. By contrast, in no region was the single-SNP based model similarly more significant than the SNP-haplotype based model. Moreover, when we included the 932 MS-associated SNP-haplotypes (that we identified from 102 regions) as independent variables into a logistic linear model, the amount of MS-heritability, as assessed by Nagelkerke's R-squared, was 38%, which was considerably better than 29%, which was obtained by using only single-SNPs. This study demonstrates that SNP-haplotypes can be used to fine-map the genetic associations within regions of interest previously identified by single-SNP GWAS. Moreover, the amount of the MS genetic risk explained by the SNP-haplotype associations in the 110 MS-associated genomic regions was considerably greater when using SNP-haplotypes than when using single-SNPs. Also, the use of SNP-haplotypes can lead to the discovery of new regions of interest, which have not been identified by a single-SNP GWAS. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.Entities:
Keywords: Clinical genetics; Genome-wide; Multiple sclerosis
Mesh:
Year: 2015 PMID: 26185143 PMCID: PMC4552900 DOI: 10.1136/jmedgenet-2015-103071
Source DB: PubMed Journal: J Med Genet ISSN: 0022-2593 Impact factor: 6.318
Figure 1The strength of the multiple sclerosis (MS)-associations comparing single-nucleotide polymorphisms (SNPs) to multi-SNP-haplotypes in the 110 MS-associated regions identified by genome-wide association studies (GWAS) (1718). The designations haplotype-2 and haplotype-3 refer to those circumstances in which the ‘top’ haplotype-association was more significant (adjusted) than the ‘top’ single-SNP-association by two or three orders of magnitude, respectively. Similarly, the designations singleton-2 and singleton-3 refer to circumstances in which the ‘top’ single-SNP-association was more significant than the ‘top’ haplotype-association by two or three orders of magnitude, respectively. The y-axis represents the total number of regions in which these particular circumstances occurred. The x-axis indicates the range of multi-SNP-haplotype sizes considered. In all cases SNP-haplotypes outperformed single-SNPs. Because the sets of larger haplotype sizes had a much larger number of identified haplotypes than the sets of smaller haplotype size, these sets were subjected to a much more stringent Bonferroni correction. As a result, these larger sets didn't perform as well compared with single-SNPs as those sets, which included haplotypes of nine SNPs or less.
Chromosome 6 (long arm) associations*
| SNPs† | Haplotype | Control | Control | Case | Case | OR | p Value‡ |
|---|---|---|---|---|---|---|---|
| Single-SNPs | |||||||
| rs11154801_A | 1 | 7592 | 10 908 | 4212 | 7152 | 1.24 (1.18 to 1.30) | 2×10−18 |
| rs1475069_C | 1 | 9444 | 9299 | 5036 | 6032 | 1.22 (1.16 to 1.28) | 3×10−16 |
| Multiple SNPs | |||||||
| rs7739635 _A | |||||||
| rs10223338_A | |||||||
| rs12202212_A | |||||||
| rs1475069_C | 0001000 | 18 616 | 34 | 10 815 | 175 | 8.9 (6.1 to 13.2) | 5×10−44 |
| rs2038551_A | |||||||
| rs9399161_G | |||||||
| rs4896180_G | |||||||
The SNP rs11154801_A is the top single-SNP ‘hit’ in the region. The SNP rs1475069_C is the top single-SNP ‘hit’ within the multi-SNP-haplotype but the fourth most significant single-SNP ‘hit’ overall. These two SNPs are separated by 317 kb of DNA. The 95% CIs for the OR are shown in parentheses.
*In all cases the model selected was dominant.
†Letters designate the minor allele nucleotide at the SNP location in the control population. Thus, the letters, which follow each SNP’s so-called ‘rs ID’ number, indicate the allele that was designated as the ‘1’ allele.
‡The p values presented have not been corrected for the total number of SNP-haplotypes or single-SNPs tested. The total number of SNP-haplotypes (of any length from 1 to 15 SNPs) was 110 310. Therefore, the Bonferroni-corrected p value for the multi-SNP-haplotype is 6×10−39, which is still well below the uncorrected p value for the single-SNP.
SNP, single nucleotide polymorphism.
Figure 2The haplotype frequencies of the top ‘hits’ in cases and controls are shown on the x-axis (log-scale) in panels (A) and (B), respectively. In each panel, the frequencies are shown for single-nucleotide polymorphisms (SNPs) (SNP-haplotype length=1) and for multi-SNP-haplotypes (SNP-haplotype lengths=2–15). On the y-axis is shown the number (count) of different haplotypes that were present at the different mean haplotype frequencies.
Figure 3Histogram of the different haplotype sizes for the 932 multiple sclerosis (MS)-associated single-nucleotide polymorphism (SNP)-haplotypes identified at the 110 non-major histocompatibility complex MS-associated genomic regions (see online supplementary table S1). On the y-axis is the per cent of the total haplotypes represented by each haplotype size. On the x-axis is the number of SNPs included in each identified SNP-haplotype (ie, the haplotype size).
Nagelkerke's R-squared (R2) values for the different data splits
| R2 | R2 | |
|---|---|---|
| Single-SNPs | ||
| Split A | 0.255 | 0.340 |
| Split B | 0.185 | 0.291 |
| SNP-haplotypes | ||
| Split A | 0.418 | 0.482 |
| Split B | 0.289 | 0.377 |
| HLA alone | ||
| Split A | – | 0.130 |
| Split B | – | 0.133 |
*The HLA models were based on previously published allelic associations4 6 7 in addition to an association with the DQB1*0502 allele (see online supplementary material). Consequently the same set of alleles and allelic interactions was used for both splits.
HLA, human leucocyte antigen; SNP, single nucleotide polymorphism.
Chromosome 13 associations*
| SNPs† | Haplotype | Control | Control | Case | Case | OR | p Value‡ |
|---|---|---|---|---|---|---|---|
| Multiple SNPs | |||||||
| rs3116605_G | 1010 | 18 718 | 26 | 11 059 | 161 | 10.5 | 3×10−43 |
| rs17074558_A | |||||||
| rs279072_G | |||||||
| rs1928123_C | |||||||
| Split A | 1010 | 9558 | 12 | 5522 | 77 | 10.9 | 9×10−22 |
| Split B | 1010 | 9360 | 14 | 5537 | 84 | 10.1 | 5×10−23 |
*In all cases the model was: one copy of the haplotype vs zero copies of the haplotype.
†Letters designate the minor allele nucleotide at the SNP location in the control population. If the haplotype has a one at a particular location, this indicates that this haplotype is has the minor SNP allele at this location; zero indicates the opposite.
‡The p values presented have not been corrected for the total number of SNP-haplotypes or single-SNPs tested. The total number of SNP-haplotypes (of any length from 1 to 15 SNPs) was 26 180. Therefore, the Bonferroni-corrected p value for the multi-SNP-haplotype is 1×10−38, which is still well below the uncorrected p value for the single-SNP.
SNP, single nucleotide polymorphism.