| Literature DB >> 31387299 |
Zareen Vadva1, Charles E Larsen2,3, Bennett E Propp1, Michael R Trautwein1, Dennis R Alford1, Chester A Alper4,5.
Abstract
Single nucleotide polymorphisms (SNPs) are usually the most frequent genomic variants. Directly pedigree-phased multi-SNP haplotypes provide a more accurate view of polymorphic population genomic structure than individual SNPs. The former are, therefore, more useful in genetic correlation with subject phenotype. We describe a new pedigree-based methodology for generating non-ambiguous SNP haplotypes for genetic study. SNP data for haplotype analysis were extracted from a larger Type 1 Diabetes Genetics Consortium SNP dataset based on minor allele frequency variation and redundancy, coverage rate (the frequency of phased haplotypes in which each SNP is defined) and genomic location. Redundant SNPs were eliminated, overall haplotype polymorphism was optimized and the number of undefined haplotypes was minimized. These edited SNP haplotypes from a region containing HLA-DRB1 (DR) and HLA-DQB1 (DQ) both correlated well with HLA-typed DR,DQ haplotypes and differentiated HLA-DR,DQ fragments shared by three pairs of previously identified megabase-length conserved extended haplotypes. In a pedigree-based genetic association assay for type 1 diabetes, edited SNP haplotypes and HLA-typed HLA-DR,DQ haplotypes from the same families generated essentially identical qualitative and quantitative results. Therefore, this edited SNP haplotype method is useful for both genomic polymorphic architecture and genetic association evaluation using SNP markers with diverse minor allele frequencies.Entities:
Keywords: HLA polymorphism; T1DGC; disease association; haplotype; major histocompatibility complex (MHC); pedigree; phase; protocol; single nucleotide polymorphism (SNP); type 1 diabetes (T1D)
Year: 2019 PMID: 31387299 PMCID: PMC6721696 DOI: 10.3390/cells8080835
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 6.600
Figure 1Genomic map of HLA-DR/DQ region in the human major histocompatibility complex (MHC) reference sequence. The map shows a slightly larger region than that phased in MERLIN. The two marked single nucleotide polymorphisms (SNPs) represent the boundaries of the phased 101 SNP haplotypes from which SNPs were “pre-triaged” for redundancy to create the initial 37-SNP haplotypes for further editing.
T1DGC MHC Fine Mapping SNPs to distinguish B8,DR3 and B18,DR3 CEHs 1.
| dbSNP Variants | rs2076536 | rs3117103 | rs3135363 | rs6901541 | rs4999342 | Cell Line | ||
|---|---|---|---|---|---|---|---|---|
| B8,DR3 | T | T | G | C | C | COX | ||
| B18,DR3 | C | A | A | T | T | QBL | ||
|
|
|
|
|
|
|
|
|
|
| B8,DR3 | 1 | 1 | 3 | 2 | 2 | 95.0 | 0.8 | 4.2 |
| B18,DR3 | 3 | 4 | 1 | 4 | 4 | 79.4 | 10.7 | 9.8 |
1 Seq = Sequence. Shown are the reference sequence (rs) SNP alleles for two different MHC conserved extended haplotypes (CEHs). dbSNP data were provided by NCBI (https://www.ncbi.nlm.nih.gov/snp/).
The most frequent T1DGC ImmunoChip study edited 27-SNP haplotypes 1.
| SNP Variant Name | Total ( | Percentage | SNP Variant Name | Total ( | Percentage |
|---|---|---|---|---|---|
| Variant 1 | 1786 | 28.3 | Variant 19 | 45 | 0.7 |
| Variant 2 | 1064 | 16.9 | Variant 20 | 26 | 0.4 |
| Variant 3 | 517 | 8.2 | Variant 21 | 22 | 0.3 |
| Variant 4 | 467 | 7.4 | Variant 22 | 22 | 0.3 |
| Variant 5 | 400 | 6.3 | Variant 23 | 21 | 0.3 |
| Variant 6 | 308 | 4.9 | Variant 24 | 10 | 0.2 |
| Variant 7 | 296 | 4.7 | Variant 25 |
| 0.1 |
| Variant 8 | 285 | 4.5 | Variant 26 |
| 0.1 |
| Variant 9 | 154 | 2.4 | Variant 27 |
| 0.1 |
| Variant 10 | 134 | 2.1 | Variant 28 |
| 0.1 |
| Variant 11 | 112 | 1.8 | Variant 29 | 8 | 0.1 |
| Variant 12 | 96 | 1.5 | Variant 30 | 7 | 0.1 |
| Variant 13 | 79 | 1.3 | Variant 31 | 6 | 0.1 |
| Variant 14 | 77 | 1.2 | Variant 32 | 4 | 0.1 |
| Variant 15 | 71 | 1.1 | Variant 33 | 4 | 0.1 |
| Variant 16 | 59 | 0.9 | Variant 34 | 4 | 0.1 |
| Variant 17 | 56 | 0.9 | Variant 68 | 1 | 0.0 |
| Variant 18 | 55 | 0.9 |
1 These edited SNP haplotype variants are those that existed at n ≥ 4 in the entire T1DGC ImmunoChip study or were otherwise named in the main text (Variant 68). The SNP haplotype sequences for all of the edited SNP haplotypes named here are given in Supplementary Table S2.
Major edited 27-SNP haplotypes shared by both T1DGC studies.
| Edited SNP Haplo Rank | Variant | SNP Haplo | % Defined | Dominant HLA-DR,DQ Haplotype | |||
|---|---|---|---|---|---|---|---|
|
|
|
| HLA Abbrev. | ||||
| 1 | Variant 1 | 729 | 28.5 | 04:xx | 03:01 | 03:02 | DR4,DQ8 |
| 2 | Variant 2 | 447 | 17.5 | 03:01 | 05:01 | 02:01 | B8,DR3,DQ2 |
| 3 | Variant 4 | 205 | 8.0 | 03:01 | 05:01 | 02:01 | B18,DR3,DQ2 |
| 4 | Variant 3 | 202 | 7.9 | 01:01 | 01:01 | 05:01 | DR0101,DQ5 |
| 5 | Variant 5 | 166 | 6.5 | 07:01 | 02:01 | 02:02 | DR7,DQ2 |
| 6 | Variant 6 | 124 | 4.8 | 04:xx | 03:01 | 03:01/03:04 | DR4,DQ7 |
| 7 | Variant 7 | 116 | 4.5 | 15:01 | 01:02 | 06:02 | DR15,DQ6 |
| 8 | Variant 8 | 106 | 4.1 | 11:xx | 05:01 | 03:01 | DR11,DQ3 |
| 9 | Variant 9 | 62 | 2.4 | 08:01 | 04:01 | 04:02 | DR8,DQ4 |
| 10 | Variant 10 | 53 | 2.1 | 13:02 | 01:02 | 06:04 | DR1302,DQ6 var1 |
| 11 | Variant 11 | 40 | 1.6 | 13:01 | 01:03 | 06:03 | DR1301,DQ6 var1 |
| 11 | Variant 12 | 40 | 1.6 | 16:01 | 01:02 | 05:02 | DR16,DQ5 |
| 13 | Variant 14 | 34 | 1.3 | 07:01 | 02:01 | 03:03 | DR7,DQ3 |
| 14 | Variant 13 | 30 | 1.2 | 13:01 | 01:03 | 06:03 | DR1301,DQ6 var2 |
| 15 | Variant 15 | 29 | 1.1 | 09:01 | 03:01 | 03:03 | DR9,DQ3 |
| 16 | Variant 17 | 23 | 0.9 | 12:01 | 05:01 | 03:01 | DR12,DQ3 |
| 17 | Variant 19 | 22 | 0.9 | 13:02 | 01:02 | 06:04 | DR1302,DQ6 var2 |
| 18 | Variant 18 | 21 | 0.8 | 14:01/14:04 | 01:01 | 05:03 | DR14,DQ5 |
| 19 | Variant 16 | 17 | 0.7 | 01:02 | 01:01 | 05:01 | DR0102,DQ5 |
| TOTAL | 2466 | 96.3 | |||||
Major HLA-DR,DQ haplotypes shared in T1DGC studies: their dominant 27-SNP haplotype and their percentages 1.
| DR,DQ | HLA Haplo | DR,DQ | % all DR,DQ | Dominant SNP | 1st | % of This DR,DQ | % of Fully-Defined |
|---|---|---|---|---|---|---|---|
| 1 | DR4,DQ8 | 1024 | 27.4 | Variant 1 | 722 | 70.5% | 99.2% |
| 2 | All DR3,DQ2 | 950 | 25.4 | Variant 2 | 441 | 46.4% | 67.7% |
| 3 | DR0101,DQ5 | 290 | 7.8 | Variant 3 | 185 | 63.8% | 99.5% |
| 4 | DR7,DQ2 | 230 | 6.2 | Variant 5 | 165 | 71.7% | 100.0% |
| 5 | DR15,DQ6 | 188 | 5.0 | Variant 7 | 110 | 58.5% | 99.1% |
| 6 | DR11,DQ3 | 182 | 4.9 | Variant 8 | 102 | 56.0% | 98.1% |
| 7 | DR4,DQ7 | 155 | 4.1 | Variant 6 | 106 | 68.4% | 98.1% |
| 8 | DR1301,DQ6 | 108 | 2.9 | Variant 11 | 40 | 37.0% | 58.8% |
| 9 | DR1302,DQ6 | 104 | 2.8 | Variant 10 | 51 | 49.0% | 65.4% |
| 10 | DR8,DQ4 | 89 | 2.4 | Variant 9 | 58 | 65.2% | 90.6% |
| 11 | DR16,DQ5 | 62 | 1.7 | Variant 12 | 40 | 64.5% | 100.0% |
| 11 | DR7,DQ3 | 54 | 1.4 | Variant 14 | 34 | 63.0% | 94.4% |
| 13 | DR9,DQ3 | 45 | 1.2 | Variant 15 | 29 | 64.4% | 100.0% |
| 14 | DR14,DQ5 | 36 | 1.0 | Variant 18 | 21 | 58.3% | 87.5% |
| 15 | DR12,DQ3 | 30 | 0.8 | Variant 17 | 22 | 73.3% | 95.7% |
| 16 | DR0102,DQ5 | 27 | 0.7 | Variant 16 | 17 | 63.0% | 100.0% |
| TOTAL | 3574 | 95.7 | TOTAL | 2143 |
1 The HLA haplotype abbreviations used here are those from Table 3 with minor exceptions. Here, the test haplotype is the HLA-DR,DQ (DR,DQ) haplotype. Therefore, for example, the entire DR3,DQ2 group is analyzed. The last column gives the percentage of each DR,DQ haplotype group represented by the dominant 27-SNP haplotype among all fully-defined 27-SNP haplotypes. The second most frequent 27-SNP haplotype and their percentages of each DR,DQ haplotype group as well as the total untyped or unphased 27-SNP haplotypes for each DR,DQ group are given in Supplementary Table S3.
Dominant MHC CEHs in major 27-SNP edited haplotypes of the DR,DQ region 1.
| SNP Haplo Var. Name | Dom. DR,DQ Haplo ( | SNP Haplo Total ( | Dom. DR,DQ Haplo Total ( | Dom. CEH of DR,DQ Var. | Dom. DR,DQ CEH Total ( |
|---|---|---|---|---|---|
| Variant 1 |
| 729 | 722 | None | ** |
| Variant 2 |
| 447 | 441 | [HLA-C7,B8,SC01,DR3] | ** |
| Variant 3 |
| 202 | 185 | * | ** |
| Variant 4 |
| 205 | 202 | [HLA-C5,B18,F1C30,DR3] | ** |
| Variant 5 |
| 166 | 165 | * | ** |
| Variant 6 |
| 124 | 106 | * | ** |
| Variant 7 |
| 116 | 110 | [HLA-C7,B7,SC31,DR15] | 31; 52% *** |
| Variant 8 |
| 106 | 102 | None | ** |
| Variant 9 |
| 62 | 58 | [HLA-C7,B39,unk,DR8] | 11; 19% |
| Var. 10 |
| 53 | 51 | [HLA-C3,B40,SC02,DR13] | 26; 51% |
| Var. 11 |
| 40 | 40 | [HLA-C12,B38,SC21,DR13] | 10; 25% |
| Var. 12 |
| 40 | 40 | [HLA-C12,B39,unk,DR16] | 9; 23% |
| Var. 13 |
| 30 | 27 | [HLA-C3,B15,unk,DR13] | 8; 30% |
| Var. 14 |
| 34 | 34 | [HLA-C6,B57,SC61,DR7] | 20; 59% |
| Var. 15 |
| 29 | 29 | [HLA-C7,B7,unk,DR9] | 5; 17% |
| Var. 16 |
| 17 | 17 | [HLA-C8,B14,SC2(1,2),DR1] | 11; 65% |
| Var. 17 |
| 23 | 22 | [HLA-C5,B44,unk,DR12] | 4; 18% |
| Var. 18 |
| 21 | 21 | [HLA-C4,B35,unk,DR14] | 6; 29% |
| Var. 19 |
| 22 | 22 | [HLA-C7,B15,unk,DR13] | 6; 27% |
| TOTAL | 2466 | 2394 |
1 Dom. = Dominant; Haplo = Haplotype; Var. = Variant. * The dominant CEH of this group was not determined; ** The totals for these CEHs were not determined; *** Only 60 of 110 haplotypes were evaluated.
Analysis of equalized fully-phased 27-SNP edited disease (DIS) and family control (FC) haplotypes from the ImmunoChip study 1.
| SNP Haplo Var. Name | DIS Haplo | FC Haplo | Total | DIS/FC | DIS Haplo | FC Haplo | χ2 * |
|---|---|---|---|---|---|---|---|
| Variant 1 | 916 | 238 | 1154 | 3.85 | 1 | 2 | 398.34 |
| Variant 2 | 416 | 204 | 620 | 2.04 | 2 | 3 | 72.49 |
| Variant 3 | 108 | 190 | 298 | 0.57 | 4 | 5 | 22.56 |
| Variant 4 | 249 | 46 | 295 | 5.41 | 3 | 12 | 139.69 |
| Variant 7 | 5 | 254 | 259 | 0.02 | 14 | 1 | 239.39 |
| Variant 5 | 47 | 190 | 237 | 0.25 | 6 | 5 | 86.28 |
| Variant 8 | 18 | 195 | 213 | 0.09 | 10 | 4 | 147.08 |
| Variant 6 | 73 | 120 | 193 | 0.61 | 5 | 7 | 11.45 |
| Variant 11 | 8 | 71 | 79 | 0.11 | 13 | 8 | 50.24 |
| Variant 9 | 45 | 32 | 77 | 1.41 | 7 | 15 | 2.19 |
| Variant 14 | 2 | 68 | 70 | 0.03 | 19 | 9 | -- |
| Variant 10 | 25 | 43 | 68 | 0.58 | 8 | 13 | 4.76 |
| Variant 12 | 20 | 39 | 59 | 0.51 | 9 | 14 | 6.12 |
| Variant 13 | 4 | 52 | 56 | 0.08 | 17 | 10 | -- |
| Variant 18 | 1 | 49 | 50 | 0.02 | 22 | 11 | -- |
| Variant 15 | 18 | 24 | 42 | 0.75 | 10 | 17 | 0.86 |
| Variant 17 | 5 | 26 | 31 | 0.19 | 14 | 16 | 14.23 |
| Variant 16 | 9 | 17 | 26 | 0.53 | 12 | 19 | 2.46 |
| Variant 19 | 4 | 16 | 20 | 0.25 | 17 | 20 | -- |
| Variant 20 | 1 | 19 | 20 | 0.05 | 22 | 18 | -- |
| Variant 21 | 1 | 10 | 11 | 0.10 | 22 | 21 | -- |
| Variant 23 | 1 | 10 | 11 | 0.10 | 22 | 21 | -- |
| Variant 25 | 5 | 1 | 6 | 5.00 | 14 | 29 | -- |
| Variant 29 | 1 | 5 | 6 | 0.20 | 22 | 23 | -- |
| Variant 28 | 2 | 3 | 5 | 0.67 | 19 | 24 | -- |
| Variant 33 | 2 | 2 | 4 | 1.00 | 19 | 27 | -- |
| Variant 31 | 1 | 3 | 4 | 0.33 | 22 | 24 | -- |
| Variant 27 | 1 | 3 | 4 | 0.33 | 22 | 24 | -- |
| Variant 32 | 1 | 2 | 3 | 0.50 | 22 | 27 | -- |
| Misc. Haplos | 15 | 72 | 87 | ||||
| TOTAL | 2004 | 2004 | 4008 | 1198.15 |
1 Misc. = Miscellaneous; Haplo = Haplotype; Var. = Variant. * Chi-squared statistic of DIS and FC SNP haplotypes each ≥ 5 in frequency.
Analysis of equalized DIS and FC SNP haplotypes each ≥ 5 in frequency among overlapping families in both T1DGC studies 1.
| SNP Haplo Var. Name | DIS Haplo | FC Haplo | Total | DIS/FC | DIS Haplo | FC Haplo | χ2 * |
|---|---|---|---|---|---|---|---|
| Variant 1 | 373 | 102 | 475 | 3.66 | 1 | 2 | 154.61 |
| Variant 2 | 169 | 84 | 253 | 2.01 | 2 | 3 | 28.56 |
| Variant 4 | 105 | 24 | 129 | 4.38 | 3 | 9 | 50.86 |
| Variant 3 | 44 | 72 | 116 | 0.61 | 4 | 5 | 6.76 |
| Variant 7 | 1 | 103 | 104 | 0.01 | 17 | 1 | -- |
| Variant 5 | 21 | 78 | 99 | 0.27 | 6 | 4 | 32.82 |
| Variant 8 | 11 | 70 | 81 | 0.16 | 8 | 6 | 42.98 |
| Variant 6 | 26 | 47 | 73 | 0.55 | 5 | 7 | 6.04 |
| Variant 9 | 17 | 16 | 33 | 1.06 | 7 | 13 | 0.03 |
| Variant 11 | 2 | 30 | 32 | 0.07 | 13 | 8 | -- |
| Variant 10 | 10 | 16 | 26 | 0.63 | 9 | 13 | 1.38 |
| Variant 12 | 6 | 18 | 24 | 0.33 | 10 | 12 | 6.00 |
| Variant 13 | 2 | 20 | 22 | 0.10 | 13 | 10 | -- |
| Variant 18 | 1 | 19 | 20 | 0.05 | 17 | 11 | -- |
| Variant 15 | 3 | 12 | 15 | 0.25 | 12 | 15 | -- |
| Variant 17 | 1 | 11 | 12 | 0.09 | 17 | 16 | -- |
| Variant 19 | 4 | 5 | 9 | 0.80 | 11 | 18 | -- |
| Variant 16 | 2 | 6 | 8 | 0.33 | 13 | 17 | -- |
| Variant 28 | 2 | 1 | 3 | 2.00 | 13 | 19 | -- |
| Variant 27 | 1 | 1 | 2 | 1.00 | 17 | 19 | -- |
| Misc. Haplos | 7 | 73 | 80 | ||||
| TOTAL | 808 | 808 | 1616 | 330.04 |
1 Misc.: Miscellaneous; Haplo: Haplotype; Var: Variant. * Chi-squared statistic of DIS and FC SNP haplotypes each ≥ 5 in frequency.
Analysis of equalized DIS and FC HLA-DR,DQ haplotypes each ≥ 5 in frequency among overlapping families in both T1DGC studies 1.
| DR,DQ Haplo | DIS Haplo | FC Haplo | Total | DIS/FC | DIS Haplo | FC Haplo | χ2 * |
|---|---|---|---|---|---|---|---|
| DR4,DQ8 | 508 | 87 | 595 | 5.84 | 1 | 6 | 297.88 |
| DR3,DQ2 | 416 | 133 | 549 | 3.13 | 2 | 2 | 145.88 |
| DR0405,DQ2 | 7 | 4 | 11 | 1.75 | 12 | 16 | -- |
| DR8,DQ4 | 30 | 26 | 56 | 1.15 | 5 | 12 | 0.29 |
| DR13,DQ0604 | 21 | 30 | 51 | 0.70 | 7 | 10 | 1.59 |
| DR0901,DQ0303 | 11 | 19 | 30 | 0.58 | 10 | 14 | 2.13 |
| DR1,DQ0501 | 72 | 132 | 204 | 0.55 | 3 | 3 | 17.65 |
| DR16,DQ0502 | 13 | 25 | 38 | 0.52 | 8 | 13 | 3.79 |
| DR4,DQ7 | 27 | 68 | 95 | 0.40 | 6 | 8 | 17.69 |
| DR7,DQ2 | 31 | 118 | 149 | 0.26 | 4 | 5 | 50.80 |
| DR13,DQ0603 | 8 | 77 | 85 | 0.10 | 11 | 7 | 56.01 |
| DR11,DQ0301 | 12 | 132 | 144 | 0.09 | 9 | 3 | 100.00 |
| DR12,DQ0301 | 1 | 17 | 18 | 0.06 | 13 | 15 | -- |
| DR14,DQ0503 | 1 | 30 | 31 | 0.03 | 13 | 10 | -- |
| DR15,DQ0602 | 1 | 166 | 167 | 0.01 | 13 | 1 | -- |
| DR0701,DQ0303 | 0 | 49 | 49 | 0.00 | 16 | 9 | -- |
| Misc. Haplos | 12 | 58 | 70 | ||||
| TOTAL | 1171 | 1171 | 2342 | 693.71 |
1 Misc.: Miscellaneous; Haplo: Haplotype; Var: Variant. *Chi-square statistic of DIS and FC HLA-DR,DQ haplotypes each ≥ 5 in frequency.