| Literature DB >> 24720548 |
Nikolas Pontikos, Deborah J Smyth, Helen Schuilenburg, Joanna M M Howson, Neil M Walker, Oliver S Burren, Hui Guo, Suna Onengut-Gumuscu, Wei-Min Chen, Patrick Concannon, Stephen S Rich, Jyothi Jayaraman, Wei Jiang, James A Traherne, John Trowsdale, John A Todd, Chris Wallace1.
Abstract
BACKGROUND: Killer Immunoglobulin-like Receptors (KIRs) are surface receptors of natural killer cells that bind to their corresponding Human Leukocyte Antigen (HLA) class I ligands, making them interesting candidate genes for HLA-associated autoimmune diseases, including type 1 diabetes (T1D). However, allelic and copy number variation in the KIR region effectively mask it from standard genome-wide association studies: single nucleotide polymorphism (SNP) probes targeting the region are often discarded by standard genotype callers since they exhibit variable cluster numbers. Quantitative Polymerase Chain Reaction (qPCR) assays address this issue. However, their cost is prohibitive at the sample sizes required for detecting effects typically observed in complex genetic diseases.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24720548 PMCID: PMC4029094 DOI: 10.1186/1471-2164-15-274
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Bivariate copy number calling of from qPCRCt. On the left, the median normalised ΔCt values for KIR3DS1 and KIR3DL1 are shown with the results of clustering into the eight copy number groups coloured according to the group with the highest posterior probability. The three most common KIR3DS1-KIR3DL1 copy number groups are the ones with a total copy number of two: 0-2 (dark green), 1-1 (pink) and 2-0 (dark blue). The ellipses delimit the 95 percentile. On the right, the counts of the most probable copy number groups are shown for cases and controls.
Figure 2Overlay of ImmunoChip and qPCR samples for and at SNP rs592645. Samples are coloured by the most likely KIR3DS1-KIR3DL1 copy number group according to the qPCR analysis (see Figure 1). It should be apparent that R is representative of the total copy number whereas θ relates to the ratio of copies of KIR3DL1 to KIR3DS1. The first and second row split the samples on the availability of qPCR data, and the third row is the overlay of the samples from the first and second row. The first and second column split the samples by case-control status and the third column is the overlay of the samples from the first and second column.
Figure 3Leave-one-out crossvalidation error rate for k-nearest neighbour prediction. Leave-one-out cross validation error rates obtained from k-nearest neighbours (knn) prediction of KIR3DL1/3DS1 copy numbers from the R and θ signals of SNP rs592645. Each point shows the proportion of samples for which the knn predicted copy number did not match the qPCR call, averaged over ten multiply imputed qPCR call datasets (using the posterior probabilities from Figure 1). Error bars show the minimum and maximum error rates over the ten multiply imputed datasets. Knn was run in parallel for cases only, controls only and on all samples together. The minimum error rate is achieved for k=8 when the prediction uses both cases and controls.
Figure 4Error rate of k-nearest neighbour prediction from and of rs592645 in random subset of samples. Each panel shows the LOOCV error rates of KIR3DL1/3DS1 copy number prediction from R and θ of rs592645 in the remaining unlabeled samples when using a different size subset of the training data. The percentage of the complete training data set and the size of the subset is given in the title of each panel. Each point represents the LOOCV error rate averaged over ten multiply imputed qPCR call datasets (using the posterior probabilities from Figure 1). Smoothing lines show the average over 25 independent random subsets of training data. The black dashed line represent the observed error rate in the complete sample. As the size of the training dataset increases the error rate becomes less sensitive to the choice of the parameter k. Only 295 samples are required to achieve LOOCV error rates <5% and 590 for error rates <2.5%.
Association with T1D tested in the joint copy number group - (a), and in the marginal (b) and (c) copy number groups
| 444:446 | 890 | 1.00 | | | 4094:3222 | 7316 | 1 | | | |
| 229:207 | 436 | 1.11 | 0.88-1.40 | 0.3673 | 2050:1628 | 3678 | 0.99 | 0.92-1.07 | 0.8349 | |
| 26:28 | 54 | 0.92 | 0.52-1.61 | 0.7713 | 229:225 | 454 | 0.79 | 0.65-0.96 | 0.0193 | |
| 15:16 | 31 | 0.94 | 0.46-1.93 | 0.8695 | 121:101 | 222 | 0.92 | 0.7-1.2 | 0.5246 | |
| 13:14 | 27 | 0.93 | 0.43-2.01 | 0.8587 | 98:74 | 172 | 1.04 | 0.77-1.42 | 0.7822 | |
| 13:11 | 24 | 1.19 | 0.53-2.68 | 0.6794 | 116:77 | 193 | 1.19 | 0.89-1.59 | 0.2535 | |
| 4:3 | 7 | 1.34 | 0.30-6.02 | 0.7031 | 25:21 | 46 | 0.94 | 0.52-1.68 | 0.8255 | |
| 3:2 | 5 | 1.52 | 0.27-8.62 | 0.6369 | 11:14 | 25 | 0.74 | 0.3-1.82 | 0.518 | |
| 747:727 | 1474 | | | 0.9842 | 6744:5362 | 12106 | | | 0.3552 | |
| 457:460 | 917 | 1.00 | | | 4192:3296 | 7488 | 1 | | | |
| 257:234 | 491 | 1.11 | 0.89-1.38 | 0.3702 | 2287:1806 | 4093 | 0.99 | 0.92-1.07 | 0.8883 | |
| 33:33 | 66 | 1.01 | 0.61-1.66 | 0.9795 | 265:260 | 525 | 0.8 | 0.67-0.96 | 0.0151 | |
| 747:727 | 1474 | | | 0.6651 | 6744:5362 | 12106 | | | 0.0506 | |
| 457:457 | 914 | 1.00 | | | 4210:3299 | 7509 | 1 | | | |
| 246:224 | 470 | 1.10 | 0.88-1.37 | 0.4096 | 2173:1723 | 3896 | 0.99 | 0.91-1.07 | 0.7785 | |
| 41:44 | 85 | 0.94 | 0.60-1.47 | 0.7787 | 350:326 | 676 | 0.83 | 0.71-0.97 | 0.0212 | |
| 3:2 | 5 | 1.24 | 0.21-7.28 | 0.8084 | 11:14 | 25 | 0.74 | 0.3-1.82 | 0.5119 | |
| 747:727 | 1474 | 0.8044 | 6744:5362 | 12106 | 0.1494 | |||||
No evidence of a significant, joint or marginal, effect was detected in the qPCR dataset, 747 cases and 727 controls, nor in the SNP dataset, 6744 cases and 5362 controls. Case-control counts shown are derived from the most likely copy number assignment across the ten multiply imputed qPCR and SNP datasets. Statistical inference for association is derived from the multiply imputed datasets using the R mitools package [13]. The last row of each table contains the pooled p-value for each association test using the R mice package [14].
Association with T1D conditional on the presence of the respective HLA-Bw4 epitope, tested in the joint copy number group - (a), and in the marginal (b) and (c) copy number groups
| 259:286 | 545 | 1.00 | | | 1027:1157 | 2184 | 1 | | | |
| 123:128 | 251 | 1.06 | 0.79-1.43 | 0.6976 | 555:582 | 1137 | 1.08 | 0.93-1.24 | 0.3119 | |
| 16:15 | 31 | 1.22 | 0.58-2.57 | 0.5985 | 59:88 | 147 | 0.76 | 0.54-1.07 | 0.1133 | |
| 7:13 | 20 | 0.59 | 0.23-1.51 | 0.2754 | 34:40 | 74 | 0.93 | 0.58-1.48 | 0.7529 | |
| 8:8 | 16 | 1.10 | 0.41-2.98 | 0.8450 | 24:33 | 57 | 0.85 | 0.5-1.45 | 0.5502 | |
| 10:7 | 17 | 1.58 | 0.59-4.20 | 0.3621 | 36:24 | 60 | 1.69 | 1-2.85 | 0.0491 | |
| 2:1 | 3 | 2.21 | 0.20-24.50 | 0.5187 | 7:4 | 11 | 1.97 | 0.58-6.76 | 0.2793 | |
| 3:0 | 3 | | | | 5:0 | 5 | | | | |
| 428:458 | 886 | | | 0.8978 | 1747:1928 | 3675 | | | 0.2173 | |
| 267:294 | 561 | 1.00 | | | 1051:1190 | 2241 | 1 | | | |
| 140:148 | 288 | 1.04 | 0.78-1.38 | 0.7787 | 625:646 | 1271 | 1.09 | 0.95-1.26 | 0.1975 | |
| 21:16 | 37 | 1.45 | 0.74-2.83 | 0.2822 | 71:92 | 163 | 0.88 | 0.64-1.21 | 0.4181 | |
| 428:458 | 886 | | | 0.5563 | 1747:1928 | 3675 | | | 0.2586 | |
| 159:187 | 346 | 1.00 | | | 649:733 | 1382 | 1 | | | |
| 93:83 | 176 | 1.32 | 0.92-1.90 | 0.1370 | 383:366 | 749 | 1.18 | 0.99-1.41 | 0.0628 | |
| 12:14 | 26 | 1.01 | 0.45-2.24 | 0.9842 | 61:75 | 136 | 0.91 | 0.64-1.3 | 0.607 | |
| 2:0 | 2 | | | | 3:0 | 3 | | | | |
| 266:284 | 550 | 0.5209 | 1096:1174 | 2270 | 0.2416 | |||||
Association is tested in the subset of individuals carriers of an HLA-Bw4 epitope for the joint KIR3DS1-KIR3DL1(a) and marginal KIR3DL1(b) copy number groups and, also tested in the subset of individuals carriers of the HLA-Bw4-80I epitope for the marginal KIR3DS1(c) copy number group. Case-control counts shown are derived from the most likely copy number assignment across the ten multiply imputed qPCR and SNP datasets. Statistical inference for association is derived from the multiply imputed datasets using the R mitools package [13]. The last row of each table contains the pooled p-value for each association test using the R mice package [14].
Case-only test for interaction between - and HLA-Bw4, across the ten multiply imputed qPCR and SNP datasets
| 183 | 269 | 739 | 1063 | ||
| 12 | 21 | 40 | 71 | ||
| 113 | 138 | 396 | 613 | ||
| | | | p-value =0.4094 | | p-value =0.4235 |
| | | ||||
| 12 | 21 | 40 | 71 | ||
| 296 | 407 | 1135 | 1676 | ||
| | | | p-value =0.5144 | | p-value =0.3609 |
| | | ||||
| 293 | 159 | 1153 | 649 | ||
| 159 | 107 | 673 | 447 | ||
| p-value =0.4922 | p-value =0.0353 | ||||
Counts in each contingency table are derived from the most likely copy number assignment across the multiply imputed datasets. To reduce the degrees of freedom and improve power, we summarise copy numbers higher or equal to one by presence (+) and zero by absence (-). The pooled p-value of each χ2 test, across the multiply imputed datasets, is given in the last row of each contingency table. We find no significant association with HLA-Bw4, within cases, in either the joint (a) or the marginal (b)(c)KIR3DS1-KIR3DL1 distributions.