| Literature DB >> 23209447 |
Minzhong Tang1, James A Lautenberger, Xiaojiang Gao, Efe Sezgin, Sher L Hendrickson, Jennifer L Troyer, Victor A David, Li Guan, Carl E McIntosh, Xiuchan Guo, Yuming Zheng, Jian Liao, Hong Deng, Michael Malasky, Bailey Kessing, Cheryl A Winkler, Mary Carrington, Guy Dé The, Yi Zeng, Stephen J O'Brien.
Abstract
Nasopharyngeal carcinoma (NPC) is an epithelial malignancy facilitated by Epstein-Barr Virus infection. Here we resolve the major genetic influences for NPC incidence using a genome-wide association study (GWAS), independent cohort replication, and high-resolution molecular HLA class I gene typing including 4,055 study participants from the Guangxi Zhuang Autonomous Region and Guangdong province of southern China. We detect and replicate strong association signals involving SNPs, HLA alleles, and amino acid (aa) variants across the major histocompatibility complex-HLA-A, HLA -B, and HLA -C class I genes (P(HLA-A-aa-site-62) = 7.4 × 10(-29); P (HLA-B-aa-site-116) = 6.5 × 10(-19); P (HLA-C-aa-site-156) = 6.8 × 10(-8) respectively). Over 250 NPC-HLA associated variants within HLA were analyzed in concert to resolve separate and largely independent HLA-A, -B, and -C gene influences. Multivariate logistical regression analysis collapsed significant associations in adjacent genes spanning 500 kb (OR2H1, GABBR1, HLA-F, and HCG9) as proxies for peptide binding motifs carried by HLA- A*11:01. A similar analysis resolved an independent association signal driven by HLA-B*13:01, B*38:02, and B*55:02 alleles together. NPC resistance alleles carrying the strongly associated amino acid variants implicate specific class I peptide recognition motifs in HLA-A and -B peptide binding groove as conferring strong genetic influence on the development of NPC in China.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23209447 PMCID: PMC3510037 DOI: 10.1371/journal.pgen.1003103
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1NPC associations of GWAS and Taqman validation.
A.) Manhattan plot of GWAS P value association results of 591,458 SNP allele genotypes versus chromosome coordinate position (N = 1043 study participants; Row I in Table S3). Association p-values (-log10 transformed) are calculated by logistic regression in additive logistic model and plotted by genomic position. Association p-values for HLA SNP that were assessed by HLA sequence base typing for the same 1043 individuals are indicated by open red triangles (see text). B.) NPC association signal for significant HLA alleles (left) and included SNP variants on Chromosome 6.The –log10 p values, calculated with the logistic regression model, in GWAS and combined association tests are shown, SNPs are ordered according to the location on chromosome 6 HLA-A region. Color coded indicate the LD value (r2) of each variant with the most significant SNP rs417162. C.) Disequilibrium coefficient values for SNPs genotyped in the HLA region for NPC GWAS (N = 1043), generated with the use of Haploview software.
GWAS and validation of SNPs association data in two independent NPC cohorts.
| GWAS ( | Validation ( | Combined ( | |||||||||
| SNP name | Gene | Chr. | MA | MAF |
| OR(95% CI) | MAF |
| OR(95% CI) |
| OR (95% CI) |
| rs417162 |
| 6 | C | 0.26/0.37 | 1.13E-07 | 0.58(0.48–0.71) | 0.26/0.35 | 3.75E-05 | 0.63(0.50–0.78) | 1.05E-11 | 0.63(0.53–0.75) |
| rs2517713 |
| 6 | G | 0.26/0.37 | 3.03E-07 | 0.57(0.46–0.71) | 0.26/0.35 | 2.61E-05 | 0.62(0.50–0.78) | 1.63E-11 | 0.60(0.52–0.70) |
| rs9260734 |
| 6 | A | 0.22/0.32 | 5.90E-07 | 0.57(0.45–0.71) | 0.21/0.32 | 1.32E-05 | 0.59(0.47–0.75) | 2.63E-11 | 0.59(0.50–0.69) |
| rs5009448 |
| 6 | T | 0.26/0.35 | 1.09E-05 | 0.63(0.51–0.77) | 0.24/0.35 | 4.43E-07 | 0.56(0.45–0.70) | 6.40E-11 | 0.61(0.53–0.71) |
| rs2267633 |
| 6 | G | 0.17/0.26 | 1.43E-06 | 0.58(0.47–0.73) | 0.17/0.24 | 2.77E-04 | 0.63(0.49–0.81) | 1.89E-09 | 0.61(0.52–0.72) |
| rs29230 |
| 6 | C | 0.17/0.25 | 2.37E-05 | 0.60(0.47–0.76) | 0.17/0.24 | 6.02E-04 | 0.65(0.50–0.83) | 9.48E-09 | 0.61(0.52–0.72) |
: Replication SNPs that not included in Affymetrix Genome-Wide SNP Array;
: MA, Minor allele;
: MAF, Minor allele frequencies;
Gene frequencies (%) of the HLA-A, -B, and -C alleles detected and association analysis (N = 4,055).#
| Allele Frequencies | NPC vs. Control | NPC VS. EP. Controls | NPC VS. EN. Controls | |||||||
| Allele Name | Amino Acid Binding Motifs | NPC Patients ( | EP. Controls | EN. Controls | OR (95% CI.) | P Value | OR (95% CI.) | P Value | OR (95% CI.) | P Value |
|
| ||||||||||
|
| .[LV]……[LI] | 15.80(444) | 10.87(280) | 15.01(409) | Ns | 1.52(1.29–1.79) | 4.49E-07 | Ns | ||
|
| .[VQ]……[VS] | 2.62(139) | 2.21(57) | 3.01(82) | 1.65(1.28–2.13) | 1.15E-04 | 1.91(1.38–2.65) | 1.00E-04 | Ns | |
|
| .[L-]……[VL] | 16.48(463) | 13.32(343) | 11.56(315) | 1.38(1.21–1.57) | 1.52E-06 | Ns | 1.49(1.28–1.74) | 3.98E-07 | |
|
| .[YT]……[K-] | 20.25(569) | 30.12(776) | 29.11(793) | 0.59(0.53–0.66) | 1.72E-19 | 0.57(0.50–0.65) | 1.43E-16 | 0.61(0.53–0.69) | 3.13E-14 |
|
| [DE]I……[R-] | 17.69(497) | 15.76(406) | 12.37(337) | 1.36(1.19–1.54) | 3.14E-06 | Ns | 1.57(1.35–1.84) | 1.01E-08 | |
|
| ||||||||||
|
| .[–]……[–] | 8.04(226) | 12.27(316) | 10.87(296) | 0.66(0.56–0.77) | 4.52E-07 | 0.62(0.51–0.74) | 3.15E-07 | 0.71(0.59–0.86) | 3.16E-04 |
|
| R[R-]……[LF] | 0.53(15) | 0.97(25) | 1.69(46) | Ns | Ns | 0.31(0.17–0.55) | 8.56E-05 | ||
|
| .[–]……[–] | 12.21(343) | 6.83(176) | 8.48(231) | 1.67(1.43–1.94) | 6.96E-11 | 1.88(1.55–2.28) | 1.96E-10 | 1.51(1.26–1.80) | 5.90E-06 |
|
| .[MI]……[YF] | 19.18(539) | 17.97(463) | 14.46(394) | Ns | Ns | 1.39(1.21–1.60) | 4.77E-06 | ||
|
| .[P-]……[AV] | 1.00(28) | 3.38(75) | 3.67(100) | 0.27(0.18–0.40) | 1.57E-10 | 0.28(0.18–0.43) | 7.38E-09 | 0.26(0.17–0.40) | 5.02E-10 |
|
| .[TS]……[WF] | 16.65(468) | 15.49(399) | 12.22(333) | 1.47(1.26–1.71) | 2.38E-04 | Ns | 1.43(1.23–1.67) | 1.47E-06 | |
|
| ||||||||||
|
| 22.03(619) | 21.35(550) | 17.99(490) | Ns | Ns | 1.28(1.12–1.46) | 2.16E-04 | |||
|
| 16.37(460) | 15.22(392) | 11.97(326) | 1.28(1.12–1.46) | 2.05E-04 | Ns | 1.48(1.26–1.73) | 1.30E-06 | ||
|
| 21.64(608) | 16.89(435) | 19.38(528) | 1.25(1.11–1.40) | 1.84E-04 | 1.37(1.19–1.58) | 1.65E-05 | Ns | ||
|
| 0.93(26) | 1.59(41) | 2.83(77) | 0.41(0.27–0.63) | 4.28E-05 | Ns | 0.31(0.20–0.49) | 4.14E-07 | ||
|
| 0.78(22) | 2.14(55) | 1.76(48) | 0.41(0.25–0.65) | 1.47E-04 | 0.37(0.22–0.62) | 1.37E-04 | Ns | ||
: Amino acid motifs from Lund et al., 2004.
: EP. controls, the controls of NPC free and EBV IgA/VCA antibody positive.
: EN. controls, the controls of NPC free and EBV IgA/VCA antibody negative.
: Allele also showed significant in the comparison of EP. controls VS. EN. controls, p = 7.81E-06, OR = 0.69(0.59–0.81).
A preliminary analysis of HLA association in phase I was previously reported (Tang et al., 2010). Now, HLA associations with phase I and phase II (See Materials and methods) are presented separately in Table S12.
Figure 2NPC associations of HLA alleles and amino acid variants.
A.) NPC associations of alleles and amino acid variants at HLA-A locus; B.) NPC associations of alleles and amino acid variants at HLA-B locus; C.) NPC associations of alleles and amino acid variants at HLA-C locus. Genetic association of HLA alleles and amino acid sites were calculated (N = 4055 study participants; Line V in Table S3). For amino acid positions with more than two alleles, p-value for the omnibus test that tests all amino acid alleles simultaneously (with >1 degrees of freedom) for association to control.
Figure 3Proxy variant analysis for the strongest aa-variants or HLA allele in HLA-A, in HLA-B, and in HLA-C.
Genetic association of each HLA amino acid variant was calculated. Study participants in both phase I and phase II (N = 4055 study participants; Line V in Table S3). Multivariate conditional logistic regression analysis was performed to compute amino acid variants (A, C, and E) or HLA alleles (B, D, and F) association p-value. The HLA typing data set (N = 4055 study participants; Line V in Table S3) were used PLINK to examine the residual effect of index amino acid variant or HLA allele while using other amino acid variant or HLA allele as a covariate, and we adjusted the results for age and gender. The index amino acid variant or HLA allele was marked with bold red font. The red line indicated unadjusted –log10 p of index. Three HLA class I genes regions were separated by light blue block of HLA-C gene between HLA-A and HLA-B. The X-axis is amino acid or HLA allele covariate, ranking by their coordinate. HLA alleles were group together ranking by allele names in each HLA class I gene locus. The Y-axis is the –log10p of index variant adjust by covariate. Independent variants should not change the adjusted p-values from the strong unadjusted values of the index variant, while LD-proxies would reduce their p-values appreciably depending upon the strength of LD.