| Literature DB >> 35954343 |
Ning-Yuan Lee1, Melissa Hum1, Pei-Yi Ong2, Matthew Khine Myint1, Enya H W Ong3, Kar-Perng Low3, Zheng Li4, Boon-Cher Goh2,5,6, Joshua K Tay7,8, Kwok-Seng Loh7,8, Melvin L K Chua3,9,10, Soo-Chin Lee2,5,6, Chiea-Chuen Khor4, Ann S G Lee1,10,11.
Abstract
The current understanding of genetic susceptibility factors for nasopharyngeal carcinoma (NPC) is still incomplete. To identify novel germline variants associated with NPC predisposition, we analysed whole-exome sequencing data from 119 NPC patients from Singapore with a family history of NPC and/or with early-onset NPC, together with 1337 Singaporean participants without NPC. Variants were prioritised and filtered by selecting variants with minor allele frequencies of <1% in both local control (n = 1337) and gnomAD non-cancer (EAS) (n = 9626) cohorts and a high pathogenicity prediction (CADD score > 20). Using single-variant testing, we identified 17 rare pathogenic variants in 17 genes that were associated with NPC. Consistent evidence of enrichment in NPC patients was observed for five of these variants (in JAK2, PRDM16, LRP1B, NIN, and NKX2-1) from an independent case-control comparison of 156 NPC patients and 9770 unaffected individuals. In a family with five siblings, a FANCE variant (p. P445S) was detected in two affected members, but not in three unaffected members. Gene-based burden testing recapitulated variants in NKX2-1 and FANCE as being associated with NPC risk. Using pathway analysis, endocytosis and immune-modulating pathways were found to be enriched for mutation burden. This study has identified NPC-predisposing variants and genes which could shed new insights into the genetic predisposition of NPC.Entities:
Keywords: exome sequencing; genetic predisposition; germline variants; nasopharyngeal carcinoma
Year: 2022 PMID: 35954343 PMCID: PMC9367457 DOI: 10.3390/cancers14153680
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.575
Figure 1Study design and steps taken for the selection of candidate NPC susceptibility variants. For each variant filtering step, the number of variants remaining after filtering is given within the curly brackets.
Allele frequencies and case-control association analysis of 17 variants in 17 known or candidate cancer genes identified from the discovery cohort.
| Gene | Variant | Allele Frequency in Discovery Cohort | Minor Allele Frequency | Case-Control against Local Control Cohort | Case-Control against gnomAD (EAS) | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Local Control Cohort | gnomAD (EAS) | Odds Ratio | FDR Adjusted | Odds Ratio | FDR Adjusted | |||||
|
| NM_004972.3:c.1174G>A | 3.782% | 0.748% | 0.789% | 5.2 (2.1–12.1) | 0.000327 | 0.0268 | 4.9 (2.2–9.8) | 0.000165 | 0.00387 |
|
| NM_000384.3:c.6698T>C | 0.840% | 0.000% | 0.000% | Inf (2.1–Inf) | 0.00665 | 0.0606 | Inf (15.2–Inf) | 0.000149 | 0.00387 |
|
| NM_001008781.2:c.8983C>A | 0.840% | 0.000% | 0.000% | Inf (2.1–Inf) | 0.00665 | 0.0606 | Inf (14.9–Inf) | 0.000155 | 0.00387 |
|
| NM_022114.4:c.2062G>A | 1.261% | 0.037% | 0.042% | 34.0 (2.7–1770.4) | 0.00203 | 0.0606 | 30.0 (5.1–126.5) | 0.000294 | 0.00536 |
|
| NM_004329.2:c.1348G>A | 1.261% | 0.075% | 0.042% | 17.0 (1.9–204.3) | 0.00476 | 0.0606 | 30.7 (5.2–128.4) | 0.000276 | 0.00536 |
|
| NM_021798.4:c.305C>T | 0.840% | 0.000% | 0.005% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 162.3 (8.4–8835.8) | 0.000442 | 0.00725 |
|
| NM_018177.5:c.2582A>T | 0.840% | 0.000% | 0.010% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 81.3 (5.9–1093.6) | 0.000877 | 0.0131 |
|
| NM_001110556.2:c.2876G>A | 1.681% | 0.112% | 0.164% | 15.2 (2.6–104.4) | 0.00125 | 0.0606 | 10.4 (2.6–30.7) | 0.000989 | 0.0135 |
|
| NM_170606.2:c.9530G>A | 0.840% | 0.000% | 0.016% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 54.2 (4.5–482.3) | 0.00145 | 0.0171 |
|
| NM_001323674.1:c.2321C>T | 0.840% | 0.000% | 0.016% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 54.1 (4.5–481.3) | 0.00146 | 0.0171 |
|
| NM_001291285.1:c.10310C>A | 0.840% | 0.000% | 0.031% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 27.1 (2.7–152.4) | 0.00397 | 0.0367 |
|
| NM_021922.2:c.1333C>T | 0.840% | 0.000% | 0.042% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 20.4 (2.1–102.6) | 0.00626 | 0.0461 |
|
| NM_005373.2:c.1300G>A | 0.840% | 0.000% | 0.047% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 18.1 (1.9–88.2) | 0.00760 | 0.0477 |
|
| NM_018557.2:c.5837A>T | 0.840% | 0.000% | 0.047% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 18.1 (1.9–88.0) | 0.00762 | 0.0477 |
|
| NM_020921.3:c.2867G>A | 0.840% | 0.000% | 0.047% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 18.1 (1.9–88.2) | 0.00759 | 0.0477 |
|
| NM_003317.4:c.251G>A | 0.840% | 0.000% | 0.054% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 15.7 (1.7–74.3) | 0.0097 | 0.0497 |
|
| NM_001243078.1:c.298C>T | 0.840% | 0.000% | 0.088% | Inf (2.1–Inf) | 0.00665 | 0.0606 | 9.6 (1.1–40.7) | 0.0222 | 0.0865 |
Allele frequencies and case-control association analysis of 17 selected variants in the validation cohort.
| Gene | Variant | RefSNP | Allele Frequency in Validation Cohort (n = 156) | Allele Frequency in SG10K_Health (n = 9770) | Odds Ratio | |
|---|---|---|---|---|---|---|
|
| NM_004972.3:c.1174G>A | rs200018153 | 5/312 (1.603%) | 180/19178 (0.939%) | 1.7 (0.5–4.1) | 0.226 |
|
| NM_000384.3:c.6698T>C | rs1176839033 | 0/312 (0.000%) | N/A | N/A | N/A |
|
| NM_001008781.2:c.8983C>A | rs62622785 | 0/312 (0.000%) | 109/19068 (0.572%) | 0.0 (0.0–2.1) | 0.426 |
|
| NM_022114.4:c.2062G>A | rs367580261 | 1/312 (0.321%) | 40/18760 (0.213%) | 1.5 (0.0–8.9) | 0.492 |
|
| NM_004329.2:c.1348G>A | rs55932635 | 0/312 (0.000%) | 5/19048 (0.026%) | 0.0 (0.0–66.8) | 1.000 |
|
| NM_021798.4:c.305C>T | rs1180106880 | 0/312 (0.000%) | 16/18962 (0.084%) | 0.0 (0.0–15.8) | 1.000 |
|
| NM_018177.5:c.2582A>T | rs61748748 | 0/312 (0.000%) | 173/19110 (0.905%) | 0.0 (0.0–1.3) | 0.121 |
|
| NM_001110556.2:c.2876G>A | rs782104597 | 0/312 (0.000%) | 14/13534 (0.103%) | 0.0 (0.0–13.1) | 1.000 |
|
| NM_170606.2:c.9530G>A | rs535118581 | 0/312 (0.000%) | 5/19106 (0.026%) | 0.0 (0.0–67.0) | 1.000 |
|
| NM_001323674.1:c.2321C>T | rs759615272 | 0/312 (0.000%) | N/A | N/A | N/A |
|
| NM_001291285.1:c.10310C>A | rs761789494 | 0/312 (0.000%) | 5/19210 (0.026%) | 0.0 (0.0–67.4) | 1.000 |
|
| NM_021922.2:c.1333C>T | rs141551053 | 0/312 (0.000%) | 16/18978 (0.084%) | 0.0 (0.0–15.8) | 1.000 |
|
| NM_005373.2:c.1300G>A | rs754296556 | 0/312 (0.000%) | 5/19024 (0.026%) | 0.0 (0.0–66.8) | 1.000 |
|
| NM_018557.2:c.5837A>T | rs776616686 | 1/312 (0.321%) | 27/19236 (0.140%) | 2.3 (0.1–14.0) | 0.363 |
|
| NM_020921.3:c.2867G>A | rs199887033 | 2/312 (0.641%) | 60/19098 (0.314%) | 2.0 (0.2–7.8) | 0.263 |
|
| NM_003317.4:c.251G>A | rs757703309 | 1/312 (0.321%) | 5/18862 (0.027%) | 12.1 (0.3–109.0) | 0.0938 |
|
| NM_001243078.1:c.298C>T | rs200936737 | 0/312 (0.000%) | 11/18986 (0.058%) | 0.0 (0.0–24.3) | 1.000 |
N/A—These variants were not reported in the SG10K_Health cohort, so their allele frequency was unavailable, and the statistical tests were not done.
Gene-based burden testing of 17 prioritised candidate genes.
| Gene | Protein Function a | Cases with Rare Pathogenic Variants (n = 119) | Controls with Rare Pathogenic Variants (n = 1337) | FDR Adjusted | β Estimate | β Standard Error | |
|---|---|---|---|---|---|---|---|
|
| Cytokine and growth factor signalling | 9/119 (7.56%) | 32/1337 (2.39%) | 0.0209 | 0.998 | 0.07955 | 0.03441 |
|
| LDL receptor ligand | 8/119 (6.72%) | 46/1337 (3.44%) | 0.134 | 0.998 | 0.04427 | 0.02949 |
|
| Cell-cell adhesion (predicted) | 11/119 (9.24%) | 101/1337 (7.55%) | 0.697 | 0.998 | 0.00824 | 0.02113 |
|
| Transcription factor; involved in some MDS/AML translocations | 5/119 (4.20%) | 42/1337 (3.14%) | 0.728 | 0.998 | 0.01111 | 0.03198 |
|
| Type I transmembrane serine/threonine receptor | 5/119 (4.20%) | 7/1337 (0.52%) | 0.00299 | 0.468 | 0.18300 | 0.06152 |
|
| Cell growth; proliferation and differentiation of T cells, B cells, and NK cells | 2/119 (1.68%) | 2/1337 (0.15%) | 0.00978 | 0.674 | 0.27530 | 0.10640 |
|
| Ubiquitylation substrate (predicted); transcription-coupled DNA repair (predicted); recombination (predicted) | 2/119 (1.68%) | 33/1337 (2.47%) | 0.649 | 0.998 | -0.01689 | 0.03709 |
|
| Cytoskeleton structure and remodelling | 3/119 (2.52%) | 41/1337 (3.07%) | 0.831 | 0.998 | −0.00699 | 0.03269 |
|
| Histone methylation and transcriptional coactivation | 6/119 (5.04%) | 56/1337 (4.19%) | 0.990 | 0.998 | −0.00037 | 0.02824 |
|
| Transcription repressor of IL-2 (predicted) | 2/119 (1.68%) | 12/1337 (0.90%) | 0.347 | 0.998 | 0.06365 | 0.06773 |
|
| Regulating of planar cell polarity | 16/119 (13.45%) | 121/1337 (9.05%) | 0.154 | 0.998 | 0.02757 | 0.01935 |
|
| DNA cross-link repair | 3/119 (2.52%) | 8/1337 (0.60%) | 0.120 | 0.998 | 0.10020 | 0.06451 |
|
| Regulator of megakaryopoiesis and platelet production | 0/119 (0.00%) | 10/1337 (0.75%) | 0.507 | 0.998 | −0.04494 | 0.06773 |
|
| LDL receptor | 10/119 (8.40%) | 107/1337 (8.00%) | 0.850 | 0.998 | −0.00388 | 0.02053 |
|
| Critical for centrosome function | 7/119 (5.88%) | 68/1337 (5.09%) | 0.855 | 0.998 | 0.00464 | 0.02534 |
|
| Homeobox regulating morphogenesis and thyroid-specific genes | 2/119 (1.68%) | 0/1337 (0.00%) | 0.0000184 | 0.0144 | 0.64450 | 0.15000 |
|
| T-cell proliferation and survival; cytokine production | 3/119 (2.52%) | 1/1337 (0.07%) | 0.000775 | 0.202 | 0.35340 | 0.10490 |
a Obtained via RefGene.
Gene-based burden testing of 17 prioritised candidate genes in 16 probands and 38 family members (2 affected, 36 unaffected).
| Gene | Cases with Pathogenic Variants (n = 18) | Controls with Pathogenic Variants (n = 36) | β Estimate | β Standard Error | |
|---|---|---|---|---|---|
|
| 0 (0.00%) | 1 (2.78%) | 0.435 | −0.37660 | 0.47920 |
|
| 18 (100.00%) | 36 (100.00%) | 0.625 | −0.06977 | 0.14200 |
|
| 4 (22.22%) | 8 (22.22%) | 0.907 | 0.01852 | 0.15700 |
|
| 2 (11.11%) | 6 (16.67%) | 0.677 | −0.07689 | 0.18360 |
|
| 1 (5.56%) | 2 (5.56%) | 0.986 | −0.00504 | 0.28290 |
|
| 0 (0.00%) | 0 (0.00%) | N/A | N/A | N/A |
|
| 1 (5.56%) | 1 (2.78%) | 0.646 | 0.15840 | 0.34280 |
|
| 1 (5.56%) | 4 (11.11%) | 0.498 | −0.15180 | 0.22260 |
|
| 18 (100.00%) | 36 (100.00%) | N/A | N/A | N/A |
|
| 0 (0.00%) | 0 (0.00%) | N/A | N/A | N/A |
|
| 15 (83.33%) | 34 (94.44%) | 0.218 | −0.27570 | 0.22130 |
|
| 2 (11.11%) | 0 (0.00%) | 0.0376 | 0.70240 | 0.32920 |
|
| 0 (0.00%) | 0 (0.00%) | N/A | N/A | N/A |
|
| 1 (5.56%) | 2 (5.56%) | 0.978 | 0.00781 | 0.28300 |
|
| 13 (72.22%) | 26 (72.22%) | 0.833 | 0.03131 | 0.14810 |
|
| 0 (0.00%) | 0 (0.00%) | N/A | N/A | N/A |
|
| 0 (0.00%) | 0 (0.00%) | N/A | N/A | N/A |
N/A—No pathogenic variants were found in these genes for both affected and unaffected individuals, so no statistical test was performed.
Top 10 canonical pathways identified by IPA Pathway Analysis.
| Top Canonical Pathways | IPA | IPA | Individuals with Variants in Pathway Genes | Odds Ratio (95% CI) b | Odds Ratio | Gene | |
|---|---|---|---|---|---|---|---|
| Cases | Controls | ||||||
| GM-CSF Signalling | 0.0092 | 6/70 (8.6%) | 17/119 (14.29%) | 41/1337 (3.07%) | 5.3 | 1.21 × 10−6 |
|
| Synaptogenesis Signalling Pathway | 0.0158 | 15/312 (4.8%) | 29/119 (24.37%) | 61/1337 (4.56%) | 6.7 | 6.11 × 10−12 |
|
| Clathrin-mediated Endocytosis Signalling | 0.0274 | 10/192 (5.2%) | 19/119 (15.97%) | 25/1337 (1.87%) | 9.9 | 1.35 × 10−10 |
|
| Role of JAK1 and JAK3 in γc Cytokine Signalling | 0.0291 | 5/67 (7.5%) | 16/119 (13.45%) | 41/1337 (3.07%) | 4.9 | 4.87 × 10−6 |
|
| Agrin Interactions at Neuromuscular Junction | 0.0307 | 5/68 (7.4%) | 29/119 (24.37%) | 127/1337 (9.50%) | 3.1 | 7.13 × 10−6 |
|
| NAD biosynthesis II (from tryptophan) | 0.0312 | 2/11 (18%) | 5/119 (4.20%) | 6/1337 (0.45%) | 9.7 | 1.04 × 10−3 |
|
| Tryptophan Degradation III (Eukaryotic) | 0.0314 | 3/27 (11%) | 9/119 (7.56%) | 22/1337 (1.65%) | 4.9 | 5.20 × 10−4 |
|
| CREB Signalling in Neurons | 0.0355 | 23/595 (3.9%) | 57/119 (47.90%) | 160/1337 (11.97%) | 6.7 | 1.61 × 10−19 |
|
| IL-15 Production | 0.0357 | 7/120 (5.8%) | 30/119 (25.21%) | 88/1337 (6.58%) | 4.8 | 1.98 × 10−9 |
|
| Caveolar-mediated Endocytosis Signalling | 0.0441 | 5/75 (6.7%) | 6/119 (5.04%) | 1/1337 (0.07%) | 70.4 | 1.73 × 10−6 |
|
a IPA enrichment p-value and IPA overlap tests for over-represented biological pathways in the list of genes with significantly different germline mutation burden in cases as compared to controls. The IPA enrichment p-values were calculated using Fisher’s exact test. IPA overlap represents the number of genes in our dataset over the total number of genes that make up the pathway in the Ingenuity Knowledge Base. b Odds ratio and odds ratio p-value tests if case or control individuals are over-represented in the list of individuals with any rare pathogenic variant in each pathway.