| Literature DB >> 34309201 |
Yajun Yu1, Salem Werdyani1, Megan Carey1, Patrick Parfrey2, Yildiz E Yilmaz1,2,3, Sevtap Savas1,4.
Abstract
We aimed to examine the associations of a genome-wide set of single nucleotide polymorphisms (SNPs) and 254 copy number variations (CNVs) and/or insertion/deletions (INDELs) with clinical outcomes in colorectal cancer patients (n = 505). We also aimed to investigate whether their associations changed (e.g., appeared, diminished) over time. Multivariable Cox proportional hazards and piece-wise Cox regression models were used to examine the associations. The Cancer Genome Atlas (TCGA) datasets were used for replication purposes and to examine the gene expression differences between tumor and nontumor tissue samples. A common SNP (WBP11-rs7314075) was associated with disease-specific survival with P-value of 3.2 × 10-8 . Association of this region with disease-specific survival was also detected in the TCGA patient cohort. Two expression quantitative trait loci (eQTLs) were identified in this locus that were implicated in the regulation of ERP27 expression. Interestingly, expression levels of ERP27 and WBP11 were significantly different between colorectal tumors and nontumor tissues. Three SNPs predicted the risk of recurrent disease only after 5 years postdiagnosis. Overall, our study identified novel variants, one of which also showed an association in the TCGA dataset, but no CNVs/INDELs, that associated with outcomes in colorectal cancer. Three SNPs were candidate predictors of long-term recurrence/metastasis risk.Entities:
Keywords: colorectal cancer; genetic variants; genome-wide association study; prognostic markers; proportional hazards (PH) assumption; variables with time-varying associations
Mesh:
Substances:
Year: 2021 PMID: 34309201 PMCID: PMC8637572 DOI: 10.1002/1878-0261.13067
Source DB: PubMed Journal: Mol Oncol ISSN: 1574-7891 Impact factor: 6.603
Baseline characteristics of the SNP and CNV/INDEL analysis cohorts.
| Variable | SNP analysis cohort ( | CNV/INDEL analysis cohort ( | ||
|---|---|---|---|---|
| Number | % | Number | % | |
| Age at diagnosis | ||||
| Median (range) | 61.43 (20.70–75.01) | – | 61.40 (20.70–75.01) | – |
| Sex | ||||
| Male | 307 | 60.79 | 301 | 60.81 |
| Female | 198 | 39.21 | 194 | 39.19 |
| Tumor location | ||||
| Colon | 334 | 66.14 | 328 | 66.26 |
| Rectum | 171 | 33.86 | 167 | 33.74 |
| Stage | ||||
| I | 93 | 18.42 | 89 | 17.98 |
| II | 196 | 38.81 | 193 | 38.99 |
| III | 166 | 32.87 | 164 | 33.13 |
| IV | 50 | 9.90 | 49 | 9.90 |
| Histology | ||||
| Nonmucinous | 448 | 88.71 | 438 | 88.48 |
| Mucinous | 57 | 11.29 | 57 | 11.52 |
| Grade | ||||
| Well/moderately differentiated | 464 | 91.88 | 457 | 92.32 |
| Poorly differentiated | 37 | 7.33 | 34 | 6.87 |
| Unknown | 4 | 0.79 | 4 | 0.81 |
| MSI status | ||||
| MSI‐L/MSS | 431 | 85.35 | 421 | 85.05 |
| MSI‐H | 53 | 10.50 | 53 | 10.71 |
| Unknown | 21 | 4.16 | 21 | 4.24 |
| Adjuvant chemotherapy treatment | ||||
| No | 224 | 44.36 | 217 | 43.84 |
| Yes | 277 | 54.85 | 274 | 55.35 |
| Unknown | 4 | 0.79 | 4 | 0.81 |
| Adjuvant radiotherapy treatment | ||||
| No | 364 | 72.08 | 355 | 71.72 |
| Yes | 124 | 24.55 | 123 | 24.85 |
| Unknown | 17 | 3.37 | 17 | 3.43 |
| Follow‐up time | ||||
| Median (range) | 13.79 (0.38–19.00) | – | 13.80 (0.38–19.00) | – |
| DSS status | ||||
| Death from other causes or alive | 332 | 65.74 | 323 | 65.25 |
| Death from colorectal cancer | 99 | 19.60 | 99 | 20.00 |
| Unknown | 74 | 14.65 | 73 | 14.75 |
| Death from other causes or alive (within 5 years) | 407 | 80.59 | 398 | 80.40 |
| Death from colorectal cancer (within 5 years) | 62 | 12.28 | 62 | 12.53 |
| Unknown (within 5 years) | 36 | 7.13 | 35 | 7.10 |
| RMFS status | ||||
| Recurrence or metastasis (−) | 331 | 72.75 | 322 | 72.20 |
| Recurrence or metastasis (+) | 124 | 27.25 | 124 | 27.80 |
| Recurrence or metastasis (−) (within 5 years) | 348 | 76.48 | 339 | 76.01 |
| Recurrence or metastasis (+) (within 5 years) | 105 | 23.08 | 105 | 23.54 |
| Unknown (within 5 years) | 2 | 0.44 | 2 | 0.45 |
CNV, copy number variation; DSS, disease‐specific survival; INDEL, insertion/deletion; MSI, microsatellite instability; MSI‐H, microsatellite instability‐high; MSI‐L, microsatellite instability‐low; MSS, microsatellite stable; RMFS, recurrence/metastasis‐free survival; SNP, single nucleotide polymorphism.
Note that all 495 patients in the CNV/INDEL analysis cohort are also in the SNP analysis cohort with 505 patients.
Stage I–III patients only, total n = 455 in the SNP analysis cohort and total n = 446 in the CNV/INDEL analysis cohort.
‘Unknowns’ appear because two patients had unknown survival time. Although they experienced recurrences/metastases, we do not know whether they had these events within the first 5 years postdiagnosis or after that.
rs7314075 that is significantly associated with disease‐specific survival (DSS) in multivariate analysis under the dominant and additive genetic models.
| Chr | Pos | Minor/major allele | MAF | Variant type | Info score | Genetic model | HR (95% CI) |
|
| Located region |
|---|---|---|---|---|---|---|---|---|---|---|
| 12 | 14945417 | A/G | 0.14 | Imputed | 0.964 | Dominant | 3.36 (2.18, 5.16) | 3.27 × 10−8 | 0.96 | Intron of |
| Additive | 2.65 (1.88, 3.75) | 3.24 × 10−8 | 0.63 |
Models are adjusted for MSI status, disease stage, tumor location (6 years as the cutoff time point), adjuvant chemotherapy, and radiotherapy statuses (7 years as the cutoff time point for adjuvant radiotherapy).
Chr, chromosome; CI, confidence interval; HR, hazard ratio; MAF, minor allele frequency; PH, proportional hazard; Pos, position.
Hazard ratio was estimated under the dominant genetic model for [AA+AB] vs BB and under the additive genetic model for AA vs AB vs BB, where A is the minor allele and B is the major allele.
Gene annotation is obtained from the UCSC database [96] (‘UCSC genes’ from the UCSC browser [GRCh37/hg19]).
Fig. 1Kaplan–Meier curves of rs7314075 in the disease‐specific survival (DSS) analysis under the dominant genetic model. The P‐value of the log‐rank test is 2 × 10−06.
Associations between SNPs in high LD with rs7314075 and disease‐specific survival (DSS) in multivariate analysis in the TCGA dataset under the dominant and additive genetic models.
| Genetic model | SNP | Chr | Pos | Minor/major allele | MAF | HR (95% CI) |
|
|
|---|---|---|---|---|---|---|---|---|
| Dominant | rs11056174 | 12 | 14909977 | T/C | 0.14 | 2.94 (1.20, 7.20) |
| 0.56 |
| rs2041909 | 12 | 14915409 | C/T | 0.14 | 3.00 (1.23, 7.32) |
| 0.58 | |
| rs2041908 | 12 | 14916150 | G/A | 0.14 | 2.32 (0.96, 5.65) | 0.063 | 0.73 | |
| rs6488711 | 12 | 14933216 | T/C | 0.14 | 2.93 (1.20, 7.17) |
| 0.56 | |
| rs2241221 | 12 | 14959391 | C/T | 0.16 | 2.97 (1.23, 7.16) |
| 0.47 | |
| rs11835363 | 12 | 14982700 | C/T | 0.16 | 2.42 (1.00, 5.88) | 0.050 | 0.23 | |
| Additive | rs11056174 | 12 | 14909977 | T/C | 0.14 | 2.35 (1.05, 5.29) |
| 0.81 |
| rs2041909 | 12 | 14915409 | C/T | 0.14 | 2.38 (1.06, 5.32) |
| 0.85 | |
| rs2041908 | 12 | 14916150 | G/A | 0.14 | 1.96 (0.87, 4.44) | 0.106 | 0.92 | |
| rs6488711 | 12 | 14933216 | T/C | 0.14 | 2.32 (1.03, 5.20) |
| 0.79 | |
| rs2241221 | 12 | 14959391 | C/T | 0.16 | 2.39 (1.08, 5.31) |
| 0.72 | |
| rs11835363 | 12 | 14982700 | C/T | 0.16 | 2.01 (0.90, 4.50) | 0.091 | 0.39 |
Models are adjusted for MSI status, disease stage, tumor location, and the top principal component. Bolded values are P‐values < 0.05, indicating significant associations between variants and DSS.
Chr, chromosome; CI, confidence interval; HR, hazard ratio; MAF, minor allele frequency; PH, proportional hazard; Pos, position.
Hazard ratio was estimated under the dominant genetic model for [AA+AB] vs BB and under the additive genetic model for AA vs AB vs BB, where A is the minor allele and B is the major allele.
Variants that are in high LD with WBP11‐rs7314075 that are eQTLs.
| Outcome—genetic model | rs ID | eQTL‐associated gene (tissue)—RegulomeDB | eQTL‐associated gene (tissue)—GTEx |
|---|---|---|---|
| DSS‐dominant/ additive | rs2241221 |
| – |
| DSS‐dominant/ additive | rs11056174 |
| – |
DSS, disease‐specific survival; eQTL, expression quantitative trait locus; SNP, single nucleotide polymorphism.
Variants that are in high LD with WBP11‐rs7314075 (retrieved from Haploreg [84]) were explored in RegulomeDB [87] and GTEx [88]. Note that GTEx data were for colon tissue, as it has no data for rectal tissue. The eQTLs are all cis‐eQTLs that locate within ± 1 Mb of the transcription start sites of the genes shown in the table.
Fig. 2Expression level of WBP11 in colorectal tumors and normal tissues. Analysis was done in UCSC Xena [85] using the GDC TCGA COAD and READ data. In both datasets, primary tumors and adjacent normal tissues (noted as ‘solid tissue normal’ in TCGA data) were selected (recurrent and metastatic tumors were excluded). Then, only tumors and normal tissues with their anatomical sites noted as colon (in COAD) and rectum and rectosigmoid junction (in READ) were analyzed. (A) WBP11 expression in colon tumors and normal tissues from the TCGA COAD cohort; (B) WBP11 expression in rectal tumors and normal tissues from the TCGA READ cohort. Expression of WBP11 is significantly higher in colon and rectum tumors compared to normal tissues. The number of patients in the colon and rectum tumor datasets is larger than those in the normal tissue datasets. This may explain why the gene expression levels in tumors have a higher variance compared to that in the normal tissues.