| Literature DB >> 29057889 |
Wensheng Zhang1, Andrea Edwards1, Erik K Flemington2, Kun Zhang3.
Abstract
The causes underlying racial disparities in cancer are multifactorial. In addition to socioeconomic issues, biological factors may contribute to these inequities, especially in disease incidence and patient survival. To date, there have been few studies that relate the disparities in these aspects to genetic aberrations. In this work, we studied the impacts of race on the patient survival and tumor mutation burden using the data released by the Cancer Genome Atlas (TCGA). The potential relationship between mutation burden and disease incidence is further inferred by an integrative analysis of TCGA data and the data from the Surveillance, Epidemiology, and End Results (SEER) Program. The results show that disparities are present (p < 0.05) in patient survival of five cancers, such as head and neck squamous cell carcinoma. The numbers of tumor driver mutations are differentiated (p < 0.05) over the racial groups in five cancers, such as lung adenocarcinoma. By treating a specific cancer type and a racial group as an "experimental unit", driver mutation numbers demonstrate a significant (r = 0.46, p < 0.002) positive correlation with cancer incidence rates, especially when the five cancers with mutational disparities are exclusively focused (r = 0.88, p < 0.00002). These results enrich our understanding of racial disparities in cancer and carcinogenic process.Entities:
Mesh:
Year: 2017 PMID: 29057889 PMCID: PMC5651797 DOI: 10.1038/s41598-017-13091-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The summary of sample profiles‡.
| Cancer | Total samples | White | Black | Asian |
|---|---|---|---|---|
| BLCA | 382 (233) | 300 (183) | 22 (13) | 42 (26) |
| GBM | 594 (285) | 505 (256) | 50 (17) | 13 (5) |
| HNSC | 522 (504) | 447 (439) | 45 (36) | 11 (11) |
| KIRC | 533 (417) | 466 (390) | 51 (14) | 8 (7) |
| LUAD | 521 (488) | 391 (385) | 52 (29) | 8 (8) |
| LUSC | 496 (178) | 341 (111) | 31 (9) | 9 (5) |
| BRCA | 1080 (967) | 747 (698) | 172 (116) | 61 (57) |
| OV | 588 (371) | 498 (324) | 34 (17) | 20 (12) |
| UCEC | 538 (248) | 372 (193) | 104 (25) | 20 (13) |
| COAD | 455 (216) | 214 (177) | 54 (19) | 11 (7) |
| THCA | 506 (402) | 329 (263) | 27 (18) | 52 (39) |
| CESC | 305 (198) | 210 (142) | 30 (16) | 19 (19) |
| ESCA | 174 (171) | 110 (109) | 2 (2) | 41 (41) |
| KIRP | 272 (168) | 189 (108) | 60 (43) | 5 (2) |
| LIHC | 363 (197) | 175 (120) | 17 (14) | 159 (54) |
| STAD | 453 (288) | 288 (167) | 12 (4) | 89 (76) |
‡Outside the brackets are the numbers of samples with clinical information only. Inside the brackets are the numbers of samples with both clinical and genomic information. Some samples do not belong to any racial group of White, Black or Asian.
Figure 1Racial disparity in survival time of cancer patients. Red: White patients; Green: Black patients; Blue: Asian patients. Censored patients (samples), for whom the follow-up after treatment ends before events (death) occur, are marked with vertical ticks. For each comparison, the printed p-value is the aggregated p-value (p), which integrates p (the p-value obtained from the Cox-PH analysis) and p (the p-value obtained from the RMST method) by the conventional Bonferroni method.
The statistics of non-synonymous somatic mutations in the pan-cancer driver (pcDriver) genes‡.
| Cancer | White | Black | Asian | P-value | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Q1 | Q2 | Q3 | Q1 | Q2 | Q3 | Q1 | Q2 | Q3 | White::Black | White::Asian | Black::Asian | |
| BLCA | 7 | 11 | 16 | 5 | 8 | 13 | 3.5 | 7 | 11 | 1.9E-01 |
| 4.7E-01 |
| GBM | 2 | 4 | 5 | 3 | 4 | 6 | 2 | 4 | 7 | 2.0E-01 | 7.2E-01 | 9.7E-01 |
| HNSC | 4 | 7 | 11 | 4.75 | 8.5 | 11 | 3.5 | 6 | 13 | 5.6E-01 | 7.7E-01 | 7.2E-01 |
| KIRC | 2 | 3 | 5 | 1.25 | 3.5 | 4 | 1 | 2 | 2 | 3.0E-01 |
| 2.9E-01 |
| LUAD | 5 | 9 | 15 | 10 | 13 | 18 | 5.75 | 9.5 | 18.25 |
| 6.2E-01 | 4.6E-01 |
| LUSC | 6 | 10 | 14 | 6 | 10 | 14 | 4 | 5 | 10 | 1.0E+00 | 1.4E-01 | 1.6E-01 |
| BRCA | 1 | 3 | 4 | 1 | 3 | 4 | 2 | 3 | 5 | 7.9E-01 | 1.4E-01 | 1.4E-01 |
| OV | 1 | 2 | 3 | 2 | 3 | 3 | 1 | 2.5 | 4.25 | 2.4E-01 | 5.8E-01 | 8.9E-01 |
| UCEC | 5 | 8 | 15 | 4 | 5 | 12 | 5 | 7 | 60 |
| 3.4E-01 |
|
| COAD | 5 | 7 | 12 | 5.5 | 7 | 15.5 | 5 | 10 | 73 | 9.3E-01 | 3.5E-01 | 4.7E-01 |
| THCA | 1 | 1 | 2 | 1 | 1 | 2 | 1 | 1 | 1 | 6.7E-01 | 6.3E-01 | 5.6E-01 |
| CESC | 2 | 3 | 7 | 2 | 3.5 | 6.25 | 1 | 2 | 4.5 | 9.3E-01 | 8.1E-02 | 2.0E-01 |
| ESCA | 3 | 5 | 7 | 6.25 | 6.5 | 6.75 | 3 | 5 | 6 | 2.7E-01 | 8.4E-01 | 2.0E-01 |
| KIRP | 1 | 2 | 4 | 1 | 1 | 2.5 | 0.75 | 1.5 | 2.25 |
| 5.0E-01 | 8.2E-01 |
| LIHC | 2 | 3.5 | 5 | 3 | 4 | 6.5 | 2 | 3 | 6 | 2.0E-01 | 7.9E-01 | 3.2E-01 |
| STAD | 3 | 6 | 10.5 | 4 | 19.5 | 36.25 | 2 | 4 | 11 | 3.0E-01 | 4.0E-01 | 2.2E-01 |
‡Q1, Q2 and Q3 are the first quantile, the second quantile (median) and the third quantile of mutation numbers, respectively. The number of tumor samples in each cancer-race group is the same as that in Table 1. P-values are calculated by the Mann Whitney test.
Figure 2The association between mutation burden and cancer incidence rate for the five cancer types that demonstrate mutational disparities between patient races. Each data point represents the combination of a racial group and a TCGA cancer. Y (Incidence rate) in the both plots indicates the number of new cancer cases per 100000 individuals per year. (A) X1 indicates the median of mutation numbers in the pan-cancer driver genes. (B) X2 indicates the log2 transformed median of mutation numbers in all HUGO genes. The p-value of Pearson correlation (r) between X1 (X2) and Y is estimated by the t-test. The regression of Y on X1 (X2) is denoted by the dotted red line.
Figure 3The association between mutation burden and cancer incidence rate for all the addressed cancer types except for BRCA. Y (Incidence rate) in the both plots indicates the number of new cancer cases per 100000 individuals per year. Each data point represents the combination of a racial group and a TCGA cancer. (A) 1 indicates the median of mutation numbers in the pan-cancer driver genes. (B) X2 indicates the log2 transformed median of mutation numbers in all HUGO genes. The p-value of Pearson correlation (r) between X1(X2) and Y is estimated by the t-test. The regression of Y on X1 (or X2) is denoted by the dotted red line. The information of BRCA is not used in the analysis. The graphics is generated by the gap.plot() function in the R package “plotrix”.
The regression of cancer incidence on stem cell division and somatic mutation burden.
| Model | Explanatory variablea | Adjusted-R2 | p-value (SCD) | p-value (DM) | p-value (TSM) |
|---|---|---|---|---|---|
| Model-1 | SCD, DM | 0.374 | 0.030 | 0.042 | NA |
| Model-2 | SCD, TSM | 0.340 | 0.011 | NA | 0.079 |
| Model-3 | SCD | 0.268 | 0.006 | NA | NA |
| Model-4 | DM | 0.247 | NA | 0.008 | NA |
| Model-5 | TSM | 0.139 | NA | NA | 0.041 |
aSCD: the lifetime number of stem cell divisions. DM: the number of somatic mutations in the pan-cancer driver (pcDriver) genes. TSM: the number of somatic mutations present in all HOGO genes. Before the regression analysis, the logarithm transformation is applied to SCD and TSM.