| Literature DB >> 32433524 |
Agustín González-Reymúndez1,2, Ana I Vázquez3,4.
Abstract
Despite recent advances in treatment, cancer continues to be one of the most lethal human maladies. One of the challenges of cancer treatment is the diversity among similar tumors that exhibit different clinical outcomes. Most of this variability comes from wide-spread molecular alterations that can be summarized by omic integration. Here, we have identified eight novel tumor groups (C1-8) via omic integration, characterized by unique cancer signatures and clinical characteristics. C3 had the best clinical outcomes, while C2 and C5 had poorest. C1, C7, and C8 were upregulated for cellular and mitochondrial translation, and relatively low proliferation. C6 and C4 were also downregulated for cellular and mitochondrial translation, and had high proliferation rates. C4 was represented by copy losses on chromosome 6, and had the highest number of metastatic samples. C8 was characterized by copy losses on chromosome 11, having also the lowest lymphocytic infiltration rate. C6 had the lowest natural killer infiltration rate and was represented by copy gains of genes in chromosome 11. C7 was represented by copy gains on chromosome 6, and had the highest upregulation in mitochondrial translation. We believe that, since molecularly alike tumors could respond similarly to treatment, our results could inform therapeutic action.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32433524 PMCID: PMC7239905 DOI: 10.1038/s41598-020-65119-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Omic integration and features selection method. Step 1) Singular value decomposition of a concatenated list of omic blocks and identification of major axes of variation. Step 2) Identification of omic features (expression of genes, methylation intensities, copy gains/losses) influencing the axes and mapping them onto genes and functional classes (e.g. pathways, ontologies, targets of micro RNA). Step 3) Mapping major axes of variation via tSNE and cluster definition by DBSCAN. Step 4) Phenotypic characterization of each cluster of subjects.
Data description by cancer type after quality control.
| Code | Cancer type | n | F% | Ethnicity %* | TS% | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| AD | W | A | Age | N | M | Surv** | ||||
| ACC | Adrenocortical carcinoma | 23 | 61 | 0 | 100 | 0 | 48 (35–57) | 0 | 0 | 6.6 (2.5–6.6) |
| BLCA | Bladder urothelial carcinoma | 271 | 99 | 13 | 80 | 7 | 58 (49–66) | 1 | 0 | 3.0 (1.2–3.0) |
| BRCA | Breast invasive carcinoma | 639 | 69 | 18 | 75 | 7 | 58 (46–71) | 7 | 0 | 10.2 (6.5–10.2) |
| CESC | Cervical squamous cell carcinoma and endocervical adenocarcinoma | 234 | 25 | 8 | 78 | 14 | 60 (53–69) | 1 | 1 | 11.2 (3.1–11.2) |
| CHOL | Cholangiocarcinoma | 12 | 36 | 0 | 100 | 0 | 55 (46–67) | 75 | 0 | 1.7 (0.7–5.3) |
| COAD | Colon adenocarcinoma | 264 | 36 | 12 | 79 | 9 | 58 (41–66) | 7 | 0 | 8.3 (3.6–8.3) |
| DLBC | Lymphoid Neoplasm Diffuse Large B-cell Lymphoma | 26 | 54 | 19 | 81 | 0 | 60 (54–63) | 0 | 0 | 17.6 (17.6–17.6) |
| ESCA | Esophageal carcinoma | 134 | 60 | 12 | 88 | 0 | 68 (59–73) | 2 | 0 | 2.3 (1.1–4.4) |
| GBM | Glioblastoma multiforme | 49 | 23 | 12 | 78 | 10 | 66 (60–73) | 0 | 0 | 0.9 (0.4–1.2) |
| HNSC | Head and Neck squamous cell carcinoma | 89 | 48 | 8 | 91 | 1 | 61 (59–71) | 1 | 0 | 5.9 (1.2–5.9) |
| KICH | Kidney chromophobe | 2 | 0 | 0 | 100 | 0 | 52 (50–54) | 0 | 0 | — |
| KIRC | Kidney renal clear cell carcinoma | 43 | 51 | 2 | 91 | 7 | 67 (62–75) | 0 | 0 | 7.5 (7.5–7.5) |
| KIRP | Kidney renal papillary cell carcinoma | 37 | 62 | 20 | 80 | 0 | 65 (59–72) | 0 | 0 | — |
| LAML | Acute myeloid leukemia | 28 | 0 | 0 | 94 | 6 | 60 (57–67) | 0 | 0 | — |
| LGG | Brain lower grade glioma | 93 | 42 | 11 | 88 | 1 | 70 (62–75) | 0 | 0 | 9.5(3.1–12.2) |
| LIHC | Liver hepatocellular carcinoma | 62 | 25 | 8 | 92 | 0 | 69 (61–74) | 13 | 0 | 4.6 (1.6–8.6) |
| LUAD | Lung adenocarcinoma | 381 | 29 | 6 | 90 | 5 | 66 (59–72) | 4 | 0 | 4.2 (2.1–9.2) |
| LUSC | Lung squamous cell carcinoma | 289 | 28 | 9 | 89 | 2 | 57 (46–64) | 0 | 0 | 4.7 (1.8–10.5) |
| MESO | Mesothelioma | 68 | 0 | 7 | 93 | 0 | 60 (53–66) | 0 | 0 | 1.6 (0.9–2.4) |
| OV | Ovarian serous cystadenocarcinoma | 5 | 0 | 0 | 100 | 0 | 60 (55–61) | 0 | 0 | 2.9 (2.9–2.9) |
| PAAD | Pancreatic adenocarcinoma | 151 | 24 | 4 | 76 | 20 | 67 (60–74) | 3 | 0 | 1.6 (1.0–4.1) |
| PCPG | Pheochromocytoma and paraganglioma | 144 | 0 | 0 | 100 | 0 | 61 (56–65) | 0 | 1 | — |
| PRAD | Prostate adenocarcinoma | 490 | 36 | 5 | 94 | 1 | 62 (54–70) | 6 | 0 | 9.6 (9.6–9.6) |
| READ | Rectum adenocarcinoma | 83 | 42 | 0 | 85 | 15 | 63 (54–73) | 2 | 0 | 3.9 (3.9–3.9) |
| SARC | Sarcoma | 181 | 41 | 0 | 100 | 0 | 58 (46–69) | 0 | 1 | 6.7 (3.1–6.7) |
| SKCM | Skin cutaneous melanoma | 378 | 85 | 15 | 83 | 2 | 61 (50–70) | 0 | 75 | 7.4 (2.6–20.1) |
| STAD | Stomach adenocarcinoma | 263 | 37 | 4 | 70 | 25 | 67 (58–73) | 0 | 0 | 4.6 (1.3–4.6) |
| TGCT | Testicular germ cell tumors | 134 | 0 | 4 | 92 | 4 | 31 (26–37) | 0 | 0 | — |
| THCA | Thyroid carcinoma | 501 | 73 | 6 | 80 | 13 | 46 (35–58) | 8 | 1 | — |
| THYM | Thymoma | 106 | 45 | 6 | 85 | 9 | 58 (48–68) | 1 | 0 | 9.6 (9.6–9.6) |
| UCEC | Uterine corpus endometrial carcinoma | 146 | 100 | 43 | 57 | 0 | 65 (57–72) | 14 | 0 | 9.2 (3.6–9.2) |
| UCS | Uterine carcinosarcoma | 4 | 100 | 0 | 75 | 25 | 63 (54–74) | 0 | 0 | 1.4 (0.3–2.2) |
| UVM | Uveal melanoma | 78 | 45 | 0 | 100 | 0 | 62 (51–74) | 0 | 0 | 3.8 (2.4–3.8) |
Tumor samples are described by cancer type (TCGA Codes and cancer name), in terms of relative sample size (n), percent of females (F%), ethnicities (percent of non-Hispanic Whites, Afro-descendants, and Asians), Age (at the moment of diagnosis, in years), type of sample (TS%, as percent of normal –N- and metastatic –M- samples), and survival (Surv, as expected time to 50% survival, in years). Age and Surv are represented by median values, with first and third quartiles as measurements of dispersion. Data corresponded to the alignment and intersection of all samples with information of gene expression (GE), methylation (METH), and copy number variants (CNV).
*Only the three most abundant ethnicities in the data set were considered to calculate the percent.
**Survival quantiles for cancer types with less than five death events were not calculated.
Figure 2Pan-cancer clustering of tumor samples: tissue effects correction a selection of omic features. Tumor clusters were obtained by sequential application of tSNE and DBSCAN algorithm for 5,408 samples across 33 cancer types. The contours reflect cluster membership, and the points’ colors and shapes represent similar anatomical site and cancer type, respectively. The two-dimensional tSNE projection was obtained from the four deep principal axes of the extended omic matrix projected outside the tissue specific effects, after performing sSVD and removing the first two axes. After re-classifying tumors, the few samples coming from Kidney chromophobe tumors (KICH) did not map in any of the eight clusters obtained.
Characterization of pan-cancer clusters of tumors after removing tissue effects.
| Clusters | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Clinical information | bc | c | d | ab | ab | ab | bc | a | ||||
| 5c | 4de | 3e | 17ab | 5de | 7 cd | 12bc | 21a | |||||
| 2.2a | 2.1a | 2.8b | 1.8a | 1.5ab | 1.8ab | 2.2ab | 2.0a | |||||
| IVab | IVbc | IIIc | IVab | IIIabc | IIIab | IIIabc | IVab | |||||
| 60a | 70a | 80b | 60a | 60a | 60a | 60a | 60a | |||||
| 13ab | 14ab | 4d | 10c | 15a | 12abc | 14ab | 9bc | |||||
| 0.4a | 0.3a | −0.4b | 0.3a | 0.3a | 0.4a | 0.4a | 0.5a | |||||
| Demographic information | 61a | 62a | 57b | 60ab | 60ab | 61ab | 62a | 57b | ||||
| 52ab | 54a | 50ab | 50ab | 53ab | 46b | 58a | 41b | |||||
| Genome instability rates (as deviations from normal genome) | 1.8bc | 2.2bc | 0.7d | 3.2a | 2.0abc | 1.7c | 2.5ab | 1.8bc | ||||
| 12a | 12a | 3b | 10a | 14a | 11a | 12a | 10a | |||||
| 22ab | 16c | 8d | 23ab | 22abc | 25a | 27a | 19bc | |||||
| Immune infiltration (as deviations from leukocytes fraction) | −5.9b | −5.7b | −3.1a | −6.6b | −8.0b | −6.7b | −5.6b | −5.8b | ||||
| 2.6c | 2.3c | 1.6c | 4.2ab | 5.1abc | 5.4ab | 5.2ab | 6.1a | |||||
| −8.8b | −7.5b | 5.4a | −14.7c | −5.4b | −4.5b | −8.5b | −9.0b | |||||
| 2bc | 0.2bc | 0.3a | 0.3ab | 0.2bc | 0.1c | 0.2bc | 0.2bc | |||||
| 4.7bc | 5.9b | 4.1a | 4.4bc | 4.6bc | 3.1bc | 4.9bc | 3.0c | |||||
| 1.7b | 1.7b | 1.9a | 1.7b | 1.8ab | 1.6b | 1.8b | 1.6b | |||||
| Functional Classes (such as pathways and ontologies) ** | −0.6d | 0.6a | −0.1bc | 0.6a | 0.4ab | 0.7a | −0.3c | −0.2bc | ||||
| 0.4d | −0.3b | 0.0c | −0.9a | 0.3 cd | −1.1a | 1.9e | 0.5d | |||||
| −1.1c | 0.7a | −0.1b | 0.7a | −0.2b | 0.8a | −1.1c | −0.1b | |||||
| −1.5 f | 1.0b | −0.1d | 0.5c | 0.3c | 1.3a | −0.4e | −0.4e | |||||
| C1 | COAD (14.2), LUAD (11.7), BRCA (10.7), SKCM (8.1), SARC (7.1), READ (6.4), PRAD (4.8), ESCA (4.6), CESC (4.1), LUSC (4.1), STAD (4.1), BLCA (3.8), PAAD (3.6), TGCT (2.5), ACC (2.3), MESO (2), LIHC (1.5), UCEC (1.5), PCPG (1), HNSC (0.8), KIRC (0.3), LGG (0.3), OV (0.3), and UVM (0.3). | |||||||||||
| C2 | BRCA (11.1), COAD (11.1), STAD (9.6), LUSC (7.4), LUAD (7.1), SKCM (6.1), CESC (5.6), BLCA (5.4), SARC (5.4), READ (4), ESCA (3.1), KIRP (2.5), PAAD (2.5), PRAD (2.5), PCPG (2.2), HNSC (1.7), LIHC (1.5), UVM (1.5), MESO (1.4), UCEC (1.4), ACC (1.3), KIRC (1.1), GBM (1), THYM (1), LGG (0.8), THCA (0.7), TGCT (0.6), DLBC (0.1), and LAML (0.1). | |||||||||||
| C3 | THCA (16.1), PRAD (13.2), BRCA (9.3), LUAD (6.3), SKCM (4.4), BLCA (4.3), LUSC (3.9), STAD (3.8), COAD (3.4), TGCT (3.4), UCEC (3.4), PAAD (3.3), CESC (3.2), THYM (3.2), PCPG (3.1), LGG (2.5), SARC (1.7), UVM (1.6), HNSC (1.3), LIHC (1.2), KIRC (1.1), MESO (1.1), ESCA (1), GBM (1), LAML (0.9), DLBC (0.7), READ (0.5), KIRP (0.4), CHOL (0.4), UCS (0.1), ACC (0.1), and OV (0.1). | |||||||||||
| C4 | SKCM (21.7), BLCA (13), CESC (9.6), LUAD (9.6), LUSC (8.7), BRCA (7.8), ESCA (4.3), UVM (4.3), MESO (3.5), HNSC (2.6), SARC (2.6), GBM (1.7), LIHC (1.7), STAD (1.7), UCEC (1.7), COAD (0.9), KIRP (0.9), PRAD (0.9), READ (0.9), TGCT (0.9), and THYM (0.9). | |||||||||||
| C5 | BLCA (18.4), LUAD (15.8), CESC (10.5), SKCM (10.5), PRAD (7.9), BRCA (5.3), ESCA (5.3), STAD (5.3), COAD (2.6), GBM (2.6), HNSC (2.6), LIHC (2.6), LUSC (2.6), PAAD (2.6), PCPG (2.6), and TGCT (2.6). | |||||||||||
| C6 | BRCA (31.5), LUSC (9.7), ESCA (8.6), SKCM (8.6), BLCA (8.2), STAD (6.5), LUAD (5.7), PRAD (5.7), HNSC (3.9), CESC (2.5), SARC (2.2), PAAD (1.8), GBM (0.7), LGG (0.7), UCEC (0.7), UVM (0.7), CHOL (0.4), DLBC (0.4), MESO (0.4), PCPG (0.4), READ (0.4), and TGCT (0.4). | |||||||||||
| C7 | SKCM (14.7), BRCA (11.5), LUSC (11), ESCA (8.4), STAD (7.3), SARC (6.8), CESC (5.8), LUAD (5.8), UVM (4.7), BLCA (4.2), PAAD (3.1), HNSC (2.6), COAD (2.1), PRAD (2.1), LIHC (1.6), MESO (1.6), READ (1.6), UCEC (1.6), TGCT (1), DLBC (0.5), GBM (0.5), LGG (0.5), OV (0.5), and THCA (0.5). | |||||||||||
| C8 | SKCM (24.8), BRCA (23.9), CESC (12.8), PCPG (6.8), BLCA (5.1), SARC (5.1), LUSC (4.3), HNSC (3.4), UCEC (2.6), COAD (1.7), ESCA (1.7), MESO (1.7), READ (1.7), TGCT (1.7), LUAD (0.9), OV (0.9), and UVM (0.9). | |||||||||||
The clusters produced by integration of whole-genome profiles of gene expression (GE), copy number variants (CNV), and DNA methylation (METH) were characterized in terms of clinical, demographic, immune and molecular information. The table shows those variables with significant differences in at least one cluster. For each variable, different letters represent significant differences between clusters.
*Values represent median survival times by cluster. Letters represent significant differences under the log-rank test to compare the entire survival curves of each cluster.
**Databases: GO Biological process (&), miRTabrBase (▯), Reactome (¶). Functional classes significant at FDR adj. p-value < 0.05.
Overlap between our selected group of genes and databases:
(1)GINS1, POLD3, PRIM2, POLD4, PCNA, MCM8 and MCM3.
(2)MRPS26, MRPL2, MRPL51, MRPS35, MRPL16, MRPS18A, MRPS10, MRPL14, MRPL48, MRPL21 and MRPL11.
(3)PANK2, SF3B2, PCNA, HSP90AB1, NOP2, ATN1, CHD4, HOXC13, PRICKLE4, DPP3, C12ORF57, LDHB, CCND3, CCND2, STK35, RAB23, PPP6R3, IDH3B, RPS3, SIRPA, PSMF1, DNM1L, NKX2-5, PRNP, UVRAG, PPIL1, TPI1, DST, CSNK2A1, SMOX, YIPF3, DDX11, ENTPD6, MAD2L1BP, PPP2R5D, MUT, FBXL14, MRPL21, KLHL42, WNK1, RPL7L1, NCAPD2, FKBP4 and GAPDH.
(4)GINS1, POLD3, PRIM2, POLD4, PCNA, CDKN1B, CCND1, MCM8, MCM3, PSMF1 and CDC25B.
Figure 3Gene signatures for Clusters 1 and 4 in terms of gene expression, copy number variation, and methylation. The genes significantly de-regulated exclusive of Clusters 1 and 4 were used to define signatures (y-axis). The features values (x-axis) of each gene are separated in gene expression (GE, first column of panels), copy number variants (CNV, second column of panels), and DNA methylation (METH, third column of panels), and summarized by Bonferroni confidence intervals (adjusting for all the 441 significant genes in at least one cluster). Dots represent the average of features values across samples.
Figure 4Gene signatures for Clusters 6, 7 and 8 in terms of gene expression, copy number variation, and methylation. The genes significantly de-regulated exclusively in Clusters 6, 7 and 8 were used to define signatures (y-axis). The features values (x-axis) of each gene are separated in gene expression (GE, first column of panels), copy number variants (CNV, second column of panels), and DNA methylation (METH, third column of panels), and summarized by Bonferroni confidence intervals (adjusting for all the 441 significant genes in at least one cluster). Dots represent the average of features values across samples.