| Literature DB >> 34880275 |
Mai Adachi Nakazawa1, Yoshinori Tamada2,3, Yoshihisa Tanaka4,5, Marie Ikeguchi1, Kako Higashihara1, Yasushi Okuno6,7.
Abstract
The identification of cancer subtypes is important for the understanding of tumor heterogeneity. In recent years, numerous computational methods have been proposed for this problem based on the multi-omics data of patients. It is widely accepted that different cancer subtypes are induced by different molecular regulatory networks. However, only a few incorporate the differences between their molecular systems into the identification processes. In this study, we present a novel method to identify cancer subtypes based on patient-specific molecular systems. Our method realizes this by quantifying patient-specific gene networks, which are estimated from their transcriptome data, and by clustering their quantified networks. Comprehensive analyses of The Cancer Genome Atlas (TCGA) datasets applied to our method confirmed that they were able to identify more clinically meaningful cancer subtypes than the existing subtypes and found that the identified subtypes comprised different molecular features. Our findings also show that the proposed method can identify the novel cancer subtypes even with single omics data, which cannot otherwise be captured by existing methods using multi-omics data.Entities:
Mesh:
Year: 2021 PMID: 34880275 PMCID: PMC8654869 DOI: 10.1038/s41598-021-02394-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of our method.
Figure 2(a) Heatmap showing hierarchical clustering for the ECv matrix in the STAD dataset. (b) Heatmap showing hierarchical clustering for the RNA-seq matrix in the STAD dataset. (c–e) The distribution of ECv of edges and absolute fold change in genes in the STAD dataset (See supplementary S2.3). Dashed lines represent of the top 1.0% of the total edges in every subtype. (f) The Venn diagram represents the number of edges in the STAD dataset. Colored areas in the Venn diagram represent subtype-specific edges in each subtype.
Figure 3(a) Kaplan–Meier survival probability curves of patients for the multi-omics-based subtypes. The log-rank test p-value = 0.10 for the four subtypes. The log-rank test between two subtypes; 0.45 (CIN vs EBV) > 0.05, 0.30 (CIN vs GS) > 0.05, and 0.50 (CIN vs MSI) > 0.05, 0.31 (EBV vs GS) > 0.05, 0.95 (EBV vs MSI) > 0.05, 0.16 (GS vs MSI) > 0.05. (b) Kaplan-Meier survival probability curves of patients for the identified network-based subtypes. The log-rank test p-value = 0.00011 for the identified three subtypes. The log-rank test between two subtypes; 0.00016 (subtype 1 vs 2) < 0.05, 0.042 (subtype 1 vs 3) < 0.05, and 0.013 (subtype 2 vs 3) < 0.05. (c) Kaplan-Meier survival probability curves of patients for the identified RNA-seq based three subtypes. The log-rank test p-value = 0.06 for the identified three subtypes. The log-rank test between two subtypes; 0.70 (subtype 1 vs 2) > 0.05, 0.091 (subtype 1 vs 3) > 0.05, and 0.14 (subtype 2 vs 3) > 0.05. (d) Kaplan-Meier survival probability curves of patients for the identified RNA-seq based two major subtypes. The log-rank test p-value = 0.19 > 0.05.
The relationship between the existing four molecular subtypes and our identified subtypes.
| Subtype name | CIN | EBV | GS | MSI | Unknown | All |
|---|---|---|---|---|---|---|
| Subtype 1 | 33 | 10 | 37 | 9 | 24 | 113 |
| Subtype 2 | 25 | 3 | 5 | 21 | 22 | 76 |
| Subtype 3 | 58 | 11 | 6 | 15 | 83 | 173 |
Figure 4Visualization of subtype-specific subnetworks in the STAD dataset. (a) Subnetworks of subtype-specific edges were highlighted with the basal network (blue). (b,c) The biggest connected component in the subnetwork of subtype-specific edges in each subtype. Edges and nodes were colored by each subtype: subtype 1 (gray), subtype 2 (magenta), and subtype 3 (green). Colored nodes were hub nodes in each subtype and the color gradient represents the outdegree of hubs.
Figure 5The top five terms of biological functions in the STAD dataset.