| Literature DB >> 32825786 |
Xiucai Ye1, Weihang Zhang1, Yasunori Futamura1, Tetsuya Sakurai1.
Abstract
High-throughput sequencing technologies have enabled the generation of single-cell RNA-seq (scRNA-seq) data, which explore both genetic heterogeneity and phenotypic variation between cells. Some methods have been proposed to detect the related genes causing cell-to-cell variability for understanding tumor heterogeneity. However, most existing methods detect the related genes separately, without considering gene interactions. In this paper, we proposed a novel learning framework to detect the interactive gene groups for scRNA-seq data based on co-expression network analysis and subgraph learning. We first utilized spectral clustering to identify the subpopulations of cells. For each cell subpopulation, the differentially expressed genes were then selected to construct a gene co-expression network. Finally, the interactive gene groups were detected by learning the dense subgraphs embedded in the gene co-expression networks. We applied the proposed learning framework on a real cancer scRNA-seq dataset to detect interactive gene groups of different cancer subtypes. Systematic gene ontology enrichment analysis was performed to examine the detected genes groups by summarizing the key biological processes and pathways. Our analysis shows that different subtypes exhibit distinct gene co-expression networks and interactive gene groups with different functional enrichment. The interactive genes are expected to yield important references for understanding tumor heterogeneity.Entities:
Keywords: co-expression networks; interactive gene groups; machine learning; single-cell RNA-seq; subgraph learning
Mesh:
Year: 2020 PMID: 32825786 PMCID: PMC7563496 DOI: 10.3390/cells9091938
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 6.600
Figure 1The proposed learning framework to detect interactive gene groups. Four major steps: (a) Filtering rare, ubiquitous, and invariable genes; (b) Spectral clustering to identify cell subpopulations; (c) Constructing gene co-expression networks; (d) Detecting dense subgraphs embedded in the gene co-expression networks.
Number of cells in each sample/patient.
| Sample ID | Total Cells | Benign Cells (Percentage) | Malignant Cells (Percentage) |
|---|---|---|---|
| Melanoma_53 | 143 | 127 (88.8%) | 16 (11.2%) |
| Melanoma_58 | 142 | 142 (100%) | 0 |
| Melanoma_59 | 70 | 16 (22.9%) | 54 (77.1%) |
| Melanoma_60 | 226 | 217 (96.0%) | 9 (4.0%) |
| Melanoma_65 | 63 | 59 (93.7%) | 4 (6.3%) |
| Melanoma_67 | 95 | 95 (100%) | 0 |
| Melanoma_71 | 89 | 35 (39.3%) | 54 (60.7%) |
| Melanoma_72 | 181 | 181 (100%) | 0 |
| Melanoma_74 | 147 | 147 (100%) | 0 |
| Melanoma_75 | 344 | 341 (99.1%) | 3 (0.9%) |
| Melanoma_78 | 131 | 11 (8.4%) | 120 (91.6%) |
| Melanoma_79 | 896 | 428 (47.8%) | 468 (52.2%) |
| Melanoma_80 | 480 | 355 (74.0%) | 125 (26.0%) |
| Melanoma_81 | 205 | 72 (35.1%) | 133 (64.9%) |
| Melanoma_82 | 84 | 52 (61.9%) | 32 (38.1%) |
| Melanoma_84 | 159 | 145 (91.2%) | 14 (8.8%) |
| Melanoma_88 | 351 | 234 (66.7%) | 117 (33.3%) |
| Melanoma_89 | 475 | 377 (79.4%) | 98 (20.6%) |
| Melanoma_94 | 364 | 354 (97.3%) | 10 (2.7%) |
Figure 2Performance comparison of different clustering methods. Adjusted rand index (ARI) is employed to measure the accuracy of clustering results.
Figure 3Three-dimensional spaces constructed by (a) the first three eigenvectors and (b) the last three eigenvectors. Different colors denote different clusters output by spectral clustering.
Figure 4Visualization of cancer subtypes identified by spectral clustering from human melanoma scRNA-seq data set in two-dimensional space constructed by (a) t-SNE and (b) UMAP, respectively. Different colors denote different clusters output by spectral clustering.
Cell subpopulations presented in each sample/patient. The majority of cells belonging to a subtype was highlighted in bold-face type.
| Sample ID | Subtype 1 | Subtype 2 | Subtype 3 | Subtype 4 | Subtype 5 | Subtype 6 |
|---|---|---|---|---|---|---|
| Melanoma_53 | 0 | 0 |
| 0 | 0 | 0 |
| Melanoma_59 |
| 0 | 0 | 0 | 0 | 2 (3.7%) |
| Melanoma_60 | 1 (11.1%) | 0 | 0 |
| 1 (11.1%) | 1 (11.1%) |
| Melanoma_65 |
| 0 | 0 | 0 | 0 | 0 |
| Melanoma_71 |
| 1 (1.8%) | 0 | 0 | 0 | 3 (5.6%) |
| Melanoma_75 | 3 (100%) | 0 | 0 | 0 | 0 | 0 |
| Melanoma_78 | 6 (5.0%) | 0 | 0 |
| 0 | 0 |
| Melanoma_79 | 2 (0.4%) |
| 0 | 0 | 1 (0.2%) | 0 |
| Melanoma_80 | 0 | 0 | 0 | 0 | 0 |
|
| Melanoma_81 | 2 (1.5%) | 0 |
| 0 | 0 | 0 |
| Melanoma_82 | 0 | 0 | 32 (100%) | 0 | 0 | 0 |
| Melanoma_84 | 1 (7.1%) | 1 (7.1%) | 0 | 0 | 1 (7.1%) |
|
| Melanoma_88 |
| 0 | 0 | 0 | 0 | 1 (0.9%) |
| Melanoma_89 | 1 (1.0%) | 0 | 0 | 0 |
| 0 |
| Melanoma_94 |
| 1 (10.0%) | 2 (20.0%) | 0 | 1 (10.0%) | 0 |
Figure 5Eigenvector norms (left column): (a) norms in Subtype 1, (c) norms in Subtype 2, and (e) norms in Subtype 3. Scatterplots of the projection into the subspace defined by the indicated eigenvectors (right column): (b) Scatterplot in Subtype 1, (d) Scatterplot in Subtype 2, and (f) Scatterplot in Subtype 3.
Figure 6Eigenvector norms (left column): (a) norms in Subtype 4, (c) norms in Subtype 5, and (e) norms in Subtype 6. Scatterplots of the projection into the subspace defined by the indicated eigenvectors (right column): (b) Scatterplot in Subtype 4, (d) Scatterplot in Subtype 5, and (f) Scatterplot in Subtype 6.
Dense subgraphs detected by analysis.
| Subtype | Subgraph | Eigenvector | Subgraph Size | Subgraph Density |
|---|---|---|---|---|
| Subtype 1 | Subg 1 |
| 20 |
|
| Subg 2 |
| 9 |
| |
| Subtype 2 | Subg 1 |
| 10 |
|
| Subg 2 |
| 13 |
| |
| Subtype 3 | Subg 1 |
| 12 |
|
| Subg 2 |
| 15 |
| |
| Subtype 4 | Subg 1 |
| 13 |
|
| Subg 2 |
| 13 |
| |
| Subtype 5 | Subg 1 |
| 12 |
|
| Subg 2 |
| 9 |
| |
| Subtype 6 | Subg 1 |
| 8 |
|
| Subg 2 |
| 13 |
|
Figure 7Two detected subgraphs in the gene co-expression network of cancer subtype 2. Two detected subgraphs are highlighted by red circles. Genes in the subgraphs are shown in the green squares.
Significant genes and Gene Ontology (GO) analysis of the co-expression networks of different melanoma subtypes.
| Subgraph | Gene List | Term Type & Name | |
|---|---|---|---|
| Subtype 1: Subg 1 | BP: cell cycle (18) | 1.4 × 10 | |
| BP: nuclear division (13) | 1.2 × 10 | ||
| CC: chromosome (11) | 6.4 × 10 | ||
| KEGG: Cell cycle (3) | 8.4 × 10 | ||
|
| |||
| Subtype 2: Subg 2 | BP: sister chromatid segregation (8) | 5.4 × 10 | |
| BP: mitotic cell cycle process (10) | 6.5 × 10 | ||
| BP: chromosome organization (8) | 5.2 × 10 | ||
|
| KEGG: Pyrimidine metabolism (3) | 8.5 × 10 | |
| Subtype 3: Subg 2 | BP: mitotic cell cycle process (13) | 6.2 × 10 | |
| BP: cell division (10) | 4.2 × 10 | ||
| CC: spindle (9) | 8.8 × 10 | ||
|
| BP: microtubule-based process (9)) | 3.8 × 10 | |
| MF: microtubule binding (5) | 1.2 × 10 | ||
| KEGG:Cell cycle (2) | 1.8 × 10 | ||
| Subtype 4: Subg 2 | BP: mitotic cell cycle (9) | 5.1 × 10 | |
| BP: organelle fission (6) | 4.6 × 10 | ||
| CC: centrosome (6) | 3.7 × 10 | ||
|
| BP: DNA metabolic process (5) | 4.3 × 10 | |
| Subtype 5: Subg 2 | BP: DNA replication (7) | 7.2 × 10 | |
| CC: chromosomal part (7) | 8.5 × 10 | ||
|
| MF: helicase activity (4) | 3.6 × 10 | |
| BP: cellular macromolecule (9) | 6.7 × 10 | ||
| KEGG: DNA replication (2) | 5.2 × 10 | ||
| Subtype 6: Subg 2 | BP: mitotic cell cycle (11) | 2.5 × 10 | |
| BP: organelle fission (9) | 1.6 × 10 | ||
| CC: condensed chromosome (6) | 4.4× 10 | ||
|
| KEGG: Pyrimidine metabolism (2) | 5.7 × 10 |