| Literature DB >> 35847558 |
Ryota Chijimatsu1,2, Shogo Kobayashi3, Yu Takeda3, Masatoshi Kitakaze3, Shotaro Tatekawa4, Yasuko Arao1, Mika Nakayama1, Naohiro Tachibana5, Taku Saito5, Daisuke Ennishi2, Shuta Tomida2, Kazuki Sasaki3, Daisaku Yamada3, Yoshito Tomimaru3, Hidenori Takahashi3, Daisuke Okuzaki6, Daisuke Motooka6, Takahito Ohshiro7, Masateru Taniguchi7, Yutaka Suzuki8, Kazuhiko Ogawa4, Masaki Mori3,9, Yuichiro Doki3, Hidetoshi Eguchi3, Hideshi Ishii1.
Abstract
Single-cell RNA sequencing (scRNAseq) has been used to assess the intra-tumor heterogeneity and microenvironment of pancreatic ductal adenocarcinoma (PDAC). However, previous knowledge is not fully universalized. Here, we built a single cell atlas of PDAC from six datasets containing over 70 samples and >130,000 cells, and demonstrated its application to the reanalysis of the previous bulk transcriptomic cohorts and inferring cell-cell communications. The cell decomposition of bulk transcriptomics using scRNAseq data showed the cellular heterogeneity of PDAC; moreover, high levels of tumor cells and fibroblasts were indicative of poor-prognosis. Refined tumor subtypes signature indicated the tumor cell dynamics in intra-tumor and their specific regulatory network. We further identified functionally distinct tumor clusters that had close interaction with fibroblast subtypes via different signaling pathways dependent on subtypes. Our analysis provided a reference dataset for PDAC and showed its utility in research on the microenvironment of intra-tumor heterogeneity.Entities:
Keywords: Cancer; Cancer systems biology; Transcriptomics
Year: 2022 PMID: 35847558 PMCID: PMC9283889 DOI: 10.1016/j.isci.2022.104659
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1Integration of the six scRNAseq datasets and reinterpretation of altered gene expression in bulk transcriptomics at the cellular level
(A–C) The integrated data are summarized in the UMAP and color-coded according to cell type (A), dataset (B), and sample type (C).
(D and E) The module scores calculated using the signatures defined in the NMF method on TCGA-PDAC are plotted on UMAP (D) and their violin plot (E).
(F and G) Bubble plot showing the cell origin of the driving genes that were highly upregulated (F) and downregulated (G) in TCGA-PAAD, with cell types indicated in the rows and genes in the columns.
The size of each bubble represents the rate of cells expressing the gene, and the color represents the scaled average expression in their cell type cluster. The bar plot shows the log2 fold change in the TCGA-PAAD sample vs. the control pancreas obtained from GEPIA2.
Figure 2Cellular decomposition analysis of the bulk transcriptome using scRNAseq data
(A–D) Heatmap showing the estimated cell ratio of each patient from TCGA-PAAD (A). Hierarchical clustered heatmap for the correlation among patients based on the cell composition (B) with previously defined tumor subtypes, clinical metadata (C), and Oncoprint for major gene mutation (D).
(E) Kaplan–Meier curve for a group of five patients defined by hierarchical clustering of cell composition (p = 0.03 from likelihood ratio test in Cox proportional hazards regression model).
(F) Boxplot (25th percentile, median, and 75th percentile) showing cell composition aggregated for each patient group. Significance was assessed using Pairwise Wilcoxon Rank Sum Tests with Bonferroni correction (∗∗∗∗, p < 0.001; ∗∗∗, p < 0.05; ∗∗, p < 0.01; ∗, p < 0.05. Sample number is indicated in Figure 2E).
(G) Hierarchical clustering for the correlation among cell types based on their pattern. The red frame highlights the positive/negative correlations among stellate cells, endocrine cells, T cells, and fibroblasts.
Figure 3Estimation of cell–cell communication within PDAC using CellChat
(A) Heatmap showing the summary of the signaling pathways that contribute to outgoing or incoming communication. The color bar represents the relative signaling strength of a signaling pathway across cell types. The bars indicate the sum of the signaling strength of each cell type or pathway.
(B) Circle plots displaying putative ligand–receptor interactions, with the width of edges representing the strength of the communication.
(C) Chord diagram showing each ligand–receptor pattern and their weights in the interaction between CAFs and malignant cells.
Figure 4Re-clustering of ductal cells and their tumor-subtype analysis
(A and B) The subset of ductal cells alone was isolated and further processed using the Seurat pipeline without batch correction. UMAP color-coded according to cell type (A) and individual (B) showing high heterogeneity of malignant cell clusters.
(C) Tumor-subtype scores defined in Figure 1 plotted on UMAP showing the segmented pattern of the score.
(D–G) Pearson’s correlation analysis of the two subtypes of scores indicate weak correlation. The color coding was according to tissue type (D) or new classification (F) (E, G) Modified tumor-subtype scores are shown on the UMAP, which processed new signatures as features for PCA to classify cells according to tumor-subtype (E). Based on their score distribution (F), ductal cells were classified into five groups (G).
(H) Boxplot (25th percentile, median, and 75th percentile) of modified subtype scores aggregated for cells from each tumor sample showing their dependency on the source sample. Significance was assessed using Welch’s t-tests (∗∗∗∗, p < 0.001; ∗∗∗, p < 0.05; ∗∗, p < 0.01; ∗, p < 0.05. N = 26 to 2585 cells per patient.).
(I) Enrichment analysis of differentially expressed genes in cell clusters using Gene Ontology Biological Process. Tree plot showing the hierarchical clustering of enriched terms with word clouds.
(J and K) Tumor-subtype specific regulatory analysis conducted using SCENIC. The heatmap was colored according to the residual sum of squares (RSS) showing the specificity of the activated transcription factor (J). The representative transcript factor is plotted on the UMAP with color attributed according to regulon activity.
Figure 5Unsupervised clustering of malignant cells indicates tumor-associated features in addition to subtype-specific functions
(A and B) Unsupervised clustering of the malignant cell fraction plotted on the UMAP and color-coded according to cluster (A), tissue type, and dataset (B).
(C–E) The modified subtype score (C) and cell label defined in Figure 4E are projected on the UMAP. Violin plots showing the aggregation of subtype scores for each cluster (D).
(F) Enrichment analysis of differentially expressed genes in cell clusters using the KEGG database. Gene-Concept Network showing enriched terms of the KEGG database with nodes colored according to malignant cell clusters. The node size indicates the number of leaf nodes, and the characters show the Gene ID.
(G) Heatmap of the proportion of malignant cell clusters in each sample. The normal pancreas sample is highlighted in red.
Figure 6Re-clustering based on the fraction of cancer-associated fibroblasts and their subtype analysis
(A and B) An unsupervised clustering of the CAF fraction plotted on the UMAP, which was color-coded according to dataset (A) and tissue type (B).
(C–E) UMAP plots showing the stromal scores defined in Figure 1C, the expression of marker genes for CAF subtypes (D), and the subtype score for inflammatory/myofibroblast CAFs (E).
(F) Pearson’s correlation analysis of two CAF scores indicating a negative correlation.
(G) Circle plots displaying putative ligand–receptor interactions, with the width of edges representing the strength of the communication.
(H) Heatmap showing the summary of signaling pathways that contributed to outgoing or incoming communication. The color bar represents the relative signaling strength of a signaling pathway across cell types. The bars show the sum of signaling strength for each cell type or pathway.
(I) Sanky diagram summarizing the interaction between iCAF/myCAF-upregulated ligands and the receptor of malignant clusters. The width of the edges represents the weight value defined in NicheNet, and the color of the output of the edge represents the target malignant cluster. Bubble plot showing the expression levels in myCAFs or iCAFs. The size of each bubble represents the rate of cells expressing a gene, and the color represents the scaled average expression in their cell-type cluster.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Surgically dissected human PDAC tissue | This paper | N/A |
| Chromium Next GEM Single Cell 3′ Kit v3.1 | 10X Genomics | Cat# PN-1000269 |
| Chromium Next GEM Chip G Single Cell Kit | 10X Genomics | Cat# PN-1000127 |
| Dual Index Kit TT Set A | 10X Genomics | Cat# PN-1000215 |
| PDAC scRNAseq (OUGS) | This paper | zenodo [10.5281/zenodo.6024273] |
| PDAC scRNAseq (PRJCA001063) | ( | zenodo [10.5281/zenodo.3969339] |
| PDAC scRNAseq (GSE155698) | ( | GEO: |
| PDAC scRNAseq (GSE111672) | ( | GEO: |
| PDAC scRNAseq (GSE154778) | ( | GEO: |
| PDAC scRNAseq (GSM4293555) | ( | GEO: |
| Upregulated gene list of TCGA-PAAD RNAseq | ( | |
| PDAC bulk RNAseq | Firehose | |
| PDAC bulk RNAseq | ICGA | |
| PDAC bulk RNAseq | ICGA | |
| PDAC bulk RNAseq | ( | ArrayExpress, E-MTAB-6830 |
| PDAC genomic mutation file | GDC via TCGAbiolinks | |
| R (v4.1.2) | R CRAN | |
| Seurat R package (v4.0.5) | ( | |
| pheatmap R package (v1.0.12) | R CRAN | |
| ComplexHeatmap R package (v2.10.0) | Bioconductor | |
| ggplot2 R package (v3.3.5) | R CRAN | |
| ktplots R package (v1.1.14) | Github | |
| survival R package (v3.2-13) | R CRAN | |
| ensembldb R package (v 2.18.1) | Bioconductor | |
| EnsDb.Hsapiens.v86 R package (v 2.99.0) | Bioconductor | |
| BisuqeRNA R package (v1.0.5) | ( | |
| TCGAbiolinks R package (v2.22.1) | ( | |
| CellChat R package (v1.1.3) | ( | |
| SCENIC R package (v1.2.4) | ( | |
| clusterProfilerR package (v4.2.0) | ( | |
| ReactomePA R package (v1.38.0) | ( | |
| DOSE R package (v3.20.0) | ( | |
| enrichplot R package (v1.14.1) | Bioconductor | |
| org.Hs.eg.db R package (v3.14.0) | Bioconductor | |
| AnnotationDbi R package (v1.56.1) | Bioconductor | |
| nichenetr R package (v1.0.0) | ( | |
| networkD3 R package (v0.4) | CRAN | |
| fastp v0.20.1 | ( | |
| STAR v2.7.9a | ( | |
| RSEM v1.3.3 | ( | |