| Literature DB >> 33042271 |
Rui Cao1, Fan Yang2, Si-Cong Ma3, Li Liu1, Yu Zhao2,4, Yan Li5, De-Hua Wu3, Tongxin Wang6, Wei-Jia Lu2, Wei-Jing Cai7, Hong-Bo Zhu1, Xue-Jun Guo3, Yu-Wen Lu3, Jun-Jie Kuang3, Wen-Jing Huan8, Wei-Min Tang8, Kun Huang9,10, Junzhou Huang2, Jianhua Yao2, Zhong-Yi Dong3.
Abstract
Microsatellite instability (MSI) has been approved as a pan-cancer biomarker for immune checkpoint blockade (ICB) therapy. However, current MSI identification methods are not available for all patients. We proposed an ensemble multiple instance deep learning model to predict microsatellite status based on histopathology images, and interpreted the pathomics-based model with multi-omics correlation.Entities:
Keywords: colorectal cancer; ensembled patch likelihood aggregation (EPLA); microsatellite instability; multi-omics; pathomics
Mesh:
Substances:
Year: 2020 PMID: 33042271 PMCID: PMC7532670 DOI: 10.7150/thno.49864
Source DB: PubMed Journal: Theranostics ISSN: 1838-7640 Impact factor: 11.556
Figure 1Overview of the Ensemble Patch Likelihood Aggregation (EPLA) model. A whole slide image (WSI) of each patient was obtained and annotated to highlight the regions of carcinoma (ROIs). Then, patches were tiled from ROIs, and the MSI likelihood of each patch was predicted by ResNet-18, during which a heat map was shown to visualize the patch-level prediction. Then, PALHI and BoW pipelines integrated the multiple patch-level MSI likelihoods into a WSI-level MSI prediction, respectively. Finally, ensemble learning combined the results of the two pipelines and made the final prediction of the MS status.
Figure 2Validation of the EPLA and comparison with DL-based MV in the TCGA cohort. (A) Representative heat maps of MSI and MSS cases at the patch-level prediction stage. Color bars show the MSI likelihood of each patch. (B) Receiver operating characteristic (ROC) curve of EPLA. The P value was calculated by the Wald test. (C) Summary of EPLA and DL-based MV. DL-based MV was re-implemented from a voting-based model in Ref.20. The last line of the table summarizes the performance of the original DL-based MV model. (D) Correlation of the degree of differentiation with EPLA-predicted MS status and MSIsensor score. DL-based MV, deep-learning based majority voting; EPLA, Ensemble Patch Likelihood Aggregation. Significance values: *** P < 0.001.
Figure 3Generalization performance of the EPLA in an Asian cohort. (A) Summary of the performance of EPLA in Asian-CRC with or without transfer learning. When using transfer learning, 10% of cases from Asian-CRC were used for model fine-tuning. (B) The Receiver operating characteristic (ROC) curve of EPLA in the Asian-CRC after transfer learning. (C) ROCAUCs of the model in Asian-CRC with increasing proportions of cases for transfer learning. EPLA, Ensemble Patch Likelihood Aggregation; CRC, colorectal cancer.
Figure 4Identification and genomic correlation analysis of top pathological signatures. (A) Importance ranking of the top ten pathological signatures extracted from EPLA. (B) Boxplots of the five pathological signatures between MSI and MSS groups. Significance values: **** P < 0.0001. (C) Heat map with unsupervised clustering showing the correlation between genomic landscape and top pathological signatures in each patient. Each column corresponds to a patient in the TCGA-COAD cohort. All continuous variables are normalized to a range of 0 to 1. EPLA, Ensemble Patch Likelihood Aggregation; FEA, feature; INDEL: insertion-deletion, TMB: tumor mutation burden, MMR: mismatch repair, DDR: DNA damage response and repair, and HRD: homologous recombination deficiency.
Figure 5Correlation of top pathological signatures with WGCNA-identified modules and anti-tumor immunity. (A) Weighted gene co-expression network analysis (WGCNA) based on gene expression data identified gene modules with highly synergistic changes. The biological functions of these modules were annotated using Gene Ontology (GO) analyses. (B) Heat map of correlation coefficients (corresponding P values in brackets) for each pair of annotated modules and top pathological signatures. (C) Significantly-enriched GO terms of ME8, ME12 and ME13. The dotted line indicates the level with an adjusted P value of 0.05. Correlation of cytolytic activity (CYT) (D) and CD8+ T-effector genes (E) with MS status and top pathological signatures. The heat maps show Spearman's rank correlation coefficients, where a transition from red to blue represents positive to negative correlations. Significance values in boxplots: **** P < 0.0001.