| Literature DB >> 31646964 |
Yelin Fu1, Lishuang Qi1, Wenbing Guo1, Liangliang Jin1, Kai Song1, Tianyi You1, Shuobo Zhang1, Yunyan Gu1, Wenyuan Zhao2, Zheng Guo3,4,5.
Abstract
BACKGROUND: Microsatellite instability (MSI) accounts for about 15% of colorectal cancer and is associated with prognosis. Today, MSI is usually detected by polymerase chain reaction amplification of specific microsatellite markers. However, the instability is identified by comparing the length of microsatellite repeats in tumor and normal samples. In this work, we developed a qualitative transcriptional signature to individually predict MSI status for right-sided colon cancer (RCC) based on tumor samples.Entities:
Keywords: Gene expression profiles; Microsatellite instability status; Qualitative transcriptional signature; Relative gene expression orderings; Right-sided colon cancer
Mesh:
Year: 2019 PMID: 31646964 PMCID: PMC6813057 DOI: 10.1186/s12864-019-6129-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The flowchart of this study, as exemplified by the development and validation of predicting MSI status signature for patients with RCC (see Methods)
The Composition of 10-GPS
| signature | gene1 | gene2 | signature | gene1 | gene2 |
|---|---|---|---|---|---|
| pair1 |
|
| pair6 |
|
|
| pair2 |
|
| pair7 |
|
|
| pair3 |
|
| pair8 |
|
|
| pair4 |
|
| pair9 |
|
|
| pair5 |
|
| pair10 |
|
|
Notes: A RCC sample was classified as MSI if the REOs (gene1 > gene2) of at least 7 of the gene pairs in the 10-GPS vote for MSI; otherwise the MSS
Fig. 2The ROC curves for 10-GPS in four datasets. a the training dataset. b the RCCs of GSE39084. c the RCCs of GSE18088. d the RCCs of GSE75317. e the RCCs of TCGA
The performances of the 10-GPS in RCCs of the independent datasets
| pre-MSIa (MSI:MSS)b | pre-MSSa (MSI:MSS)b | sensitivity | specificity | F-score | |
|---|---|---|---|---|---|
| GSE39084_R | 13 (13:0) | 18 (0:18) | 1 | 1 | 1 |
| GSE18088_R | 13 (13:0) | 15 (1:14) | 0.9286 | 1 | 0.9630 |
| GSE75317_R | 8 (8:0) | 18 (1:17) | 0.8889 | 1 | 0.9412 |
| TCGA_R | 15 (9:6) | 38 (1:37) | 0.900 | 0.8605 | 0.8798 |
| Total_RCCs | 49 (43:6) | 89 (3:86) | 0.9348 | 0.9348 | 0.9348 |
Notes: a represents the predicted MSI status by 10-GPS; b represents the original MSI status; GSE_R represents the RCC samples; Total_RCCs represents all the samples of RCC
Fig. 3The complete linkage hierarchical clustering of the RCC samples in the (a), training dataset, (b) GSE18088 and (c) GSE75317 based on the differentially expressed genes between the signature-confirmed MSI and MSS samples. X- > Y, X represents the original MSI status and Y represents the reclassified MSI status by 10-GPS
The molecular characteristics of the five signature-disconfirmed RCC samples in the training dataset
| original_MSI. | predicted_MSI. | KRAS. | BRAF. | CIMP. |
|---|---|---|---|---|
| MSS | MSI | wild type | NA | NA |
| MSS | MSI | wild type | mutation | + |
| MSS | MSI | wild type | mutation | + |
| MSI | MSS | wild type | mutation | + |
| MSI | MSS | NA | wild type | – |
The datasets analyzed in this study from GEO and TCGA
| GSE39582 | GSE39084 | GSE18088 | GSE75317 | GSE13067 | GSE13294 | TCGA | |
|---|---|---|---|---|---|---|---|
| Stage | |||||||
| I | 33 | 8 | – | 6 | – | – | 75 |
| II | 264 | 23 | 53 | 24 | – | – | 178 |
| III | 205 | 16 | – | 17 | – | – | 130 |
| IV | 60 | 22 | – | 12 | – | – | 64 |
| Microsatellite status | |||||||
| MSI | 75 | 16 | 19 | 11 | 11 | 78 | 11 |
| MSS | 444 | 54 | 34 | 48 | 63 | 77 | 81 |
| Location | |||||||
| Right | 224 | 31 | 28 | 26 | – | – | 261 |
| Left | 342 | 30 | 25 | 33 | – | – | 177 |
| MSI_proportion | |||||||
| Right | 57:154a (27.0%)b | 13:18 (41.9%) | 14:14 (50.0%) | 9:17 (34.6%) | – | – | 10:43 (18.9%) |
| Left | 18:290c (5.8%)d | 3:27 (10.0%) | 5:20 (20.0%) | 2:31 (6.1%) | – | – | 1:33 (2.9%) |
| MSI detection | PCR | PCR or IHC | PCR | PCR | PCR | PCR | PCR |
| Adjuvant chemotherapy | |||||||
| Yes | 233 | – | – | – | – | – | – |
| No | 316 | – | 53 | – | – | – | – |
Notes: The data from GEO were produced by the same gene expression profiling platform (GPL570, Affy-HG-U133_Plus_2). a represents the number of MSI and MSS of RCCs, respectively; b represents the proportion of MSI in RCCs; c represents the number of MSI and MSS of LCCs, respectively; d represents the proportion of MSI in LCCs