| Literature DB >> 35055016 |
Karpiński Paweł1, Sąsiadek Maria Małgorzata1.
Abstract
The CpG island methylator phenotype (CIMP) can be regarded as the most notable emanation of epigenetic instability in cancer. Since its discovery in the late 1990s, CIMP has been extensively studied, mainly in colorectal cancers (CRC) and gliomas. Consequently, knowledge on molecular and pathological characteristics of CIMP in CRC and other tumour types has rapidly expanded. Concordant and widespread hypermethylation of multiple CpG islands observed in CIMP in multiple cancers raised hopes for future epigenetically based diagnostics and treatments of solid tumours. However, studies on CIMP in solid tumours were hampered by a lack of generalisability and reproducibility of epigenetic markers. Moreover, CIMP was not a satisfactory marker in predicting clinical outcomes. The idea of targeting epigenetic abnormalities such as CIMP for cancer therapy has not been implemented for solid tumours, either. Twenty-one years after its discovery, we aim to cover both the fundamental and new aspects of CIMP and its future application as a diagnostic marker and target in anticancer therapies.Entities:
Keywords: CIMP; DNA methylation; epigenetics; methylator phenotype
Mesh:
Substances:
Year: 2022 PMID: 35055016 PMCID: PMC8777692 DOI: 10.3390/ijms23020830
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Graph depicting the number of PubMed articles published on CIMP each year between 1999 and 2020. We used the RISmed R package for analysis and “methylator phenotype OR CIMP”as the search term.
Figure 2A heatmap representing the result of clustering of cancer-specific CGI methylation in gastric cancer based on TCGA Illumina 450K array data. Note the existence of two different clusters with high levels of GI methylation: MSI-associated (MSI-CIMP) and EBV-associated (EBV-CIMP). The heatmap was generated based on unsupervised clustering (COMMUNAL R package) of CpG probes located in the proximity (±500 bp) of transcription start sites (TSS) that displayed high cancer-specific methylation [79]. The extraction of CGI methylation, together with CIMP assignments, were described previously in our work [80].
Data on methodological approaches undertaken to provide methylation clusters in various TCGA publications. Table summarizes 27 TCGA studies on 29 cancer types.
| Cancer | Abbreviation | Illumina Platform | CIMP or Highly Methylated Cluster | Probes Selection | No. of Probes Selected | Tumor Purity Addresed | Clustering | Clustering Method | Ref. | Publication Year |
|---|---|---|---|---|---|---|---|---|---|---|
| Breast adenocarcinoma | BRCA | 27K and 450K | Yes | most variable (highest standard deviations) | 574 | no | NA | RPMM clustering | [ | 2012 |
| Prostate adenocarcinoma | PRAD | 450K | Yes | cancer specific hypermethylation (beta value > 0.3) | 5000 | yes | binary distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2015 |
| Bladder urothelial carcinoma | BLCA | 450K | Yes | cancer specific/promoter and CpG island associated (beta value > 0.3) | 31,249 | yes | binary distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2014 |
| Ovarian serous cystadenocarcinoma | OV | 27K | Not reported | most variable (highest standard deviations) | 858 | no | K-means clustering/Euclidean distance | consensus clustering | [ | 2011 |
| Colorectal adenocarcinoma | COAD and READ | 27K | Yes | most variable (highest standard deviations) | 2758 | no | NA | RPMM clustering | [ | 2012 |
| Lung adenocarcinoma | LUAD | 27K and 450K | Yes | promoter and CpG island associated/1.0% of most variable | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2014 |
| Lung squamous cell carcinoma | LUSC | 27K and 450K | Not reported | most variable (highest standard deviations) | 8228 | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2012 |
| Uterine corpus endometrial carcinoma | UCEC | 27K and 450K | Yes | most variable (highest standard deviations) | 785 | no | NA | RPMM clustering | [ | 2013 |
| Acute Myeloid Leukemia | AML | 450K | Yes | most variable (highest standard deviations) | 1000 | no | Euclidean distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2013 |
| Glioblastoma | GBM | GoldenGate and 27K | Yes | most variable (highest standard deviations) | 370 | no | K-means clustering/Euclidean distance | consensus clustering | [ | 2008 |
| Stomach adenocarcinoma | STAD | 27K and 450K | Yes | cancer specific hypermethylation | 1375 | yes | binary distance clustering/Ward’s method for linkage | consensus clustering | [ | 2014 |
| Thyroid carcinoma | THCA | 450K | Yes | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2014 |
| Head and neck squamous cell carcinoma | HNSC | 450K | Yes | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2015 |
| Skin cutaneous melanoma | SKCM | 450K | Yes | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2015 |
| Brain lower grade glioma | LGG | 450K | Yes | cancer specific hypermethylation (beta value > 0.3) | 11,977 | yes | binary distanceclustering/Ward’s method for linkage | hierarchical clustering | [ | 2015 |
| Kidney renal papillary cell carcinoma | KIRP | 27K and 450K | Yes | cancer specific hypermethylation/most variable | not provided | no | Ward’s method | hierarchical clustering | [ | 2016 |
| Adrenocortical carcinoma | ACC | 450K | Yes | promoter and CpG island associated/1.0% of most variable | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2016 |
| Uterine carcinosarcoma | UCS | 450K | Yes | cancer specific hypermethylation/most variable | 5000 | no | Ward’s method for linkage | hierarchical clustering | [ | 2017 |
| Cholangiocarcinoma | CHOL | 450K | Yes | cancer specific hypermethylation (beta value > 0.3) | 37,743 | yes | binary distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2017 |
| Cervical cancer | CESC | 450K | Yes | promoter and CpG island associated/1.0% of most variable | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2017 |
| Hepatocellular carcinoma | HCC | 450K | Yes | cancer specific hypermethylation (beta value > 0.3) | 37,848 | yes | binary distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2017 |
| Uveal melanoma | UVM | 450K | Yes | most variable 1.0% of probes | 3859 | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2017 |
| Pancreatic adenocarcinoma | PAAD | 450K | Yes | cancer specific hypermethylation (beta value > 0.25) | 31,956 | yes | binary distance clustering/Ward’s method for linkage | consensus clustering | [ | 2017 |
| Sarcoma | SARC | 450K | Yes | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2017 |
| Thymoma | THYM | 450K | Not reported | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2018 |
| Testicular cancer | TGTC | 450K | Yes | most variable (standard deviation >= 0.26) | 9614 | yes | Euclidean distance clustering/Ward’s method for linkage | hierarchical clustering | [ | 2018 |
| Malignant pleural mesothelioma | MPM | 450K | Not reported | most variable 1.0% of probes | not provided | no | PAM clustering/Euclidean distance | consensus clustering | [ | 2018 |
| Oesophageal carcinoma | ESCA | 450K | Yes | cancer specific hypermethylation (beta value > 0.25) | not provided | yes | binary distance clustering/Ward’s method for linkage | consensus clustering | [ | 2017 |
Figure 3Alluvial diagrams illustrating the influence of choice of variable selection strategy (A) or clustering algorithm (B) on CIMP-positive assignment in colon cancer (TCGA dataset). In all cases, we used the consensus clustering approach with Euclidean distance and average linkage [120]. Consensus clustering was run using 80% sample resampling, a maximum evaluated k of 8 and 1000 resamplings. (A). Samples were clustered by pam (partitioning around medoids). All together, 1500 or 2500 or 3500 most variable Illumina 450K probes or probes located in CGI that displayed cancer-specific methylation were selected. CIMP-positive tumours identified by the last approach are represented with pink alluvia. Note changes in the flow of CIMP-positive samples from left to right. (B) For clustering, probes located in CGI that displayed cancer-specific methylation were selected. Samples were clustered by selecting hierarchical clustering (hc) or k-means (km) or partitioning around medoids (pam). CIMP-positive tumours identified by the last approach are represented with pink alluvia. Note changes in the flow of CIMP-positive samples from left to right.
Figure 4Box plots illustrating differences in average methylation calculated for Illumina 450k array probes located in CGI for CIMP-positive tumours between 23 cancer types. Box plots were ordered in terms of increasing mean CGI methylation from bottom to top. The extraction of CGI methylation together with CIMP assignments were described previously in our work [80].
Figure 5A density plot illustrating increased epigenetic drift (elevated average CGI hypermethylation and decreased average backbone methylation) in CIMP-positive tumours when compared to CIMP-negative tumours and normal adjacent tissue. To design this plot, we downloaded and normalised TCGA methylation data obtained from oesophageal adenocarcinomas, gastric carcinomas and colorectal cancers (819 samples in total). The extraction of CGI methylation and backbone methylation, together with CIMP assignments, were described previously in our work [80]. The colour scale reflects regions with a high sample density (red) and a low sample density (blue). The plot was generated using the MASS R package.