Literature DB >> 35202405

Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression.

Sangeetha Muthamilselvan1, Abirami Raghavendran1, Ashok Palaniappan1.   

Abstract

BACKGROUND: Aberrant DNA methylation acts epigenetically to skew the gene transcription rate up or down, contributing to cancer etiology. A gap in our understanding concerns the epigenomics of stagewise cancer progression. In this study, we have developed a comprehensive computational framework for the stage-differentiated modelling of DNA methylation landscapes in colorectal cancer (CRC).
METHODS: The methylation β-matrix was derived from the public-domain TCGA data, converted into M-value matrix, annotated with AJCC stages, and analysed for stage-salient genes using an ensemble of approaches involving stage-differentiated modelling of methylation patterns and/or expression patterns. Differentially methylated genes (DMGs) were identified using a contrast against controls (adjusted p-value <0.001 and |log fold-change of M-value| >2), and then filtered using a series of all possible pairwise stage contrasts (p-value <0.05) to obtain stage-salient DMGs. These were then subjected to a consensus analysis, followed by matching with clinical data and performing Kaplan-Meier survival analysis to evaluate the impact of methylation patterns of consensus stage-salient biomarkers on disease prognosis.
RESULTS: We found significant genome-wide changes in methylation patterns in cancer cases relative to controls agnostic of stage. The stage-differentiated models yielded the following consensus salient genes: one stage-I gene (FBN1), one stage-II gene (FOXG1), one stage-III gene (HCN1) and four stage-IV genes (NELL1, ZNF135, FAM123A, LAMA1). All the biomarkers were significantly hypermethylated in the promoter regions, indicating down-regulation of expression and implying a putative CpG island Methylator Phenotype (CIMP) manifestation. A prognostic signature consisting of FBN1 and FOXG1 survived all the analytical filters, and represents a novel early-stage epigenetic biomarker / target.
CONCLUSIONS: We have designed and executed a workflow for stage-differentiated epigenomic analysis of colorectal cancer progression, and identified several stage-salient diagnostic biomarkers, and an early-stage prognostic biomarker panel. The study has led to the discovery of an alternative CIMP-like signature in colorectal cancer, reinforcing the role of CIMP drivers in tumor pathophysiology.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35202405      PMCID: PMC8870460          DOI: 10.1371/journal.pone.0249151

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Colorectal adenocarcinoma (CRC) is a major malignant disease with devastating incidence and mortality, being the cancer with the third highest global burden of disease, after lung and breast cancers, and accounting for 1.36 million new cases annually [1]. The etiology of CRC involves chromosomal instability (involving accumulation of mutations in oncogenes and tumor suppressor genes), microsatellite instability (MSI) (leading to loss of DNA mismatch repair) and CpG island methylator phenotype (CIMP), observed in nearly 85%, 15% and 10–40% respectively of all reported sporadic cases [2-4]. Epigenetic dysregulation is a key driver of these processes, and DNA methylation is the most important epigenetic modification [5, 6]. DNA hypomethylation could cause gain-of-function of oncogenes [7], and might aid severe tumor progression [8]. It has been found that large hypomethylation blocks are a universal characteristic of colorectal cancers and other solid tumors [9]. Hypomethylation could also contribute to tumor initiation and progression by a general increase in genomic instability [10]. DNA hypermethylation could cause loss-of-function of tumor suppressor genes, and hypermethylation in the germline could cause heritable loss of gene expression through genomic imprinting [11]. Aberrant hypermethylation of specific CpG islands has been observed to occur in colorectal cancer. The CpG island methylator phenotype (CIMP) was originally discovered in a subset of colorectal cancers [12], and subsequently refined to the involvement of five genes CACNA1G, IGF2, NEUROG1, RUNX3, and SOCS1 [13]. Methylation changes contributing to phenotypic aberrations need not be localized to promoter regions but could occur in the gene coding regions and intron-exon structures [14-17]. The persistence of such modifications throughout the tumor cell lifetime has also been demonstrated by Lengauer et al. [18], who showed that methylation aberrations and genome instability were correlated, suggesting a key role for such aberrations in tumorigenic chromosomal segregation processes. The Cancer Genome Atlas (TCGA) is a comprehensive resource of genome-wide mutation, expression and DNA methylation profiles of 46 different types of cancers [19]. Besides the TCGA, the International Human Epigenetic Consortium is devoted to data-driven understanding of the role of epigenomics in normal vs disease states [20]. Methylation patterns constitute an emerging class of promising prognostic factors mainly due to: (i) the persistence of widespread DNA methylation changes; (ii) the occurrence of such changes much ahead of the consequent changes in gene expression; and (iii) the ability to detect these changes in body fluids and blood plasma [21]. Few methylation markers have been previously translated to clinically applicable biomarkers [22], but it is known that tumorbehavior corresponds with epigenomic changes as reflected in differential DNA methylation [23]. Early detection may reduce the mortality rate via tailored adjustments to the treatment regimen, with the result of fewer side-effects and better patient compliance. Chen et al., demonstrated a method to screen multiple types of cancer using a methylation-based blood test four years before conventional diagnosis [24]. A consensus approach to identifying significant methylation signatures in each stage of colorectal cancer progression would increase the utility and reliability of putative biomarkers. This motivated our interest in a systematic investigation of stage-salient epigenetic factors using several model-driven approaches, with the main objective of obtaining diagnostic and prognostic biomarkers.

Methods

Data preprocessing

Methylation data from 27k assays was used, since it is preferentially enriched in epigenetic profiles in the proximal promoter regions (relative to 450k assays which are enriched in probes in the gene body and intergenic regions) [25]. Processed Level-3 27k CRC methylation data was retrieved from TCGA [26]. All samples in the dataset were processed and submitted by a single organization (namely 05: JHU_USC center), ensuring uniformity in data processing. MBatch analysis yielded low (<0.3) Dispersion Separability Criterion (measured as the ratio of between-batch dispersion to within-batch dispersion), indicating negligible batch effects and obviating the need for batch-correction (https://bioinformatics.mdanderson.org/public-software/mbatch/). The data containing the methylation β-values for each probe in each sample was converted into a matrix with probes as rows and cases as columns. Each probe corresponds to one CpG site in the genome. A single gene may be under the control of multiple epigenetic sites, hence multiple probes may be associated with the same gene. It is noted that multiple probes usually exist for the same gene. The probes which have ‘na’ values were discarded from the analysis. To transform the range of methylation values from (0,1) to (-∞,+∞), we used the following function on the β-matrix values, to obtain the M-value matrix [27]: In our study, two M-value matrices were considered: one, where all the probes were used in the analysis; and two, where the probes corresponding to one gene were represented by an average of their values (‘avereps’), thus reducing the M-value matrix from a probe:sample matrix to a gene:sample matrix. Further, we filtered out the probes/genes showing little change in methylation (defined as σ < 1) across all cases in the M-value matrices. The latest clinical data (clinical.cases_selected.tar.gz) was obtained from the GDC by matching on the patient barcode [28]. The stages were annotated for both the β-matrix and M-value matrices using the ‘Pathologic_stage’ attribute encoded in the clinical data. Cases with unknown stage (‘NA’ values) were discarded. The stage information was mapped to the American Joint Committee on Cancer (AJCC) Tumor-Node-Metastasis (TNM) classification system [29] (Table 1).
Table 1

AJCC cancer staging.

TCGA StageTNM ClassificationCases
IT1N0M050
II-1786
IIaT3N0M064
IIbT4aN0M05
III-1660
IIIaT1-T2N1/NcM03
T1N2aM0
IIIbT3-T4aN1/NcM021
T2-T3N2aM0
T1-T2N2bM0
IIIcT4aN2aM020
T3-T4bN2bM0
T4bN1-N2M0
IV-3536
IVaAny-T Any-N M1a1
CONTROL-42
NA-1

The correspondence between the AJCC staging and the TCGA staging for COADREAD is noted. ‘NA’ denotes cases where the stage information is unavailable. Sample sizes are successively aggregated to the parent stage.

The correspondence between the AJCC staging and the TCGA staging for COADREAD is noted. ‘NA’ denotes cases where the stage information is unavailable. Sample sizes are successively aggregated to the parent stage. The final β and M-value matrices were subjected to stage-differentiated contrast analysis with a battery of six different methods, described below. All analysis was carried out on R [30].

Modelling

To compensate for the assumptions specific to individual modelling approaches, an ensemble of models was explored.

(1) Linear modelling with M-values

Linear modelling is essential to identify linear trends in expression across cancer stages and thereby detect stage-sensitive patterns. We used the R package limma [31] for linear modelling of stagewise expression using the complete M-value matrix, with multiple probes per gene (S1 Table in S1 Text).

(2) Linear modelling with avereps matrix

This is essentially similar to the above model, except that the input was the ‘avereps’ matrix, where the methylation of each gene was represented by the average of its M-values across all its probes (S2 Table in S1 Text). Such alternative representations of the methylation data negotiate a tradeoff with respect to information loss and interpretability. In both the linear models, the controls contributed to the intercept of the design matrix, while the stages were represented as indicator variables [32]. The linear fit was subjected to empirical Bayes adjustment to obtain moderated t-statistics. These results were then used for the stage-differentiated contrast analysis

(3) Association between methylation status and phentoype

The strength of the association between the methylation levels of CpG sites and the phenotype of interest (CRC-stage) could enable the identification of relevant markers. We used the R package CpGassoc [33] to estimate this association based on ANOVA with multiple hypothesis correction. The β-matrix was used as input, and five factors (control, stage I, stage II, stage III, stage IV) were specified as the target phenotype.

(4) The Chip Analysis Methylation Pipeline (ChAMP)

The Chip Analysis Methylation Pipeline (ChAMP) integrative analysis suite uses limma to identify differentially methylated probes (DMPs) from the β-matrix [34]. A mapping of sample IDs with the pathological stage phenotype was provided as an additional input file. In addition, the identification of differentially methylated regions (DMRs), consisting of polygenic genomic blocks, was performed using DMRcate in ChAMP (with preset p-value cutoff<0.05) [35]. GSEA was used to identify the enrichment of DMPs and DMRs in the MSigDB pathways [36], using the Fisher Exact test calculation with adjusted p-value < 0.05.

(5) Correlation between gene methylation and expression

We used MethylMix2.0 to estimate the correlation between the methylation and actual expression patterns of each gene [37]. The expression data for the cases of interest were retrieved from TCGA (gdac.broadinstitute.org_COADREAD.Merge_rnaseqv2_illuminaga_rnaseqv2_unc_edu_Level_3_RSEM_genes_data.Level_3.2016012800.0.0.tar.gz). MethylMix was executed with the preset correlation cutoff (> |0.3|), and statistical significance was assessed using Wilcoxon Rank Sum test with adj. p-value < 0.05.

(6) Modelling expression from methylation

We used the R package BioMethyl to model the aggregate expression level of a gene from its methylation patterns [38]. The gene expression matrix was estimated using the methylation β-matrix and then subjected to linear modelling with limma, followed by stage-differentiated contrast analysis.

Stage-differentiated contrast analysis

A directed two-tier set of contrasts was performed in limma to drill down to the stage-salient genes: Tier I: Stage-differentiated contrast against controls. Four pairwise contrasts were performed, one for each of the stages I, II, III and IV. To identify reliable DMGs, the following criteria were used: |lfc M-value| >2, and adj. p-value <0.001. Tier II: Inter-stage contrasts. Six pairwise contrasts between the stages (namely: I-II, I-III, I-IV, II-III, II-IV, and III-IV) were performed (p-value for each contrast: <0.05). To illustrate, a putative DMG identified in Tier I would undergo three inter-stage contrasts in Tier II, to ensure stage-salience. For example, a putative stage-II DMG established by Tier I, would have to pass the following inter-stage contrasts: stage-II vs stage-I, stage-II vs stage-III and stage-II vs stage-IV, for confirmation as stage II-salient DMG.

Identification of stage-salient biomarkers

Finding the consensus of a set of methods with different algorithms overcomes the biases specific to individual methods, and enables screening out false positives. Consensus was obtained by finding the agreement among the results of the various methods used. At least three methods should agree on a given DMG’s stage-salience, for confirmation as consensus stage-salient biomarker.

Survival analysis

The survival data for each case was obtained from the following attributes encoded in the clinical data: patient.vital_status, patient.days_to_followup, and patient.days_to _death. The association between consensus stage-salient DMGs and case overall survival (OS) was evaluated by univariate Cox proportional hazards regression model using the R survival package [39]. This uncovered potential prognostic stage-salient genes from the methylation analysis, using a significance cutoff< 0.05. Such prognostic genes were used as the independent variables in a regression model to estimate the survival risk of each case. Based on this risk score, cases with colorectal cancer were categorized into high and low groups using the optimal cut point determined by the maxstat (maximally selected rank statistic) [40]. Kaplan-Meier estimation was then applied to the median survival times of these two groups for flagging significant differences, providing a prognostic assessment of the biomarkers of interest.

Results

Linear modelling with M-values (at the probe-level)

The number of significant genes present in each stage-control pair from the Tier-I contrasts is shown in Fig 1A. Using the top 100 DM genes of the linear model (given in S3 File in S1 Text), we found a clear separation between controls and stage samples (Fig 1B). The top genes in each stage (by adjusted p-value of contrast with control) are shown in Table 2, with |lfc M-value| and inferred regulation status. The top four genes of each stage were used to construct a stagewise methylation heatmap (Fig 1C). Fig 1D and 1E show boxplots of stagewise methylation levels for two representative genes: TMEM179, mutations in which could cause MSI [41]; and MEOX2 whose promoter methylation status is a known CRC marker [42]. The stagewise methylation patterns of the top linear model genes are shown in Fig 2. It is notable that a naturally occuring read-through fusion protein GPR75-ASB3 is the top linear model gene with significant differential expression in all stages relative to the control. GPR75-ASB3 is positively differentially expressed in the lung as well as different keratinocyte cell types, and evidence is emerging of its role in other cancers [43]. In this light, GPR75-ASB3 could play a significant role in colorectal cancers which are of epithelial origin. The top 100 significant stage-specific genes, listed in S3 File in S1 Text, were used in the consensus analysis.
Fig 1

Linear modelling with M-value matrix, all probes.

(A) Venn distribution of significant DM genes in each stage relative to control. (B) Distribution of samples based on the top two principal components of the top 100 genes shows a clear separation of cancer cases (labelled by stage) from controls. (C) Stagewise methylation portraits of the top four significant stage-specific DMGs. The contrast with the control is especially evident. Also shown are the stagewise methylation levels of (D) TMEM179, and (E) MEOX2.

Table 2

Top ten genes of the linear model at the probe level.

IDStage I lfc (β1)Stage I lfc (β2)Stage III lfc (β3)Stage IV lfc (β4)Adj. P-valMethylation status
GPR75-ASB32.282.192.162.321E-82Hyper
TM4SF19-3.63-3.58-3.72-3.711E-82Hypo
CNRIP12.742.602.682.971E-78Hyper
PDE4A1.681.581.601.711E-71Hyper
KRTAP11-1-2.36-2.30-2.37-2.401E-70Hypo
ADHFE13.152.973.003.431E-69Hyper
FAM123A3.563.183.433.901E-69Hyper
KHDRBS22.302.162.102.341E-68Hyper
AJAP12.522.442.462.641E-68Hyper
NALCN2.962.802.943.251E-68Hyper

The log fold-change of M-value of the probe in each stage relative to the controls, followed by p-value adjusted for the false discovery rate, and the methylation status of the gene in the cancer stages with respect to the control.

Fig 2

Top DMGs identified from linear modelling.

(A) GPR75-ASB3, (B) TM4SF19, (C) CNRIP1, (D) KRTAP11-1, (E) ADHFE1 and (F) PDE4A. For each gene, notice that the trend in methylation could be either hyper-or hypo-methylation relative to the control. TM4SF19 and KRTAP11-1 are hypomethylated whereas CNRIP1, GPR75-ASB3, ADHFE1, PDE4A are hypermethylated.

Linear modelling with M-value matrix, all probes.

(A) Venn distribution of significant DM genes in each stage relative to control. (B) Distribution of samples based on the top two principal components of the top 100 genes shows a clear separation of cancer cases (labelled by stage) from controls. (C) Stagewise methylation portraits of the top four significant stage-specific DMGs. The contrast with the control is especially evident. Also shown are the stagewise methylation levels of (D) TMEM179, and (E) MEOX2.

Top DMGs identified from linear modelling.

(A) GPR75-ASB3, (B) TM4SF19, (C) CNRIP1, (D) KRTAP11-1, (E) ADHFE1 and (F) PDE4A. For each gene, notice that the trend in methylation could be either hyper-or hypo-methylation relative to the control. TM4SF19 and KRTAP11-1 are hypomethylated whereas CNRIP1, GPR75-ASB3, ADHFE1, PDE4A are hypermethylated. The log fold-change of M-value of the probe in each stage relative to the controls, followed by p-value adjusted for the false discovery rate, and the methylation status of the gene in the cancer stages with respect to the control.

Linear modelling with avereps matrix (at the gene-level)

The methylation levels of genes with multiple probes were averaged using limma’s avereps function, and summarized to one value. The number of genes present in each stage-control pair from the Tier-I contrasts is shown in Fig 3A. Using the top 100 genes of the linear model (given in S4 File in S1 Text), we found a clear separation between controls and stage samples (Fig 3B). The top genes in each stage (by adjusted p-value of contrast with control) are shown in Table 3, with |lfc M-value| and inferred regulation status. The top four genes of each stage were used to construct a stagewise methylation heatmap (Fig 3C). Fig 3D and 3E shows the boxplots of stagewise methylation levels for two representative genes, NALCN and GLRX. Mutations in NALCN have been reported in sporadic CRC [44]; here NALCN is seen to be significantly hypermethylated, indicating the same outcome (loss of function) could be effected in multiple ways. GLRX is a target of the activating transcription factor MEOX2 [45]. It is observed that LY6H showed both hypermethylation and hypomethylation when compared to the controls, indicating the role of experimentation necessary to clarify its role in colorectal cancer progression. The top significant 100 genes of each stage, listed in S4 File in S1 Text, were used for the consensus analysis.
Fig 3

Linear modelling with M-value matrix, avereps transformation.

(A) Venn distribution of significant DM genes in each stage relative to control. (B) Distribution of samples based on the top two principal components of the top 100 genes shows a clear separation of cancer cases (labelled by stage) and controls. (C) Stagewise methylation portraits of the top four significant stage-specific DMGs. The stark contrast with the control is especially evident. Also shown are the stagewise methylation levels of (D) NALCN, and (E) GLRX.

Table 3

Top ten genes of linear modelling with averaging of multiple probes.

IDStage I lfc (β1)Stage I lfc (β2)Stage III lfc (β3)Stage IV lfc (β4)Adj. P-valMethylation status
TM4SF19-3.63-3.57-3.72-3.711E-82Hypo
GPR75-ASB32.282.192.152.321E-82Hyper
CNRIP12.742.602.672.971E-77Hyper
KRTAP11-1-2.36-2.30-2.38-2.401E-70Hypo
ADHFE13.152.962.993.431E-69Hyper
FAM123A3.563.183.423.891E-68Hyper
AJAP12.532.442.462.641E-67Hyper
NALCN2.962.792.953.251E-65Hyper
IRF41.991.831.892.131E-65Hyper
PRKAR1B3.383.133.243.501E-65Hyper

The log fold-change of M-value of the gene in each stage (relative to the control) is given, followed by p-value adjusted for the false discovery rate and the methylation status of the gene in the cancer stages with respect to the control. A consistent methylation pattern is observed for all the top genes.

Linear modelling with M-value matrix, avereps transformation.

(A) Venn distribution of significant DM genes in each stage relative to control. (B) Distribution of samples based on the top two principal components of the top 100 genes shows a clear separation of cancer cases (labelled by stage) and controls. (C) Stagewise methylation portraits of the top four significant stage-specific DMGs. The stark contrast with the control is especially evident. Also shown are the stagewise methylation levels of (D) NALCN, and (E) GLRX. The log fold-change of M-value of the gene in each stage (relative to the control) is given, followed by p-value adjusted for the false discovery rate and the methylation status of the gene in the cancer stages with respect to the control. A consistent methylation pattern is observed for all the top genes.

Association with phenotype

The ANOVA from CpGassoc yielded p-values and log fold-changes, which were used to identify significant genes for each stage using the criteria given in Methods. The top 100 genes of each stage from this analysis (given in S5 File in S1 Text) were used for the consensus investigation.

DMP analysis with ChAMP

The summary features of the β matrix dataset were evaluated using ChAMP (Fig 4). The DMPs were identified using ChAMP analysis from the β matrix. All the inter-stage contrasts yielded null results (i.e, no significant genes), except for stageII–stageIV contrast. Due to this, the top 100 DMPs from the stage vs control contrasts were used for the consensus analysis directly. Contrasts that showed significant DMPs were subjected to a further DMR analysis, to enable identification of DM genes. The stage-salient DMR regions (genes) determined are provided in S6 File in S1 Text, and summarized in Table 4. The stage-II vs stage-IV DMR contrast yielded three genes, namely PLAG1, SOCS2, and NNAT. It is observed that these genes might be critical players in the transition to malignancy. Interestingly, some genes were differentially methylated in all the stagewise contrasts with the control; such genes are differentially methylated agnostic of stage and could serve as valuable drug targets for CRC therapy. The top such genes included EYA4, WT1, DCC, RP11, GATA4, MSX1, DLX5, BNC1, WT1-AS, and ZIM2. A total of 31 such genes were identified and tabulated in S7 Table in S1 Text. The DMPs and DMRs from the analysis were subjected to GSEA and these results could also be found in S6 File in S1 Text. Fig 5 shows representative DMP and DMR plots using ChAMP.
Fig 4

Distribution of probes based on (A) genomic position: opensea, shore, island, shelf; (B) gene context: transcription start site (TSS), exons, untranscribed regions (UTRs), and intergenic regions (IGR).

Table 4

Contrast-wise counts of DM probes and DM regions.

ContrastDMPsDMRs
Control and Stage 11104534
Control and Stage 21125435
Control and Stage 31125436
Control and Stage 41110834
Stage 2 and Stage 44043

No DM regions were found for the contrasts not shown, namely the stage-pairs: [(1,2), (1,3), (1,4), (2,3), (3,4)].

Fig 5

DMP and DMR plots using ChAMP.

(A) DMP plot of FCN2 for stage-I vs control illustrating significant hypomethylation (B) DMR plot of transcriptional activator EYA4 for stage-I vs control illustrating significant hypermethylation. Solid lines represent mean values while dashed lines represent the loess.

Distribution of probes based on (A) genomic position: opensea, shore, island, shelf; (B) gene context: transcription start site (TSS), exons, untranscribed regions (UTRs), and intergenic regions (IGR).

DMP and DMR plots using ChAMP.

(A) DMP plot of FCN2 for stage-I vs control illustrating significant hypomethylation (B) DMR plot of transcriptional activator EYA4 for stage-I vs control illustrating significant hypermethylation. Solid lines represent mean values while dashed lines represent the loess. No DM regions were found for the contrasts not shown, namely the stage-pairs: [(1,2), (1,3), (1,4), (2,3), (3,4)].

Methylation and expression correlation analysis

Differential methylation (DM) calculated from stage vs control contrasts ranged from -0.7 to +0.8, and genes could be hyper- or hypo-methylated based on the sign of the DM value. There were 209, 441, 275, and 134 driver genes in each of the contrasts with the controls (stage-I, stage-II, stage-III and stage-IV, respectively). All between-stages contrasts yielded null DM genes. The results from this analysis, including driver genes for all the contrasts, are provided in S8 File in S1 Text. It is notable that the top genes from an overall cancer vs control comparison included GATA4, CCDC88B, and WAS. Top 100 genes from each comparison with the controls were taken forward for consensus analysis. Certain genes emerged common to all the four comparisons with the controls, thereby suggesting stage-agnostic differential methylation events. The top such stage-agnostic differentially methylated genes included CCDC88B, C1orf59, CHFR, ZP2, HOXA9, ELF5, FAM50B, MUC17, TBX20, and VSIG2. Stage-agnostic genes hold promise as therapeutic targets for the treatment of colorectal cancer; the complete set of 56 stage-agnostic genes identified in this analysis is provided in S9 Table in S1 Text. Mixture models of genes, indicative of the number of methylation states, were constructed using MethylMix, and illustrated for a few stage-IV driver genes in Fig 6. The estimated correlation between the methylation levels and actual gene expression for the same genes shows the inverse relationship between methylation and gene expression, thereby highlighting the effect of epigenetic events (Fig 6).
Fig 6

Mixture models and Correlation plots for (A) FAM123A, (B) LAMA1 and (C) NELL1. The x-axis indicates the level of methylation (in terms of β values); y-axis, the frequency. Mixture component curves represent density fits of the histogram. A negative correlation between methylation and expression is evident, indicating that methylation acts to repress gene transcription, though the strength of the inverse correlation varies from gene to gene. olour indicates the mixture model fit.

Mixture models and Correlation plots for (A) FAM123A, (B) LAMA1 and (C) NELL1. The x-axis indicates the level of methylation (in terms of β values); y-axis, the frequency. Mixture component curves represent density fits of the histogram. A negative correlation between methylation and expression is evident, indicating that methylation acts to repress gene transcription, though the strength of the inverse correlation varies from gene to gene. olour indicates the mixture model fit.

BioMethyl analysis

The significant stage-specific DEGs identified by BioMethyl are shown in UpSet plot [46] (Fig 7), and provided in S10 File in S1 Text. Top 100 genes of each stage from this analysis were taken forward for consensus analysis.
Fig 7

UpSet plot of BioMethyl-based stagewise gene expression modelling.

The intersection of all stages yielded 3268 genes, which represent consistently differentially regulated genes.

UpSet plot of BioMethyl-based stagewise gene expression modelling.

The intersection of all stages yielded 3268 genes, which represent consistently differentially regulated genes.

Stage-salient consensus biomarkers

The top 100 significantly differentially-expressed genes of each stage from all the methods discussed above (collated in S11 File in S1 Text) were used for the consensus determination. The consensus analysis yielded seven stage-salient DMGs: one stage-I gene (FBN1), one stage-II gene (FOXG1), one stage-III gene (HCN1) and four stage-IV genes (NELL1, ZNF135, FAM123A, LAMA1). Each of these stage-salient genes presented an |lfc M-value| > 0.4 with respect to the other stages, validating their salience. Fig 8 represents violin plots of the consensus biomarkers, and Table 5 presents a summary of the consensus analysis. Gene ontology (GO) analysis [47] of the consensus biomarkers yielded processes related to structural integrity of cell division processes, immunity dysfunction, and cell migration (Table 6). Detailed GO results are presented in the S12 File in S1 Text.
Fig 8

Violin plots of stage-salient genes.

(A) Stage-I Gene FBN1, (B) Stage-II Gene–FOXG1, (C) Stage-III Gene–HCN1 and Stage-IV genes (D) LAMA1, (E) NELL1, (F) FAM123A, (G) ZNF135.

Table 5

Stage-salient biomarkers.

HGNC IDGene NameMethods in agreementSalienceMeth. statusStatistical significance
M valueAverepsCox analysisKaplan Meier
3603FBN1Avereps, ChAMPIHyper0.3100.0400.0360.025
3811FOXG1Mvalue, Avereps, ChAMP, MethylmixIIHyper1E-160.0030.0190.037
4845HCN1Mvalue, Avereps, ChAMPIIIHyper1E-170.0220.0310.059
7756NELL1Mvalue, Avereps, ChAMPIVHyper1E-680.0610.2830.27
12919ZNF135Mvalue, ChAMP, MethylmixIVHyper1E-760.0620.0960.084
26360FAM123AMvalue, ChAMP, MethylmixIVHyper1E-1150.0970.300.28
6481LAMA1Mvalue, ChAMP, MethylmixIVHyper1E-860.2970.0520.051

The results of the consensus analysis and univariate survival analysis are summarized. All the biomarkers showed hypermethylation, reflecting downregulation of gene expression.

Table 6

GO analysis of stage-salient genes in the order of decreasing significance (i.e, increasing p–value).

GO IDTermOntologyp-value
GO:1990047Spindle matrixCC0.0001
GO:0030109HLA-B specific inhibitory MHC class I receptor activityMF0.0003
GO:0032396Inhibitory MHC class I receptor activityMF0.006
GO:0042609CD4 receptor bindingMF0.0012
GO:0032393MHC class I receptor activityMF0.0013
GO:0050930Induction of positive chemotaxisBP0.0016
GO:0050927Positive regulation of positive chemotaxisBP0.0033
GO:0050926Regulation of positive chemotaxisBP0.0034
GO:0008608Attachment of spindle microtubules to kinetochoreBP0.0043
GO:0007094Mitotic spindle assembly checkpointBP0.0044

Ontology could be Cellular Compartment (CC), Molecular Function (MF), or Biological Process (BP).

Violin plots of stage-salient genes.

(A) Stage-I Gene FBN1, (B) Stage-II Gene–FOXG1, (C) Stage-III Gene–HCN1 and Stage-IV genes (D) LAMA1, (E) NELL1, (F) FAM123A, (G) ZNF135. The results of the consensus analysis and univariate survival analysis are summarized. All the biomarkers showed hypermethylation, reflecting downregulation of gene expression. Ontology could be Cellular Compartment (CC), Molecular Function (MF), or Biological Process (BP). We constructed independent prognostic models of the stage-salient genes and identified the prognostically significant biomarkers as FBN1, FOXG1, HCN1, and LAMA1. The corresponding univariate Kaplan-Meier plots are shown in Fig 9. Rational combinations of stage-salient genes, termed ColoRectal cancer Signatures (CRS), were modelled using multivariate Kaplan-Meier regression, to yield a risk score. Risk scores were then used to estimate survival-effect significance, as described in Methods. The results of this exercise are summarised in Table 7. We found that CRS12 signature (consisting of FBN1 and FOXG1) yielded significant risk scores in the multivariate Kaplan-Meier analysis, and both CRS12 and CRS34 (consisting of HCN1, NELL1, ZNF135, FAM123A, LAMA1) were significant in estimating overall survival (prognosis p-value ≤ 0.02) (Fig 10). S13 File in S1 Text provides survival plots of all possible signatures. At the end of our analysis pipeline, CRS12 passed all the filters and emerged as a significant early-stage panel for CRC prognosis.
Fig 9

K-M plots for the prognostically significant stage-salient genes.

(A) FBN1, (B) FOXG1, (C) HCN1, and (D) LAMA1.

Table 7

Summary of selected multivariate prognostic models.

SignatureStagesBiomarkerWeightP-value
Multivariate modelPrognosis
CRS12I+IIFBN1-0.62 0.015 0.005
FOXG1-1.05
CRS34III+IVNELL10.10.1720.02
ZNF135-0.21
FAM123A-0.23
LAMA1-0.39
HCN1-1.1
CRS234II+III+IVFOXG1-0.990.08770.032
HCN1-1.07
NELL1-0.10
ZNF135-0.22
FAM123A-0.37
LAMA1-0.27

Weight denotes the coefficient in the multivariate model. The ultimate significant signature is highlighted.

Fig 10

Survival analysis of combination biomarker panels shows significance.

(A) Early-stage panel, CRS12; and (B) Late-stage panel, CRS34.

K-M plots for the prognostically significant stage-salient genes.

(A) FBN1, (B) FOXG1, (C) HCN1, and (D) LAMA1.

Survival analysis of combination biomarker panels shows significance.

(A) Early-stage panel, CRS12; and (B) Late-stage panel, CRS34. Weight denotes the coefficient in the multivariate model. The ultimate significant signature is highlighted.

Discussion

CRC development is due to the accumulation of genetic and epigenetic changes of which DNA methylation is of paramount importance. DNA methylation profiles of colorectal cancer have been investigated in several previous studies using various approaches [48, 49]. It is well-known that changes in methylation status correspond with CRC progression [50]. Here we have designed a comprehensive approach to systematically analyze stage-differentiated DNA methylation patterns in colorectal cancer and their relationship to patient survival. Our study has yielded consensus stage-salient significantly differentially methylated genes, and evaluated their prognostic value. Corollary insights obtained in the course of our investigations, such as stage-agnostic genes, have been documented, and would also be of interest to researchers in the field. It is significant that none of the stage-salient genes figure as a cancer gene or a hallmark gene in the Cancer Gene Census [51]; HCN1 is notably marked as a candidate cancer gene based on mouse insertional mutagenesis experiments [52]. The dominant differentially methylated CpG site in all the stage-salient genes is located within the core / proximal promoter regions (Table 8). Mixture models of methylation levels of stage-salient genes, along with their inverse correlation to corresponding expression levels are shown in Fig 11, and unambiguously establish the epigenetic impact of the changes in methylation. Our findings are further discussed in the context of the existing literature, and lead us to detect a strange CpG island methylator phenotype (CIMP) signature in colorectal cancer.
Table 8

Location of the major DM CpG site in stage-salient genes.

Stage-salient geneDM CpG siteDistance to TSSLocation in the promoter region
FBN1cg18671950146Proximal
FOXG1cg1030068436Core
HCN1cg06498267298Proximal
NELL1cg17371081179Proximal
ZNF135cg16638540144Proximal
FAM123Acg2202927573Core
LAMA1cg07846220133Proximal

All the hypermethylated CpG sites of stage-salient DMGs were found in the core/proximal promoter regions.

Fig 11

Mixture models and correlation plots of stage-salient genes.

Shown are FBN1, FOXG1, HCN1, and ZNF135. Two mixture components are seen for FBN1, HCN1, and ZNF135, and three for FOXG1. A strong inverse correlation exists for all genes, except HCN1. Other stage-salient genes are shown in Fig 6.

Mixture models and correlation plots of stage-salient genes.

Shown are FBN1, FOXG1, HCN1, and ZNF135. Two mixture components are seen for FBN1, HCN1, and ZNF135, and three for FOXG1. A strong inverse correlation exists for all genes, except HCN1. Other stage-salient genes are shown in Fig 6. All the hypermethylated CpG sites of stage-salient DMGs were found in the core/proximal promoter regions.

Stage-salient DMGs

Promoter hypermethylation of FBN1, a glycoprotein component of calcium-binding extracellular matrix microfibrils [53], is a recognized biomarker of CRC [54, 55]. Our analysis supports this literature, while pinpointing the stage I-salience in its action. FOXG1 is well-known as an etiological factor in certain neurological disorders and plays a role in the epithelial-mesenchymal transition of CRC cells (a key hallmark of cancer progression), and is known to be overexpressed in CRC cases [56]. It is a nodal gene, with connections to oncogenic pathways like WNT pathway in hepatocellular carcinoma [57] and TGF-β pathway in ovarian cancer [58] Interestingly, FOXG1 was found to be a hypermethylated stage-II salient gene. HCN1, coding for hyperpolarization-activated cyclic nucleotide-gated channel subunits, is associated with low survival rates in breast, brain, and colorectal cancer [59]. We have identified HCN1 as a stage-III hypermethylated gene, suggesting a loss-of-function mechanism for its tumorigenic potential. Our study has provided clear evidence that hypermethylation of LAMA1 (which codes for α-laminin of the extracellular matrix) is a stage IV-specific signature. Experimental evidence for the hypermethylation of the promoter region of LAMA1 in CRC cases is available [60]. NELL1 is a known tumor suppressor gene [61], whose hypermethylation is associated with poor survival outcomes [62]. Here it is found to be a stage IV-specific hypermethylated gene, resonating with the above findings. ZNF135 is a zinc-finger protein involved in regulation of cell morphology and cytoskeletal organizations. Its expression and epigenetic regulation have been reported to be key in cancers of the cervix and esophagus, respectively [63, 64]. Here we have found that epigenetic silencing of ZNF135 is a key feature of stage-IV CRC. It is interesting that another member of the zinc-finger protein family, ZNF726, has been recently identified as the only methylated gene significantly associated with OS in patients with CRC, without regard for pathologic stage [65]. FAM123A, also known as AMER2, is associated with microtubule proteins [66], and is a paralog of the well-documented FAM123B, a tumor-suppressor whose loss-of-function by mutation, methylation and copy-number aberrations is known to play a pivotal role in colorectal cancer, especially in older patients [67-69]. It is significant that our study has uncovered FAM123A as a hypermethylated stage IV-specific DMG, signalling the need for experimental investigations. There is very little literature on the cancer significance of any of the above stage-salient genes, marking our findings as novel and important in the context of gaps in our knowledge.

Putative CIMP signature

Aberrant methylation of CpG promoter regions causes stable repression of transcription leading to gene-silencing [70, 71]. In the context of tumorigenic processes, this is likely to lead to loss-of-function of tumor-suppressor genes. Multiple CpG islands might be methylated simultaneously in some cancers, paving the way for CpG island methylator phenotype (CIMP), first discovered in colorectal cancer [72]. CIMP is characterised by hypermethylation of CpG islands surrounding the promoter regions of genes involved in cancer onset and progression [73]. The phenotype is heterogenous with the type of tumor [74] and dependent on definition [75]. Table 8 suggests that the stage-salient hypermethylated biomarkers identified in our study are components constituting an aggregate novel CIMP, and there is preliminary experimental evidence in this direction. Earlier studies have identified LAMA1 as a CIMP panel constituent [50, 60]. FBN1 has been used as an epigenetic biomarker in diagnostic panels associated with CIMP-positive tumors [54, 76]. While this paper was under review, FAM123A has been used in a five marker panel to detect stage-IV CRC using blood samples [77]. The original CIMP had been associated with advanced T staging (T3/T4) [78], which accords with our finding of four hypermethylated stage IV-salient DMGs. The biomarkers from our study contributing to the putative CIMP were tested with a standard survival analysis workflow yielding significant prognostication power for five of the seven stage-salient genes (Table 5). A Cox multivariate analysis of biomarker panels uncovered two signatures, an early-stage CRS12, and a late-stage CRS34 that might be prognostically valuable. In particular, CRS12 (composed of FBN1 and FOXG1) suggests a significant early-stage biomarker panel (p-value < 0.01) for the effective prognosis and stage-sensitive detection of colorectal cancer. Diagnostic biomarkers that are also superior in prognostication power imply methylation events that are vital to tumor-specific pathophysiology. This suggests future directions for therapeutic intervention. Epigenetic intervention for CIMP-positive cancers has been advanced as a possible treatment strategy [79]. The alternative CIMP-like biomarkers could serve to stratify the cancer subtype, thereby facilitating precision medicine. The current standard of CRC screening is colonoscopy, an invasive method with a significant rate of complications. A non-invasive method based on molecular diagnostics would improve patient satisfaction and efficiency. Several studies have been conducted to identify and/or validate biomarkers for CRC diagnosis. It is recognized that DNA methylation patterns could serve as valid biomarker candidates [80, 81]. Freitas et al., have validated the performance of a 3-gene biomarker panel for the detection of colorectal cancer irrespective of the molecular subtype [82]. However optimal stage-salient epigenetic biomarkers have not yet been reported. Using hypermethylated DNA patterns as cancer markers offers the advantage of providing small targets with high concentrations of CpG for assays, useful for the design of analytical amplicons [83]. Hypermethylation in the gene body and upstream control regions like enhancers and insulators might affect transcription differently than hypermethylation of promoter regions [84, 85]. Further DNA methylation patterns in noncoding RNA genes seem to be important in tumorigenesis and progression [86]. Non-coding RNAs themselves play a significant role in epigenetic modification through the phenomenon of RNA-directed DNA methylation [48]. The nuanced relationship between methylation and gene transcription signals the need for clinical validation of our results, however ensemble approaches such as the one used here suffer less uncertainties with respect to translation of the identified biomarkers. Since methylation mediates a direct epigenetic regulatory mechanism used by all life [87], it is hoped that the workflow herein designed would advance our understanding of the complex effects of methylation events, patterns, and landscapes in different settings, including in the developmental stages of life.

Conclusion

We have developed a comprehensive computational framework for the consensus identification of stage-differentiated significant differentially methylated genes, and evaluation of their prognostic significance. Our analysis has yielded seven stage-salient genes, all hypermethylated in the promoter regions and relatively unreported in the literature: one stage-I gene (FBN1), one stage-II gene (FOXG1), one stage-III gene (HCN1) and four stage-IV genes (NELL1, ZNF135, FAM123A, LAMA1). Stage-salient genes could serve as diagnostic biomarkers, and their concordant hypermethylation would signal a distinct CIMP-like character possibly promoting epigenetic destabilisation, which in turn would drive the progression of colorectal cancer. These findings lend further evidence to CIMP drivers of colorectal cancer and point more generally to a pervasive role for these aberrations in tumor biology that remains to be discovered. Independent prognostic evaluation of the stage-salient markers yielded significance for FBN1 and FOXG1. Survival analysis of biomarker signatures composed of the stage-salient genes yielded a significant early-stage panel consisting of FBN1 and FOXG1. Our studies have also spawned secondary results such as stage-agnostic genes that could serve as targets for drug discovery in CRC therapy. Consensus approaches, like the one used here, are more reliable, and the epigenetic biomarkers identified in our study could potentially advance the accurate early detection of colorectal cancers, their treatment and prognostic evaluation. The methods are extendable to the investigation of epigenomics in other cancers, normal/disease conditions, and developmental biology. (TXT) Click here for additional data file. 7 Jul 2021 PONE-D-21-08604 Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression PLOS ONE Dear Dr. Palaniappan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jul 22 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:  http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at  https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Alessandro Weisz Academic Editor PLOS ONE Additional Editor Comments: The manuscript has been reviewed by two experts in the filed that both found it quite good and of interest, despite some problems, highlighted in particular by R.2, that must be addressed before it can be considered for publication. Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: No ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In this paper Muthamilselvan et al., developed a comprehensive computational framework for stage-differentiated modelling of DNA methylation landscapes in CRC, found significant changes and discovery a novel CIMP-like signature bearing potential clinical significance. The data are supported by a strong statistical analysis. The paper can be acceptted for pubblication. Minor point +Page 3 the word "Tomczak" must be deleted. Reviewer #2: In this study, the Authors propose a computational workflow for the analysis of the DNA methylation aberrations in colorectal cancer (CRC) with a stage-differentiated perspective. Data have been collected from The Cancer Genome Atlas (TCGA) portal; as a result, the Authors identify 7 stage-characteristic genes that could be indicative of a novel CpG island methylator phenotype (CIMP). The manuscript provides an interesting perspective and a detailed explanation of how DNA methylation data could be analyzed to detect epigenetic signatures between normal and tumoral samples or in different stages of the disease. Bioinformatic procedures are well described and depicting a useful workflow when handling big and complex data from repositories such as TCGA. Unfortunately, however, I cannot avoid pointing out that the work has serious shortcomings that do not allow to accept its publication in the present version and must be mandatorily corrected. Importantly, even if interesting, data are presented in a confused manner; in particular, the Authors should better define the aim of the study and the experimental design, carefully organize the Results, improve the Discussion and avoid exaggerated conclusions, concerning in particular inferences drawn from data that should be better supported by experimental validation. Specific comments are listed below. Major Comments: 1. The Authors should carefully revise the text and correct some grammar mistakes. 2. Figures are in many cases blurred and not legible; their quality must be improved. 3. The Authors should pay attention to some typing mistakes and font differences (see, for example, pages 14 and 12). 4. TCGA collects data from at least 10 different studies on CRC; at least the cohort from which data have been collected should be reported in Materials and Methods. 5. TCGA collects data from 236 patients profiled with Human Methylation Bead Chip HM27, and 393 with HM450, measuring 27,000 and 480,000 CpG sites, respectively; why only the HM27 data have been used? 6. The correlation analyses between methylation and gene expression data are providing interesting information but should be better described by focusing the attention not only on the methodological procedure used but also on the biological meaning of the observed results. 7. In the Conclusions the Authors write: “All the stage-salient genes were found to be hypermethylated, indicating a novel CIMP-like character possibly promoting epigenetic destabilisation, which in turn would drive the progression of colorectal cancer”. First, it is not clear where the hypermethylation associated with these stage-salient genes is located (promoter, TSS, CpG island or gene body); this should be better explained. Then, the role of the stage-salient genes identified by the Authors should be better characterized in the context of CRC to indicate a possible novel CIMP-like phenotype; I would suggest the Authors to enrich the Discussion by adding more details and experimental evidence of the involvement of these genes in CRC pathogenesis. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 23 Jul 2021 >>>At the outset, we would like to thank the reviewers and the Editor for their valuable comments. The manuscript has been reviewed by two experts in the filed that both found it quite good and of interest, despite some problems, highlighted in particular by R.2, that must be addressed before it can be considered for publication. this has been done. >>>We have now substantially revised the manuscript and expanded the scope of our investigations / discussion. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. >>> done 2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. >>>This has been done. Reviewer #1: In this paper Muthamilselvan et al., developed a comprehensive computational framework for stage-differentiated modelling of DNA methylation landscapes in CRC, found significant changes and discovery a novel CIMP-like signature bearing potential clinical significance. The data are supported by a strong statistical analysis. The paper can be acceptted for pubblication. Minor point +Page 3 the word "Tomczak" must be deleted. >>> Thank you. We would like to thank Reviewer #1 for their time and the kind comments. Reviewer #2: In this study, the Authors propose a computational workflow for the analysis of the DNA methylation aberrations in colorectal cancer (CRC) with a stage-differentiated perspective. Data have been collected from The Cancer Genome Atlas (TCGA) portal; as a result, the Authors identify 7 stage-characteristic genes that could be indicative of a novel CpG island methylator phenotype (CIMP). The manuscript provides an interesting perspective and a detailed explanation of how DNA methylation data could be analyzed to detect epigenetic signatures between normal and tumoral samples or in different stages of the disease. Bioinformatic procedures are well described and depicting a useful workflow when handling big and complex data from repositories such as TCGA. Unfortunately, however, I cannot avoid pointing out that the work has serious shortcomings that do not allow to accept its publication in the present version and must be mandatorily corrected. Importantly, even if interesting, data are presented in a confused manner; in particular, the Authors should better define the aim of the study and the experimental design, carefully organize the Results, improve the Discussion and avoid exaggerated conclusions, concerning in particular inferences drawn from data that should be better supported by experimental validation. Specific comments are listed below. >>> We would like to thank Reviewer #2 for the careful reading of our paper and the critical comments. We have addressed all the many valid points in the present revision. We have undertaken a major revision of the manuscript in line with the suggestions. 1. The Authors should carefully revise the text and correct some grammar mistakes. >>> yes 2. Figures are in many cases blurred and not legible; their quality must be improved. >>> all figures have been redone in high-resolution tiff format 3. The Authors should pay attention to some typing mistakes and font differences (see, for example, pages 14 and 12). >>> yes 4. TCGA collects data from at least 10 different studies on CRC; at least the cohort from which data have been collected should be reported in Materials and Methods. >>> this has been identified and recorded in the manuscript under Methods. 5. TCGA collects data from 236 patients profiled with Human Methylation Bead Chip HM27, and 393 with HM450, measuring 27,000 and 480,000 CpG sites, respectively; why only the HM27 data have been used? >>> This point has also been addressed in the Methods section. Essentially 450k Chips show enrichment in gene body and intergenic regions. A distribution of the CpG sites with respect to the genomic / genic location clearly indicates this (data not shown). 27k data are enriched in CpG sites in promoter regions. Please refer the section under Methods. 6. The correlation analyses between methylation and gene expression data are providing interesting information but should be better described by focusing the attention not only on the methodological procedure used but also on the biological meaning of the observed results. >>> This has now been rectified. Indeed, we now show plots for only the stage-salient genes, to make the biological connections and meaning clear. We note that all the results from our investigations are available in the Supplementary Files. 7. In the Conclusions the Authors write: “All the stage-salient genes were found to be hypermethylated, indicating a novel CIMP-like character possibly promoting epigenetic destabilisation, which in turn would drive the progression of colorectal cancer”. First, it is not clear where the hypermethylation associated with these stage-salient genes is located (promoter, TSS, CpG island or gene body); this should be better explained. Then, the role of the stage-salient genes identified by the Authors should be better characterized in the context of CRC to indicate a possible novel CIMP-like phenotype; I would suggest the Authors to enrich the Discussion by adding more details and experimental evidence of the involvement of these genes in CRC pathogenesis. >>> We have now increased the literature weight for these statements and discussion. We have included a new Table 8 with the location of the Dm probes. and a new Figure 19 to support all these assertion. We also found a new publication citing stage-IV specificity for FAM123A while this manuscript was under review (medrxiv preprint of our work was available in October 2020). Thank you. Submitted filename: response2reviewers.pdf Click here for additional data file. 27 Aug 2021 PONE-D-21-08604R1 Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression PLOS ONE Dear Dr. Palaniappan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== ACADEMIC EDITOR: Please insert comments here and delete this placeholder text when finished. Be sure to: Indicate which changes you require for acceptance versus which changes you recommend Address any conflicts between the reviews so that it's clear which advice the authors should follow Provide specific feedback from your evaluation of the manuscript Please ensure that your decision is justified on PLOS ONE’s publication criteria and not, for example, on novelty or perceived impact. For Lab, Study and Registered Report Protocols: These article types are not expected to include results but may include pilot data. ============================== Please submit your revised manuscript by Oct 11 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:  http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at  https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Alessandro Weisz Academic Editor PLOS ONE Journal Requirements: Additional Editor Comments (if provided): The effort toward improving the text led to a significant improvement of the manuscript. Unfortunately, however, quality of most figures is still very poor, the images extremely blurred and the numbers difficult to visualize cause these ito be n most cases useless to the reader. Unless this problem is not correctly addressed and solved, the manuscript can not be accepted for publication. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 1 Sep 2021 We would like to thank the Editor and Reviewers for their comments. We have updated all the figures again, to ensure maximum clarity, and also subjected each individual figure to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ to ensure that every individual figure meets PLOS requirements. We can assure you that all figures meet the requirements. If there is anything wanting in any figure, please let us know the figure identification and we will rectify it immediately. As there are no other comments in the Decision Letter, we are re-submitting the manuscript for publication. Thank you. Submitted filename: Response to Reviewers_R2.docx Click here for additional data file. 6 Sep 2021 PONE-D-21-08604R2 Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression PLOS ONE Dear Dr. Palaniappan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected. Specifically: I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision. Yours sincerely, Alessandro Weisz Academic Editor PLOS ONE Additional Editor Comments (if provided): The A.s failed to take any action toward improving the figure outlay, that in my opinion is not suitable for a scientific publication. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] - - - - - For journal use only: PONEDEC3 16 Oct 2021 >>>Under "Specifically", the following was provided: "I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision." No concrete reason has been provided for the reject decision. There is an Additional Editor Comments (if provided): "The A.s failed to take any action toward improving the figure outlay, that in my opinion is not suitable for a scientific publication." It is possible that the links to the high-resolution figures from within the built PDF are not being viewed. However that may be, we have carried out a restructuring of the manuscript and especially the figures, as follows: (i) Figures 1, 2, & 3 combined into one figure (ii) Figures 4 & 8 converted into violinplots, and combined into one figure (iii) Figures 5, 6, & 7 combined into one figure (iv) Figures 12 & 13 combined into one figure (v) Figure 14 replaced with Upset plot. (vi) Figures 15 & 16 converted into violinplots and combined into one figure The updated figures have again been checked with the PACE digital diagnostic tool (and run by third party readers). A detailed response is provided in the Response to Reviewers document and is reproduced here for convenience: >>> (1) Each figure has been subjected to the PACE digital diagnostic tool, and adjusted accordingly. All figures passed the PACE test. This is the standard used by PLOS ONE for all figures. How could the figures then be of poor quality? All figures are publication quality figures directly obtained from analysis software /algorithms. Hence the decision is a clear deviation from PLOS ONE’s editorial policy. >>> (2) We have complied with all the suggestions made to us with respect to our submission. As a proactive measure, we have reworked the entire set of figures and replaced it with a new more compact set of figures. This has been done in the following manner: (i) Figures 1, 2, & 3 combined into one figure (ii) Figures 4 & 8 converted into violinplots, and combined into one figure (iii) Figures 5, 6, & 7 combined into one figure (iv) Figures 12 & 13 combined into one figure (v) Figure 14 replaced with Upset plot. (vi) Figures 15 & 16 converted into violinplots and combined into one figure This resulted in 11 figures from the original 19 figures. The reworked figures have again been checked with the PACE digital diagnostic tool (and run by third party readers). The manuscript has been accordingly updated. We request the reviewers to link to the high-resolution figures from the manuscript pdf. >>> (3) To reflect all tracked changes since the original manuscript submission, the changes have been color-coded in the following manner: blue for revision-1, red for revision-2, and green for changes post appeal. Revision R2: Response to Reviewers: Academic Editor: “The effort toward improving the text led to a significant improvement of the manuscript. Unfortunately, a however, quality of most figures is still very poor, the images extremely blurred and the numbers difficult to visualize cause these ito be n most cases useless to the reader. Unless this problem is not correctly addressed and solved, the manuscript can not be accepted for publication.” >>>We would like to thank the Editor and Reviewers for their comments. We have updated all the figures again, to ensure maximum clarity, and also subjected each individual figure to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ to ensure that every individual figure meets PLOS requirements. We can assure you that all figures meet the requirements. If there is anything wanting in any figure, please let us know the figure identification and we will rectify it immediately. Revision R1: Response to reviewers >>>At the outset, we would like to thank the reviewers and the Editor for their valuable comments. Academic Editor: The manuscript has been reviewed by two experts in the filed that both found it quite good and of interest, despite some problems, highlighted in particular by R.2, that must be addressed before it can be considered for publication. >>>We have addressed the points raised by the reviewer#2 and have substantially revised the manuscript and expanded the scope of our investigations / discussion. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. >>> Thank you, we have done the same. 2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. >>> Thank you, this has been done. Reviewer #1: In this paper Muthamilselvan et al., developed a comprehensive computational framework for stage-differentiated modelling of DNA methylation landscapes in CRC, found significant changes and discovery a novel CIMP-like signature bearing potential clinical significance. The data are supported by a strong statistical analysis. The paper can be acceptted for pubblication. Minor point +Page 3 the word "Tomczak" must be deleted. >>> Thank you, it was inadvertent and has been deleted. We would like to thank Reviewer #1 for their time and the kind comments. Reviewer #2: In this study, the Authors propose a computational workflow for the analysis of the DNA methylation aberrations in colorectal cancer (CRC) with a stage-differentiated perspective. Data have been collected from The Cancer Genome Atlas (TCGA) portal; as a result, the Authors identify 7 stage-characteristic genes that could be indicative of a novel CpG island methylator phenotype (CIMP). The manuscript provides an interesting perspective and a detailed explanation of how DNA methylation data could be analyzed to detect epigenetic signatures between normal and tumoral samples or in different stages of the disease. Bioinformatic procedures are well described and depicting a useful workflow when handling big and complex data from repositories such as TCGA. Unfortunately, however, I cannot avoid pointing out that the work has serious shortcomings that do not allow to accept its publication in the present version and must be mandatorily corrected. Importantly, even if interesting, data are presented in a confused manner; in particular, the Authors should better define the aim of the study and the experimental design, carefully organize the Results, improve the Discussion and avoid exaggerated conclusions, concerning in particular inferences drawn from data that should be better supported by experimental validation. Specific comments are listed below. >>> We would like to thank Reviewer #2 for the careful reading of our paper and the criticalcomments. We have addressed all the many valid points in the present revision. We have undertaken a major revision of the manuscript in line with the suggestions. 1. The Authors should carefully revise the text and correct some grammar mistakes. >>> yes 2. Figures are in many cases blurred and not legible; their quality must be improved. >>> all figures have been redone in high-resolution tiff format 3. The Authors should pay attention to some typing mistakes and font differences (see, for example, pages 14 and 12). >>> yes 4. TCGA collects data from at least 10 different studies on CRC; at least the cohort from which data have been collected should be reported in Materials and Methods. >>> this has been identified and recorded in the manuscript under Methods. 5. TCGA collects data from 236 patients profiled with Human Methylation Bead Chip HM27, and 393 with HM450, measuring 27,000 and 480,000 CpG sites, respectively; why only the HM27 data have been used? >>> This point has also been addressed in the Methods section. Essentially 450k Chips show enrichment in gene body and intergenic regions. A distribution of the CpG sites with respect to the genomic / genic location clearly indicates this (data not shown). 27k data are enriched in CpG sites in promoter regions. Please refer the section under Methods. 6. The correlation analyses between methylation and gene expression data are providing interesting information but should be better described by focusing the attention not only on the methodological procedure used but also on the biological meaning of the observed results. >>> This has now been rectified. Indeed, we now show plots for only the stage-salient genes, to make the biological connections and meaning clear. We note that all the results from our investigations are available in the Supplementary Files. 7. In the Conclusions the Authors write: “All the stage-salient genes were found to be hypermethylated, indicating a novel CIMP-like character possibly promoting epigenetic destabilisation, which in turn would drive the progression of colorectal cancer”. First, it is not clear where the hypermethylation associated with these stage-salient genes is located (promoter, TSS, CpG island or gene body); this should be better explained. Then, the role of the stage-salient genes identified by the Authors should be better characterized in the context of CRC to indicate a possible novel CIMP-like phenotype; I would suggest the Authors to enrich the Discussion by adding more details and experimental evidence of the involvement of these genes in CRC pathogenesis. >>> We have now increased the literature weight for these statements and discussion. We have included a new Table 8 with the location of the DM probes. and a new Figure 19 to support these assertions. We further found a new publication citing stage-IV specificity for FAM123A while this manuscript was under review (medrxiv preprint of our work was available in October 2020). This has been included in the References. >>> Thank you. Submitted filename: Response2Reviewers_FinalSubmission.pdf Click here for additional data file. 12 Jan 2022
PONE-D-21-08604R3
Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression
PLOS ONE Dear Dr. Palaniappan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised below by the reviewers during the review process. In addition, the authors are requested to provide high quality figures.
Please ensure that your decision is justified on PLOS ONE’s publication criteria and not, for example, on novelty or perceived impact. For Lab, Study and Registered Report Protocols: These article types are not expected to include results but may include pilot data. ============================== Please submit your revised manuscript by Feb 26 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Surinder K. Batra Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. Additional Editor Comments (if provided): [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #3: All comments have been addressed Reviewer #4: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #3: Yes Reviewer #4: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #3: Yes Reviewer #4: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #3: Yes Reviewer #4: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #3: Yes Reviewer #4: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #3: This is an interesting and potentially useful study where a comprehensive computational framework for modelling of the stage-associated DNA methylation in colorectal cancer (CRC) was developed. The authors have addressed previous critiques. Reviewer #4: While the authors have addressed few of the comments from the previous revisions, a few minor points mainly about the cohort remain to be addressed in a greater detail- 1. As pointed out previously, there are various TCGA CRC the authors however do not clearly mention the details of the cohort. 2. It is briefly mentioned how the samples were from studies from Johns Hopkins or UCSC studies but it is unclear if these are two different studies from these centers (hence raising a question of batch effect) or if this was a combined effort. The authors should consider adding more information about the cohort. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #3: No Reviewer #4: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
21 Jan 2022 Response to Reviewers: >>> We would like to thank the Academic Editor and the Reviewers for helping improve our manuscript. All figures have individually passed the PACE digital diagnostic tool, https://pacev2.apexcovantage.com/ to ensure meeting PLOS requirements. References have been thoroughly checked once again. Reviewer #3: This is an interesting and potentially useful study where a comprehensive computational framework for modelling of the stage-associated DNA methylation in colorectal cancer (CRC) was developed. The authors have addressed previous critiques. >>> Thanks. Reviewer #4: While the authors have addressed few of the comments from the previous revisions, a few minor points mainly about the cohort remain to be addressed in a greater detail- 1. As pointed out previously, there are various TCGA CRC the authors however do not clearly mention the details of the cohort. 2. It is briefly mentioned how the samples were from studies from Johns Hopkins or UCSC studies but it is unclear if these are two different studies from these centers (hence raising a question of batch effect) or if this was a combined effort. The authors should consider adding more information about the cohort. >>> The comments are related, and we would like to thank the reviewer for raising this. The 27k methylation data were all shipped and processed by a single organization with code:05 JHU_USC center (Johns Hopkins – Univ. Southern California). This ensures homogeneity in data processing and submission to the TCGA. But the tissue source sites could be numerous (see below), so we carried out an analysis of batch-effects (https://bioinformatics.mdanderson.org/public-software/tcga-batch-effects/). This yielded a Dispersion Separability Criterion of 0.299, which is < 0.5 and much lower than the recommended threshold of 1.0 for batch-correction. At the minimum, DSC values need to be >0.5 to consider the possibility of batch effects existing in the data. Thus, the samples within batches are as homogeneous to each other only as the batches themselves are to each other. For instance, there were 29 tissue source sites for COAD alone: 1. Albert Einstein Medical Center 2. Mary Bird Perkins Cancer Center - Our Lady of the Lake 3. Duke University 4. University of Sao Paulo 5. Christiana Healthcare 6. Indivumed 7. International Genomics Consortium 8. Cureline 9. St. Joseph's Medical Center-(MD) 10. UNC 11. University of Pittsburgh 12. ILSBio 13. Harvard 14. MSKCC 15. Greater Poland Cancer Center 16. University Of Michigan 17. Asterand 18. Roswell Park 19. Candler 20. BLN - Baylor 21. University of Chicago 22. CHI-Penrose Colorado 23. Northwestern University 24. St. Joseph's Hospital AZ 25. Medical College of Georgia 26. Molecular Response 27. Institute of Human Virology Nigeria 28. University of Kansas 29. Wake Forest University >>>We would like to thank the AE and Reviewers again for their comments. Submitted filename: Response to Reviewers_R4.pdf Click here for additional data file. 2 Feb 2022 Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression PONE-D-21-08604R4 Dear Dr. Palaniappan, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Surinder K. Batra Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 4 Feb 2022 PONE-D-21-08604R4 Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression Dear Dr. Palaniappan: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Prof. Surinder K. Batra Academic Editor PLOS ONE
  82 in total

Review 1.  The DNA methylation paradox.

Authors:  P A Jones
Journal:  Trends Genet       Date:  1999-01       Impact factor: 11.639

2.  The CpG island methylator phenotype: what's in a name?

Authors:  Laura A E Hughes; Veerle Melotte; Joachim de Schrijver; Michiel de Maat; Vincent T H B M Smit; Judith V M G Bovée; Pim J French; Piet A van den Brandt; Leo J Schouten; Tim de Meyer; Wim van Criekinge; Nita Ahuja; James G Herman; Matty P Weijenberg; Manon van Engeland
Journal:  Cancer Res       Date:  2013-06-25       Impact factor: 12.701

3.  The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more "personalized" approach to cancer staging.

Authors:  Mahul B Amin; Frederick L Greene; Stephen B Edge; Carolyn C Compton; Jeffrey E Gershenwald; Robert K Brookland; Laura Meyer; Donna M Gress; David R Byrd; David P Winchester
Journal:  CA Cancer J Clin       Date:  2017-01-17       Impact factor: 508.702

Review 4.  CpG island methylator phenotype in cancer.

Authors:  Jean-Pierre Issa
Journal:  Nat Rev Cancer       Date:  2004-12       Impact factor: 60.716

5.  Detection of hypermethylated fibrillin-1 in the stool samples of colorectal cancer patients.

Authors:  Qi Guo; Yongchun Song; Hao Zhang; Xuandi Wu; Peng Xia; Chengxue Dang
Journal:  Med Oncol       Date:  2013-08-21       Impact factor: 3.064

6.  [Role and mechanism of FOXG1 in invasion and metastasis of colorectal cancer].

Authors:  Haixia Wu; Cheng Qian; Chungang Liu; Junyu Xiang; Di Ye; Zhenfang Zhang; Xianquan Zhang
Journal:  Sheng Wu Gong Cheng Xue Bao       Date:  2018-05-25

7.  FAM123A binds to microtubules and inhibits the guanine nucleotide exchange factor ARHGEF2 to decrease actomyosin contractility.

Authors:  Priscila F Siesser; Marta Motolese; Matthew P Walker; Dennis Goldfarb; Kelly Gewain; Feng Yan; Rima M Kulikauskas; Andy J Chien; Linda Wordeman; Michael B Major
Journal:  Sci Signal       Date:  2012-09-04       Impact factor: 8.192

8.  Novel significant stage-specific differentially expressed genes in hepatocellular carcinoma.

Authors:  Arjun Sarathi; Ashok Palaniappan
Journal:  BMC Cancer       Date:  2019-07-05       Impact factor: 4.430

9.  Crosstalk between DNA methylation and gene expression in colorectal cancer, a potential plasma biomarker for tracing this tumor.

Authors:  Mohammad Amin Kerachian; Ali Javadmanesh; Marjan Azghandi; Afsaneh Mojtabanezhad Shariatpanahi; Maryam Yassi; Ehsan Shams Davodly; Amin Talebi; Fatemeh Khadangi; Ghodratollah Soltani; Abdorasool Hayatbakhsh; Kamran Ghaffarzadegan
Journal:  Sci Rep       Date:  2020-02-18       Impact factor: 4.379

10.  Differential DNA methylation in high-grade serous ovarian cancer (HGSOC) is associated with tumor behavior.

Authors:  Henry D Reyes; Eric J Devor; Akshaya Warrier; Andreea M Newtson; Jordan Mattson; Vincent Wagner; Gabrielle N Duncan; Kimberly K Leslie; Jesus Gonzalez-Bosquet
Journal:  Sci Rep       Date:  2019-11-29       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.