Changqin Liu1, Wei Wu1, Wenju Chang2, Ruijin Wu1, Xiaomin Sun1, Huili Wu3, Zhanju Liu1,3. 1. Department of Gastroenterology, The Shanghai Tenth People's Hospital of Tongji University, Shanghai 200072, P.R. China. 2. Department of General Surgery, Zhongshan Hospital of Fudan University, Shanghai 200032, P.R. China. 3. Department of Gastroenterology, Zhengzhou Central Hospital Affiliated to Zhengzhou University, Zhengzhou, Henan 450007, P.R. China.
Colorectal cancer (CRC) is one of the most frequently diagnosed malignancies and one of the leading causes of mortality worldwide (1). In 2018, there were >1.8 million new cases of CRC and 881,000 deaths worldwide, accounting for ~1 in 10 cancer cases and deaths (1). Overall, CRC ranked the third in incidence and the second in mortality (1). Currently, although the etiology and pathology are still not fully understood, it is generally considered that CRC is caused by multiple factors such as environmental factors, lifestyle, and genetic susceptibility (2). CRC may be caused by mutations that target oncogenes, tumor suppressor genes and genes related to DNA repair mechanisms (3). It can be classified as sporadic (70%), inherited (5%) or familial (25%) according to the origin of the mutation and the pathologies are classified into three types, chromosomal instability, microsatellite instability (MSI), and CpG island methylator phenotype. In these types of CRC, common mutations, as well as chromosomal changes and translocations have been reported to affect important pathways (such as MAPK/PI3K, WNT, TP53 and TGF-β signaling) (3). In addition to gene mutations, changes in long non-coding RNA or microRNA (miRNA/miR) are also found to be involved in different stages of carcinogenesis and may serve as predictive biomarkers (3).The incidence of CRC has been rapidly rising in people <50 years old in the past 20 years (1,4). Moreover, early-onset colorectal cancer (EOCRC, <50 years old) differs from late-onset CRC (LOCRC, >50 years old) in numerous aspects, such as distinctive histological features, site of tumor location, stage at the presentation, and molecular profiles (5–7). Therefore, improved understanding the molecular mechanisms of EOCRC may help the development of precise screening and therapeutic strategies.EOCRC can be divided into two distinct subtypes, including the inherited subtype, which is a well-documented hereditary condition, and the sporadic subtype, which occurs without prior family history. Hereditary cases account for ~30% of EOCRC cases (8). The pathogenesis of the inherited subtype has been well characterized, and is mainly related to Lynch syndrome (9). A previous study has reported that 16% (72/450) of patients with EOCRC have gene mutations and that Lynch syndrome germline mutations in mismatch repair (MMR) genes, including MLH1, MSH2, MSH2/monoallelic MUTYH, MSH6 and PMS2, account for nearly 50% cases (37/72) (10). Moreover, another study using weighted gene co-expression network analysis has predicted that seven genes (SPARC, DCN, FBN1, WWTR1, TAGLN, DDX28 and CSDC2) play an important role in the pathogenesis of EOCRC (11). However, the molecular features of sporadic EOCRC (SEOCRC) are still undefined.In the present study, the mRNA and miRNA profiles of SEOCRC and sporadic LOCRC (SLOCRC) were analyzed using next-generation sequencing (Illumina HiSeq) and bioinformatics. Differentially expressed mRNAs and miRNAs in SEOCRC and SLOCRC were identified and validated using reverse transcription-quantitative PCR (RT-qPCR). The expression of the DMD gene was further examined using immunohistochemistry, and its clinical relevance to the prognosis was also evaluated.
Materials and methods
Patients and sample collection
Cohort 1
Between February and July 2019, 13 patients with primary CRC between 18 and 80 years old were recruited in the Shanghai Tenth People's Hospital of Tongji University (China), including 8 with SEOCRC (32–47 years, 4 males) and 5 with SLOCRC (60–72 years, 4 males). Tumor and pericarcinomatous tissues (5 cm away from visible tumor edges) were collected and stored at −80°C until RNA isolation. The pathological stage was defined according to the UICC/AJCC TNM classification system (https://www.uicc.org/resources/tnm). Details are shown in Table I.
Table I.
Histopathological characteristics of the patients with SEOCRC and SLOCRC.
Patient
Age, years
Sex
Location of tumor
Dimensions, cm
TNM staging
UICC staging
Dukes' staging
MAC staging
Early 1
47
Female
Sigmoid colon
6×4.5×1.5
T4aN0M0
IIB
B
B2
Early 2
33
Male
Sigmoid colon
5×4.5
T3N2bM0
IIIC
C
C2
Early 3
43
Male
Ascending colon
4×2×2
T1N0M0
I
A
A
Early 4
32
Female
Rectum
3×2×1
T1N1aM0
IIIA
C
C1
Early 5
37
Female
Transverse colon
6×4.5×1.1
T4aN1aMO
IIIB
C
C2
Early 6
46
Female
Ascending colon
4×2
T4aN1Am0
IIIB
C
C2
Early 7
46
Male
Sigmoid colon
11×10×8
T3N0M0
IIA
B
B2
Early 8
42
Male
Sigmoid colon
2×2
T3N2aM1a
IVA
-
-
Late 1
60
Male
Rectum
1.8×1.6×0.8
T1N0M0
I
A
A
Late 2
61
Male
Sigmoid colon
5×4×1
T3N0M0
IIA
B
B2
Late 3
64
Male
Transverse colon
6.5×3.5
T3N0M0
IIA
B
B2
Late 4
60
Male
Rectum
6×4.5×1
T4aN0M0
IIB
B
B2
Late 5
72
Female
Rectum
6×3
T3N0M0
IIA
B
B2
Early: Tumor tissue from patients with SEOCRC. Late: Tumor tissue from patients with SLOCRC. The data were obtained from Cohort 1. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer; UICC, Union for International Cancer Control; MAC, Modified Astler Coller.
Cohort 2
The present study also selected the mRNA and miRNA data of 74 tumor tissues (31–49 years; 33 males, 41 females) and 3 pericarcinomatous tissues (41–48 years; all females) from different patients with EOCRC, and 531 tumor tissues (50–90 years, 286 males, 245 females) and 8 pericarcinomatous tissues (54–90 years; 2 males, 6 females) from different patients with LOCRC from the Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) which is publicly available.
Cohort 3
Between July and December 2019, paired specimens of 13 tumors and 13 paracancerous SEOCRC tissue samples (33–48 years; 8 males), as well as 11 tumor and 11 SLOCRC paracancerous tissue samples (53–79 years; 7 males) were collected in the Shanghai Tenth People's Hospital of Tongji University. For each patient, one tissue section was stored at −80°C for RNA isolation and another tissue section was embedded in paraffin for immunohistochemistry.
Cohort 4
Surgical specimens of sporadic CRC tissues and adjacent normal tissues were obtained from patients with a diagnosis of primary SEOCRC who underwent surgery in the Shanghai Tenth People's Hospital of Tongji University between January 2011 and December 2015. None of the patients had received radiotherapy before surgery excision. A total of 80 tissue samples (30–48 years, 47 males) were immediately frozen in liquid nitrogen and stored at −80°C until further use.The diagnosis of all patients was confirmed by colonoscopy and pathology. Inherited cases and patients who received radiotherapy or chemotherapy before surgery or colonoscopy were all excluded. Informed written consent was obtained from all patients, and the study was approved by the Ethics Committee of the Shanghai Tenth People's Hospital, Tongji University.
RNA isolation
RNA was isolated from tumor and pericarcinomatous tissues using TriReagent (Ambion Inc.). Agarose gel electrophoresis was performed to determine the extent of RNA degradation and contamination, and the purity of the RNA was also measured by Nanodrop (ND-1000). The concentration was precisely quantified using a Qubit3 (Thermo Fisher Scientific, Inc.), and the integrity was measured using an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc.). Samples with a RIN value of 7 and above were used for further analysis.
RNA sequencing (RNAseq)
The RNA-seq transcriptome library was prepared using 1 µg total RNA by TruSeq RNA sample preparation kit (cat. no. RS-122-2001; Illumina, Inc.) according to the manufacturer's instructions. Libraries were size-selected for cDNA target fragments of 300 bp on 2% Low Range Ultra Agarose followed by PCR amplification using Phusion DNA polymerase (New England Biolabs) for 15 PCR cycles. After quantification using a Qubit3, the paired-end RNA-seq sequencing library was sequenced using the HiSeq X Ten Reagent Kit v2.5 (cat. no. FC-501-2501; Illumina, Inc.) with the Illumina HiSeq Xten (2×150 bp read length) system.
Small RNA sequencing
Small RNA sequencing libraries were created using 1 µg total RNA according to the TruSeq small RNA sample Preparation kit (cat. no. RS-200-0048; Illumina, Inc.). Reverse transcription was performed to generate cDNA libraries and PCR was used to amplify and add unique index sequences to each library. After quantification using a Qubit3, the small RNA sequencing library was sequenced using HiSeq X Ten Reagent kit v2.5 (cat. no. FC-501-2501; Illumina, Inc.) with the Illumina HiSeq X Ten system.
Identification of differentially expressed genes (DEGs)
The raw paired end reads were trimmed and quality-controlled using SeqPrep (v1.3.2-4; http://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle) with default parameters. Subsequently, clean reads were separately aligned to the reference genome (hg19) using HISAT2 (v2.1.0; http://ccb.jhu.edu/software/hisat2/index.shtml) software. The mapped reads of each sample were assembled using StringTie (v2.0.5; https://ccb.jhu.edu/software/stringtie/index.shtml) with a reference-based approach as described previously (12). Differential expression analysis was performed for the RNA-seq data using the edgeR v3.26.8 in R v3.6.0 with false discovery rate (FDR) correction (13,14). The genes that met the conditions of log2 fold-change (log2FC) >2 (where FC is the fold change in expression) and P<0.01 were considered to be differentially expressed.
Identification of differentially expressed miRNAs (DEMs)
FASTX-Toolkit (v0.0.13; http://hannonlab.cshl.edu/fastx_toolkit/) was used to cut all small RNA sequencing reads at the 3′ end to remove the adapter sequences. After adaptor trimming, reads were aligned to the human genome build 19 (hg19) using BLAST 2.10.1 (http://blast.ncbi.nlm.nih.gov/). The number of reads with each known microRNA from miRBase v22 was counted using mirdeep2 (https://drmirdeep.github.io/mirdeep2_tutorial.html). DEMs were obtained using edgeR package using log2 fold-change (log2FC) >2 and P<0.01 as cut-offs.
Prediction of regulatory miRNAs of DEGs
According to the recognition mechanism of miRNAs and mRNAs, the DEM and DEG pairs were selected in SEOCRC by bioinformatics analysis using miRTarBase database 8.0 (http://mirtarbase.mbc.nctu.edu.tw/).
miRNA extraction and RT-qPCR
To determine miRNA levels, total RNA of colon tissue (cohort 3) was isolated with the miRcute miRNA Isolation Kit (Tiangen Biotech Co., Ltd.) according to the manufacturer's protocol. miRNA was reverse transcribed into cDNA using a miRcute miRNA First-Strand cDNA Synthesis Kit (Tiangen Biotech Co., Ltd.) at 37°C for 60 min. A miRcute miRNA qPCR Detection Kit (SYBR Green; Tiangen Biotech Co., Ltd.) was used for RT-qPCR analysis on an ABI 7500 fast real-time PCR system (Applied Biosystems) following the manufacturer's instructions. The cDNA (1 µl) was added to a 10-µl reaction system for amplification at 94°C for 2 min; followed by 42 cycles of 94°C for 20 sec and 60°C for 34 sec. All reactions were performed in triplicate. The specificity of the qPCR product was confirmed using melting curve analysis, and miRNAs with a Cq value >35 and a detection rate <75% in each group were excluded from further analysis. The relative expression of miRNA was normalized to that of the internal control U6. Relative expression was calculated using the 2−ΔΔCq method (15). The sequences of the forward primers are shown in Table II. Universal Reverse primers were obtained from Tiangen Biotech Co., Ltd.
Table II.
Forward primers used for reverse transcription-quantitative PCR analysis of cohort 2.
Gene name
Sequence (5′-3′)
hsa-mir-9-3p
ATAAAGCTAGATAACCGAAAGT
hsa-mir-10b-5p
TACCCTGTAGAACCGAATTTGTG
hsa-mir-31-3p
TGCTATGCCAACATATTGCCAT
hsa-mir-31-5p
AGGCAAGATGCTGGCATAGCT
hsa-mir-34b-3p
CAATCACTAACTCCACTGCCAT
hsa-mir-101-5p
CAGTTATCACAGTGCTGATGCT
hsa-mir-204-5p
TTCCCTTTGTCATCCTATGCCT
hsa-mir-206
TGGAATGTAAGGAAGTGTGTGG
hsa-mir-592
TTGTGTCAATATGCGATGATGT
U6
CGCAAGGATGACACGCAAATTCGT
A universal reverse primer was used (cat. no. FP401-02; Tiangen Biotech Co., Ltd.). hsa, Homo sapiens; mir, microRNA.
RNA extraction and RT-qPCR
RNA was extracted from tissue samples (cohort 3 and 4) using the conventional TRIzol® (Invitrogen; Thermo Fisher Scientific, Inc.) method. Up to 1 µg total RNA was reversed transcribed into cDNA using the cDNA synthesis kit (Takara Bio, Inc.). The reaction conditions were 37°C for 15 min and 85°C for 5 sec. The following primer pairs were used: DMD forward, 5′-TGGGCAAACTGTATTCACTCAAAC-3′ and reverse, 5′-TTCCCTTGTGGTCACCGTAGT-3′; GAPDH forward, 5′-GGAGCGAGATCCCTCCAAAAT-3′ and reverse, 5′-GGCTGTTGTCATACTTCTCATGG-3′. qPCR assays were performed using SYBR Green qRT-PCR kits (Takara Bio, Inc.). For each sample, 10-µl reactions were set up containing 5 µl SYBR Premix, 0.2 µl ROX-2, 0.2 µl forward primer (10 µM/µl), 0.2 µl reverse primer (10 µM/µl), 1 µl cDNA, 3.4 µl ddH2O. All PCR reactions were performed in triplicate. The following cycling protocol was used: 95°C for 30 sec, followed by 40 cycles of 95°C for 5 sec and 60°C for 30 sec. The relative expression levels for the target gene were calculated using the 2−ΔΔCq method (15).
Immunohistochemistry
Immunohistochemical (IHC) staining was performed on 4-µm sections of paraffin-embedded tissue samples to detect the expression levels of DMD protein from patients in Cohort 3. Paraffin-embedded tissue sections were mounted on glass slides and heated for 30 min at 55°C. Then they were dewaxed three times in xylene for 10 min each time, followed by rehydration: 100% ethanol twice for 5 min each time, 90% ethanol for 5 min, 70% ethanol for 5 min, ddH2O for 5 min. H2O2 solution (3%) was used to block endogenous peroxidase activity for 10 min at 37°C and phosphate-buffered saline (PBS; Thermo Fisher Scientific, Inc.) was used to wash the slides twice for 5 min each time. The sections were immersed in 0.01 mmol/l sodium citrate buffer solution (pH 6.0; Thermo Fisher Scientific, Inc.) and incubated at 100°C for 20 min and then they were rinsed twice with PBS (Thermo Fisher Scientific, Inc.) for 5 min each time. After incubation with 5% normal goat serum (Thermo Fisher Scientific, Inc.) for 20 min at 37°C, these sections were then incubated with anti-human DMD monoclonal antibody (Abcam; catalog no. ab15277; dilution 1:200) at 4°C overnight. After washing, the sections were incubated for 60 min with HRP-conjugated goat anti-Rabbit IgG (Abcam; catalog no. ab150077; dilution 1:400) at room temperature. The color reaction was developed with 3,3′-diaminobenzidine and the sections were counterstained with hematoxylin at room temperature for 30 sec followed by rinsing with distilled water for 30 min. Then the slides were dehydrated as follows: 70% ethanol dehydration for 3 min, 80% ethanol for 3 min, 95% ethanol for 3 min, 95% ethanol for 3 min, anhydrous ethanol for 3 min and xylene for 3 min. A light Leica microscope was used at ×100 and ×200 magnification (Leica Microsystems GmbH).
Statistical analysis
The GraphPad prism 5.0 software (GraphPad software, Inc.) was used for statistical analysis. According to whether the data are normally distributed, the RT-qPCR results were analyzed with paired t-test or Wilcoxon's signed rank tests. The association between the clinicopathological characteristics of the patients and DMD expression was analyzed using Fisher's exact test. The survival curves were plotted according to the Kaplan-Meier method, and the log-rank test was used for their statistical analysis. P<0.05 was considered to indicate a statistically significant difference.
Results
Expression profiles of DEGs in SEOCRC and SLOCRC
In the present study, 13 patients with CRC were enrolled and divided into SEOCRC (<50 years; n=8) and SLOCRC (≥50 years; n=5) groups (cohort 1). A total of 1,589 DEGs were identified between the tumor (n=8) and pericarcinomatous tissues (n=7; one samples was excluded to poor RNA quality) of patients with SEOCRC, including 913 upregulated genes and 676 downregulated genes (Fig. 1A). In SLOCRC, 1,383 DEGs were identified between tumor and pericarcinomatous tissues (n=5 each), including 481 upregulated genes and 902 downregulated genes (Fig. 1B). By comparing the DEGs between SEOCRC and SLOCRC, 837 DEGs were found only in SEOCRC and 631 DEGs only in SLOCRC (Fig. 1C and D).
Figure 1.
Identification of DEGs in SEOCRC and SLOCRC. (A) Heatmap of the top 25 DEGs in SEOCRC tissues compared with pericarcinomatous tissue. (B) Heatmap of the top 25 DEGs in SLOCRC tissues compared with pericarcinomatous tissue. (C) Heatmap of the top 25 DEGs in SEOCRC tissues compared with SLOCRC. (D) Venn diagram showing 837 DEGs unique to SEOCRC and 631 unique to SLOCRC of experimental group. The data were obtained from Cohort 1. (E) Venn diagram showing 125 DEGs unique to SEOCRC and 1,056 unique to SLOCRC of TCGA group. The data were obtained from Cohort 2. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer; DEG, differentially expressed gene.
To confirm these results, TCGA datasets were analyzed. In the TCGA group (cohort 2), a differential analysis was performed based on mRNA profiling data of 74 cases with CRC and 3 pericarcinomatous normal control tissues of EOCRC that were extracted from TCGA data portal. In total, 655 DEGs were identified, including 150 upregulated genes and 505 downregulated genes. Similarly, 1586 DEGs were identified between 531 cancer tissues and 8 pericarcinomatous tissue samples of LOCRC, including 712 upregulated genes and 874 downregulated genes. Among the 655 and 1586 DEGs, 125 DEGs were specific to EOCRC and 1,056 DEGs were specific to LOCRC, respectively (Fig. 1E).By combining the results of these two mRNA profiling studies, DMD and MPPED2 were identified (Table III) as the signature genes in the EOCRC group, consistent with a previous report showing MPPED2 as a hypermethylated biomarker of CRC (16). Taken together, these results indicated that the molecular mechanism of SEOCRC is different from that in SLOCRC and that DMD and MPPED2 may play a role in the onset of SEOCRC.
Table III.
DEGs specific to SEOCRC were shared by the experimental group and the TCGA group.
Group
Gene
Log2 FC
P-value
FDR
Expression
Experimental group
DMD
−2.769013037
1.37×10−10
1.43×10−8
Downregulated
Experimental group
MPPED2
−3.151889779
0.002553869
0.009940859
Downregulated
TCGA group
DMD
−2.188531129
0.000134702
0.003064467
Downregulated
TCGA group
MPPED2
−2.054712209
0.000520196
0.00879555
Downregulated
The data were obtained from Cohort 1. hsa, Homo sapiens; miR, microRNA; DEG, differentially expressed gene; DEM, differentially expressed microRNA; SEOCRC, sporadic early-onset colorectal cancer; FC, fold change; FDR, false discovery rate.
Expression profiles of DEMs in SEOCRC & SLOCRC
In the experimental group, 116 DEMs were identified between 8 tumor and 7 pericarcinomatous tissue samples of SEOCRC, including 68 upregulated and 48 downregulated miRNAs (Fig. 2A). 99 DEMs were identified between 5 cancer tissues and 5 pericarcinomatous tissues of SLOCRC, including 25 upregulated and 74 downregulated miRNAs (Fig. 2B). Among these DEMs, 78 DEMs were specific to EOCRC while 61 DEMs were specific to LOCRC (Fig. 2C).
Figure 2.
Identification of DEMs in SEOCRC and SLOCRC. (A) Heatmap of the top 25 DEMs in SEOCRC tissue compared with pericarcinomatous tissue. (B) Heatmap of the top 25 DEMs in SLOCRC compared with pericarcinomatous tissue. (C) Venn diagram showing 78 DEMs unique to SEOCRC and 61 unique to SLOCRC of experimental group. The data were obtained from Cohort 1. (D) Venn diagram showing 22 DEMs unique to SEOCRC and 130 unique to SLOCRC in TCGA data. The data were obtained from Cohort 2. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer; DEM, differentially expressed microRNA.
In TCGA group (cohort 2), a differential analysis was also carried out based on miRNA profiling data of 74 CRC cases and 3 pericarcinomatous tissue samples from patients of EOCRC from TCGA. In total, 217 DEMs were identified, including 137 upregulated miRNAs and 80 downregulated miRNAs. Similarly, 325 DEMs were identified between 531 cancer and 8 pericarcinomatous tissues samples from patients with LOCRC, including 198 upregulated miRNAs and 127 down-regulated miRNAs. Among these DEMs, 22 DEMs were specific to EOCRC while 130 DEMs were specific to LOCRC (Fig. 2D).By combining the results of these two miRNA profiling studies, miR-31-5p and miR-31-3p were identified as the distinctive miRNAs in the EOCRC group (Table IV), which has previously been reported to be associated with CRC (17–21). Taken together, these results further indicated that the molecular mechanism of SEOCRC is different from that of SLOCRC and that miR-31-5p and miR-31-3p may take part in the pathogenesis of SEOCRC.
Table IV.
DEMs specific to SEOCRC were shared by the experimental group and the TCGA group.
Group
Gene
Log FC
P-value
FDR
Type
Experimental group
hsa-miR-31-5p
3.43133174
0.001694065
0.018612422
Upregulated
Experimental group
hsa-miR-31-3p
6.375239183
0.001308524
0.017072145
Upregulated
TCGA group
hsa-miR-31
5.590640822
0.003093682
0.009872114
Upregulated
The data were obtained from Cohort 1. hsa, Homo sapiens; miR, microRNA; DEG, differentially expressed gene; DEM, differentially expressed microRNA; SEOCRC, sporadic early-onset colorectal cancer; FC, fold change; FDR, false discovery rate.
Identification of key tumor-related genes and their regulatory miRNAs in SEOCRC
All SEOCRC private DEGs and DEMs in the experimental (cohort 1) and the TCGA (cohort 2) groups were matched using miRTarBase database. There were 10 DEMs and DEGs matched pairs in the experimental group including CDK4 with miR-34b-3p, DMD with miR-31-5p, DMD with miR-9-3p, TFAP2C with miR-10b-5p, NECTIN4 with miR-31-3p, IGFBP2 with miR-204-5p, SOX9 with miR-206, SOX9 with miR-101-5p, SOX9 with miR-592 and CSMD1 with miR-10b-5p. Moreover, there were 2 DEGs and DEMs pairs in the TCGA group, including DMD with miR-31-5p and SOX4 with miR-31-5p. Interestingly, DMD was observed to be downregulated while miR-31-5p was upregulated in both the experimental group and TCGA groups (Table V) (22–31). Therefore, the miR-31-5p-DMD pair was selected as a candidate biomarker in the development of SEOCRC.
Table V.
Key DEGs and paired DEMs identified in SEOCRC.
A, TCGA group
First author, year
miRTarBase ID
miRNA
Target gene
Target Entrez Gene ID
Experiments
Support type
(Refs.)
Cacchiarelli et al, 2011
MIRT005456[a]
hsa-miR-31-5p
DMD
1756
Luciferase reporter assay, RT-qPCR, western blotting
Functional MTIs
(22)
Koumangoye et al, 2015
MIRT733212
hsa-miR-31-5p
SOX4
6659
Chromatin immunoprecipitation, immunoprecipitation, RT-qPCR, western blotting
Functional MTIs
(23)
B, Experimental group
First author, year
miRTarBase ID
miRNA
Target gene
Target gene Gene ID
Experiments
Support type
(Refs.)
Suzuki et al, 2010
MIRT003450
hsa-miR-34b-3p
CDK4
1019
Microarray, western blotting, RT-qPCR
Functional MTIs
(24)
Cacchiarelli et al, 2011
MIRT005456[a]
hsa-miR-31-5p
DMD
1756
Luciferase reporter assay, RT-qPCR, western blotting
Functional MTIs
(22)
Gabriely et al, 2011
MIRT006367
hsa-miR-10b-5p
TFAP2C
7022
Luciferase reporter assay, western blotting
Functional MTIs
(25)
Geekiyanage et al, 2016
MIRT731898
hsa-miR-31-3p
NECTIN4
81607
Luciferase reporter assay, western blotting
Functional MTIs
(26)
Chen et al, 2016
MIRT732358
hsa-miR-204-5p
IGFBP2
3485
Western blotting, luciferase reporter assay, microarray, RT-qPCR
Functional MTIs
(27)
Sim et al, 2016
MIRT733192
hsa-miR-9-3p
DMD
1756
Luciferase reporter assay
Functional MTIs
(28)
Zhang et al, 2015
MIRT733693
hsa-miR-206
SOX9
6662
Luciferase reporter assay, western blotting
Functional MTIs
(29)
Liu et al, 2017
MIRT734338
hsa-miR-101-5p
SOX9
6662
Luciferase reporter assay, RT-qPCR, western blotting
Indicates the DEM-DEG pairs that are shared between the experimental group and the TCGA dataset. The data were obtained from Cohort 1. hsa, Homo sapiens; miR, microRNA; DEG, differentially expressed gene; DEM, differentially expressed microRNA; SEOCRC, sporadic early-onset colorectal cancer; MTI, miRNA-target interaction; RT-qPCR, reverse transcription-quantitative PCR.
miR-31-5p acts as biomarker in patients with SEOCRC
To validate the expression of these nine miRNAs in patients with CRC, miRNA levels were determined using RT-q PCR in 13 tumor and 13 paracancerous tissue samples from patients with SEOCRC, and 11 tumor and 11 paracancerous tissue samples from patients with SLOCRC (Cohort 3). As shown in Fig. 3, the levels of miR-31-5p were significantly upregulated in tumor tissues compared with paracancerous tissues in patients with SEOCRC (P=0.020), whereas no significant difference was observed in the SLOCRC group (P=0.465; Fig. 3A). The level of miR-592 was significantly increased in tumor compared with paracancerous tissue samples in both the SEOCRC (P<0.001) and the SLOCRC group (P=0.003; Fig. 3B). No statistically significant difference was observed in the levels of miR-9-3p, miR-34b-3p and miR-101-5p between tumor and paracancerous tissue samples in either the SEOCRC group (P=0.376, P=0.787 and P=0.138, respectively) or the SLOCRC group (P=0.276, P=0.131 and P=0.765, respectively; Fig. 3C). No statistically significant difference was observed in the levels of miR-31-3p and miR-10b-5p between tumor and paracancerous tissue samples in the SEOCRC group (P=0.058 and P=0.132). The level of miR-31-3p was significantly increased in tumor compared with paracancerous tissue samples in the SLOCRC group (P=0.002). However, the level of miR-10b-5p was significantly decreased in tumor compared with paracancerous tissue samples in the SLOCRC group (P=0.042) (Fig. 3D). By contrast, the levels of miR-204-5p and miR-206 were significantly downregulated in tumor compared with paracancerous tissue samples in both the SEOCRC (P<0.001 and P=0.049, respectively) and the SLOCRC group (P=0.001 and P=0.031, respectively; Fig. 3E).
Figure 3.
Expression levels of nine candidate miRNAs in tumor and paracancerous tissue samples from patients with SEOCRC or SLOCRC. (A) miR-31-5p level was significantly increased in tumor (n=13) compared with paracancerous tissue (n=13) in patients with SEOCRC (P=0.020). There were no significant differences between the tumor (n=11) and paracancerous tissue (n=11) in the SLOCRC group. (B) miR-592 levels were significantly increased in tumor compared with paracancerous tissue samples in both the SEOCRC (P<0.001) and the SLOCRC group (P=0.003). (C) No statistically significant difference was observed in the levels of miR-9-3p, miR-34b-3p and miR-101-5p between tumor and paracancerous tissue samples in either the SEOCRC group or the SLOCRC group. (D) No statistically significant difference was observed in the levels of miR-31-3p and miR-10b-5p between tumor and paracancerous tissue samples in the SEOCRC group. The level of miR-31-3p was significantly increased in tumor compared with paracancerous tissue samples in the SLOCRC group (P=0.002), while the level of miR-10b-5p was significantly decreased in tumor compared with paracancerous tissue samples in the SLOCRC group (P=0.042). (E) miR-204-5p and miR-206 levels were significantly downregulated in tumor compared with paracancerous tissue samples in both the SEOCRC (P<0.001 and P=0.049, respectively) and the SLOCRC group (P=0.001 and P=0.031, respectively). The data were obtained from Cohort 3. *P<0.05, **P<0.01. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer.
DMD is downregulated in patients with SEOCRC
In order to verify the expression levels of DMD in SEOCRC, RT-qPCR was performed in 13 tumor and 13 paracancerous tissue samples from patients with SEOCRC, as well as 11 tumor tissues and 11 paracancerous tissue samples from patients with SLOCRC (cohort 3). The results demonstrated that the expression of DMD was downregulated in tumor tissue compared with paracancerous tissue samples of SEOCRC (P=0.040; Fig. 4A). However, there was no significant difference in DMD gene expression between cancer and paracancerous tissue of patients with SLOCRC (P=0.896; Fig. 4B).
Figure 4.
Expression of DMD in SEOCRC and SLOCRC. (A) DMD expression was significantly downregulated in tumor (n=13) compared with pericarcinomatous tissue samples (n=13) from patients with SEOCRC (P=0.040). (B) No statistical difference in expression of DMD between tumor (n=11) and pericarcinomatous tissue (n=11) in SLOCRC (P=0.896). The data were obtained from Cohort 3. *P<0.05. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer; DMD, dystrophin.
Consistent with the aforementioned results, the expression of DMD at the protein level was also assessed using IHC staining in 13 paired tumor tissues and paracancerous tissues of SEOCRC, and 11 paired tumor tissues and paracancerous tissues of SLOCRC, respectively. As shown in Fig. 5, DMD protein expression was markedly decreased in tumor tissues compared with that in paired paracancerous tissue samples from patients with SEOCRC. However, there was no difference between tumors and paired paracancerous tissues of SLOCRC with respect to DMD expression. Collectively, these results indicate that a decrease of DMD may be associated with the development of SEOCRC.
Figure 5.
Protein expression of DMD in SEOCRC and SLOCRC. (A) In situ expression of DMD was observed in paracancerous epithelia but was faint in colorectal cancer cells in SEOCRC by immunostaining. (B) In situ expression of DMD was observed in paracancerous epithelia and colorectal cancer cells in SLOCRC by immunostaining. The data were obtained from Cohort 3. SEOCRC, sporadic early-onset colorectal cancer; SLOCRC, sporadic late-onset colorectal cancer.
Correlation of DMD expression with clinicopathological features of SEOCRC
In order to evaluate the association between DMD expression and clinicopathological variables, 80 patients with SEOCRC (cohort 4) were divided into a high-expression and a low-expression group (n=40 in each group) according to the median value of DMD expression. The correlation between DMD expression and clinicopathological features was assessed. As shown in Table VI, low expression of DMD was significantly associated with advanced pathological stage and increased incidence of lymph node metastasis (P=0.007 and P=0.008, respectively). However, no significant associations between DMD and other patient characteristics were observed. Moreover, the patients with low DMD expression had a significantly poorer prognosis than those with high DMD expression level in overall survival (P=0.011; Fig. 6A), cancer-specific survival (P=0.009; Fig. 6B) and recurrence free survival (P=0.014; Fig. 6C) in a Kaplan-Meier survival analysis.
Table VI.
Association between DMD expression in sporadic colorectal cancer tissue with different clinicopathological features
DMD expression, n (%)
Clinicopathological characteristics
Low (n=40) (%)
High (n=40) (%)
P-value
Sex
0.259
Male
26 (65.0)
21 (52.5)
Female
14 (35.0)
19 (47.5)
Tumor size, cm
0.182
<5
18 (45.0)
24 (60.0)
≥5
22 (55.0)
16 (40.0)
Histological grade
0.052
Good or moderate
24 (60.0)
32 (80.0)
Poor
16 (40.0)
8 (20.0)
TNM stage
0.007[a]
II
17 (42.5)
29 (72.5)
III
23 (57.5)
11 (27.5)
Lymph node metastasis
Yes
25 (62.5)
13 (32.5)
0.008[a]
No
15 (37.5)
27 (67.5)
P<0.01. The data were obtained from Cohort 3. DMD, dystrophin.
Figure 6.
Kaplan-Meier survival curves of patients with sporadic early-onset colorectal cancer stratified according to DMD expression. (A) Patients with low expression had significantly poorer overall survival than those with high expression (P=0.011). (B) Patients with low expression had significantly poorer cancer specific survival than those with high expression (P=0.009). (C) Patients with low expression had significantly poorer recurrence free survival than those with high expression (P=0.014). The data were obtained from Cohort 4. DMD, dystrophin.
Discussion
Currently, the incidence of SEOCRC is increasing worldwide. Although the pathogenesis has been studied intensively, it still remains unclear. It has been recognized that the origin of the disease may be attributed to the presence of a large number of common, low-penetrance genetic variants, each exerting a small influence on risk (9). Accumulating evidence has also shown that 80% of sporadic EOCRCs tend to be microsatellite-stable and do not feature the CpG island methylator phenotype (32). In a study involving 18,218 clinical specimens, the alterations of TP53 and CTNNB1 were found to be more common in younger patients (<40) in the microsatellite-stable group, while APC, KRAS, BRAF and FAM123B were more frequently altered in older patients (≥50) with CRC. In the MSI-high cohort, the majority of genes have been proven to have a similar rate of alterations in all age group, but with significant differences in APC, BRAF, and KRAS (33). However, the younger group of this study included inherited and sporadic CRC. Additionally, another study has also identified ten candidate heterozygous variants (BMPR1A, BRIP1, SRC, CLSPN, SEC24B, SSH2, ACACA, NR2C2, INPP4A, and DIDO1) and five possibly biallelic autosomal recessive candidate genes (ATP10B, PKHD1, UGGT2, MYH13, TFF3) through exome sequencing in 51 early-onset non-familial CRC cases (34).In the present study, the role of key genes and their regulatory miRNAs were examined in the development of SEOCRC by NGS and bioinformatics. Clinical samples (cohort 1) and TCGA (cohort 2) datasets were examined and it was demonstrated that the miR-31-5p-DMD axis was altered in SEOCRC in both cohorts. The expression of miR-31-5p was upregulated whereas DMD was downregulated in SEOCRC, which were further verified by qPCR and IHC. Therefore, the miR-31-5p-DMD axis may serve as a novel potential biomarker in the pathogenesis of SEOCRC.miR-31-5p has been proposed as novel biomarker for the diagnosis and treatment of many types of cancer including oral cancer, renal cell carcinoma, CRC, nasopharyngeal carcinoma, and hepatocellular carcinoma (35–39). miR-31 plays an intricate role in human cancer function as onco-miR and tumor suppressor miR (19). Moreover, it can influence the drug sensitivity and efficacy of chemotherapy in colorectal cancer and hepatocellular carcinoma cells (18,39). It has been considered as a target of long noncoding or circular RNA in cardiomyocyte hypertrophy and pre-eclampsia (40,41), and a shared regulator of chronic mucus hypersecretion in asthma and chronic obstructive pulmonary disease (42). In the present study, miR-31 was upregulated in SEOCRC and DMD was downregulated.The DMD gene encodes the dystrophin protein which forms a component of the dystrophin-glycoprotein complex (DGC) bridging the inner cytoskeleton and the extracellular matrix. Deletion, duplication, and point mutation of DMD gene may cause Duchenne's muscular dystrophy, Becker muscular dystrophy (BMD), or cardiomyopathy (43–45). Altered DMD expression is also linked to the onset and progression of cancer, including myogenic tumors and even non-myogenic tumors (46–49), and it is considered as a new regulatory factor in tumor development and a new prognostic factor for tumor progression and survival. However, the molecular mechanism of DMD disorder in cancer is not clear, and the relationship between DMD gene and CRC has not been reported. In the present study, DMD was found to be downregulated in patients with SEOCRC and associated with tumor stage, lymph node metastasis and patient survival.In summary, the present findings reveal a reduction of DMD and an increase of miR-31-5p in SEOCRC, suggesting that the miR-31-5p-DMD axis may contribute to the occurrence of SEOCRC and may serve as a new biomarker in the diagnosis and treatment of SEOCRC.