| Literature DB >> 32345988 |
Sung Won Han1, Ji Young Ahn1, Soobin Lee1, Young Seon Noh1, Hee Chan Jung2, Min Hyung Lee1, Hae Jun Park1, Hoon Jai Chun3, Seong Ji Choi3, Eun Sun Kim4, Ji-Yun Lee5.
Abstract
Colon cancer has been well studied using a variety of molecular techniques, including whole genome sequencing. However, genetic markers that could be used to predict lymph node (LN) involvement, which is the most important prognostic factor for colon cancer, have not been identified. In the present study, we compared LN(+) and LN(-) colon cancer patients using differential gene expression and network analysis. Colon cancer gene expression data were obtained from the Cancer Genome Atlas and divided into two groups, LN(+) and LN(-). Gene expression networks were constructed using LASSO (Least Absolute Shrinkage and Selection Operator) regression. We identified hub genes, such as APBB1, AHSA2, ZNF767, and JAK2, that were highly differentially expressed. Survival analysis using selected hub genes, such as AHSA2, CDK10, and CWC22, showed that their expression levels were significantly associated with the survival rate of colon cancer patients, which indicates their possible use as prognostic markers. In addition, protein-protein interaction network, GO enrichment, and KEGG pathway analysis were performed with selected hub genes from each group to investigate the regulatory relationships between hub genes and LN involvement in colon cancer; these analyses revealed differences between the LN(-) and LN(+) groups. Our network analysis may help narrow down the search for novel candidate genes for the treatment of colon cancer, in addition to improving our understanding of the biological processes underlying LN involvement. All R implementation codes are available at journal website as Supplementary Materials.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32345988 PMCID: PMC7189385 DOI: 10.1038/s41598-020-63806-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical characteristics of the LN(+) and LN(−) patients group with colorectal cancer collected from TCGA.
| COAD | Lymph node | Total | LN(+) vs. | ||
|---|---|---|---|---|---|
| Positive | Negative | ||||
| Value (%) | Value (%) | Value(%) | |||
| 179 (45) | 216(55) | 395(100) | |||
| Age | mean (SD) | 64.54 ± 13.4 | 68.29 ± 12.37 | 66.58 ± 13 | 0.005 |
| median | 66 | 70 | 68 | 0.006 | |
| Gender | FEMALE | 89(50) | 90(42) | 179(45) | 0.134 |
| MALE | 90(50) | 126(58) | 216(55) | 0.134 | |
| NA | 0 | 0 | 0 | ||
| Status | Alive | 123(69) | 185(86) | 308(78) | 8.83E-05 |
| Dead | 56(31) | 31(14) | 87(22) | 8.83E-05 | |
| NA | 0 | 0 | 0 | ||
| Race | WHITE | 87(74) | 93(76) | 180(75) | 0.014 |
| BLACK OR AFRICAN AMERICAN | 29(25) | 19(16) | 48(20) | 0.014 | |
| ASIAN | 1(1) | 10(8) | 11(5) | 0.014 | |
| AMERICAN INDIAN OR ALASKA NATIVE | 1(1) | 0(0) | 1(0) | 0.014 | |
| NA | 61 | 94 | 155 | ||
| Radiation Therapy | NO | 148(97) | 178(98) | 326(98) | 0.549 |
| YES | 5(3) | 3(2) | 8(2) | 0.549 | |
| NA | 26 | 35 | 0 | ||
| Stage | I | 0(0) | 66(31) | 66(17) | 3.30E-79 |
| II | 0(0) | 142(66) | 142(36) | 3.30E-79 | |
| III | 125(70) | 0(0) | 125(32) | 3.30E-79 | |
| IV | 54(30) | 8(4) | 62(16) | 3.30E-79 | |
| NA | 0 | 0 | 0 | ||
| T stage | t1 | 1(1) | 8(4) | 9(2) | 7.16E-16 |
| t2 | 9(5) | 57(26) | 66(17) | 7.16E-16 | |
| t3 | 129(72) | 149(69) | 278(70) | 7.16E-16 | |
| t4 | 40(22) | 1(0) | 41(10) | 7.16E-16 | |
| tis | 0(0) | 0 | 1(0) | 7.16E-16 | |
| NA | 0 | 0 | 0 | ||
| N stage | n0 | 0(0) | 216(100) | 216(55) | 1.69E-86 |
| n1 | 101(56) | 0(0) | 101(26) | 1.69E-86 | |
| n2 | 78(44) | 0(0) | 78(20) | 1.69E-86 | |
| NA | 0 | 0 | 0 | ||
| M stage | m0 | 102(57) | 207(96) | 309(78) | 2.50E-20 |
| m1 | 54(30) | 8(4) | 62(16) | 2.50E-20 | |
| mx | 23(13) | 0(0) | 23(6) | 2.50E-20 | |
| NA | 0 | 1 | 0 | ||
Figure 1Representative hub with its edge genes calculated using the degree centrality analysis of the LN(+) and LN(−) groups. (A) PCNP and (B) HEG1 as hub genes. Green fill: downregulated genes in the DEG analysis, Red fill: upregulated genes in the DEG analysis, Red font: common genes in both groups, Edge width: coefficient power.
Figure 2Representative the hub of hub gene with its edge genes calculated by the degree of centrality analysis from the LN(+) and LN(−) groups. (A) PCNP and (B) HEG1 as the hub of hub genes. Green fill: downregulated genes in the DEG analysis, Red fill: upregulated genes in the DEG analysis, Red font: common genes in both groups, Edge width: coefficient power.
Figure 3Degree of centrality analysis of the top 240 hub genes in the LN(+) group. (A) 240 Hub genes in the LN(+) group. (B) Hub genes (240) in the LN(−) group. The location of each gene in (A,B) is identical. Green fill: downregulated genes in the DEG analysis, Red fill: upregulated genes in the DEG analysis, Edge width: coefficient power.
Figure 4Venn diagram of genes shared across the 1918 DEG (p < 0.005) sets and hub genes (CV ≤ 20%) from each group. The number indicates the number of genes, which listed in Table 2.
Selected hub genes by comparison DEG sets and hub genes from each group.
| Degree Centrality | LN(+) | LN(−) | Relative gene expression level [LN(+)/LN(−)] | p-value | ||
|---|---|---|---|---|---|---|
| 179 | 216 | |||||
DEG&LN(+) &LN(−) [26] | SLC22A17 | degree | 40 | 28 | up | 0.0000 |
| median of expression (log2) | 6.456 | 5.991 | ||||
| APBB1 | degree | 37 | 43 | up | 0.0001 | |
| median of expression (log2) | 7.009 | 6.599 | ||||
| SLC7A14 | degree | 28 | 31 | up | 0.0002 | |
| median of expression (log2) | 0.952 | 0 | ||||
| JAM3 | degree | 26 | 36 | up | 0.0002 | |
| median of expression (log2) | 7.998 | 7.722 | ||||
| PRELP | degree | 30 | 27 | up | 0.0004 | |
| median of expression (log2) | 7.495 | 6.542 | ||||
| LYSMD3 | degree | 28 | 29 | down | 0.0004 | |
| median of expression (log2) | 8.554 | 8.757 | ||||
| RBPMS2 | degree | 29 | 34 | up | 0.0008 | |
| median of expression (log2) | 4.736 | 4.345 | ||||
| LMOD1 | degree | 32 | 30 | up | 0.0008 | |
| median of expression (log2) | 8.205 | 7.599 | ||||
| GEFT | degree | 35 | 26 | up | 0.0010 | |
| median of expression (log2) | 7.006 | 6.459 | ||||
| SALL2 | degree | 26 | 27 | up | 0.0010 | |
| median of expression (log2) | 4.872 | 4.601 | ||||
| TNS1 | degree | 32 | 46 | up | 0.0010 | |
| median of expression (log2) | 10.254 | 9.675 | ||||
| EFEMP2 | degree | 31 | 40 | up | 0.0014 | |
| median of expression (log2) | 9.071 | 8.798 | ||||
| SYDE1 | degree | 27 | 34 | up | 0.0016 | |
| median of expression (log2) | 7.844 | 7.579 | ||||
| CLIP3 | degree | 48 | 37 | up | 0.0016 | |
| median of expression (log2) | 7.515 | 7.128 | ||||
| MRVI1 | degree | 30 | 35 | up | 0.0016 | |
| median of expression (log2) | 8.567 | 8.244 | ||||
| PKN2 | degree | 26 | 36 | down | 0.0018 | |
| median of expression (log2) | 9.935 | 10.147 | ||||
| AHSA2 | degree | 34 | 51 | up | 0.0022 | |
| median of expression (log2) | 8.660 | 8.378 | ||||
| AKAP11 | degree | 28 | 28 | up | 0.0026 | |
| median of expression (log2) | 10.494 | 10.250 | ||||
| TIMP2 | degree | 36 | 35 | up | 0.0026 | |
| median of expression (log2) | 12.056 | 11.636 | ||||
| CDK1 | degree | 33 | 32 | down | 0.0029 | |
| median of expression (log2) | 10.289 | 10.405 | ||||
| ABCE1 | degree | 28 | 34 | down | 0.0030 | |
| median of expression (log2) | 10.849 | 10.973 | ||||
| PKD1 | degree | 27 | 27 | up | 0.0031 | |
| median of expression (log2) | 10.563 | 10.364 | ||||
| SGMS2 | degree | 26 | 32 | down | 0.0033 | |
| median of expression (log2) | 9.014 | 9.180 | ||||
| MGP | degree | 41 | 39 | up | 0.0034 | |
| median of expression (log2) | 9.336 | 8.834 | ||||
| HSPB8 | degree | 30 | 28 | up | 0.0037 | |
| median of expression (log2) | 6.676 | 6.234 | ||||
| BOC | degree | 41 | 53 | up | 0.0039 | |
| median of expression (log2) | 6.997 | 6.573 | ||||
DEG&LN(+) [ | TMTC3 | degree | 28 | 24 | down | 0.0004 |
| median of expression (log2) | 8.432 | 8.568 | ||||
| FXYD6 | degree | 28 | 20 | up | 0.0004 | |
| median of expression (log2) | 8.095 | 7.513 | ||||
| PDZD4 | degree | 33 | 10 | up | 0.0007 | |
| median of expression (log2) | 4.063 | 3.538 | ||||
| SLC35A3 | degree | 26 | 19 | down | 0.0009 | |
| median of expression (log2) | 9.293 | 9.592 | ||||
| TMED7 | degree | 28 | 25 | down | 0.0014 | |
| median of expression (log2) | 11.003 | 11.198 | ||||
| SCAF1 | degree | 33 | 16 | up | 0.0017 | |
| median of expression (log2) | 10.830 | 10.641 | ||||
| TUB | degree | 27 | 15 | up | 0.0019 | |
| median of expression (log2) | 4.819 | 4.296 | ||||
| MYH11 | degree | 29 | 21 | up | 0.0023 | |
| median of expression (log2) | 10.712 | 10.094 | ||||
| C14orf132 | degree | 31 | 20 | up | 0.0026 | |
| median of expression (log2) | 6.466 | 5.961 | ||||
| SPARCL1 | degree | 28 | 21 | up | 0.0034 | |
| median of expression (log2) | 10.075 | 9.712 | ||||
| TRO | degree | 31 | 15 | up | 0.0035 | |
| median of expression (log2) | 5.470 | 5.182 | ||||
DEG&LN(−) [27] | C12orf48 | degree | 17 | 26 | down | 0.0000 |
| median of expression (log2) | 8.204 | 8.499 | ||||
| C14orf129 | degree | 15 | 30 | down | 0.0000 | |
| median of expression (log2) | 10.160 | 10.653 | ||||
| C18orf32 | degree | 20 | 33 | down | 0.0002 | |
| median of expression (log2) | 9.154 | 9.436 | ||||
| PDLIM7 | degree | 16 | 34 | up | 0.0004 | |
| median of expression (log2) | 9.937 | 9.549 | ||||
| COPS4 | degree | 19 | 28 | down | 0.0004 | |
| median of expression (log2) | 9.258 | 9.414 | ||||
| ADAMTSL3 | degree | 19 | 26 | up | 0.0005 | |
| median of expression (log2) | 4.943 | 4.269 | ||||
| FHL1 | degree | 23 | 30 | up | 0.0005 | |
| median of expression (log2) | 8.568 | 8.263 | ||||
| GPRASP1 | degree | 19 | 40 | up | 0.0006 | |
| median of expression (log2) | 5.955 | 5.530 | ||||
| HMCN1 | degree | 20 | 29 | up | 0.0007 | |
| median of expression (log2) | 6.655 | 6.041 | ||||
| GBP4 | degree | 14 | 26 | down | 0.0010 | |
| median of expression (log2) | 8.569 | 9.076 | ||||
| JAK2 | degree | 18 | 26 | down | 0.0011 | |
| median of expression (log2) | 7.856 | 8.159 | ||||
| MXRA8 | degree | 23 | 28 | up | 0.0012 | |
| median of expression (log2) | 9.792 | 9.426 | ||||
| SETD1A | degree | 14 | 28 | up | 0.0012 | |
| median of expression (log2) | 9.872 | 9.755 | ||||
| RAB27B | degree | 14 | 26 | down | 0.0013 | |
| median of expression (log2) | 4.289 | 5.011 | ||||
| TNRC6A | degree | 10 | 26 | up | 0.0014 | |
| median of expression (log2) | 9.662 | 9.499 | ||||
| NUMA1 | degree | 21 | 26 | up | 0.0014 | |
| median of expression (log2) | 12.548 | 12.352 | ||||
| MRPL50 | degree | 11 | 28 | down | 0.0022 | |
| median of expression (log2) | 9.049 | 9.150 | ||||
| ZNF24 | degree | 21 | 28 | down | 0.0026 | |
| median of expression (log2) | 9.947 | 10.147 | ||||
| LONRF2 | degree | 22 | 27 | up | 0.0034 | |
| median of expression (log2) | 2.224 | 1.616 | ||||
| ZNF767 | degree | 16 | 26 | up | 0.0036 | |
| median of expression (log2) | 7.424 | 7.248 | ||||
| ARFIP1 | degree | 22 | 40 | down | 0.0037 | |
| median of expression (log2) | 10.082 | 10.215 | ||||
| USP33 | degree | 10 | 26 | down | 0.0037 | |
| median of expression (log2) | 9.917 | 10.092 | ||||
| C5orf44 | degree | 19 | 34 | down | 0.0042 | |
| median of expression (log2) | 8.568 | 8.676 | ||||
| ZNF720 | degree | 16 | 26 | up | 0.0045 | |
| median of expression (log2) | 7.938 | 7.768 | ||||
| UBA3 | degree | 13 | 39 | down | 0.0046 | |
| median of expression (log2) | 9.826 | 9.953 | ||||
| LDB2 | degree | 19 | 26 | up | 0.0048 | |
| median of expression (log2) | 6.801 | 6.555 | ||||
| CDK10 | degree | 14 | 33 | up | 0.0049 | |
| median of expression (log2) | 10.293 | 10.166 | ||||
Figure 5Representative Kaplan-Meier survival curves of selected hub genes. AHSA2, ZNF767, CDK10, and CWC22.
Figure 6Protein-protein interaction network among the hub genes from LN(−) and LN(+) with more than a 0.95 confidence score as analyzed by STRING. Balls represent proteins, and lines represent interactions between proteins. A red circle around a ball indicates genes shared among both groups. Red arrow indicates upregulation. Green arrow indicates downregulation.
Figure 7A. Top 10 enriched GO terms B. KEGG pathway with more than 0.025% of the hub genes involved [searched using 353 hub genes from LN(−) and 240 hub genes from LN(+)]. *Indicates proportion of the number of genes: [Number of hub genes involved in this pathway/number of total hub genes from LN(+) or LN(−)] × 100.