| Literature DB >> 30867765 |
Qiang Wang1, Jianchang Wei1, Zhuanpeng Chen1, Tong Zhang1, Junbin Zhong1, Bingzheng Zhong1, Ping Yang1, Wanglin Li1, Jie Cao1.
Abstract
The current study aimed to develop multiple diagnosis models for colorectal cancer (CRC) based on data from The Cancer Genome Atlas database and analysis with artificial neural networks in order to enhance CRC diagnosis methods. A genetic algorithm and mean impact value were used to select genes to be used as numerical encoded parameters to reflect cancer metastasis or aggression. Back propagation and learning vector quantization neural networks were used to build four diagnosis models: Cancer/Normal, M0/M1, carcinoembryonic antigen (CEA) <5/≥5 and Clinical stage I-II/III-IV. The performance of each model was evaluated by predictive accuracy (ACC), the area under the receiver operating characteristic curve (AUC) and a 10-fold cross-validation test. The ACC and AUC of the Cancer/Normal, M0/M1, CEA and Clinical stage models were 100%, 1.000; 87.14%, 0.670; 100%, 1.000; and 100%, 1.000, respectively. The 10-fold cross-validation test of the ACC values and sensitivity for each test were 93.75-99.39%, 1.0000; 80.58-88.24%, 0.9286-1.0000; 67.21-92.31%, 0.7091-1.0000; and 59.13-68.85%, 0.6017-0.6585, respectively. The diagnosis models developed in the current study combined gene expression profiling data and artificial intelligence algorithms to create tools for improved diagnosis of CRC.Entities:
Keywords: artificial neural networks; colorectal cancer; diagnosis model
Year: 2019 PMID: 30867765 PMCID: PMC6396131 DOI: 10.3892/ol.2019.10010
Source DB: PubMed Journal: Oncol Lett ISSN: 1792-1074 Impact factor: 2.967
Datasets used in the four diagnosis models.
| Datasets | Cancer/normal, n | M0/M1, n | CEA <5/≥5, n | Clinical stage I–II/III–IV, n |
|---|---|---|---|---|
| TCGA_colorectal cancer | 287/41 | 189/39 | 79/43 | 155/101 |
TCGA, The Cancer Genome Atlas; M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen.
Number of samples used in training sets and test sets.
| Diagnosis model | |||||
|---|---|---|---|---|---|
| Use | Total sample, n | Cancer/normal, n | M0/M1, n | CEA <5/≥5, n | Clinical stage I–II/III–IV, n |
| Training set | 246 | 215/31 | – | – | – |
| Test set | 82 | 72/10 | – | – | – |
| Training set | 171 | – | 140/31 | – | – |
| Test set | 57 | – | 49/8 | – | – |
| Training set | 92 | – | – | 60/32 | – |
| Test set | 30 | – | – | 19/11 | – |
| Training set | 208 | – | – | – | 118/90 |
| Test set | 69 | – | – | – | 37/32 |
M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen.
Figure 1.Flow chart for building colorectal cancer diagnosis models. TCGA, The Cancer Genome Atlas; GA, genetic algorithm; MIV, mean impact value; BP, back propagation; LVQ, learning vector quantization; M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen.
Diagnostic genes used in diagnosis models.
| Diagnosis model | Gene symbol | mRNA description | Ratio | Regulation |
|---|---|---|---|---|
| Cancer/normal | MT1M | Metallothionein 1M | 57.35 | Down |
| Cancer/normal | ATP1A2 | ATPase Na+/K+ Transporting Subunit Alpha 2 | 45.00 | Down |
| Cancer/normal | ALPI | Alkaline Phosphatase, Intestinal | 43.49 | Down |
| Cancer/normal | LOC646627 | Uncharacterized LOC646627 | 43.23 | Down |
| Cancer/normal | TMEM72 | Transmembrane Protein 72 | 34.31 | Down |
| Cancer/normal | CPNE7 | Copine 7 | 33.87 | Up |
| M0/M1 | ALPPL2 | Alkaline Phosphatase, Placental Like 2 | 2.08 | Down |
| M0/M1 | ALPP | Alkaline Phosphatase, Placental | 2.35 | Down |
| M0/M1 | CACNG4 | Calcium Voltage-Gated Channel Auxiliary Subunit Gamma 4 | 2.60 | Down |
| M0/M1 | CAMK2B | Calcium/Calmodulin Dependent Protein Kinase II Beta | 2.30 | Down |
| M0/M1 | DLX3 | Distal-Less Homeobox 3 | 2.87 | Down |
| M0/M1 | FREM2 | FRAS1 Related Extracellular Matrix Protein 2 | 2.52 | Down |
| M0/M1 | GPR81 | Hydroxycarboxylic Acid Receptor 1 | 2.14 | Down |
| M0/M1 | HEPHL1 | Hephaestin Like 1 | 2.25 | Down |
| M0/M1 | KRT6A | Keratin 6A | 2.24 | Down |
| M0/M1 | LOC100133545 | MRPL23 antisense RNA 1 | 2.62 | Down |
| M0/M1 | LOC440173 | Uncharacterized LOC440173 | 2.03 | Down |
| M0/M1 | MAP7D2 | MAP7 Domain Containing 2 | 2.21 | Down |
| M0/M1 | MSLN | Mesothelin | 2.05 | Down |
| M0/M1 | PSCA | Prostate Stem Cell Antigen | 2.19 | Down |
| M0/M1 | SCEL | Sciellin | 2.39 | Down |
| M0/M1 | SLC14A1 | Solute Carrier Family 14 Member 1 (Kidd Blood Group) | 3.47 | Down |
| M0/M1 | SLC15A1 | Solute Carrier Family 15 Member 1 | 2.08 | Down |
| CEA <5/≥5 | ADH6 | Alcohol Dehydrogenase 6 (Class V) | 2.17 | Down |
| CEA <5/≥5 | AHSG | Alpha-2-HS-Glycoprotein | 2.05 | Down |
| CEA <5/≥5 | CCL25 | C-C Motif Chemokine Ligand 25 | 3.34 | Down |
| CEA <5/≥5 | CPLX2 | Complexin 2 | 2.36 | Down |
| CEA <5/≥5 | DEFA5 | Defensin Alpha 5 | 4.40 | Down |
| CEA <5/≥5 | DKK4 | Dickkopf WNT Signaling Pathway Inhibitor 4 | 2.25 | Down |
| CEA <5/≥5 | ELF5 | E74 Like ETS Transcription Factor 5 | 2.39 | Up |
| CEA <5/≥5 | EMX1 | Empty Spiracles Homeobox 1 | 3.04 | Down |
| CEA <5/≥5 | FABP4 | Fatty Acid Binding Protein 4 | 2.19 | Up |
| CEA <5/≥5 | GNG4 | G Protein Subunit Gamma 4 | 2.58 | Up |
| CEA <5/≥5 | IGFL2 | IGF Like Family Member 2 | 2.08 | Down |
| CEA <5/≥5 | NOS2 | Nitric Oxide Synthase 2 | 2.31 | Down |
| CEA <5/≥5 | SVOPL | SVOP Like | 2.07 | Up |
| CEA <5/≥5 | TNFRSF6B | Tumor Necrosis Factor Receptor Superfamily 6b | 2.11 | Down |
| Clinical stage | LY6G6D | Lymphocyte Antigen 6 Complex, Locus G6D | 2.01 | Down |
| I–II/III–IV | ||||
| Clinical stage | PALM3 | Paralemmin 3 | 2.23 | Down |
| I–II/III–IV | ||||
| Clinical stage | PRKAA2 | Protein Kinase AMP-Activated Catalytic Subunit Alpha 2 | 2.14 | Down |
| I–II/III–IV |
M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen.
Diagnosis model testing results.
| Diagnosis model | Use | ACC, % | AUC |
|---|---|---|---|
| Cancer/Normal | Training set | 100.00 | 1.0000 |
| Cancer/Normal | Test set | 100.00 | 1.0000 |
| M0/M1 | Training set | 87.14 | 0.6700 |
| M0/M1 | Test set | 92.98 | 0.8550 |
| CEA <5/≥5 | Training set | 100.00 | 1.0000 |
| CEA <5/≥5 | Test set | 80.00 | 0.8708 |
| Clinical stage I–II/III–IV | Training set | 100.00 | 1.0000 |
| Clinical stage I–II/III–IV | Test set | 65.22 | 0.6419 |
M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen; ACC, accuracy; AUC, area under curve.
Figure 2.Training set and test set receiver operating characteristic curves for the four colorectal cancer diagnosis models. (A) Cancer/normal. (B) M0/M1 (without distant metastasis/with distant metastasis). (C) Carcinoembryonic antigen <5/≥5. (D) Clinical stage I–II/III–IV. AUC, area under curve.
10-fold cross validation of diagnosis model TCGA training sets.
| Cancer/normal | M0/M1 | CEA <5/≥5 | Clinical stage I–II/III–IV | |||||
|---|---|---|---|---|---|---|---|---|
| 10-fold cross | ACC | Sen | ACC | Sen | ACC | Sen | ACC | Sen |
| 10-1 | 0.9375 | 1.0000 | 0.8824 | 1.0000 | 0.9231 | 1.0000 | 0.6000 | 0.6364 |
| 10-2 | 0.9692 | 1.0000 | 0.8824 | 1.0000 | 0.8000 | 0.9375 | 0.6829 | 0.6087 |
| 10-3 | 0.9796 | 1.0000 | 0.8431 | 0.9762 | 0.7568 | 0.8750 | 0.6885 | 0.6176 |
| 10-4 | 0.9847 | 1.0000 | 0.8088 | 0.9496 | 0.7551 | 0.8438 | 0.6707 | 0.6304 |
| 10-5 | 0.9878 | 1.0000 | 0.8118 | 0.9429 | 0.7377 | 0.7750 | 0.6602 | 0.6552 |
| 10-6 | 0.9898 | 1.0000 | 0.8058 | 0.9286 | 0.6986 | 0.7234 | 0.6452 | 0.6429 |
| 10-7 | 0.9913 | 1.0000 | 0.8083 | 0.9388 | 0.6744 | 0.7091 | 0.6414 | 0.6585 |
| 10-8 | 0.9924 | 1.0000 | 0.8102 | 0.9464 | 0.6735 | 0.7143 | 0.6024 | 0.6170 |
| 10-9 | 0.9923 | 1.0000 | 0.8117 | 0.9524 | 0.6909 | 0.7324 | 0.5936 | 0.6038 |
| 10-10 | 0.9939 | 1.0000 | 0.8187 | 0.9571 | 0.6721 | 0.7215 | 0.5913 | 0.6017 |
M0, without distant metastasis; M1, with distant metastasis; CEA, carcinoembryonic antigen; ACC, accuracy; Sen, sensitivity.
Figure 3.Kaplan-Meier survival curves for training set patients with colorectal cancer. (A) Cancer/Normal. (B) M0/M1 (without distant metastasis/with distant metastasis). (C) Carcinoembryonic antigen <5/≥5. (D) Clinical stage I–II/III–IV. CEA, carcinoembryonic antigen.
Figure 4.Kaplan-Meier survival curves for test set patients with colorectal cancer. (A) Cancer/Normal. (B) M0/M1 (without distant metastasis/with distant metastasis). (C) Carcinoembryonic antigen <5/≥5. (D) Clinical stage I–II/III–IV. CEA, carcinoembryonic antigen.