| Literature DB >> 28229027 |
Wei-Chuan Shangkuan1, Hung-Che Lin2, Yu-Tien Chang3, Chen-En Jian3, Hueng-Chuen Fan4, Kang-Hua Chen5, Ya-Fang Liu6, Huan-Ming Hsu7, Hsiu-Ling Chou8, Chung-Tay Yao9, Chi-Ming Chu3, Sui-Lung Su3, Chi-Wen Chang10.
Abstract
BACKGROUND: Colorectal cancer (CRC) is one of the leading cancers worldwide. Several studies have performed microarray data analyses for cancer classification and prognostic analyses. Microarray assays also enable the identification of gene signatures for molecular characterization and treatment prediction.Entities:
Keywords: Cancer; Gene expression; Gene ontology; Microarray analysis; Prediction analysis for microarrays
Year: 2017 PMID: 28229027 PMCID: PMC5314952 DOI: 10.7717/peerj.3003
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Process of pooling the 11 microarray gene expression datasets.
GEO, Gene Expression Omnibus; GSE, GEO series.
GSE datasets included in our study.
| GSE | Tissue | Total numbers | Total number of genes on chips | Type of gene chips | |
|---|---|---|---|---|---|
| Tumor ( | Normal ( | ||||
|
| 53 | 53 | 33727 | HG-U133_Plus_2 | |
|
| 115 | 30 | 145 | 33727 | HG-U133_Plus_2 |
|
| 105 | 43 | 148 | 33727 | HG-U133_Plus_2 |
|
| 59 | 59 | 33727 | HG-U133_Plus_2 | |
|
| 65 | 65 | 33727 | HG-U133_Plus_2 | |
|
| 37 | 37 | 33727 | HG-U133_Plus_2 | |
|
| 17 | 17 | 34 | 33727 | HG-U133_Plus_2 |
|
| 90 | 6 | 96 | 33727 | HG-U133_Plus_2 |
|
| 27 | 27 | 33727 | HG-U133_Plus_2 | |
|
| 130 | 130 | 33727 | HG-U133_Plus_2 | |
|
| 19 | 38 | 57 | 14713 | HG-U133A |
The centroid scores and frequency of the colorectal cancer genes in the 100 repeated samplings using the PAM method.
| CRC centroid score | NOR centroid score | Diff score (Max) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Genes | Frequency | Mean | SD | Max | Min | Mean | SD | Max | Min | |
| ABCG2 | 100 | −0.023285 | 0.005258 | −0.0121 | −0.0406 | 0.183034 | 0.041345 | 0.3191 | 0.0949 | 0.3597 |
| AQP8 | 100 | −0.024511 | 0.005812 | −0.0096 | −0.0372 | 0.192729 | 0.045658 | 0.2925 | 0.0754 | 0.3297 |
| SPIB | 100 | −0.034727 | 0.004733 | −0.0207 | −0.0456 | 0.273003 | 0.037222 | 0.3582 | 0.1625 | 0.4038 |
| CA7 | 99 | −0.051488 | 0.005233 | −0.0429 | −0.0666 | 0.404711 | 0.041172 | 0.5239 | 0.3369 | 0.5905 |
| CLDN8 | 89 | −0.010152 | 0.004605 | −0.0015 | −0.026 | 0.079792 | 0.036207 | 0.2044 | 0.0118 | 0.2304 |
| SCNN1B | 62 | −0.004138 | 0.00235 | −0.0098 | 0.032498 | 0.018485 | 0.0771 | 0.0002 | 0.0869 | |
| SLC30A10 | 29 | −0.004566 | 0.002946 | −0.0003 | −0.0102 | 0.035979 | 0.023184 | 0.0804 | 0.0024 | 0.0906 |
| CD177 | 5 | −0.00254 | 0.002319 | −0.0004 | −0.0051 | 0.02006 | 0.018175 | 0.0403 | 0.0034 | 0.0454 |
| PADI2 | 2 | −0.00265 | 0.001768 | −0.0014 | −0.0039 | 0.0208 | 0.013859 | 0.0306 | 0.011 | 0.0345 |
| TGFBI | 2 | 0.00045 | 0.000354 | 0.0007 | 0.0002 | −0.0033 | 0.002687 | −0.0014 | −0.0052 | 0.0059 |
Notes.
colorectal cancer tissue
normal tissue
Figure 2(A) The lowest threshold between the normal tissue and colorectal tumors tissue is 14; (B) The number of needed genes is between four and eight genes.
Figure 3PAM model accuracy rates.
The average PAM model accuracy rate was 95% (SD = 0.44). The average validation accuracy rate was 95.2% (SD = 1.33). The average number of significant genes was 5.9.
The GO terms, GO molecular function, GO biological process, GO cellular component of the 10 significant colorectal cancer genes.
| Gene | GO terms | GO molecular function | GO biological process | GO cellular component |
|---|---|---|---|---|
| CA7 | Carbonic anhydrase 7 | Hydro-lyase activity | Metabolic process | |
| SCNN1B | Amiloride-sensitive sodium channel subunit beta | Ion channel activity | Sensory perception of taste | |
| SPIB | Transcription factor Spi-B | Sequence-specific DNA binding transcription factor activity | B cell mediated immunity | |
| CD177 | CD177 antigen | |||
| SLC30A10 | Zinc transporter 10 | Transmembrane transporter activity | Cation transport | |
| TGFBI | Transforming growth factor-beta-induced protein ig-h3 | Receptor binding | Cell communication | |
| PADI2 | Protein-arginine deiminase type-2 | Hydrolase activity | Cellular protein modification process | |
| ABCG2 | ATP-binding cassette sub-family G member 2 | ATPase activity, coupled to transmembrane movement of substances | Lipid metabolic process | |
| CLDN8 | Claudin-8 | Cellular process | Plasma membrane | |
| AQP8 | Aquaporin-8 | Transmembrane transporter activity | Transport |