| Literature DB >> 21060876 |
Kun Xu1, Juan Cui, Victor Olman, Qing Yang, David Puett, Ying Xu.
Abstract
A comparative study of public gene-expression data of seven types of cancers (breast, colon, kidney, lung, pancreatic, prostate and stomach cancers) was conducted with the aim of deriving marker genes, along with associated pathways, that are either common to multiple types of cancers or specific to individual cancers. The analysis results indicate that (a) each of the seven cancer types can be distinguished from its corresponding control tissue based on the expression patterns of a small number of genes, e.g., 2, 3 or 4; (b) the expression patterns of some genes can distinguish multiple cancer types from their corresponding control tissues, potentially serving as general markers for all or some groups of cancers; (c) the proteins encoded by some of these genes are predicted to be blood secretory, thus providing potential cancer markers in blood; (d) the numbers of differentially expressed genes across different cancer types in comparison with their control tissues correlate well with the five-year survival rates associated with the individual cancers; and (e) some metabolic and signaling pathways are abnormally activated or deactivated across all cancer types, while other pathways are more specific to certain cancers or groups of cancers. The novel findings of this study offer considerable insight into these seven cancer types and have the potential to provide exciting new directions for diagnostic and therapeutic development.Entities:
Mesh:
Year: 2010 PMID: 21060876 PMCID: PMC2965162 DOI: 10.1371/journal.pone.0013696
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Classification accuracies by the top 100 k-gene markers, k = 1, 2, 3, 4, on the training and the test sets of breast cancer.
For each panel, the x-axis is the list of 100 k-gene markers ordered by their classification performance on the training datasets, and the y-axis represents the classification accuracy. (A) classification accuracies by the top 100 k-gene combinations between breast cancer and reference samples in the training set, and (B) and (C) on the two test sets; (D) classification accuracies by top 100 k-gene combinations between early breast cancer and corresponding reference samples in the training set and (E) on the test set.
Figure 2Correlation between 5-year survival rate and the number of differentially genes in each cancer type.
A list of 19 genes that are differentially expressed in more than 4 cancer types and their relevance to different cancer types.
| Gene ID | Direction of regulation | Reported to be related to cancers | |||||||||||||
| Breast | Colon | Kidney | Lung | Pancreas | Prostate | Stomach | B. | C. | K. | L. | Pa. | Pr. | S. | Other cancer types | |
| CDC2 |
|
|
|
|
|
|
|
|
|
| liver cancer; squamous cell carcinoma;nasopharyngeal carcinoma | ||||
| AURKA |
|
|
|
|
|
|
|
|
|
|
| ovarian cancer;esophageal squamous cancer;uterine cancer;bladder cancer | |||
| ABCA8 |
|
|
|
|
| ||||||||||
| DPT |
|
|
|
|
| ||||||||||
| TOP2A |
|
|
|
|
|
|
|
| bladder cancer;ovarian cancer; squamous cell carcinoma | ||||||
| MMP7 |
|
|
|
|
|
|
|
|
| ovarian cancer; oral cancer; rectal cancers; bladder cancer; liver cancer | |||||
| MAD2L1 |
|
|
|
|
|
| thyroid carcinomas; oesophageal squamous cancer | ||||||||
| KLF4 |
|
|
|
|
|
|
| esophageal cancer;bladder cancer | |||||||
| MELK |
|
|
|
|
| brain cancer;endometrial cancer | |||||||||
| C7 |
|
|
|
|
|
| uterine cervical cancers | ||||||||
| ECT2 |
|
|
|
|
| ||||||||||
| PRC1 |
|
|
|
|
| ||||||||||
| RRM2 |
|
|
|
|
|
|
| ||||||||
| ALDH1A1 |
|
|
|
|
| non-small cell bronchopulmonary cancer; liver cancer; T-cell leukemia | |||||||||
| PMAIP1 |
|
|
|
|
|
|
| ||||||||
| FABP4 |
|
|
|
|
| Bladder cancer; | |||||||||
| COL11A1 |
|
|
|
| adenomas; | ||||||||||
| TTK |
|
|
|
| |||||||||||
| CENPF |
|
|
|
|
| ||||||||||
“↑” indicates up-regulated gene expression in the corresponding cancer type while “↓”is down-regulation. “*” indicates that a gene has been reported as relevant to the corresponding cancer type. “B.” for breast cancer; “C.” for colon cancer; “K.” for kidney cancer”; “L.” for lung cancer”; “Pa.” for pancreatic cancer”; “Pr.” for prostate cancer” and “S.” for stomach cancer”.
The top 2-gene markers for multiple cancer types with each numerical value showing the classification accuracy between a cancer and its corresponding control tissue.
| Count | Markers | Breast | Colon | Kidney | Lung | Pancreas | Prostate | Stomach | ||||||||||||||
| train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | ||
| 5 | CDC2+TOP2A | 72.4 | 94.8 | 95.3 | 75.0 | 100 | 64.3 | _ | _ | _ | 85.2 | 85.2 | 79.5 | 71.2 | 71.2 | 87.5 | _ | _ | _ | 78.3 | 85.5 | 85.2 |
| 4 | CDC2+DPT | 70.7 | 94.8 | 96.1 | 91.7 | 97.9 | 64.3 | _ | _ | _ | 88.9 | 92.6 | 82.1 | _ | _ | _ | _ | _ | _ | 66.7 | 85.5 | 85.2 |
| CDC2+ECT2 | _ | _ | _ | 85.4 | 97.9 | 69.0 | _ | _ | _ | 83.3 | 77.8 | miss | 78.8 | 86.5 | 87.5 | _ | _ | _ | 75.4 | 78.3 | 65.6 | |
| ABCA8+AURKA | 81.0 | 96.6 | 99.2 | 91.7 | 100 | N.A. | _ | _ | _ | 94.4 | 94.4 | 92.3 | _ | _ | _ | _ | _ | _ | 75.4 | 92.8 | 74.1 | |
| ABCA8+FABP4 | 79.3 | 96.6 | 91.5 | 89.6 | 97.9 | 85.7 | _ | _ | _ | 96.3 | 98.1 | 94.9 | _ | _ | _ | _ | _ | _ | 79.7 | 84.1 | 66.7 | |
| DPT+FABP4 | 79.3 | 87.9 | 85.2 | 95.8 | 89.6 | 65.2 | _ | _ | _ | 94.4 | 96.3 | 94.9 | _ | _ | _ | _ | _ | _ | 82.6 | 75.4 | 81.5 | |
| FABP4+TOP2A | 77.6 | 94.8 | 93.0 | 85.4 | 100 | 67.6 | _ | _ | _ | 96.3 | 94.4 | 92.3 | _ | _ | _ | _ | _ | _ | 78.3 | 85.5 | 88.9 | |
| 3 | CDC2+SULF1 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 90.7 | 88.9 | 76.9 | 96.2 | 90.4 | 87.5 | _ | _ | _ | 95.7 | 88.4 | 77.8 |
Each entry represents the classification accuracy (by percentage) between a cancer set and its corresponding reference set on the training (train) and the testing (test) datasets, respectively. (N.A. : the platform of the extra test data doesn't cover the corresponding gene).
Top k-gene discriminators with their proteins being blood secretory.
| Markers | Breast | Colon | Kidney | Lung | Pancreas | Prostate | Stomach | |||||||||||||||
| train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | train | test | test2 | ||
| 3 | GREM1+MMP7 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 88.9 | 79.6 | 64.1 | 92.3 | 73.1 | 87.5 | _ | _ | _ | 89.9 | 75.4 | 63.0 |
| 3 | MMP7+MMP9 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 75.9 | 79.6 | 87.2 | 96.2 | 78.9 | 87.5 | _ | _ | _ | 77.9 | 76.8 | 70.4 |
| 3 | MMP11+MMP7+MMP9+RRM2 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 85.2 | 96.3 | 87.2 | 96.2 | 88.5 | 87.5 | _ | _ | _ | 88.1 | 88.4 | 84.4 |
| 3 | CCL18+TGFBI | _ | _ | _ | _ | _ | _ | 74.5 | 80.9 | 80.0 | _ | _ | _ | 82.7 | 82.7 | 87.5 | _ | _ | _ | 71.0 | 75.4 | 77.8 |
| 3 | DPT+MMP7 | _ | _ | _ | 97.9 | 89.6 | 76.2 | _ | _ | _ | 85.2 | 88.9 | 87.2 | _ | _ | _ | _ | _ | _ | 84.2 | 81.2 | 74.1 |
| 3 | FAM107A+KLF4 | _ | _ | _ | 87.5 | 100 | miss | _ | _ | _ | 94.4 | 92.6 | 94.9 | _ | _ | _ | _ | _ | _ | 91.3 | 92.8 | 70.4 |
| 3 | FAM107A+KLF4+MMP7+PAICS | _ | _ | _ | 100 | 100 | _ | _ | _ | 94.4 | 94.4 | 94.9 | _ | _ | _ | _ | _ | _ | 91.3 | 91.3 | 84.4 | |
| 3 | INHBA+RRM2 | 74.1 | 100 | 98.4 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 94.2 | 88.5 | 87.5 | _ | _ | _ | 78.3 | 81.2 | 74.1 |
| 3 | GPX3+RRM2 | 81.0 | 96.6 | 96.1 | _ | _ | _ | _ | _ | _ | 88.9 | 94.4 | 94.9 | _ | _ | _ | _ | _ | _ | 85.5 | 81.2 | 77.8 |
| 3 | COL11A1+DPT | 72.4 | 96.6 | 96.9 | 97.9 | 89.6 | 57.1 | _ | _ | _ | 92.6 | 94.4 | 87.2 | _ | _ | _ | _ | _ | _ | _ | _ | _ |
| 3 | MMP11+RRM2 | _ | _ | _ | _ | _ | _ | _ | _ | _ | 88.9 | 90.7 | 69.2 | 86.5 | 88.5 | 87.5 | _ | _ | _ | 75.0 | 82.6 | 63.0 |
Each numerical value represents the classification accuracy (by percentage) between cancer tissues and their corresponding reference tissues.