| Literature DB >> 26446398 |
V V Galatenko1,2, M Yu Shkurnikov3, T R Samatov2,4, A V Galatenko1, I A Mityakina2, A D Kaprin3, U Schumacher5, A G Tonevitsky1,3.
Abstract
Genes with significant differential expression are traditionally used to reveal the genetic background underlying phenotypic differences between cancer cells. We hypothesized that informative marker sets can be obtained by combining genes with a relatively low degree of individual differential expression. We developed a method for construction of highly informative gene combinations aimed at the maximization of the cumulative informative power and identified sets of 2-5 genes efficiently predicting recurrence for ER-positive breast cancer patients. The gene combinations constructed on the basis of microarray data were successfully applied to data acquired by RNA-seq. The developed method provides the basis for the generation of highly efficient prognostic and predictive gene signatures for cancer and other diseases. The identified gene sets can potentially reveal novel essential segments of gene interaction networks and pathways implied in cancer progression.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26446398 PMCID: PMC4597361 DOI: 10.1038/srep14967
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Synthetic example of a contribution of a gene with a low degree of differential expression into the cumulative informative power of the pair of genes.
(a) Expression of gene 1 in Groups A and B; (b) expression of gene 2 in Groups A and B; (c) the joint distribution of the expression of genes 1 and 2 in Groups A and B.
Figure 2Scheme of gene pairs analysis.
Genes included in at least 20 informative gene pairs (out of 547).
| Gene Symbol | Number of informative pairs |
|---|---|
| SQLE | 109 |
| DSCC1 | 85 |
| CTTN | 43 |
| TOMM70A | 37 |
| TTK | 34 |
| RACGAP1 | 29 |
| ELOVL5 | 29 |
| KIF4A | 27 |
| CCNA2 | 23 |
Informative gene pairs that satisfied the additional constraints the sensitivity and specificity.
| Gene Symbol 1 | Gene Symbol 2 | AUC | Sensitivity | Specificity |
|---|---|---|---|---|
| IGFBP6 | ELOVL5 | 0.749 | 81.8% | 62.5% |
| HSPD1 | ELOVL5 | 0.756 | 72.7% | 62.5% |
| TTK | CADPS2 | 0.721 | 78.8% | 57.8% |
| RUNX1 | SQLE | 0.645 | 72.7% | 51.6% |
| ELOVL5 | PPIA | 0.721 | 81.8% | 56.3% |
| PSMD2 | TTK | 0.679 | 66.7% | 59.3% |
| LGR4 | KIF4A | 0.678 | 69.7% | 64.8% |
| DCTD | SQLE | 0.635 | 54.5% | 65.6% |
| ELP4 | KIF4A | 0.717 | 78.8% | 65.6% |
| BTN3A3 | RACGAP1 | 0.668 | 60.6% | 71.9% |
| TTK | DIRAS3 | 0.745 | 75.8% | 61.7% |
| HSPD1 | IL6ST | 0.733 | 75.8% | 62.5% |
| REST | SQLE | 0.637 | 63.6% | 52.3% |
| DLG3 | PRC1 | 0.664 | 60.6% | 54.7% |
AUC values, sensitivity and specificity are shown for the validation dataset.
Figure 3The properties of the classifier based on the gene pair IGFBP6 and ELOVL5 for the validation dataset.
(a) ROC-curve. (b) Kaplan-Meier curves. RFS – recurrence free survival.
Figure 4The expression levels of IGFBP6 and ELOVL5 for patients with and without recurrence measured by RNA-sequencing.
(a) Log-scaled expression of IGFPB6. (b) Log-scaled expression of ELOVL5. (c) The joint distribution of expressions of IGFBP6 and ELOVL5.