| Literature DB >> 27275538 |
Salvatore Alaimo1, Rosalba Giugno2, Mario Acunzo3, Dario Veneziano3, Alfredo Ferro1, Alfredo Pulvirenti1.
Abstract
MOTIVATION: Prediction of phenotypes from high-dimensional data is a crucial task in precision biology and medicine. Many technologies employ genomic biomarkers to characterize phenotypes. However, such elements are not sufficient to explain the underlying biology. To improve this, pathway analysis techniques have been proposed. Nevertheless, such methods have shown lack of accuracy in phenotypes classification.Entities:
Keywords: RNA-Seq; microRNAs; pathway analysis; phenotype classification
Mesh:
Substances:
Year: 2016 PMID: 27275538 PMCID: PMC5342365 DOI: 10.18632/oncotarget.9788
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
List of cancer types extracted from the cancer genome atlas (TCGA) with their codes, number of case and control samples, and subcategories
| Code | Cancer type | Control samples | Case samples | Case samples categories |
|---|---|---|---|---|
| BLCA | Bladder Urothelialt Carcinoma | 19 | 193 | Stage I, II, III, IV |
| BRCA | Breast invasive carcinoma | 86 | 642 | Stage I, II, III, IV, X |
| COAD | Colon adenocarcinoma | 8 | 389 | Stage I, II, III, IV |
| KICH | Kidney Chromophobe | 25 | 66 | Stage I, II, III, IV |
| KIRC | Kidney renal clear cell carcinoma | 71 | 224 | Stage I, II, III, IV |
| LUAD | Lung adenocarcinoma | 19 | 388 | Stage I, II, III, IV |
| LUSC | Lung squamous cell carcinoma | 37 | 247 | Stage I, II, III, IV |
| PRAD | Prostate adenocarcinoma | 50 | 191 | Category 6, 7, 8, 9, 10 |
| READ | Rectum adenocarcinoma | 3 | 150 | Stage I, II, III, IV |
| UCEC | Uterine Corpus Endometrial Carcinoma | 14 | 231 | Stage I, II, III, IV |
| All Samples |
Figure 1Performances comparison between MITHrIL, SPIA, micrographite and PARADIGM by means of the average area under the ROC curves
Each box in the figure represents the variability range of AUC values for a specific methodology.
Classification results of tumor samples in our dataset obtained training PAMR algorithm by means of Log-Fold-Change, SPIA total accumulation, paradigm scores, MITHrIL accumulators, and MITHrIL endpoint perturbations
| Data | Log-FC | Perturb. of random nodes | Endpoints | Pathway-level statistics | |||||
|---|---|---|---|---|---|---|---|---|---|
| MITHrIL | MITHrIL no miRNA | MITHrIL | MITHrILno miRNA | PARADIGM | MITHrIL Acc. | SPIA Acc. | PARADIGM scores | ||
| 3.11% | 9.59% | 6.58% | 2.60% | 2.08% | 12.95% | 49.74% | 82.38% | ||
| 1.86% | 2.12% | 3.97% | 2.00% | 2.34% | 13.08% | 8.25% | 73.05% | ||
| 2.31% | 7.81% | 3.10% | 0.77% | 32.90% | |||||
| 3.03% | 1.67% | 3.03% | 3.03% | 4.54% | 3.03% | 31.81% | |||
| 3.12% | 2.68% | 2.77% | 2.68% | 3.13% | 5.80% | 2.67% | 35.26% | ||
| 4.89% | 8.61% | 1.80% | 2.06% | 5.41% | 4.38% | 2.83% | 64.43% | ||
| 6.07% | 1.78% | 6.92% | 2.02% | 6.91% | 5.26% | 4.04% | 71.54% | ||
| 0.37% | 1.26% | 0.52% | 2.61% | 30.89% | 18.94% | ||||
| 3.33% | 9.40% | 4.00% | 0.66% | 96.66% | |||||
| 1.73% | 1.39% | 1.13% | 0.43% | 0.09% | 4.32% | 1.29% | 46.32% | ||
| 2.90% | 1.75% | 5.38% | 1.50% | 3.20% | 6.40% | 8.80% | 57.60% | ||
Each element in the table corresponds to the classification error for a specific cancer type using one algorithm. Despite the reference classification based on Log-Fold-Change yields a low average error (2.90%), the employment of perturbations computed for each endpoint provides a significant improvement in the classification accuracy
Figure 2Significance of the addition of miRNA in our model by means of a comparison of the percentages of correctly predicted endpoints for each sample between our method with and without miRNAs
Each box in the figure represents the variability range of the percentage of correctly predicted endpoints for the patients of a specific tumor type. A prediction is correct when the deregulation observed in the original data correspond to the one inferred by our algorithm. Namely, the sign of an endpoint log-Fold-Change corresponds to the sign of its perturbation value.