| Literature DB >> 27308330 |
Shengjun Hong1, Yi Huang1, Yaqiang Cao1, Xingwei Chen1, Jing-Dong J Han1.
Abstract
The recent rapid development of high-throughput technology enables the study of molecular signatures for cancer diagnosis and prognosis at multiple levels, from genomic and epigenomic to transcriptomic. These unbiased large-scale scans provide important insights into the detection of cancer-related signatures. In addition to single-layer signatures, such as gene expression and somatic mutations, integrating data from multiple heterogeneous platforms using a systematic approach has been proven to be particularly effective for the identification of classification markers. This approach not only helps to uncover essential driver genes and pathways in the cancer network that are responsible for the mechanisms of cancer development, but will also lead us closer to the ultimate goal of personalized cancer therapy.Entities:
Keywords: Bayesian network; cancer diagnosis; cancer prognosis; molecular signature; subtype
Year: 2014 PMID: 27308330 PMCID: PMC4905187 DOI: 10.4161/23723548.2014.957981
Source DB: PubMed Journal: Mol Cell Oncol ISSN: 2372-3556
Summary of cancer data resources
| Web Resource | Raw data | Preprocessed data | Features | Clinical information |
|---|---|---|---|---|
| TCGA and cBioPortal | Yes | Yes | Genomic, transcriptomic, epigenomic, proteomic | Yes |
| COSMIC & COSMICMart | No | Yes | Mutation | No |
| Oncomine | Yes | Yes | Microarray-based gene expression and copy number variation | Yes |
| Oncotator | No | Yes | Gene, mutation, cancer amplification, and deletion region | No |
| Tumorscape | No | Yes | Copy number variation | No |
| TCIA | Yes | Yes | Medical images | Yes |
Figure 1.Overview of strategies to detect cancer subtypes related to cancer diagnosis and prognosis.
Methods for the detection of molecular signatures of cancer diagnosis and prognosis
| Approach | Summary | Data type | |
|---|---|---|---|
| Associative inference–Supervised learning | |||
| GWAS | PLINK | An open-source whole genome association tool, including statistics such as Chi-square test, Cochran-Armitage test, and Fisher's exact test. | Genotype, CNV, and haplotype |
| SNPTEST | Incorporates imputation methods for genotype association test | Genotype | |
| PheWAS | Investigates the association between SNP and phenotypes | Genotype and phenotype | |
| DEGs | Student's t-test, SAM, limma, edgeR, DESeq, Cuffdiff | Identifies DEGs, assuming homogeneity of examined samples | Gene expression |
| COPA, OS, ORT, MOST, GTI, SVM-RFE | Identifies DEGs, robust to heterogeneity of examined samples | ||
| Noise reduction | Z-score normalization | Preprocess expression data with relative intensities | Multiple-layer data integration |
| Quantile normalization | |||
| Combat | Handles known confounding factors such as batch effects | Single-layer data | |
| SVA and ISVA | Excludes unknown confounding factors | Single-layer data | |
| Associative inference–Unsupervised learning | |||
| Clustering analysis | Hierarchical clustering, Kmeans clustering, SOM | Partitions features or samples into subgroups | Single-layer data |
| Biclustering, iCluster, iClusterPlus, PSDF, MDI, JIVE, SNF, Super k-means | Discovers subtypes with clinical outcomes, integrating multiple types of data | Multiple-layer data integration | |
| Mechanism inference–Modularity analysis | |||
| Subnetwork function analysis | IntOGen | Evaluates the contribution of biological modules to a cancer | Gene-gene association |
| DAVID, GSEA, PAGE | Reveals whether a cancer-related module is significantly enriched for a known pathway | Gene expression | |
| ARACNe | Infers regulatory interactions based on mutual information between genes | Gene expression | |
| Cytoscape | Network and modularity analysis | Multiple-layer data integration | |
| Mechanism inference—Bayesian network and causality inference | |||
| De novo network | Bayesian network | Infers a network by detecting potential causal relationships between genes | Multiple-layer data integration |
| Dynamic BN | Allows feedback relationship compared to regular Bayesian network | Multiple-layer data integration | |
Abbreviations: BN, Bayesian network; CNV, copy number variation; COPA, cancer outlier profile analysis; DEG, differentially expressed genes; GSEA, gene set enrichment analysis; PAGE, parametric analysis of gene set enrichment.