| Literature DB >> 33921457 |
Satoshi Takahashi1,2, Masamichi Takahashi3,4, Shota Tanaka5, Shunsaku Takayanagi5, Hirokazu Takami5, Erika Yamazawa5, Shohei Nambu5, Mototaka Miyake6, Kaishi Satomi7, Koichi Ichimura4, Yoshitaka Narita3, Ryuji Hamamoto1,2.
Abstract
Although the incidence of central nervous system (CNS) cancers is not high, it significantly reduces a patient's quality of life and results in high mortality rates. A low incidence also means a low number of cases, which in turn means a low amount of information. To compensate, researchers have tried to increase the amount of information available from a single test using high-throughput technologies. This approach, referred to as single-omics analysis, has only been partially successful as one type of data may not be able to appropriately describe all the characteristics of a tumor. It is presently unclear what type of data can describe a particular clinical situation. One way to solve this problem is to use multi-omics data. When using many types of data, a selected data type or a combination of them may effectively resolve a clinical question. Hence, we conducted a comprehensive survey of papers in the field of neuro-oncology that used multi-omics data for analysis and found that most of the papers utilized machine learning techniques. This fact shows that it is useful to utilize machine learning techniques in multi-omics analysis. In this review, we discuss the current status of multi-omics analysis in the field of neuro-oncology and the importance of using machine learning techniques.Entities:
Keywords: glioma; machine learning; multi-omics analysis; neuro-oncology
Year: 2021 PMID: 33921457 PMCID: PMC8070530 DOI: 10.3390/biom11040565
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1A chronological table of the history of neuro-oncology research. Abbreviations: BCNU, 1,3-bis(2-chloroethyl)-1-nitrosourea; MGMT, O6-methylguanine–DNA methyltransferase; IDH, Isocitrate Dehydrogenase; CNS WHO, World Health Organization Classification of the Central Nervous System.
Figure 2Expected signaling pathway changes in neuro-oncology. This result is achieved by mapping the unequivocal genetic alternations onto major pathways that are already known to be implicated in glioblastoma. Abbreviations: EGFR, Epidermal Growth Factor Receptor; ERBB2, Erb-B2 Receptor Tyrosine Kinase 2; PDGFRA, Platelet-Derived Growth Factor Receptor Alpha; MET, MET Proto-Oncogene, Receptor Tyrosine Kinase; NF1, Neurofibromin 1; PI(3)k, Phosphatidylinositol-3 kinase; PTEN, Phosphatase and Tensin Homolog; AKT, AKT Serine/Threonine Kinas; FOXO, Forkhead Box O; CDKN2A, Cyclin-Dependent Kinase Inhibitor 2A; MDM2, MDM2 Proto-Oncogene; MDM4, MDM4 Regulator of P53; TP53, Tumor Protein P53; CDKN2B, Cyclin-Dependent Kinase Inhibitor 2B; CDKN2C, Cyclin-Dependent Kinase Inhibitor 2C; CDK4, Cyclin-Dependent Kinase 4; CCND2, Cyclin D2; CDK6, Cyclin-Dependent Kinase 6; RB1, RB Transcriptional Corepressor 1.
Summary of the studies short-listed for this review.
| No. | Year | Title | Dataset | Input Data Category | Tumor Type | Output Category | Analysis Method |
|---|---|---|---|---|---|---|---|
| 1 | 2008 | Comprehensive genomic characterization defines human glioblastoma genes and core pathways [ | TCGA | Somatic mutation, copy number change profiles | GBM | Pathway and network | Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm and Genome Topography Scan (GTS) utilizing polynomial regression* |
| 2 | 2013 | Joint and individual variation explained (JIVE) for integrated analysis of multiple data types [ | TCGA | Gene expression, miRNA expression | GBM | Pathway and network | Joint and Individual Variation Explained (JIVE), which is an extension of PCA* or the SVD* that decomposes the data into low-rank and orthogonal joint and individual components |
| 3 | 2015 | Integrative multi-omics module network inference with Lemon-Tree [ | TCGA | Gene expression, copy number change profiles | GBM | Pathway and network | The module network method, a special type of Bayesian network* algorithms, with Lemon-Tree |
| 4 | 2015 | Identifying core gene modules in glioblastoma based on multilayer factor-mediated dysfunctional regulatory networks through integrating multi-dimensional genomic data [ | TCGA | Gene expression, copy number change profiles, somatic mutation, DNA methylation, miRNA expression | GBM | Pathway and network | Core Modules Driving Dysregulation in cancer (CMDD) using PLSR* |
| 5 | 2016 | Causal mechanistic regulatory network for glioblastoma deciphered using systems genetics network analysis [ | TCGA | Somatic mutation, gene expression, miRNA expression | GBM | Pathway and network | Systems Genetics Network Analysis (SYGNAL) pipeline using cMonkey2 biclustering algorithm* |
| 6 | 2016 | MONGKIE: an integrated tool for network analysis and visualization for multi-omics data [ | TCGA | Somatic mutation, copy number change profiles | GBM | Pathway and network | Modular Network Generation and Visualization with Knowledge Integration Environments (MONGKIE) using graph clustering* |
| 7 | 2017 | Incorporating prior information into differential network analysis using non-paranormal graphical models [ | TCGA | Gene expression, copy number change profiles | GBM | Pathway and network | Prior information-dependent differential network analysis (pDNA) using GGM* |
| 8 | 2017 | A systemic analysis of transcriptomic and epigenomic data to reveal regulation patterns for complex disease [ | TCGA | Gene expression, DNA methylation, miRNA expression | GBM | Pathway and network | Integrative analysis framework by incorporating sparse model, multivariate analysis, elastic net penalized regression*, GGM*, and network analysis |
| 9 | 2018 | Repression of Septin9 and Septin2 suppresses tumor growth of human glioblastoma cells [ | GEO + cell line | Gene expression, protein expression | GBM | Pathway and network | Multiple analyses combining GBM expression studies from the GEO repository |
| 10 | 2019 | Integrated proteomic and metabolomic profiling the global response of rat glioma model by temozolomide treatment [ | Mouse model (cell line) | Protein expression, metabolomic profiling | GBM | Pathway and network | Ingenuity pathway analysis |
| 11 | 2019 | A multi-cohort and multi-omics meta-analysis framework to identify network-based gene signatures [ | TCGA, GEO, CGGA | Gene expression, DNA methylation | GBM, LGG | Pathway and network | Multi-cohort and multi-omics meta-analysis framework using perturbation clustering* |
| 12 | 2020 | Identifying cancer driver lncRNAs bridged by functional effectors through integrating multi-omics data in human cancers [ | TCGA | Gene expression, copy number change profiles, somatic mutation, DNA methylation, miRNA expression | GBM | Pathway and network | DriverLncNet is proposed to integrate multi-omics data to identify lncRNAs as drivers of human cancer using PLSR* |
| 13 | 2016 | Integrated multi-omics analysis of oligodendroglial tumors identifies three subgroups of 1p/19q co-deleted gliomas [ | POLA | Gene expression, DNA methylation, miRNA expression | OT | Clinical status | k-means clustering* |
| 14 | 2018 | Whole-genome multi-omic study of survival in patients with glioblastoma [ | TCGA | Gene expression, DNA methylation, somatic mutation, copy number change profiles | GBM | Clinical status | Multi layered Bayesian regression* |
| 15 | 2019 | Group lasso regularized deep learning for cancer prognosis from multi-omics and clinical features [ | TCGA | Gene expression, copy number change profiles, somatic mutation, protein expression | GBM | Clinical status | Group lasso regularized deep learning* |
| 16 | 2019 | A novel MKL Method for GBM prognosis prediction by integrating histopathological image and multi-omics data [ | TCGA | Histopathological images, gene expression, copy number change profiles, mRNA expression | GBM | Clinical status | Multiple kernel learning* |
| 17 | 2020 | Integration of radiomic and multi-omic analyses predicts survival of newly diagnosed | TCIA, TCGA, MUHC | MRI, gene expression, somatic mutation, clinical, protein expression | Clinical status | Random forest* | |
| 18 | 2020 | Multi-dimensional omics characterization in glioblastoma identifies the purity-associated pattern and prognostic gene signatures [ | TCGA, GEO, CGGA | Gene expression, copy number change profiles, somatic mutation, DNA methylation | GBM | Clinical status | LASSO* |
| 19 | 2020 | Integrating genomic data with transcriptomic data for improved survival prediction for adult diffuse glioma [ | TCGA | Gene expression, DNA methylation, somatic mutation, copy number change profiles | GBM, LGG | Clinical status | Random forest* |
| 20 | 2017 | Multi-omics analysis of primary glioblastoma cell lines shows recapitulation of pivotal molecular features of parental tumors [ | Private dataset | Gene expression, somatic mutation, copy number change profiles | GBM | Miscellaneous | Global Parameters Hidden Markov Model (GPHMM) algorithm* |
| 21 | 2018 | A mechanistic pan-cancer pathway model informed by multi-omics data interprets stochastic cell fate responses to drugs and mitogens [ | Cell line | Gene expression, copy number change profiles, protein expression | GBM | Miscellaneous | LASSO* and support vector machine* |
| 22 | 2019 | Reduced neoantigen expression revealed by longitudinal multiomics as a possible immune evasion mechanism in glioma [ | Private dataset | Gene expression, WES | GBM, LGG | Miscellaneous | NetMHCpan using artificial neural networks* |
| 23 | 2020 | Computational identification and characterization of glioma candidate biomarkers through multi-omics integrative profiling [ | GTEx, TCGA, CGGA, GEO, Ivy GAP | Gene expression, DNA methylation, somatic mutation, protein expression | Glioma | Miscellaneous | Computational integrative multi-omics data analysis |
Abbreviations: ML, Machine Learning; TCGA, The Cancer Genome Atlas; TCIA, The Cancer Imaging Archive; GEO, Gene Expression Omnibus; GTEx, Genotype-Tissue Expression; CGGA, Chinese Glioma Genome Atlas; LGG, Lower-Grade Glioma; GBM, Glioblastoma multiforme; POLA, Prise en charge des OLigodendrogliomes Anaplasiques; OT, Oligodendrogial Tumors; MUHC, McGill University Health Centre; Ivy GAP, Ivy Glioblastoma Atlas Project; PCA, Principle Components Analysis; SVD, Singular Value Decomposition; PLSR, Partial Least Squares Regression; GGM, Gaussian Graphical Model. *: Machine learning method.
Figure 3Creation of subnetworks consisting of methylation-driven genes, differentially expressed genes, and known interactions using a network propagation algorithm (modified from Reference [48]). As shown here, some pathway and network studies attempt to discover network genes as nodes and connections between genes as edges. Shafi et al. detected differential expressed genes and methylated genes using the leave-one-out method. Then, they combined the result of the Figure.
Figure 4The summary of References [53,54]. Chaddad et al. treated MRI (A) and Zhang et al. treated histopathological images (B) as data similar to a high-throughput one rather than as merely pictures to obtain clinical data.
Figure 5The concept art for future multi-omics analysis. Various types of data are obtained for garnering a better understanding of the nature of the disease and are integrated by machine learning models as well as humans.