Literature DB >> 30936679

High-Throughput Mutation Data Now Complement Transcriptomic Profiling: Advances in Molecular Pathway Activation Analysis Approach in Cancer Biology.

Anton Buzdin1,2,3, Maxim Sorokin1,2,3, Elena Poddubskaya1,4, Nicolas Borisov1,2.   

Abstract

We recently reviewed the current progress in the use of high-throughput molecular "omics" data for the quantitative analysis of molecular pathway activation. These quantitative metrics may be used in many ways, and we focused on their application as tumor biomarkers. Here, we provide an update of the most recent conceptual findings related to pathway analysis in tumor biology, which were not included in the previous review. The major novelties include a method enabling calculation of pathway-scale tumor mutation burden termed "Pathway Instability" and its application for scoring of anticancer target drugs. A new technique termed Shambhala emerged that enables accurate common harmonization of any number of gene expression profiles obtained using any number of experimental platforms. This may be helpful for merging various gene expression data sets and for comparing their pathway activation characteristics. Another recent bioinformatics method, termed FLOating-Window Projective Separator (FloWPS), has the potential to significantly enhance the value of pathway activation profiles as biomarkers of cancer response to treatments. It reduces the minimum required number of training samples needed to construct a machine-learning-based classifier. Finally, several documented clinical cases have been recently published, in which gene-expression-based pathway analysis was successfully used for personalized off-label prescription of target drugs to metastatic cancer patients.

Entities:  

Keywords:  bioinformatics; cancer; machine learning; mutation profiling; signaling pathways

Year:  2019        PMID: 30936679      PMCID: PMC6434430          DOI: 10.1177/1176935119838844

Source DB:  PubMed          Journal:  Cancer Inform        ISSN: 1176-9351


Comment On: Buzdin A, Sorokin M, Garazha A, et al. Molecular pathway activation—new type of biomarkers for tumor morphology and personalized selection of target drugs. Semin Cancer Biol. 2018;53:110-124. doi:10.1016/j.semcancer.2018.06.003. PubMed PMID:29935311. https://www.ncbi.nlm.nih.gov/pubmed/29935311. In the recent paper by Buzdin et al,[1] we reviewed methods of molecular pathway analysis and their applications in basic research and biomedicine, primarily as cancer biomarkers. These methods deal with high-throughput gene expression and even epigenetic data. Namely, both mRNA and proteomic profiles are considered for gene expression analysis; microRNA profiles and densities of transcription factor binding sites (TFBS) are the epigenetic landmarks reported as useful sources of primary data for pathway analyses. This may seem surprising, but no genetic data, such as cancer mutation profiles, were ever communicated as the material for high-throughput modeling of molecular pathway activation. Accordingly, no such approaches were mentioned in the review.[1] This has recently changed when Zolotovskaia et al[2,3] published 2 research papers reporting cancer exome sequencing profiles being used to aggregate DNA mutation data in a value termed “pathway instability” (PI). PI reflects overall mutation burden of a pathway. It can be calculated for either total mutations or any specific group of them, such as truncating mutations that abrogate protein functions.[3] On both levels, PI serves as a significantly better type of biomarker compared with mutations in individual genes.[3] The study of 5956 individual mutation profiles from 15 cancer types encompassing 2 316 670 mutations in 19 872 genes and 1748 molecular pathways demonstrated unparalleled advantage of pathway-based mutation biomarkers (PI) over mutation profiles for the individual gene. This was reflected by more than 2 orders of magnitude greater numbers by high-quality (area under the receiver operating characteristic curve [AUC ROC] > 0.75) biomarkers. For example, the number of such biomarkers distinguishing between different cancer types in “one versus all” comparisons was only 6 (1 for truncating mutations only) for the gene mutations, but already 660 (21 for truncating mutations) for the PI values. Similarly, when doing pairwise comparisons for the above 15 cancer types, a total of 32 594 (1024 for truncating mutations) good-quality PI biomarkers were identified versus only 226 (24 for truncating mutations) gene mutation biomarkers.[3] Considering that the potential number of PI biomarkers (1748) was already 1 order of magnitude lower than the potential number of gene biomarkers (19 872 for all mutations and 16 760 for truncating mutations), this provides even stronger evidence for the superiority of the PI-based approach over mutations in individual genes.[3] Furthermore, a molecular pathway-based algorithm for the scoring of anticancer target drugs (ATDs) was developed using PI values to predict clinical efficacies of drugs. It deals with quantization of mutation enrichment for the molecular pathways having molecular targets of a drug under consideration. Here, the rationale used was that the greater the mutation level of the respective pathways, the higher the expected drug efficiency. The output value termed Mutation Drug Scoring (MDS) positively correlates with the expected efficacy of drugs for a specific tumor. Among 10 versions tested, the best version of this algorithm was selected and validated using 3800 exome mutation profiles for 128 ATDs by finding correlations of MDS values with the known drug efficiencies tested in clinical trials.[2] The same approach was next applied to simulate all known protein-coding genes as putative drug targets using 18 273 mutation profiles for 8 cancer types. Indeed, the MDS-predicted hits very frequently coincided with those already used as targets of the existing ATDs[2] (Figure 1). However, several novel candidate genes that can be considered promising targets for new drug development were identified. The MDS approach, therefore, is applicable to both the ranking of drugs and the identification of molecular targets for biopharmaceuticals. These applications of mutation data analysis are presented in detail in a recent methods paper.
Figure 1.

Dependence of MDS and occurrence of molecular targets in approved cancer drugs. Distribution of MDS values among the potential molecular drug targets. The color scale on the graph indicates densities of clinically approved cancer drugs exploiting the respective molecular targets. MDS indicates Mutation Drug Scoring.

Dependence of MDS and occurrence of molecular targets in approved cancer drugs. Distribution of MDS values among the potential molecular drug targets. The color scale on the graph indicates densities of clinically approved cancer drugs exploiting the respective molecular targets. MDS indicates Mutation Drug Scoring. A pioneering application of molecular pathway approach for TFBS data was reviewed previously[1,4]; here, we just want to mention that it was used in a recent high-throughput study of human genes with rapidly or slowly evolving regulatory modules. Using a complete set of the ENCODE project TFBS data for 563 transcription factors in 13 human cell lines,[5] the authors found that the major quickly reshaping processes in human evolution deal with gene regulation by microRNAs, olfaction, color vision, fertilization, cellular immune response, detoxication, amino acids, and fatty acids metabolism. Genes from the “slow” group were involved in protein translation, RNA transcription and processing, chromatin organization, and in various aspects of intracellular molecular signaling.[5] Another recent methods paper summarizes details of the study of molecular pathway activation using RNA sequencing data. At the first stage, this analysis compares gene expression in a sample with a pool of reference (“normal”) expression profiles. Several large collections of normal human tissue expression data sets that can be used for such analyses were published in the last year. Moreover, a novel version of the Oncobox pathway analysis algorithm[7] adds an additional, previously unknown, step of transcriptomic data, termed “Shambhala harmonization,” to make gene expression profiles compatible for both normal and case bio samples.[6] As previously shown for the cell culture gene expression data, molecular pathways activation data may be used to train the machine learning algorithms to predict efficiencies of ATDs.[7-9] However, these algorithms have very limited applicability to clinical molecular data because, to avoid overtraining, they require large cohorts of the ATD responder/non-responder molecular profiles, which are not available for the overwhelming majority of cases. Although this major limitation still persists, a novel approach, termed FLOating-Window Projective Separator (FloWPS), that may significantly reduce the number of samples to be included in a statistically significant analysis was recently published.[10] Briefly, this is a data trimming procedure tailored for personalized predictions based on molecular data. In its initial application, it uses support vector machines (SVMs), a machine learning method that is highly popular for the analysis of biomedical data.[10] Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. Its unique property is that irrelevant features in validation data set that do not have significant number of neighboring hits in the training data set are removed from further analyses. Next, for each point of a validation data set, FloWPS takes into account only the proximal points of the training data set. Thus, for every point of a validation data set, the training data set is adjusted to form a floating window.[10] The methods validation on 992 cancer patient expression data sets confirmed that FloWPS enables to significantly increase quality of a classifier.[10] So far, FloWPS has been tested for single gene expression data only. We anticipate that using this method for molecular pathway activation levels will be fruitful for developing novel robust tools for the classification of responses on drugs, treatment methods, and their combinations. Finally, a second-opinion platform termed Oncobox has been generated for clinical oncologists, which uses the original pathway activation scoring algorithm working with RNA-seq data. It personalizes selection of ATDs for individual cancer patients. Pathway activation levels are calculated and, along with the concentrations of molecular target gene products, are used as predictors of tumor response on ATDs, reflected by the Balanced Drug Efficiency Score (BES) index. Previously, a pilot prospective clinical study was performed for a cohort of 23 recurrent/metastatic solid tumor patients using microarray gene expression data.[11] The objective response rate for the Oncobox-guided ATD prescriptions was ~61% (complete + partial response, RECIST). Since April 2018, a new trial started using RNA-seq data for recurrent/metastatic solid tumors that already includes 239 patients (trial ID NCT03724097). RNAs were extracted from the formalin-fixed, paraffin-embedded (FFPE) tissue blocks. Following the test, 130 ATDs were rated according to their predicted effectiveness. After appointment of therapy, patients are divided into 3 observation groups. The first group consists of patients receiving target drugs in agreement with the Oncobox drug efficiency prediction as monotherapy or in combination(s); the second group of patients only receives drugs not recommended according to the Oncobox tests; and the third group of patients receives palliative care. The trial is ongoing and, so far, there are no preliminary results available, but several recently published clinical case studies using the same system look promising and suggest that the Oncobox system may be a useful tool for predicting efficiencies of tyrosine kinase inhibitor ATDs in metastatic advanced tumors. For example, Oncobox-guided off-label prescription of imatinib (Figure 2) was efficient for a patient with recurrent granulosa cell ovarian tumor,[12] and (sequential) prescription of sorafenib and pazopanib (Figure 3) to a patient with unresectable metastatic cholangiocarcinoma also showed promising results.[13]
Figure 2.

ERK signaling pathway was hyperactivated in the patient’s tumor tissue. Visualization was provided by Oncobox software. The pathway is shown as an interacting network, where green arrows indicate activation and red arrows indicate inhibition. Color depth of each node of the network corresponds to the logarithms of the case-to-normal (CNR) expression rate for each node, where “normal” is a geometric average between normal tissue samples, and the scale represents extent of up-/down-regulation. The molecular targets of Imatinib are shown by black arrows. ERK indicates extracellular signal-regulated kinase.

Figure 3.

(A) ERK and (B) Ras signaling pathways were hyperactivated in the biopsy CCA tissue. Visualization was provided by Oncobox software. The pathways are shown as an interacting network, where green arrows indicate activation and red arrows indicate inhibition. Color depth of each node of the network corresponds to the logarithms of the case-to-normal (CNR) expression rate for each node, where “normal” is a geometric average between normal tissue samples, and the scale represents extent of up-/down-regulation. The molecular targets of sorafenib and pazopanib are shown by black arrows. ERK indicates extracellular signal-regulated kinase; CCA, cholangiocarcinoma.

ERK signaling pathway was hyperactivated in the patient’s tumor tissue. Visualization was provided by Oncobox software. The pathway is shown as an interacting network, where green arrows indicate activation and red arrows indicate inhibition. Color depth of each node of the network corresponds to the logarithms of the case-to-normal (CNR) expression rate for each node, where “normal” is a geometric average between normal tissue samples, and the scale represents extent of up-/down-regulation. The molecular targets of Imatinib are shown by black arrows. ERK indicates extracellular signal-regulated kinase. (A) ERK and (B) Ras signaling pathways were hyperactivated in the biopsy CCA tissue. Visualization was provided by Oncobox software. The pathways are shown as an interacting network, where green arrows indicate activation and red arrows indicate inhibition. Color depth of each node of the network corresponds to the logarithms of the case-to-normal (CNR) expression rate for each node, where “normal” is a geometric average between normal tissue samples, and the scale represents extent of up-/down-regulation. The molecular targets of sorafenib and pazopanib are shown by black arrows. ERK indicates extracellular signal-regulated kinase; CCA, cholangiocarcinoma. An alternative line of biomedical investigations may deal with pathway activation-guided personalized selection of combinations of ATDs. In theory, the same pathway approaches, which were shown effective with single ATDs, may also work for their combinations. However, these studies are today at an early stage with 1 recent report published predicting and experimentally validating various effective ATD combinations for cancer cell lines based on molecular pathway activation data.[14] We expect intensive development of this field, assuming that ATDs may work synergistically in a patient-specific manner. We conclude, therefore, that the field covered by the topic “Molecular pathway activation—New type of biomarkers for tumor morphology and personalized selection of target drugs” is evolving rapidly so that the landscape depicted in our previous review[1] already looks hardly recognizable in less than a year.
  7 in total

1.  Retroelement-Linked H3K4me1 Histone Tags Uncover Regulatory Evolution Trends of Gene Enhancers and Feature Quickly Evolving Molecular Processes in Human Physiology.

Authors:  Daniil Nikitin; Nikita Kolosov; Anastasiia Murzina; Karina Pats; Anton Zamyatin; Victor Tkachev; Maxim Sorokin; Philippe Kopylov; Anton Buzdin
Journal:  Cells       Date:  2019-10-08       Impact factor: 6.600

2.  Editorial: Next Generation Sequencing Based Diagnostic Approaches in Clinical Oncology.

Authors:  Anton Buzdin; Ira Ida Skvortsova; Xinmin Li; Ye Wang
Journal:  Front Oncol       Date:  2021-01-28       Impact factor: 6.244

3.  RNA Sequencing in Comparison to Immunohistochemistry for Measuring Cancer Biomarkers in Breast Cancer and Lung Cancer Specimens.

Authors:  Maxim Sorokin; Kirill Ignatev; Elena Poddubskaya; Uliana Vladimirova; Nurshat Gaifullin; Dmitriy Lantsov; Andrew Garazha; Daria Allina; Maria Suntsova; Victoria Barbara; Anton Buzdin
Journal:  Biomedicines       Date:  2020-05-09

4.  Mutation Enrichment and Transcriptomic Activation Signatures of 419 Molecular Pathways in Cancer.

Authors:  Marianna A Zolotovskaia; Victor S Tkachev; Alexander P Seryakov; Denis V Kuzmin; Dmitry E Kamashev; Maxim I Sorokin; Sergey A Roumiantsev; Anton A Buzdin
Journal:  Cancers (Basel)       Date:  2020-01-22       Impact factor: 6.639

5.  RNA sequencing profiles and diagnostic signatures linked with response to ramucirumab in gastric cancer.

Authors:  Maxim Sorokin; Elena Poddubskaya; Madina Baranova; Alex Glusker; Lali Kogoniya; Ekaterina Markarova; Daria Allina; Maria Suntsova; Victor Tkachev; Andrew Garazha; Marina Sekacheva; Anton Buzdin
Journal:  Cold Spring Harb Mol Case Stud       Date:  2020-04-01

6.  Disparity between Inter-Patient Molecular Heterogeneity and Repertoires of Target Drugs Used for Different Types of Cancer in Clinical Oncology.

Authors:  Marianna A Zolotovskaia; Maxim I Sorokin; Ivan V Petrov; Elena V Poddubskaya; Alexey A Moiseev; Marina I Sekacheva; Nicolas M Borisov; Victor S Tkachev; Andrew V Garazha; Andrey D Kaprin; Peter V Shegay; Alf Giese; Ella Kim; Sergey A Roumiantsev; Anton A Buzdin
Journal:  Int J Mol Sci       Date:  2020-02-26       Impact factor: 5.923

7.  Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology.

Authors:  Victor Tkachev; Maxim Sorokin; Constantin Borisov; Andrew Garazha; Anton Buzdin; Nicolas Borisov
Journal:  Int J Mol Sci       Date:  2020-01-22       Impact factor: 5.923

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.