Literature DB >> 27311441

Molecular profiles to biology and pathways: a systems biology approach.

Steven Van Laere1, Luc Dirix2, Peter Vermeulen2.   

Abstract

Interpreting molecular profiles in a biological context requires specialized analysis strategies. Initially, lists of relevant genes were screened to identify enriched concepts associated with pathways or specific molecular processes. However, the shortcoming of interpreting gene lists by using predefined sets of genes has resulted in the development of novel methods that heavily rely on network-based concepts. These algorithms have the advantage that they allow a more holistic view of the signaling properties of the condition under study as well as that they are suitable for integrating different data types like gene expression, gene mutation, and even histological parameters.

Entities:  

Keywords:  Data integration; Networks; Pathways; Systems biology; Topology

Mesh:

Year:  2016        PMID: 27311441      PMCID: PMC4910225          DOI: 10.1186/s40880-016-0112-4

Source DB:  PubMed          Journal:  Chin J Cancer        ISSN: 1944-446X


For many researchers worldwide, unravelling the biology of human tumors, either primary or metastatic, is a daily practice. The development of high-throughput technologies, such as microarrays and next-generation sequencing, has greatly accelerated our understanding of the complex molecular underpinnings of cancer development and progression. Big sequencing consortia like The Cancer Genome Atlas and the International Cancer Genome Consortium provide the research community with an unprecedented wealth of genomic, epigenomic, transcriptomic, and proteomic details of various tumor and cell types [1-8]. The recent landscape of published articles represents only the initial analyses, and answers to many other important questions may well be buried deep inside the reported data. Many cancer profiling experiments have the common goal of identifying the signal transduction pathways and processes that characterize tumor biology, with the ultimate aim of discovering novel targets for treatment. The current mass production of molecular cancer profiles has spurred the development of novel tools particularly designed for this purpose. Most of these tools build on the concept of gene set enrichment analysis (GSEA), which evaluates whether the overlap between two gene sets is greater than that expected by chance [9]. Analyzed gene sets represent lists of significantly mutated or overexpressed genes on the one hand and lists of gene-associated pathways or processes on the other hand. The latter are based on prior knowledge gained through decades of basic and translational research and can be found in various publically available databases, such as the Gene Ontology, the Kyoto Encyclopedia of Genes and Genomes, and the Biocarta. However, novel biological insights have increasingly called into question the classical representation of processes and particularly pathways as hierarchically structured as well as mostly linear diagrams of protein–protein interactions that are sharply and precisely delineated from the broader cellular transduction network. Instead, the now-prevailing understanding of systems biologists is to think of pathways and processes as warm and fuzzy clouds: warm since their representation is close to the truth but not necessarily exact; fuzzy since the membership of components in a pathway is graded and dynamic, and therefore not all components of a pathway are equally important and might vary; and a cloud since the boundaries of a pathway are not sharply defined because, among other reasons, many of the pathways and processes connect to form a network [10]. This new definition has inspired the development of novel algorithms that incorporate network statistics to derive biological knowledge from molecular profiles. Two important steps can be discerned: (1) network inference, which is the process of building networks from molecular data [11], and (2) network enrichment analysis, which incorporates topological information present in the network to identify which pathways and processes are relevant and how they are associated with each other in the context of the network. Network analysis will enable researchers to develop a more holistic view of cell signaling patterns and their interactions [12]. At present, it is important to introduce two conceptually different approaches to derive biological meaning from molecular profiles. Therefore, consider a gene expression profile from cancer cells treated with a ligand (e.g., vascular endothelial growth factor). The gene expression profile can be regarded as a functional read-out from a set of pathways that are activated upon ligand-receptor interaction and that culminate in the activation of transcription factors, which cause expression changes. Methods to reveal these upstream signaling pathways based on expression changes exist and are henceforth termed “bottom-up” approaches. In turn, expression changes will endorse a biological response, for example the induction of angiogenesis. Methods to translate gene expression changes, or any other molecular profile, into downstream biological responses are henceforth termed “top-down” approaches. The top-down approach is essentially initiated on the identified molecular profile (e.g., the expression profile identified in the imaginary experiment described above). The list of genes then goes through a sequence of (1) network inference, (2) network topology analysis, (3) pathway identification through GSEA, and (4) pathway prioritization using network and pathway statistics identified in steps 2 and 3. Pathway mutual exclusivity and co-occurrence, revealed by evaluating overlaps between sets of pathway-specific genes, may provide extra guidance during pathway prioritization [13]. In the bottom-up approach, the same sequence is used, but the genes on which the sequence is initiated represent the set of transcription factors identified through target gene enrichment analysis, thus the transcription factors that drive the molecular profile. Prior to target gene enrichment analysis, gene clustering can assist in finding sets of co-regulated genes, and identifying transcription factors based on co-regulated target gene expression may lead to more biologically meaningful results [14]. With respect to the analysis strategies outlined above, four remarks should be made. First, intuitively, the bottom-up approach is better for identifying pathways, whereas the top-down approach is more appropriate to evaluate biological processes. Nevertheless, the top-down approach performs equally well in identifying signal transduction pathway activation secondary to the prior molecular changes. Therefore, these two approaches should not be regarded as mutually exclusive with respect to the pathway-process distinction. Second, networks are extremely suited for data integration [12]. Therefore, the outlined analysis strategies can be used to perform pathway analysis based on multiple molecular profiles (e.g., mutational and expression data). The dynamic properties of signaling networks (e.g., feed-back and feed-forward loops) can be visualized, for example by incorporating gene expression fold-changes during the network analysis stage. Similarly, contributions of different cell types to the signaling network can be evaluated by supplying metagene expressions that can be regarded as expression contributions of independent cell populations that co-exist in a tumor and that can be identified through multidimensional scaling techniques, such as principal component analysis. Relevant network interactions can then be analyzed using hidden Markov models or conditional random field models [15]. Third, although the bottom-up approach can be applied to all molecular profiles, the description provided in the previous paragraph is only truly applicable to transcriptional profiles, due to the integration of the target gene enrichment analysis step. Last, deciphering tumor biology from molecular profiles critically depends on the quality of the tissue sampling procedure. From this perspective, the “garbage in, garbage out” computer science principle, which refers to the fact that computers will process unintended input data to produce nonsensical output, is also applicable in this setting. Several technical aspects that may have deleterious effects on data quality should be considered, including ischemia and fixation time, used fixatives, and potential sampling bias. No matter how sophisticated the analysis pipeline is, such quality issues cannot be resolved. Thus, researchers who are developing cancer profiling studies should focus primarily on establishing a rigid tissue sampling protocol. In addition, the tissue sampling protocol could also take into account tumor heterogeneity, thus providing multiple samples from the same tumor for molecular analysis. This may enable researchers to analyze in detail the different biological processes and signal transduction mechanisms operational in one tumor sample; this may also enable them to understand how the integration of these signals drives tumor biology, such as metastatic progression and therapy resistance. Finally, the outlined analysis strategy does not represent a novel algorithm. Rather, it presents an analysis philosophy that builds on already existing tools, most of which are freely accessible online, such as BioConductor and R packages, Cytoscape plugins, and Java tools like Expression2Kinases [14].
  12 in total

1.  Expression2Kinases: mRNA profiling linked to multiple upstream regulatory layers.

Authors:  Edward Y Chen; Huilei Xu; Simon Gordonov; Maribel P Lim; Matthew H Perkins; Avi Ma'ayan
Journal:  Bioinformatics       Date:  2011-11-10       Impact factor: 6.937

2.  Pathways of Toxicity.

Authors:  Andre Kleensang; Alexandra Maertens; Michael Rosenberg; Suzanne Fitzpatrick; Justin Lamb; Scott Auerbach; Richard Brennan; Kevin M Crofton; Ben Gordon; Albert J Fornace; Kevin Gaido; David Gerhold; Robin Haw; Adriano Henney; Avi Ma'ayan; Mary McBride; Stefano Monti; Michael F Ochs; Akhilesh Pandey; Roded Sharan; Rob Stierum; Stuart Tugendreich; Catherine Willett; Clemens Wittwehr; Jianguo Xia; Geoffrey W Patton; Kirk Arvidson; Mounir Bouhifd; Helena T Hogberg; Thomas Luechtefeld; Lena Smirnova; Liang Zhao; Yeyejide Adeleye; Minoru Kanehisa; Paul Carmichael; Melvin E Andersen; Thomas Hartung
Journal:  ALTEX       Date:  2013-10-15       Impact factor: 6.043

3.  Pathway Distiller - multisource biological pathway consolidation.

Authors:  Mark S Doderer; Zachry Anguiano; Uthra Suresh; Ravi Dashnamoorthy; Alexander J R Bishop; Yidong Chen
Journal:  BMC Genomics       Date:  2012-10-26       Impact factor: 3.969

4.  Integrative genomic profiling of human prostate cancer.

Authors:  Barry S Taylor; Nikolaus Schultz; Haley Hieronymus; Anuradha Gopalan; Yonghong Xiao; Brett S Carver; Vivek K Arora; Poorvi Kaushik; Ethan Cerami; Boris Reva; Yevgeniy Antipin; Nicholas Mitsiades; Thomas Landers; Igor Dolgalev; John E Major; Manda Wilson; Nicholas D Socci; Alex E Lash; Adriana Heguy; James A Eastham; Howard I Scher; Victor E Reuter; Peter T Scardino; Chris Sander; Charles L Sawyers; William L Gerald
Journal:  Cancer Cell       Date:  2010-06-24       Impact factor: 31.743

Review 5.  Boosting signal-to-noise in complex biology: prior knowledge is power.

Authors:  Trey Ideker; Janusz Dutkowski; Leroy Hood
Journal:  Cell       Date:  2011-03-18       Impact factor: 41.582

6.  Comprehensive molecular characterization of human colon and rectal cancer.

Authors: 
Journal:  Nature       Date:  2012-07-18       Impact factor: 49.962

7.  Higher order structure in the cancer transcriptome and systems medicine.

Authors:  Edison T Liu; Thomas Lemberger
Journal:  Mol Syst Biol       Date:  2007-03-13       Impact factor: 11.429

8.  How to infer gene networks from expression profiles.

Authors:  Mukesh Bansal; Vincenzo Belcastro; Alberto Ambesi-Impiombato; Diego di Bernardo
Journal:  Mol Syst Biol       Date:  2007-02-13       Impact factor: 11.429

9.  Detection and characterization of regulatory elements using probabilistic conditional random field and hidden Markov models.

Authors:  Hongyan Wang; Xiaobo Zhou
Journal:  Chin J Cancer       Date:  2012-12-07

10.  Comprehensive molecular portraits of human breast tumours.

Authors: 
Journal:  Nature       Date:  2012-09-23       Impact factor: 49.962

View more
  3 in total

1.  Regulatory interactions between long noncoding RNA LINC00968 and miR-9-3p in non-small cell lung cancer: A bioinformatic analysis based on miRNA microarray, GEO and TCGA.

Authors:  Dong-Yao Li; Wen-Jie Chen; Jun Shang; Gang Chen; Shi-Kang Li
Journal:  Oncol Lett       Date:  2018-04-12       Impact factor: 2.967

2.  Mechanisms of action of sacubitril/valsartan on cardiac remodeling: a systems biology approach.

Authors:  Oriol Iborra-Egea; Carolina Gálvez-Montón; Santiago Roura; Isaac Perea-Gil; Cristina Prat-Vidal; Carolina Soler-Botija; Antoni Bayes-Genis
Journal:  NPJ Syst Biol Appl       Date:  2017-04-18

3.  Identification of the difference in the pathogenesis in heart failure arising from different etiologies using a microarray dataset.

Authors:  Guodong Yang; Shuping Chen; Aiqun Ma; Jun Lu; Tingzhong Wang
Journal:  Clinics (Sao Paulo)       Date:  2017-10       Impact factor: 2.365

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.