Literature DB >> 26708334

HitWalker2: visual analytics for precision medicine and beyond.

Daniel Bottomly¹, Shannon K McWeeney², Beth Wilmot².

Abstract

UNLABELLED: The lack of visualization frameworks to guide interpretation and facilitate discovery is a potential bottleneck for precision medicine, systems genetics and other studies. To address this we have developed an interactive, reproducible, web-based prioritization approach that builds on our earlier work. HitWalker2 is highly flexible and can utilize many data types and prioritization methods based upon available data and desired questions, allowing it to be utilized in a diverse range of studies such as cancer, infectious disease and psychiatric disorders.
AVAILABILITY AND IMPLEMENTATION: Source code is freely available at https://github.com/biodev/HitWalker2 and implemented using Python/Django, Neo4j and Javascript (D3.js and jQuery). We support major open source browsers (e.g. Firefox and Chromium/Chrome). CONTACT: wilmotb@ohsu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Additional information/instructions are available at https://github.com/biodev/HitWalker2/wiki.

Entities: CellLine Disease Gene Species

Mesh：

Year: 2015 PMID： 26708334 PMCID： PMC4824131 DOI： 10.1093/bioinformatics/btv739

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Across domains, the need to integrate and prioritize genes or variants is a common theme—for therapeutic selection, as well as for mechanistic and perturbation studies. Network-context methods can facilitate the ranking of genes and associated genetic variants/mutations. For instance, some serve to rank the variant genes of individual subjects relative to orthogonal biological assay data (Bottomly ; Jia and Zhao, 2014) whereas others can be used for GWAS or QTL studies (Köhler ). As a whole, these approaches can integrate different data types and statically report results, but up to now have not focused on making the data ‘accessible’ with respect to discovery, interpretation and knowledge acquisition. To this end we developed HitWalker2, which is a highly customizable approach to both producing a ranked list of genes utilizing network and external information and exploring these results using graph-centric interactive visualizations. Details of the HitWalker2 workflow and framework itself can be found in the Supplementary Material.

2 Description of software

2.1 Overview

The original HitWalker R package provided a means to prioritize variants stored in an SQL database and display the results using a static image of a relevant subnetwork and a text summary that could be exported to an Excel document. Interaction with the program was via R syntax and as such it required a working R environment or access to a server that does and some familiarity with the language (or additional training time). HitWalker2 contains all of the features of the original but additionally was designed as an approach to allow users to access meaningful aspects of their data through interactions with genes, subjects and groups as a series of graphs organized as panels. The vast majority of these interactions involve mouse clicks and dragging operations, a paradigm which bench scientists and clinicians will be familiar with. HitWalker2 extends the static subnetwork image of HitWalker to allow user interactivity along multiple levels. This interactivity is not just due to relatively basic features such as reorganizing nodes and retrieving information about nodes. More importantly, it provides the ability to ask biologically meaningful questions with or without use of a prioritization approach.

2.2 Functionality

In the original Hitwalker framework, the key focus was on prioritizing genes in individual samples. We have extended this in HitWalker2 to now include: (i) summarization and subsetting of cohort-level phenotypic attributes, (ii) cohort-level identification of recurrent genes based on ‘hits’ (as defined by thresholding multiple biological assay results), (iii) identification of subset of cohort with the same hit results in a given set of genes, (iv) pathway-context for results (individual, subset and full cohort), (v) ability to group subjects and genes by specified relationships/queries, (vi) ability to export results via CSV/PDF and (vii) fully interactive web-based visualization platform.

3 Applications

3.1 Overview

We created a base database for human samples consisting of high confidence STRING interactions (Jensen ), pathways from Pathway Commons (Cerami ) and gene symbol mapping information from NCBI (Maglott ). This database is available at our Github wiki along with steps used in its creation.

3.1.1 Precision medicine use case

One of the major translational challenges is target classification and prioritization in the clinical setting (Andre ). In this use case, the goal is to rank the cancer variants relative to expression and drug sensitivity that could guide therapeutic selection (Fig. 1). To illustrate this use case, we utilize cancer expression, variant and drug data from the Cancer Cell Line Encyclopedia (Barretina ). An expanded demo and walk-through of this example is provided on the Github wiki. We note that Hitwalker2 provides context to allow the user to move beyond individualized prioritization. For the HepG2 example, we have identified a prioritized gene set derived from the corresponding inhibitor assay results. We can then determine whether other samples or members of the cohort have mutations in the same genes (Supplementary Fig. S1). The interactive panels allow both individual and cohort level summaries. For example, at the cohort level, one can rapidly determine the most recurrently mutated genes in the cohort or for a selected subset of the cohort. Any overexpressed genes or drug sensitivity hits will automatically be displayed as part of the result. Pathways containing interesting genes can then be visualized (Supplementary Fig. S2). Exporting these results can be done via images of the graph panel(s) as well as text files.

Fig. 1

NRAS mutant skin cell lines mutations in the TLR3 Cascade pathway. The set of cell lines with NRAS mutations in the exome data was first retrieved and subsetted to only those cell lines derived from skin. Drug treatment data were used to identify MAPK7 as the most frequently perturbed gene target (GeneScore). The user can then explore the mutational burden of the pathways containing MAPK7, in this example, the Toll-Like Receptor 3 Cascade pathway from Reactome

3.1.2 GWAS evidence-based visualization use case

ADHD GWAS results (Psychiatric Genetic Consortium; http://www.med.unc.edu/pgc/) were combined with study genotypes (unpublished data, JN and BW) and DNA methylation P-values (Wilmot ). In addition to top SNPs identified from a single GWAS study, it is useful to combine GWAS from multiple data sets and additional data types to strengthen the support for biologically important SNPs. For this use case, the focus is on rapidly identifying candidate genes with support across multiple studies and data types (Supplementary Fig. S3). P-values from a case/control meta-analysis of four Attention Deficit Hyperactivity Disorder datasets were used to compare with SNPs from a candidate gene study. This use case could also allow identification of inconsistencies across studies and provide network-context to GWAS results from both a patient and cohort perspective.

4 Discussion

HitWalker2 provides a visual, reproducible and flexible framework for prioritization that is applicable to a large number of clinical, translational and basic science use cases. The interactive framework facilitates discovery and guide interpretation in a robust and scalable way. This is timely given the growing recognition for more emphasis on human–data interactions. The HitWalker2 framework is datatype agnostic and can include any type of experiment/assay which at the end suggests an effect on a gene such as variant/mutation data, copy number variation, expression as well as siRNA or drug screens.

9 in total

1. Walking the interactome for prioritization of candidate disease genes.

Authors: Sebastian Köhler; Sebastian Bauer; Denise Horn; Peter N Robinson
Journal: Am J Hum Genet Date: 2008-03-27 Impact factor: 11.025

Review 2. Prioritizing targets for precision cancer medicine.

Authors: F Andre; E Mardis; M Salm; J-C Soria; L L Siu; C Swanton
Journal: Ann Oncol Date: 2014-10-24 Impact factor: 32.976

3. Methylomic analysis of salivary DNA in childhood ADHD identifies altered DNA methylation in VIPR2.

Authors: Beth Wilmot; Rebecca Fry; Lisa Smeester; Erica D Musser; Jonathan Mill; Joel T Nigg
Journal: J Child Psychol Psychiatry Date: 2015-08-25 Impact factor: 8.982

4. Pathway Commons, a web resource for biological pathway data.

Authors: Ethan G Cerami; Benjamin E Gross; Emek Demir; Igor Rodchenkov; Ozgün Babur; Nadia Anwar; Nikolaus Schultz; Gary D Bader; Chris Sander
Journal: Nucleic Acids Res Date: 2010-11-10 Impact factor: 16.971

5. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity.

Authors: Jordi Barretina; Giordano Caponigro; Nicolas Stransky; Kavitha Venkatesan; Adam A Margolin; Sungjoon Kim; Christopher J Wilson; Joseph Lehár; Gregory V Kryukov; Dmitriy Sonkin; Anupama Reddy; Manway Liu; Lauren Murray; Michael F Berger; John E Monahan; Paula Morais; Jodi Meltzer; Adam Korejwa; Judit Jané-Valbuena; Felipa A Mapa; Joseph Thibault; Eva Bric-Furlong; Pichai Raman; Aaron Shipway; Ingo H Engels; Jill Cheng; Guoying K Yu; Jianjun Yu; Peter Aspesi; Melanie de Silva; Kalpana Jagtap; Michael D Jones; Li Wang; Charles Hatton; Emanuele Palescandolo; Supriya Gupta; Scott Mahan; Carrie Sougnez; Robert C Onofrio; Ted Liefeld; Laura MacConaill; Wendy Winckler; Michael Reich; Nanxin Li; Jill P Mesirov; Stacey B Gabriel; Gad Getz; Kristin Ardlie; Vivien Chan; Vic E Myer; Barbara L Weber; Jeff Porter; Markus Warmuth; Peter Finan; Jennifer L Harris; Matthew Meyerson; Todd R Golub; Michael P Morrissey; William R Sellers; Robert Schlegel; Levi A Garraway
Journal: Nature Date: 2012-03-28 Impact factor: 49.962

6. STRING 8--a global view on proteins and their functional interactions in 630 organisms.

Authors: Lars J Jensen; Michael Kuhn; Manuel Stark; Samuel Chaffron; Chris Creevey; Jean Muller; Tobias Doerks; Philippe Julien; Alexander Roth; Milan Simonovic; Peer Bork; Christian von Mering
Journal: Nucleic Acids Res Date: 2008-10-21 Impact factor: 16.971

7. Entrez Gene: gene-centered information at NCBI.

Authors: Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal: Nucleic Acids Res Date: 2005-01-01 Impact factor: 16.971

8. HitWalker: variant prioritization for personalized functional cancer genomics.

Authors: Daniel Bottomly; Beth Wilmot; Jeffrey W Tyner; Christopher A Eide; Marc M Loriaux; Brian J Druker; Shannon K McWeeney
Journal: Bioinformatics Date: 2013-01-09 Impact factor: 6.937

9. VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data.

Authors: Peilin Jia; Zhongming Zhao
Journal: PLoS Comput Biol Date: 2014-02-06 Impact factor: 4.475

9 in total

5 in total

Review 1. Molecular networks in Network Medicine: Development and applications.

Authors: Edwin K Silverman; Harald H H W Schmidt; Eleni Anastasiadou; Lucia Altucci; Marco Angelini; Lina Badimon; Jean-Luc Balligand; Giuditta Benincasa; Giovambattista Capasso; Federica Conte; Antonella Di Costanzo; Lorenzo Farina; Giulia Fiscon; Laurent Gatto; Michele Gentili; Joseph Loscalzo; Cinzia Marchese; Claudio Napoli; Paola Paci; Manuela Petti; John Quackenbush; Paolo Tieri; Davide Viggiano; Gemma Vilahur; Kimberly Glass; Jan Baumbach
Journal: Wiley Interdiscip Rev Syst Biol Med Date: 2020-04-19

Review 2. Integrating functional genomics to accelerate mechanistic personalized medicine.

Authors: Jeffrey W Tyner
Journal: Cold Spring Harb Mol Case Stud Date: 2017-03

3. SmartR: an open-source platform for interactive visual analytics for translational research data.

Authors: Sascha Herzinger; Wei Gu; Venkata Satagopam; Serge Eifes; Kavita Rege; Adriano Barbosa-Silva; Reinhard Schneider
Journal: Bioinformatics Date: 2017-07-15 Impact factor: 6.937

Review 4. Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

Authors: Anastasis Oulas; George Minadakis; Margarita Zachariou; Kleitos Sokratous; Marilena M Bourdakou; George M Spyrou
Journal: Brief Bioinform Date: 2019-05-21 Impact factor: 11.622

5. Functional genomic analysis identifies drug targetable pathways in invasive and metastatic cutaneous squamous cell carcinoma.

Authors: Ashley N Anderson; Danielle McClanahan; James Jacobs; Sophia Jeng; Myles Vigoda; Aurora S Blucher; Christina Zheng; Yeon Jung Yoo; Carolyn Hale; Xiaoming Ouyang; Daniel Clayburgh; Peter Andersen; Jeffrey W Tyner; Anna Bar; Olivia M Lucero; Justin J Leitenberger; Shannon K McWeeney; Molly Kulesz-Martin
Journal: Cold Spring Harb Mol Case Stud Date: 2020-08-25

5 in total