Literature DB >> 26209801

The MI bundle: enabling network and structural biology in genome visualization tools.

Abstract

UNLABELLED: Prioritization of candidate genes emanating from large-scale screens requires integrated analyses at the genomics, molecular, network and structural biology levels. We have extended the Integrated Genome Browser (IGB) to facilitate these tasks. The graphical user interface greatly simplifies building disease networks and zooming in at atomic resolution to identify variations in molecular complexes that may affect molecular interactions in the context of genomic data. All results are summarized in genome tracks and can be visualized and analyzed at the transcript level.
AVAILABILITY AND IMPLEMENTATION: The MI Bundle is a plugin for the IGB. The plugin, help, video and tutorial are available at http://cru.genomics.iit.it/igbmibundle/ and https://github.com/CRUiit/igb-mi-bundle/wiki. The source code is released under the Apache License, Version 2. CONTACT: arnaud.ceol@iit.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Disease Gene Species

Mesh：

Substances：

Year: 2015 PMID： 26209801 PMCID： PMC4817051 DOI： 10.1093/bioinformatics/btv431

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Large-scale genomics initiatives aim at systematically cataloguing the complete spectrum of genomic changes in diverse biological and clinical conditions and to elucidate the contribution of those changes to the pathogenesis of disease. These initiatives routinely produce long lists of candidate genes that must be prioritized before more detailed functional studies are warranted. Prioritization of candidates ideally should take advantage of the entire range of available data and produce an integrated picture that permits evaluating the contribution of individual genes to the pathogenic process on a data-driven basis. However, the heterogenic nature of data and bioinformatics tools to manipulate them places numerous hurdles on the way, in particular for biologists with limited bioinformatics skills. We have developed a plug-in for Integrated Genome Browser (IGB) (Nicol ) to facilitate candidate prioritization based on network and structural biology criteria in the context of diverse genomic data types that can be loaded as browser tracks. The plugin permits analyzing how genomic variations, in particular missense mutations, may affect the interaction between gene products and other molecules as well as their interference with biological pathways based on the most updated standards and resources available for network and structural biology.

2 Molecular networks and structures

The MI Bundle interrogates on-line interaction databases to identify binding molecules. The query benefits from the adoption of the PSICQUIC (Aranda ) standard web service implementation by the major databases. The structures and models for the interactions are obtained either from PDB (Velankar ) or Interactome3D (Mosca ). PDB provides structures for species and interactions that are not covered by Interactome3D, as well as protein–DNA, protein–RNA and protein–ligand interactions. On the other side, Interactome3D increases the coverage of the network with high quality models. The genomic regions of interest may be associated to one or more transcripts whose sequence is translated and mapped to the Uniprot (Magrane and Consortium, 2011) sequences of the splicing variants, which are later aligned to the associated chains in the PDB files. The atoms of each structure are browsed to extract first the residues encoded by any of the selected genomic regions, and second to identify which of those residues are at the interface between two chains [residues that lose one Å2 of available surface area upon binding, calculated with the BioJava library (Prlic )]. Alternatively, it is possible to use the dSysMap database (Mosca et al., 2015), which relies on a pre-defined set of missense mutations. The structures and the contact residues can be selected and displayed in a Jmol frame (The Jmol Team, 2007) (Fig. 1d).

Fig. 1.

Mapping RUNX1 variations to molecular interactions. (a) Genomic variations for RUNX1 are loaded from ClinVar and cBioPortal (purple tracks). (b) Some variations are identified at the interface with CBFB (yellow tracks), DNA (orange track) and RUNX1 (homodimer, blue track). (c) Network representation: the black circles on the edges indicate a variation on the interaction interface. (d) Structure visualization: affected residues in contact with DNA are displayed in red (PDB:1HD9) Although the number of available protein structures is considerable, coverage is far from complete (Mosca ). Nevertheless, non-structure based molecular interaction networks represent valuable tools for gleaning insight into the complex relationships between genotypes, network properties and phenotypes (Ideker and Sharan, 2008; Vidal ). With the MI Bundle, it is indeed possible to build such networks directly from the genomic data and either display them directly from IGB (Fig. 1c) or export them in a standard format that can be analyzed using appropriate software such as Cytoscape (Saito ).

3 Contact residues and diseases

Recently, Mosca have shown that disease causing mutations are more likely to affect protein protein interaction interfaces. We loaded known genomic variations of RUNX1 from ClinVar (Landrum ), a repository of variations and associated phenotypes, observed principally in patients with acute myelogenous leukemia (Fig. 1a and b). In the MI-Bundle, a new track can be created for each molecular interaction enabling the comparison of the interfaces of a single molecule with different partners at the genomic level. In Figure 1b, we compare the interactions that may be affected by the different variations: two of those (positions K83 and R174) are identified at the interface with DNA only. Previous studies have shown that in the presence of mutation at those sites no DNA binding is observed while the homodimerization capability is preserved (Michaud ). Another mutation at position 107 is mapped to the interface with CBFB and may, as suggested by Walker , impair this interaction, destabilizing the binding of RUNX1 to DNA and leading to RUNX1 degradation. We loaded additional mutations from cBioPortal (Cerami ). Several of those where identified at the interface with RUNX1 (14, of which 10 new), CBFB (10/9) and DNA (5/4), suggesting how those variations may interfere with the molecular network and cause or predispose to disease. Further description of the mapping of RUNX1 variations is available in the Supplementary Material

4 Discussion

Based on their observation of the property of disease causing mutations, Mosca et al. developed dSysMap, a web server that allows mapping mutations (provided as amino acid positions) to the structures and models of human protein–protein interactions. Structure-PPI (Vázquez ) propose a similar strategy that Mechismo (Betts ) extends to protein–nucleic acid and protein–ligand interactions and an assessment of the impact of the mutation on the binding properties of the molecules. The integration of our plugin into a genome browser provides genome biologists access to network and structural analysis: The analyses start from genomic regions, allowing their integration with sequencing pipelines managed by IGB, and to be independent of any preliminary mapping to the protein sequences. The possibility to select the source database extends the range of possible analyses: even when no structures are available, it helps identifying new connections and functional relationships between target genes. Moreover, each new session queries public online databases: each query can be repeated to benefit from new data released in molecular interaction and structure databases. Finally, the bundle benefits from all the features and future developments of IGB, including the many species available (e.g. Mouse, A. thaliana and E. coli).

16 in total

1. Mechismo: predicting the mechanistic impact of mutations and modifications on molecular interactions.

Authors: Matthew J Betts; Qianhao Lu; YingYing Jiang; Armin Drusko; Oliver Wichmann; Mathias Utz; Ilse A Valtierra-Gutiérrez; Matthias Schlesner; Natalie Jaeger; David T Jones; Stefan Pfister; Peter Lichter; Roland Eils; Reiner Siebert; Peer Bork; Gordana Apic; Anne-Claude Gavin; Robert B Russell
Journal: Nucleic Acids Res Date: 2014-11-11 Impact factor: 16.971

Review 2. Protein networks in disease.

Authors: Trey Ideker; Roded Sharan
Journal: Genome Res Date: 2008-04 Impact factor: 9.043

Review 3. Interactome networks and human disease.

Authors: Marc Vidal; Michael E Cusick; Albert-László Barabási
Journal: Cell Date: 2011-03-18 Impact factor: 41.582

4. dSysMap: exploring the edgetic role of disease mutations.

Authors: Roberto Mosca; Jofre Tenorio-Laranga; Roger Olivella; Victor Alcalde; Arnaud Céol; Montserrat Soler-López; Patrick Aloy
Journal: Nat Methods Date: 2015-03 Impact factor: 28.547

5. In vitro analyses of known and novel RUNX1/AML1 mutations in dominant familial platelet disorder with predisposition to acute myelogenous leukemia: implications for mechanisms of pathogenesis.

Authors: Joëlle Michaud; Feng Wu; Motomi Osato; Gregory M Cottles; Masatoshi Yanagida; Norio Asou; Katsuya Shigesada; Yoshiaki Ito; Kathleen F Benson; Wendy H Raskind; Colette Rossier; Stylianos E Antonarakis; Sara Israels; Archie McNicol; Harvey Weiss; Marshall Horwitz; Hamish S Scott
Journal: Blood Date: 2002-02-15 Impact factor: 22.113

6. The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets.

Authors: John W Nicol; Gregg A Helt; Steven G Blanchard; Archana Raja; Ann E Loraine
Journal: Bioinformatics Date: 2009-08-04 Impact factor: 6.937

7. A novel inherited mutation of the transcription factor RUNX1 causes thrombocytopenia and may predispose to acute myeloid leukaemia.

Authors: Logan C Walker; Jane Stevens; Hamish Campbell; Rob Corbett; Ruth Spearing; David Heaton; Donald H Macdonald; Christine M Morris; Peter Ganly
Journal: Br J Haematol Date: 2002-06 Impact factor: 6.998

8. BioJava: an open-source framework for bioinformatics in 2012.

Authors: Andreas Prlić; Andrew Yates; Spencer E Bliven; Peter W Rose; Julius Jacobsen; Peter V Troshin; Mark Chapman; Jianjiong Gao; Chuan Hock Koh; Sylvain Foisy; Richard Holland; Gediminas Rimsa; Michael L Heuer; H Brandstätter-Müller; Philip E Bourne; Scooter Willis
Journal: Bioinformatics Date: 2012-08-09 Impact factor: 6.937

9. UniProt Knowledgebase: a hub of integrated protein data.

Authors: Michele Magrane
Journal: Database (Oxford) Date: 2011-03-29 Impact factor: 3.451

10. Structure-PPi: a module for the annotation of cancer-related single-nucleotide variants at protein-protein interfaces.

Authors: Miguel Vázquez; Alfonso Valencia; Tirso Pons
Journal: Bioinformatics Date: 2015-03-11 Impact factor: 6.937

6 in total

1. Integrated genome browser: visual analytics platform for genomics.

Authors: Nowlan H Freese; David C Norris; Ann E Loraine
Journal: Bioinformatics Date: 2016-03-16 Impact factor: 6.937

2. Genome and network visualization facilitates the analyses of the effects of drugs and mutations on protein-protein and drug-protein networks.

Authors: Arnaud Céol; Lisette G G C Verhoef; Mark Wade; Heiko Muller
Journal: BMC Bioinformatics Date: 2016-03-02 Impact factor: 3.169

3. ProtAnnot: an App for Integrated Genome Browser to display how alternative splicing and transcription affect proteins.

Authors: Tarun Mall; John Eckstein; David Norris; Hiral Vora; Nowlan H Freese; Ann E Loraine
Journal: Bioinformatics Date: 2016-04-07 Impact factor: 6.937

4. Integrated Genome Browser App Store.

Authors: Sameer Shanbhag; Riddhi Patil; Noor Zahara; Chirag Shetty; Rachel Weidenhammer; Sneha Watharkar; Pranav Tambvekar; Philip P Badzuh; Chester Dias; Narendra Vankayala; Prutha Kulkarni; Charan Vallapureddy; Shamika Kulkarni; Pooja Nikhare; Nowlan H Freese; Ann E Loraine
Journal: Bioinformatics Date: 2022-02-18 Impact factor: 6.937

5. Integrated Systems for NGS Data Management and Analysis: Open Issues and Available Solutions.

Authors: Valerio Bianchi; Arnaud Ceol; Alessandro G E Ogier; Stefano de Pretis; Eugenia Galeota; Kamal Kishore; Pranami Bora; Ottavio Croci; Stefano Campaner; Bruno Amati; Marco J Morelli; Mattia Pelizzola
Journal: Front Genet Date: 2016-05-06 Impact factor: 4.599

6. Encompassing new use cases - level 3.0 of the HUPO-PSI format for molecular interactions.

Authors: M Sivade Dumousseau; D Alonso-López; M Ammari; G Bradley; N H Campbell; A Ceol; G Cesareni; C Combe; J De Las Rivas; N Del-Toro; J Heimbach; H Hermjakob; I Jurisica; M Koch; L Licata; R C Lovering; D J Lynn; B H M Meldal; G Micklem; S Panni; P Porras; S Ricard-Blum; B Roechert; L Salwinski; A Shrivastava; J Sullivan; N Thierry-Mieg; Y Yehudi; K Van Roey; S Orchard
Journal: BMC Bioinformatics Date: 2018-04-11 Impact factor: 3.169

6 in total