Joan Segura1, Yana Rose1, Sebastian Bittrich1, Stephen K Burley1,2,3,4,5, Jose M Duarte1. 1. Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center University of California, La Jolla, CA, 92093, USA. 2. Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers The State University of New Jersey, Piscataway, NJ, 08854, USA. 3. Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA. 4. Department of Chemistry and Chemical Biology, Rutgers The State University of New Jersey, Piscataway, NJ, 08854, USA. 5. Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA.
Abstract
MOTIVATION: Mapping positional features from one-dimensional (1D) sequences onto three-dimensional (3D) structures of biological macromolecules is a powerful tool to show geometric patterns of biochemical annotations and provide a better understanding of the mechanisms underpinning protein and nucleic acid function at the atomic level. RESULTS: We present a new library designed to display fully customizable interactive views between 1D positional features of protein and/or nucleic acid sequences and their 3D structures as isolated chains or components of macromolecular assemblies. AVAILABILITY: https://github.com/rcsb/rcsb-saguaro-3d. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Mapping positional features from one-dimensional (1D) sequences onto three-dimensional (3D) structures of biological macromolecules is a powerful tool to show geometric patterns of biochemical annotations and provide a better understanding of the mechanisms underpinning protein and nucleic acid function at the atomic level. RESULTS: We present a new library designed to display fully customizable interactive views between 1D positional features of protein and/or nucleic acid sequences and their 3D structures as isolated chains or components of macromolecular assemblies. AVAILABILITY: https://github.com/rcsb/rcsb-saguaro-3d. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Mapping positional features from 1D sequences onto 3D structures of biological macromolecules facilitates interrogation of relationships between shape and function. Sequence to structure mapping enables identification of spatial correlations and geometric patterns among protein or nucleic acid annotations that would be obscured if they were analyzed solely using linear polymer sequences. Over the last few years, various libraries, web applications and software tools have been developed to compute alignments among structures, and protein and gene sequences, and visualize positional features over the different levels of molecular organization from genome to macromolecular assemblies. 3DBIONOTES (Segura ), for example, is a stand-alone web application that integrates biochemical annotations from multiple resources and visualizes them at 1D sequence and 3D structure levels. Similarly, MolArt (Hoksza ) is a JavaScript library that integrates and visualizes UniProt (UniProt Consortium, 2021) annotations with protein structural data. Finally, the RCSB PDB 1D coordinate server (Segura ) provides alignments and mapping of annotations between genome and protein sequence resources, including structures of macromolecular assemblies.In this work, we present a new TypeScript/JavaScript module designed to create custom interactive views between 1D sequence positional features and 3D structures of macromolecules over the web. The main motivation behind this development is to provide the structural bioinformatics community with a flexible and fully customizable tool that can be adapted for use in various contexts. To achieve this end, the library exposes multiple event callbacks that allow software developers to define bidirectional interactions between 1D positional features and 3D atomic coordinates of experimental structures (from PDB) or computed structure models [from AlphaFold2 (Jumper ) or RoseTTAFold (Baek ), etc.]. Moreover, it allows arrangement of positional features in multiple sequence viewers and defining many-to-many relationships between 3D structure information and 1D viewers (see Supplementary Fig. S1). The module was built atop the open-source Mol* Viewer (Sehnal ) and the RCSB PDB Feature Viewer (Segura ). Currently, the library is used at the RCSB PDB rcsb.org web portal (Burley ) to display a bidirectional interactive view of mappings between sequence annotations and 3D macromolecular structures.
2 Materials and methods
The RCSB PDB 1D3D module is an open-source library written in TypeScript that is designed to visualize interactive views between 1D positional features and 3D biostructures. The library comprises a collection of React (https://reactjs.org/) components that integrate the Mol* Viewer and the RCSB PDB Feature Viewer (see Supplementary Section S1). 1D positional features and 3D structures are rendered in separate components that communicate with each other when external events (clicking or hovering) occur. These events trigger a set of configurable callback functions that define how 1D features and 3D atomic coordinates interact. Moreover, the 1D and 3D viewers APIs are accessible from the event callback functions, allowing modification of viewer content or representation of displayed elements.
2.1 Structure component
The structure component integrates the Mol* Viewer for the 3D visualization of macromolecular structures. The component configuration tool allows choices as to how structure data is loaded. The exposed loading configuration accepts different types of parameters, including individual or multiple PDB IDs, a URL pointing to a computed structure model from resources such as AlphaFoldDB (Varadi ) or the ModelArchive (Schwede ), or a plain string encoding the 3D structure information. In addition, the configuration includes multiple options to modify the Mol* graphical user interface. (See Supplementary Section S2 for a detailed description of the structure component configuration interface.)
2.2 Sequence component
The sequence component integrates the RCSB PDB Feature Viewer. This component is responsible for displaying the 1D positional features and encoding the logic that enables interoperability between 1D features and 3D structure information. Positional features are organized in two levels. First, a specific feature viewer allocates a collection of features as part of its configuration. Second, multiple feature viewers can be grouped into different blocks (see Supplementary Fig. S1). Thus, the sequence component contains a collection of blocks, wherein each block encodes the configuration for one or more feature viewers, including associated 1D positional features. Feature viewers belonging to the same block are displayed simultaneously. However, only a single block can be activated at any given time. The sequence component configuration includes different mechanisms to define how blocks can be activated or deactivated (see Supplementary Section S3).Interaction of positional features and 3D structures is configured through different callback functions that are triggered when mouse click or hover events occur on 3D structures or 1D features. These functions are defined at the feature viewer level. Hence, each feature viewer in each block may implement its own event callback configuration. When an event (mouse click or hover) occurs within the structure component, callback functions are triggered as defined in all the feature viewers belonging to the active block. Event data and all relevant information needed to identify the relevant polymer component (i.e. amino acid or nucleotide) or ligand, including requisite identifiers, are passed to the callback as state parameters. Thereafter, based on callback parameter information, each feature viewer determines whether to process or ignore the call. For a detailed description of the interoperation configuration between sequence and structure components see Supplementary Section S3.
3 Summary
Herein, we present RCSB Protein Data Bank 1D3D module, a novel open-source library designed for visualizing interactive environments between 1D positional features and 3D structures of biological macromolecules. The library exposes a rich and flexible configuration interface that allows developers to define interoperation between multiple 1D positional feature viewers and multiple 3D atomic coordinate models.The library is publicly available in github and published as an npm module. It is easy to install and reusable within any web resource. Currently, the RCSB PDB rcsb.org web portal uses this tool to display an interacting mapping between 1D protein features and the 3D structures of biomolecules.
Funding
This work was supported by the National Science Foundation [DBI-1832184]; the US Department of Energy [DE-SC0019749]; and the National Cancer Institute, National Institute of Allergy and Infectious Diseases and National Institute of General Medical Sciences of the National Institutes of Health [R01GM133198] (Principal Investigator: Stephen K. Burley).Conflict of Interest: none declared.Click here for additional data file.
Authors: Torsten Schwede; Andrej Sali; Barry Honig; Michael Levitt; Helen M Berman; David Jones; Steven E Brenner; Stephen K Burley; Rhiju Das; Nikolay V Dokholyan; Roland L Dunbrack; Krzysztof Fidelis; Andras Fiser; Adam Godzik; Yuanpeng Janet Huang; Christine Humblet; Matthew P Jacobson; Andrzej Joachimiak; Stanley R Krystek; Tanja Kortemme; Andriy Kryshtafovych; Gaetano T Montelione; John Moult; Diana Murray; Roberto Sanchez; Tobin R Sosnick; Daron M Standley; Terry Stouch; Sandor Vajda; Max Vasquez; John D Westbrook; Ian A Wilson Journal: Structure Date: 2009-02-13 Impact factor: 5.006
Authors: David Sehnal; Sebastian Bittrich; Mandar Deshpande; Radka Svobodová; Karel Berka; Václav Bazgier; Sameer Velankar; Stephen K Burley; Jaroslav Koča; Alexander S Rose Journal: Nucleic Acids Res Date: 2021-07-02 Impact factor: 16.971
Authors: Minkyung Baek; Frank DiMaio; Ivan Anishchenko; Justas Dauparas; Sergey Ovchinnikov; Gyu Rie Lee; Jue Wang; Qian Cong; Lisa N Kinch; R Dustin Schaeffer; Claudia Millán; Hahnbeom Park; Carson Adams; Caleb R Glassman; Andy DeGiovanni; Jose H Pereira; Andria V Rodrigues; Alberdina A van Dijk; Ana C Ebrecht; Diederik J Opperman; Theo Sagmeister; Christoph Buhlheller; Tea Pavkov-Keller; Manoj K Rathinaswamy; Udit Dalwadi; Calvin K Yip; John E Burke; K Christopher Garcia; Nick V Grishin; Paul D Adams; Randy J Read; David Baker Journal: Science Date: 2021-07-15 Impact factor: 47.728
Authors: Joan Segura; Ruben Sanchez-Garcia; Marta Martinez; Jesus Cuenca-Alba; Daniel Tabas-Madrid; C O S Sorzano; J M Carazo Journal: Bioinformatics Date: 2017-11-15 Impact factor: 6.937
Authors: Stephen K Burley; Charmi Bhikadiya; Chunxiao Bi; Sebastian Bittrich; Li Chen; Gregg V Crichlow; Cole H Christie; Kenneth Dalenberg; Luigi Di Costanzo; Jose M Duarte; Shuchismita Dutta; Zukang Feng; Sai Ganesan; David S Goodsell; Sutapa Ghosh; Rachel Kramer Green; Vladimir Guranović; Dmytro Guzenko; Brian P Hudson; Catherine L Lawson; Yuhe Liang; Robert Lowe; Harry Namkoong; Ezra Peisach; Irina Persikova; Chris Randle; Alexander Rose; Yana Rose; Andrej Sali; Joan Segura; Monica Sekharan; Chenghua Shao; Yi-Ping Tao; Maria Voigt; John D Westbrook; Jasmine Y Young; Christine Zardecki; Marina Zhuravleva Journal: Nucleic Acids Res Date: 2021-01-08 Impact factor: 16.971
Authors: John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli Journal: Nature Date: 2021-07-15 Impact factor: 49.962