MOTIVATION: Complementing structural information with biochemical and biomedical annotations is a powerful approach to explore the biological function of macromolecular complexes. However, currently the compilation of annotations and structural data is a feature only available for those structures that have been released as entries to the Protein Data Bank. RESULTS: To help researchers in assessing the consistency between structures and biological annotations for structural models not deposited in databases, we present 3DBIONOTES v2.0, a web application designed for the automatic annotation of biochemical and biomedical information onto macromolecular structural models determined by any experimental or computational technique. AVAILABILITY AND IMPLEMENTATION: The web server is available at http://3dbionotes-ws.cnb.csic.es. CONTACT: jsegura@cnb.csic.es. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Complementing structural information with biochemical and biomedical annotations is a powerful approach to explore the biological function of macromolecular complexes. However, currently the compilation of annotations and structural data is a feature only available for those structures that have been released as entries to the Protein Data Bank. RESULTS: To help researchers in assessing the consistency between structures and biological annotations for structural models not deposited in databases, we present 3DBIONOTES v2.0, a web application designed for the automatic annotation of biochemical and biomedical information onto macromolecular structural models determined by any experimental or computational technique. AVAILABILITY AND IMPLEMENTATION: The web server is available at http://3dbionotes-ws.cnb.csic.es. CONTACT: jsegura@cnb.csic.es. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Structural biology is a fundamental tool to further understand the mechanisms that control protein functions. Experimental and computational techniques for structure determination are continuously evolving and new macromolecular structures are submitted every day. In much the same way, the amount of biochemical and biomedical data available for genes and proteins grow rapidly and new databases appear every year. In this context, the mapping and analysis of biomedical and biochemical knowledge onto the residues of a newly proposed structural model constitutes a great help for a proper understanding of its function and cell role. This task was greatly facilitated by our former published web application 3DBIONOTES (Tabas-Madrid ); however, the tool was only available for those structures that were already released in structural databases. Then, if a structural model was still under analysis by the researcher and, consequently, had not yet been submitted to any structural database, there was no automatic manner to generate a proper compilation of the annotations associated with this model.Answering the specific need analysed before, in this work we present a new version of 3DBIONOTES designed to submit structural models directly to automatically annotate them. Additionally, the range of annotations currently handled by the application has increased significantly, with a new panel on genomics variants organized by pathologies. Further to other tools like (O'Donoghue ; Stank ), 3DBIONOTES v2.0 has been designed for the automatic mapping of biomedical and biochemical information onto new structures and it integrates a large collection of annotation sources (see Supplementary Material Section S1).
2 Methods
2.2 The web server
The web server has been implemented using the Ruby on Rails application framework (http://rubyonrails.org). The server performs three major tasks: first, it identifies the UniProt accessions of the different subunits contained in the submitted structure; second, it aligns the sequences of the subunits with their corresponding UniProt entries; and finally, it maps the protein annotations. Supplementary Figure S1 shows a schema of the communication and data processing between client and server. When a macromolecular structure is submitted, the server performs a BLAST search (Boratyn ) against all UniProt sequences to identify its different protein subunits; then, the best hits are sent back to the client and the user is asked to select the corresponding UniProt accessions for each chain of the submitted structure. Once each chain is identified, this information is returned to the server and the Smith–Waterman algorithm (Smith and Waterman, 1981) is used to align the sequence chains with their respective UniProt sequences. Finally, the server collects the biochemical and biomedical data from the different sources of information and the submitted structure is annotated and returned to the client.
2.3 The client
The web client provides an interactive environment linking protein sequences, structures and annotations. The client comprises three major panels (Supplementary Fig. S2): the structural panel, the annotation panel and the sequence panel. The structural panel uses the NGL viewer (Rose and Hildebrand, 2015) to display protein structures and cryo Electron Microscopy maps. The annotation panel was built using a bespoken version of the UniProt annotation viewer (Watkins ). The sequence panel shows the alignment between the sequence chains of the structure and their respective UniProt sequences. The alignment was fitted by using a modified version of the BioJS ‘Sequence’ package (Gomez and Jimenez, 2014). All panels are interconnected, allowing graphic interactivity among them. Thus, selections in the annotation or sequence panels are simultaneously highlighted in the three panels.
3 Human AKT1/PIN1 interaction
In this example, we illustrate how 3DBIONOTES v2.0 can be used together with protein structural docking to analyse and eventually select models consistent with other sources of biological knowledge. To model the structure of the AKT1/PIN1 interaction complex, we generated 50 potential models using GRAMM-X docking web server (Tovchigrechko and Vakser, 2006). Then, we used 3DBIONOTES v2.0 to visualize how well biological annotations matched to the different solutions.Among the retrieved biological annotations for the AKT1 protein, we analysed the short linear motifs (SLiMs) (Supplementary Fig. S2B, ‘Domains & sites’ section). SLiMs are short conserved segments of residues involved in the targeting and recognition of other macromolecules, which mediate many protein–protein interactions. By clicking the second SLiM, a panel that shows the SLiM information from Eukaryotic Linear Motifs database (EML DB) (Dinkel ) is displayed. According to this information, we found that AKT1 SLiM, comprised between residues I447 and P452 (Fig. 1, purple spheres), may interact with modular protein domains of type WW (Aragon ). Given that PIN1 protein contains a WW domain between L7 and P37 residues (Fig. 1, pink spheres) we explored all 50 docking models searching for solutions that involved contacts between the I447-P452 region of AKT1 and the L7-P37 residues of PIN1 WW domain. Noteworthy, model number 33, displayed in Figure 1, was the unique solution that satisfied this restraint. Other relevant annotations showed that phosphorylation site T450 of AKT1 (Fig. 1, blue spheres) is involved in the interaction with PIN1 protein (Liao ) and mutations on the W34 PIN1 residue (Fig. 1, orange spheres) disrupt interactions with other proteins (Min ). A more detailed analysis of the biochemical annotation along with another example is available in Supplementary Material. This use case illustrates how 3DBIONOTES v2.0 associates biological annotations to macromolecular structures, helping to select the interaction model that better fits those annotations.
Fig. 1
Docking model of the AKT1/PIN1 human complex. In purple colour the AKT1 SLiM region from I447 to P452 residue. In blue colour the AKT1 phosphorylation site T450. In pink colour the PIN1 protein domain WW comprised between residues L7 and P37. In orange colour the W34 PIN1 residue; note that mutation W34A disrupts interaction of PIN1 with phosphorylated proteins
Docking model of the AKT1/PIN1 human complex. In purple colour the AKT1 SLiM region from I447 to P452 residue. In blue colour the AKT1 phosphorylation site T450. In pink colour the PIN1 protein domain WW comprised between residues L7 and P37. In orange colour the W34 PIN1 residue; note that mutation W34A disrupts interaction of PIN1 with phosphorylated proteins
Funding
This work was supported by Instituto de Salud Carlos III, project number PT13/0001/0009 funding the Spanish National Institute of Bioinformatics, the Spanish Ministry of Economy and Competitiveness through grants AIC-A-2011-0638, BIO2013-44647-R and BIO2016-76400-R, together with the European Union (EU) and Horizon 2020 through grants CORBEL (INFRADEV-1-2014-1—Proposal: 654248), ELIXIR-EXCELERATE (INFRADEV-1-2015-1—Proposal: 676559) and West-Life (EINFRA-2015-1, Proposal: 675858). J.S. is recipient of a ‘Juan de la Cierva’ fellowship and R.S.-G. is recipient of a FPU fellowship.Conflict of Interest: none declared.Click here for additional data file.
Authors: D Tabas-Madrid; J Segura; R Sanchez-Garcia; J Cuenca-Alba; C O S Sorzano; J M Carazo Journal: J Struct Biol Date: 2016-02-10 Impact factor: 2.867
Authors: Sang-Hyun Min; Alan W Lau; Tae Ho Lee; Hiroyuki Inuzuka; Shuo Wei; Pengyu Huang; Shavali Shaik; Daniel Yenhong Lee; Greg Finn; Martin Balastik; Chun-Hau Chen; Manli Luo; Adriana E Tron; James A Decaprio; Xiao Zhen Zhou; Wenyi Wei; Kun Ping Lu Journal: Mol Cell Date: 2012-05-17 Impact factor: 17.970
Authors: Eric Aragón; Nina Goerner; Alexia-Ileana Zaromytidou; Qiaoran Xi; Albert Escobedo; Joan Massagué; Maria J Macias Journal: Genes Dev Date: 2011-06-15 Impact factor: 11.361
Authors: Y Liao; Y Wei; X Zhou; J-Y Yang; C Dai; Y-J Chen; N K Agarwal; D Sarbassov; D Shi; D Yu; M-C Hung Journal: Oncogene Date: 2009-05-18 Impact factor: 9.867
Authors: Grzegorz M Boratyn; Alejandro A Schäffer; Richa Agarwala; Stephen F Altschul; David J Lipman; Thomas L Madden Journal: Biol Direct Date: 2012-04-17 Impact factor: 4.540
Authors: Holger Dinkel; Kim Van Roey; Sushama Michael; Manjeet Kumar; Bora Uyar; Brigitte Altenberg; Vladislava Milchevskaya; Melanie Schneider; Helen Kühn; Annika Behrendt; Sophie Luise Dahl; Victoria Damerell; Sandra Diebel; Sara Kalman; Steffen Klein; Arne C Knudsen; Christina Mäder; Sabina Merrill; Angelina Staudt; Vera Thiel; Lukas Welti; Norman E Davey; Francesca Diella; Toby J Gibson Journal: Nucleic Acids Res Date: 2015-11-28 Impact factor: 16.971
Authors: Chris Morris; Paolo Andreetto; Lucia Banci; Alexandre M J J Bonvin; Grzegorz Chojnowski; Laura Del Cano; José Marıa Carazo; Pablo Conesa; Susan Daenke; George Damaskos; Andrea Giachetti; Natalie E C Haley; Maarten L Hekkelman; Philipp Heuser; Robbie P Joosten; Daniel Kouřil; Aleš Křenek; Tomáš Kulhánek; Victor S Lamzin; Nurul Nadzirin; Anastassis Perrakis; Antonio Rosato; Fiona Sanderson; Joan Segura; Joerg Schaarschmidt; Egor Sobolev; Sergio Traldi; Mikael E Trellet; Sameer Velankar; Marco Verlato; Martyn Winn Journal: J Struct Biol X Date: 2019-02-26
Authors: Sumaiya Iqbal; David Hoksza; Eduardo Pérez-Palma; Patrick May; Jakob B Jespersen; Shehab S Ahmed; Zaara T Rifat; Henrike O Heyne; M Sohel Rahman; Jeffrey R Cottrell; Florence F Wagner; Mark J Daly; Arthur J Campbell; Dennis Lal Journal: Nucleic Acids Res Date: 2020-07-02 Impact factor: 16.971