Literature DB >> 34498035

Socket2: A Program for Locating, Visualising, and Analysing Coiled-coil Interfaces in Protein Structures.

Prasun Kumar1, Derek N Woolfson1,2,3.   

Abstract

MOTIVATION: Protein-protein interactions are central to all biological processes. One frequently observed mode of such interactions is the α-helical coiled coil (CC). Thus, an ability to extract, visualise, and analyse CC interfaces quickly and without expert guidance would facilitate a wide range of biological research. In 2001, we reported Socket, which locates and characterises CCs in protein structures based on the knobs-into-holes (KIH) packing between helices in CCs. Since then, studies of natural and de novo designed CCs have boomed, and the number of CCs in the RCSB PDB has increased rapidly. Therefore, we have updated Socket and made it accessible to expert and non-expert users alike.
RESULTS: The original Socket only classified CCs with up to 6 helices. Here, we report Socket2, which rectifies this oversight to identify CCs with any number of helices, and KIH interfaces with any of the 20 proteinogenic residues or incorporating non-natural amino acids. In addition, we have developed a new and easy-to-use web server with additional features. These include the use of NGL Viewer for instantly visualising CCs, and tabs for viewing the sequence repeats, helix-packing angles, and core-packing geometries of CCs identified and calculated by Socket2.
AVAILABILITY AND IMPLEMENTATION: Socket2 has been tested on all modern browsers. It can be accessed freely at http://coiledcoils.chm.bris.ac.uk/socket2/home.html. The source code is distributed using an MIT license and available to download under the Downloads tab of the Socket2 home page.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34498035      PMCID: PMC8652024          DOI: 10.1093/bioinformatics/btab631

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

α-Helical coiled-coil domains (CCs) are found widely in proteins from all kingdoms of life where they mediate protein–protein interactions and protein assemblies (Lupas and Bassler, 2017). CCs account for ≈5% of all known protein sequences (Rackham ). In structural terms, CCs comprise two or more α helices that wrap around each other in a rope-like fashion. The helices can be assembled in parallel or antiparallel arrangements, and as homo- or heteromeric complexes (Lupas ). In addition to their importance in biology, CCs are productive targets for de novo protein design (Korendovych and DeGrado, 2020; Woolfson, 2017, 2021), leading to applications in cell biology, synthetic biology and biotechnology (Beesley and Woolfson, 2019; Dawson ; Lapenta ). The interactions between CC helices are tight and well-defined. These are known as knobs-into-holes (KIH) interactions as first proposed by Crick (1953). A ‘knob’ is defined as a side chain that projects from one helix and packs into a ‘hole’ formed by four side chains of an adjacent helix. These interactions are exploited by the program Socket (Walshaw and Woolfson, 2001) to identify CCs in the 3D structures of proteins deposited in RCSB PDB (Burley ). On this basis, Socket also identifies the underlying and usually 7-residue (heptad) repeats characteristic of CC sequences, assigning these to an a-to-g register (Lupas, 1996). Socket has been used by us to construct databases of CCs (Heal ; Moutevelis and Woolfson, 2009; Testa ) and tools for CC design and modelling (Wood and Woolfson, 2018; Wood ), and by others in a wide variety of CC-based research and applications (Walshaw and Woolfson, 2001). Socket has also been adopted and used widely, as evidenced by ≈300 and ≈400 citations in Web of Science and Google Scholar, respectively. CC research has advanced considerably over the past 20 years, and there are now many more CC structures and sequences to explore and examine (Lupas ). Notably, an important class of CCs, the α-helical barrels (Woolfson ), has emerged that Socket does not identify. This issue is addressed by iSocket, a Python-based application programming interface (Heal ). Nonetheless, we felt that an updated Socket web server that is accessible to nonexpert users was needed. Therefore, we have upgraded Socket to Socket2, which can identify all CC architectures, and we have developed a Socket2 webserver with a built-in visualizer and improved presentation of CC metadata that Socket generates.

2 Methods and implementation

Socket2 recognizes KIH packing to identify CCs in proteins using structural criteria alone. For this, two files are required: (i) 3D coordinate file in PDB format (Burley ) and (ii) a DSSP output file (Joosten ; Kabsch and Sander, 1983). Details of the full methodology and parameters used are given in the original publication (Walshaw and Woolfson, 2001) and in the ‘Help’ tab of the Socket2 home page.

2.1 Architecture

The Socket2 webserver has three layers: the frontend, the backend and the software itself. The frontend is written in HTML, JavaScript and CSS. The home page provides various available options for running the program. Users can either provide a 4-character PDB ID or upload a .pdb/.cif/.mmcif file containing the 3D coordinates for a protein of interest. Any uploaded files are kept confidential and deleted within 12 h of upload. Users can also select the Socket parameters ‘packing cut-off’ and ‘helix extension’ from drop-down menus; otherwise, the default values of ‘7 Å’ and ‘0’, respectively, are used. The home page also provides background and related information under different tabs. The frontend transfers the requests to the backend that runs DSSP and Socket2. The backend is written in CGI/Perl, HTML, JavaScript and CSS. Every successful run creates an output ‘Results’ page (Fig.  1A) with two parts: (i) a molecular visualizer and (ii) tabs detailing each identified CC. The webserver uses NGL Viewer (Rose ) to display the identified CCs. Sequences and heptad registers for each CC helix are also displayed (Fig.  1B). The webserver also uses Matplotlib (Hunter, 2007) to generate plots for helix–helix angles (Fig.  1C), and core-packing angles for the KIH interactions (Fig.  1D). Users can return to the home page to run further queries by clicking the Socket2 icon.
Fig. 1.

Overview of the pages of the Socket2 webserver. (A) ‘Results’ page with NGL visualizer and links to different output files that can be downloaded as a zipped file. (B) Part of the tabulated information for sequences of participating α helices in identified CCs. Distributions of (C) angles between pair of helices of the CC and (D) packing angle of each identified knob residue. Example: the biological assembly from PDB ID 6G67 (Rhys )

Overview of the pages of the Socket2 webserver. (A) ‘Results’ page with NGL visualizer and links to different output files that can be downloaded as a zipped file. (B) Part of the tabulated information for sequences of participating α helices in identified CCs. Distributions of (C) angles between pair of helices of the CC and (D) packing angle of each identified knob residue. Example: the biological assembly from PDB ID 6G67 (Rhys )

2.2 Features

The Socket2 web application has the following key features. Biological assemblies: Some PDB entries have different asymmetric units and biological assemblies. The latter can be important for capturing full protein assemblies such as CCs. The webserver allows biological assembly to be used as the input by checking the box provided. This option is not available for uploaded files. mmCIF files: In 2019, wwPDB made the use of mmCIF file format compulsory for the depositions of crystallographic methods. The webserver handles uploaded mmCIF files with MAXIT (https://sw-tools.rcsb.org/apps/MAXIT/index.html). Modified residues: The MODRES record can be used to handle any modified residues or to rename a residue. The webserver searches for the presence of modified residues and, if not present, it adds a corresponding MODRES record to the input file allowing the Socket2 program to run smoothly. Visualization of CCs: Use of NGL Viewer allows an immediate inspection of any identified CCs, providing users an advantage over using the standalone version of Socket2. Each participating helix of the CC is initially displayed in different colours. Knob residues can be highlighted in ball-and-stick representation. Residues can then be rainbow-colour-coded according to their heptad register a-to-g. Data representation: Socket2 assigns a-to-g heptad registers to each chain of each identified CC. The webserver tabulates the name, number and heptad position for every residue (Fig.  1B), allowing quick inspection of sequence-to-structure relationships. Using Matplotlib, the webserver also plots interhelix angles for each CC (Fig.  1C), and core-packing angles for every knob residue (Fig.  1D). Separate tabs for each CC: Structures may have one or more CCs. The webserver generates ‘Results’ tab for each CC to aid quick switching, inspection and analysis of these in large protein structures. Metadata: The ‘Results’ tab also provides links to text files giving the detailed Socket outputs. a PyMol script allowing off-line visualization of the annotated CCs in PyMol (Schrödinger, 2021), and helix and core-packing angles (Fig.  1B). These will be particularly useful to those wishing to visualize and analyze sets of CC structures.

3 Applications

We anticipate that Socket2 and data generated from it will be of use in gathering CC sequence statistics and structural parameters to improve sequence-to-structure relationships for CC-prediction (Ludwiczak ), modelling (Guzenko and Strelkov, 2018) and design (Korendovych and DeGrado, 2020; Woolfson, 2017, 2021). It will also facilitate the development and population of sequence and structural databases such as CC+ (Testa ), which, likewise, can be used to test CC-prediction algorithms and to develop rules for CC design. We envisage that the Socket2 webserver will provide a useful gateway to such studies for experienced and new users alike.

4 Conclusions

Socket has been upgraded to Socket2 to allow the identification of all possible CC architectures in multiple structure-file formats containing protein chains with proteinogenic or modified amino acids. The Socket2 program is freely available to download under an MIT licence from http://coiledcoils.chm.bris.ac.uk/socket2/home.html. In addition, a user-friendly, interactive, and freely available webserver has been designed to run the program, and to allow quick visual inspection of the identified CCs and associated structural and sequence data. We anticipate that these tools with be useful to new and experienced cell, chemical, structural and synthetic biologist interested in natural and designed CC domains.
  24 in total

1.  A periodic table of coiled-coil protein structures.

Authors:  Efrosini Moutevelis; Derek N Woolfson
Journal:  J Mol Biol       Date:  2008-11-25       Impact factor: 5.469

2.  NGL viewer: web-based molecular graphics for large complexes.

Authors:  Alexander S Rose; Anthony R Bradley; Yana Valasatava; Jose M Duarte; Andreas Prlic; Peter W Rose
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

Review 3.  Towards functional de novo designed proteins.

Authors:  William M Dawson; Guto G Rhys; Derek N Woolfson
Journal:  Curr Opin Chem Biol       Date:  2019-07-20       Impact factor: 8.822

Review 4.  Coiled Coils - A Model System for the 21st Century.

Authors:  Andrei N Lupas; Jens Bassler
Journal:  Trends Biochem Sci       Date:  2016-11-21       Impact factor: 13.807

5.  DeepCoil-a fast and accurate prediction of coiled-coil domains in protein sequences.

Authors:  Jan Ludwiczak; Aleksander Winski; Krzysztof Szczepaniak; Vikram Alva; Stanislaw Dunin-Horkawicz
Journal:  Bioinformatics       Date:  2019-08-15       Impact factor: 6.937

Review 6.  A brief history of de novo protein design: minimal, rational, and computational.

Authors:  Derek N Woolfson
Journal:  J Mol Biol       Date:  2021-07-20       Impact factor: 5.469

Review 7.  Coiled coil protein origami: from modular design principles towards biotechnological applications.

Authors:  Fabio Lapenta; Jana Aupič; Žiga Strmšek; Roman Jerala
Journal:  Chem Soc Rev       Date:  2018-05-21       Impact factor: 54.564

8.  CCBuilder 2.0: Powerful and accessible coiled-coil modeling.

Authors:  Christopher W Wood; Derek N Woolfson
Journal:  Protein Sci       Date:  2017-09-15       Impact factor: 6.725

Review 9.  The Structure and Topology of α-Helical Coiled Coils.

Authors:  Andrei N Lupas; Jens Bassler; Stanislaw Dunin-Horkawicz
Journal:  Subcell Biochem       Date:  2017

10.  Applying graph theory to protein structures: an Atlas of coiled coils.

Authors:  Jack W Heal; Gail J Bartlett; Christopher W Wood; Andrew R Thomson; Derek N Woolfson
Journal:  Bioinformatics       Date:  2018-10-01       Impact factor: 6.937

View more
  6 in total

Review 1.  Protein Design: From the Aspect of Water Solubility and Stability.

Authors:  Rui Qing; Shilei Hao; Eva Smorodina; David Jin; Arthur Zalevsky; Shuguang Zhang
Journal:  Chem Rev       Date:  2022-08-03       Impact factor: 72.087

2.  PTX3 structure determination using a hybrid cryoelectron microscopy and AlphaFold approach offers insights into ligand binding and complement activation.

Authors:  Dylan P Noone; Douwe J Dijkstra; Teun T van der Klugt; Peter A van Veelen; Arnoud H de Ru; Paul J Hensbergen; Leendert A Trouw; Thomas H Sharp
Journal:  Proc Natl Acad Sci U S A       Date:  2022-08-08       Impact factor: 12.779

3.  An M protein coiled coil unfurls and exposes its hydrophobic core to capture LL-37.

Authors:  Piotr Kolesinski; Kuei-Chen Wang; Yujiro Hirose; Victor Nizet; Partho Ghosh
Journal:  Elife       Date:  2022-06-21       Impact factor: 8.713

4.  Molecular architecture of the autoinhibited kinesin-1 lambda particle.

Authors:  Johannes F Weijman; Sathish K N Yadav; Katherine J Surridge; Jessica A Cross; Ufuk Borucu; Judith Mantell; Derek N Woolfson; Christiane Schaffitzel; Mark P Dodding
Journal:  Sci Adv       Date:  2022-09-16       Impact factor: 14.957

5.  Bioinformatics Analysis of the Periodicity in Proteins with Coiled-Coil Structure-Enumerating All Decompositions of Sequence Periods.

Authors:  Andre Then; Haotian Zhang; Bashar Ibrahim; Stefan Schuster
Journal:  Int J Mol Sci       Date:  2022-08-04       Impact factor: 6.208

6.  Rational design of photosynthetic reaction center protein maquettes.

Authors:  Nathan M Ennist; Steven E Stayrook; P Leslie Dutton; Christopher C Moser
Journal:  Front Mol Biosci       Date:  2022-09-21
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.