Literature DB >> 21666267

Xwalk: computing and visualizing distances in cross-linking experiments.

Abdullah Kahraman1, Lars Malmström, Ruedi Aebersold.   

Abstract

MOTIVATION: Chemical cross-linking of proteins or protein complexes and the mass spectrometry-based localization of the cross-linked amino acids in peptide sequences is a powerful method for generating distance restraints on the substrate's topology.
RESULTS: Here, we introduce the algorithm Xwalk for predicting and validating these cross-links on existing protein structures. Xwalk calculates and displays non-linear distances between chemically cross-linked amino acids on protein surfaces, while mimicking the flexibility and non-linearity of cross-linker molecules. It returns a 'solvent accessible surface distance', which corresponds to the length of the shortest path between two amino acids, where the path leads through solvent occupied space without penetrating the protein surface. AVAILABILITY: Xwalk is freely available as a web server or stand-alone JAVA application at http://www.xwalk.org.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21666267      PMCID: PMC3137222          DOI: 10.1093/bioinformatics/btr348

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

In computational structural biology, distance restraints from chemical cross-linking experiments have so far been employed as an upper limit on the Euclidean distance between a pair of cross-linked amino acids (Kaimann ; Shandiz ). However, deducing the ‘cross-linkability’ of an amino acid pair by measuring the length of a Euclidean distance vector disregards the fact that the vector often penetrates segments of the protein. Potluri ) have recognized this problem and implemented a short-cut algorithm that computes the shortest path between two cross-linked amino acids by using vertices from a protein surface triangulation and convex hull, while Zelter ) have explicitly modeled the cross-linker molecule onto existing protein structures. We have implemented Xwalk, which resembles the approach taken by Potluri et al., but instead uses grids and a search algorithm to compute the length of the shortest path (Fig. 1), which shall be referred to as solvent accessible surface distance (SASD). Our code is the only of its kind being open source and available in form of a web server.
Fig. 1.

(a) Shortest SASD path illustrated on the example of human prothrombin (PISA-Id: 1dx5, chain E). The Cβ atoms of Lys-70 (orange sphere) and Lys-81 (green sphere) have a Euclidean distance of 9.1 Å (yellow vector), which by value would have been in the cross-link range for DSS or BS3. However, the shortest path with an SASD of 59.2 Å reveals that the Euclidean distance vector actually penetrates the protein, leaving the only option to connect both amino acids via a long detour over the protein surface (chain of spheres colored blue to red for distances of 0–59 Å., respectively). (b) Argonaut protein from the RNA-induced silencing complex (RISC) with 271 virtual intra-protein cross-links. Both figures were rendered with PyMOL (http://www.pymol.org).

(a) Shortest SASD path illustrated on the example of human prothrombin (PISA-Id: 1dx5, chain E). The Cβ atoms of Lys-70 (orange sphere) and Lys-81 (green sphere) have a Euclidean distance of 9.1 Å (yellow vector), which by value would have been in the cross-link range for DSS or BS3. However, the shortest path with an SASD of 59.2 Å reveals that the Euclidean distance vector actually penetrates the protein, leaving the only option to connect both amino acids via a long detour over the protein surface (chain of spheres colored blue to red for distances of 0–59 Å., respectively). (b) Argonaut protein from the RNA-induced silencing complex (RISC) with 271 virtual intra-protein cross-links. Both figures were rendered with PyMOL (http://www.pymol.org).

2 IMPLEMENTATION

Xwalk was written in the JAVA programming language. It is based on the CleftXplorer modelling package (Kahraman ) and uses the breath-first search algorithm on local grid representations of the protein and its surrounding solvent to calculate the shortest SASD path between two atoms of two amino acids on the protein surface (Fig. 1). Xwalk can run in two modes (Supplementary Material), namely in validation mode in which Xwalk verifies experimentally measured cross-links on an existing protein structure, or in production mode, in which Xwalk reports a list of in silico predicted theoretically possible virtual cross-links (vXL) that might be observed in a cross-linking experiment. Both modes are identical except for step 1.c in the Supplementary Material, which in production mode is replaced by the specification of generic identifiers of amino acids to be cross-linked in silico. Xwalk checks that the cross-linked amino acids and the entire SASD path is solvent accessible. Furthermore, Xwalk takes the dynamic disorder of protein segments within X-ray structures into account. Therefore, it increases the maximum distance range of a cross-linker spacer arm by the sum of the mean atomic displacement of the cross-linked amino acids. The mean atomic displacement 〈x〉 of a single amino acid is inferred from the Debye–Waller formula B=8π2〈x2〉, where B is the atomic B factor of the cross-linked amino acid as given in a PDB file. Moreover, Xwalk holds the option to discard all side chains from the distance calculation to account for their conformational change when reacting with the cross-linker molecule. At the same time, the solvent accessible surface area is expanded by increasing the solvent radius to 2.0 Å to avoid path calculations through molecular ‘tunnels’ that arise due to the side chain depletion. The output of Xwalk is either a list of vXL or a PyMOL script (http://www.pymol.org) displaying the shortest SASD path as a list of dummy atom entries in a PDB file (Fig. 1). The list of vXL is a list of atom pairs sorted by SASD with information on their amino acid number and name, chain identifier and atom name, along with their distances in the PDB sequence, their Euclidean distance and SASD. Furthermore, an in silico trypsin digestion can be requested, in which case the associated shortest tryptic peptide sequences are reported. The source code of Xwalk is available under a Creative Commons license together with the executable at http://www.xwalk.org. The same site provides also an easy to use web interface to the basic functionalities of Xwalk with a Jmol viewer applet (http://www.jmol.org) as a visualization tool for the shortest paths.

3 CROSS-LINKING THE PDB

A cross-linking experiment can only yield cross-links if the proteins under study have particular amino acids that are within a certain distance from each other and solvent accessible. However, the number of cellular proteins that have such characteristics is not known neither is the number of cross-links one can expect per protein or protein complex. To estimate these numbers, we have run Xwalk in production mode on a non-homologous protein dataset and simulated the most common cross-linking reagents DSS and BS3 with both having a maximum distance cut-off of 22.4 Å (11.42 Å N–N distance in DSS +2×5 Å CB-NZ distance in lysine), discarding side chains and measuring distances between Cβ atoms. The protein dataset consisted of 1621 X-ray protein structures from the PISA server (Krissinel and Henrick, 2007), where protein homology was defined by the H-level or the superfamily-level in the CATH (Orengo ) or SCOP data base (Murzin ), respectively. Each protein in the dataset was selected to have the highest annotated domain coverage and the highest number of protein chains within its homology class, while setting an upper bound of 20 protein chains for oligomeric protein complexes. In the entire dataset, we calculated 30 266 unique vXL (excluding vXL that are found between equivalent amino acids in homomers, as these cannot be distinguished in real cross-linking experiments). Of these, 25 751 were intra-protein and 4515 were inter-protein vXL. The number of the unique intra- and inter-protein vXL increases for one to five unique protein chains from 15 to 45 and 2 to 24, respectively. In all, 18% of proteins had no vXL at all, while 40 protein structures had more than 100 vXL. The highest number of unique vXL in the dataset, namely 271 vXL, was found in the monomeric structure of the RNA-induced silencing complex (RISC) associated argonaut protein (PDB-Id: 1u04, see Fig. 1b) and in the bacteriophage DNA polymerase–DNA terminal protein complex (PDB-Id: 2ex3). The benefit of Xwalk and SASD becomes apparent when the above analysis is repeated with the conventional Euclidean distance. The repetition with a 22.4 Å Euclidean distance cutoff resulted in 65 447 vXL, i.e. more than twice as many as with SASD. Of these, 35 181 vXL had a SASD larger than 22.4 Å that differed on average by >8 Å. Of these, >100 vXL's had a distance difference of >50 Å (see exemplary Fig. 1a). These numbers suggest that Xwalk is able to reduce the false positive prediction of cross-links by >50%. The large discrepancy emphasizes the importance of an adequate model for a cross-linker molecule in cross-linking experiments. Despite the smaller number of false positives with SASD, we have observed that the number of vXL usually exceeds the number of experimental cross-links by at least one order of magnitude (Leitner ). Most of the theoretically predicted but experimentally unobserved cross-links may be missed because of their low abundance, unfavorable chromatographic, ionization and fragmentation properties or due to their unsuitable peptide length. Another issue arises in cases in which segments of the protein structure are missing, such as in intrinsically disordered proteins or proteins with flexible loops. These regions will have missing atom coordinates that are currently ignored by Xwalk and may lead to lower SASD than expected.
  9 in total

1.  Geometric analysis of cross-linkability for protein fold discrimination.

Authors:  S Potluri; A A Khan; A Kuzminykh; J M Bujnicki; A M Friedman; C Bailey-Kellogg
Journal:  Pac Symp Biocomput       Date:  2004

2.  Intramolecular cross-linking evaluated as a structural probe of the protein folding transition state.

Authors:  Ali T Shandiz; Benjamin R Capraro; Tobin R Sosnick
Journal:  Biochemistry       Date:  2007-11-07       Impact factor: 3.162

3.  Inference of macromolecular assemblies from crystalline state.

Authors:  Evgeny Krissinel; Kim Henrick
Journal:  J Mol Biol       Date:  2007-05-13       Impact factor: 5.469

4.  Molecular model of an alpha-helical prion protein dimer and its monomeric subunits as derived from chemical cross-linking and molecular modeling calculations.

Authors:  T Kaimann; S Metzger; K Kuhlmann; B Brandt; E Birkmann; H-D Höltje; D Riesner
Journal:  J Mol Biol       Date:  2007-11-21       Impact factor: 5.469

5.  Isotope signatures allow identification of chemically cross-linked peptides by mass spectrometry: a novel method to determine interresidue distances in protein structures through cross-linking.

Authors:  Alex Zelter; Michael R Hoopmann; Robert Vernon; David Baker; Michael J MacCoss; Trisha N Davis
Journal:  J Proteome Res       Date:  2010-07-02       Impact factor: 4.466

6.  On the diversity of physicochemical environments experienced by identical ligands in binding pockets of unrelated proteins.

Authors:  Abdullah Kahraman; Richard J Morris; Roman A Laskowski; Angelo D Favia; Janet M Thornton
Journal:  Proteins       Date:  2010-04

7.  CATH--a hierarchic classification of protein domain structures.

Authors:  C A Orengo; A D Michie; S Jones; D T Jones; M B Swindells; J M Thornton
Journal:  Structure       Date:  1997-08-15       Impact factor: 5.006

8.  SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors:  A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1995-04-07       Impact factor: 5.469

Review 9.  Probing native protein structures by chemical cross-linking, mass spectrometry, and bioinformatics.

Authors:  Alexander Leitner; Thomas Walzthoeni; Abdullah Kahraman; Franz Herzog; Oliver Rinner; Martin Beck; Ruedi Aebersold
Journal:  Mol Cell Proteomics       Date:  2010-03-31       Impact factor: 5.911

  9 in total
  60 in total

1.  Analysis of secondary structure in proteins by chemical cross-linking coupled to MS.

Authors:  Mariana Fioramonte; Aline Mara dos Santos; Sean McIlwain; William S Noble; Kleber G Franchini; Fabio C Gozzo
Journal:  Proteomics       Date:  2012-08       Impact factor: 3.984

2.  Quantitative Cross-Linking of Proteins and Protein Complexes.

Authors:  Marie Barth; Carla Schmidt
Journal:  Methods Mol Biol       Date:  2021

3.  M3: an integrative framework for structure determination of molecular machines.

Authors:  Ezgi Karaca; João P G L M Rodrigues; Andrea Graziadei; Alexandre M J J Bonvin; Teresa Carlomagno
Journal:  Nat Methods       Date:  2017-08-14       Impact factor: 28.547

4.  Modeling Protein Excited-state Structures from "Over-length" Chemical Cross-links.

Authors:  Yue-He Ding; Zhou Gong; Xu Dong; Kan Liu; Zhu Liu; Chao Liu; Si-Min He; Meng-Qiu Dong; Chun Tang
Journal:  J Biol Chem       Date:  2016-12-19       Impact factor: 5.157

5.  Cross-linking reveals laminin coiled-coil architecture.

Authors:  Gad Armony; Etai Jacob; Toot Moran; Yishai Levin; Tevie Mehlman; Yaakov Levy; Deborah Fass
Journal:  Proc Natl Acad Sci U S A       Date:  2016-11-04       Impact factor: 11.205

6.  Protein interactions, post-translational modifications and topologies in human cells.

Authors:  Juan D Chavez; Chad R Weisbrod; Chunxiang Zheng; Jimmy K Eng; James E Bruce
Journal:  Mol Cell Proteomics       Date:  2013-01-25       Impact factor: 5.911

7.  Distance restraints from crosslinking mass spectrometry: mining a molecular dynamics simulation database to evaluate lysine-lysine distances.

Authors:  Eric D Merkley; Steven Rysavy; Abdullah Kahraman; Ryan P Hafen; Valerie Daggett; Joshua N Adkins
Journal:  Protein Sci       Date:  2014-04-03       Impact factor: 6.725

8.  Role and structural mechanism of WASP-triggered conformational changes in branched actin filament nucleation by Arp2/3 complex.

Authors:  Max Rodnick-Smith; Qing Luan; Su-Ling Liu; Brad J Nolen
Journal:  Proc Natl Acad Sci U S A       Date:  2016-06-20       Impact factor: 11.205

9.  Mixed-isotope labeling with LC-IMS-MS for characterization of protein-protein interactions by chemical cross-linking.

Authors:  Eric D Merkley; Erin S Baker; Kevin L Crowell; Daniel J Orton; Thomas Taverner; Charles Ansong; Yehia M Ibrahim; Meagan C Burnet; John R Cort; Gordon A Anderson; Richard D Smith; Joshua N Adkins
Journal:  J Am Soc Mass Spectrom       Date:  2013-02-20       Impact factor: 3.109

10.  Visualization of Host-Polerovirus Interaction Topologies Using Protein Interaction Reporter Technology.

Authors:  Stacy L DeBlasio; Juan D Chavez; Mariko M Alexander; John Ramsey; Jimmy K Eng; Jaclyn Mahoney; Stewart M Gray; James E Bruce; Michelle Cilia
Journal:  J Virol       Date:  2015-12-09       Impact factor: 5.103

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.