Literature DB >> 20931264

Automated protein resonance assignments of magic angle spinning solid-state NMR spectra of β1 immunoglobulin binding domain of protein G (GB1).

Hunter N B Moseley1, Lindsay J Sperling, Chad M Rienstra.   

Abstract

Magic-angle spinning solid-state NMR (MAS SSNMR) represents a fast developing experimental technique with great potential to provide structural and dynamics information for proteins not amenable to other methods. However, few automated analysis tools are currently available for MAS SSNMR. We present a methodology for automating protein resonance assignments of MAS SSNMR spectral data and its application to experimental peak lists of the β1 immunoglobulin binding domain of protein G (GB1) derived from a uniformly ¹³C- and ¹⁵N-labeled sample. This application to the 56 amino acid GB1 produced an overall 84.1% assignment of the N, CO, CA, and CB resonances with no errors using peak lists from NCACX 3D, CANcoCA 3D, and CANCOCX 4D experiments. This proof of concept demonstrates the tractability of this problem.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20931264      PMCID: PMC2962796          DOI: 10.1007/s10858-010-9448-2

Source DB:  PubMed          Journal:  J Biomol NMR        ISSN: 0925-2738            Impact factor:   2.835


Introduction

Magic-angle spinning solid-state NMR (MAS SSNMR) represents a fast developing experimental method with great potential to provide structural and dynamics information for proteins not amenable to solution NMR nor X-ray crystallography. Many technical aspects of MAS SSNMR are rapidly developing, among them: (i) improvements in nano/microcrystalline and membrane protein sample preparation (Frericks et al. 2006; Li et al. 2007; Lorch et al. 2005) (ii) improvements in commercially available hardware, and (iii) development of pulse sequences for new and improved experiments (Sun et al. 1997; Li et al. 2007; Franks et al. 2007; Zhong et al. 2007; Hong 1999; Bockmann et al. 2003, Rienstra et al. 2000; Pauli et al. 2001; Igumenova et al. 2004; Astrof et al. 2001). In many cases, adaptation of tools and techniques from solution NMR have fueled this rapid development. However, the development of analysis software for MAS SSNMR lags far behind. In particular, more sophisticated automated protein resonance assignment programs for solution NMR cannot be directly used on SSNMR data lacking hydrogen resonances. This is because leading protein resonance assignment programs (Zimmerman et al. 1997; Leutner et al. 1998; Atreya et al. 2000; Bartels et al. 1996, 1997, 2004; Moseley et al. 2001; Moseley and Montelione 1999; Moseley et al. 2004; Huang et al. 2005; Coggins and Zhou 2003; Jung and Zweckstetter 2004; Eghbalnia et al. 2005; Hyberts and Wagner; 2003) are hard wired with an amide 15N-1H double resonance spin system root definition (Fig. 1) and require hydrogen-based experiments. To address this deficiency, we present a methodology for automating protein resonance assignments of MAS SSNMR spectral data and its practical application to an experimental peak list dataset of β1 immunoglobulin binding domain of protein G (GB1) as a proof of concept. Our goals are: (i) to eventually provide the necessary software tools to automate the MAS SSNMR protein resonance assignment process (ii) to improve the quality of this analysis, and (iii) to make this analysis more objective and reproducible.
Fig. 1

Standard dipeptide spin system definitions for sequential protein resonance assignments in solution and solid state NMR. Spin system root resonances are in red. The solid red box indicates that the root resonances are found in all standard experiments used in dipeptide spin system assembly. The dashed red boxes indicate pairs of root resonances are found in only a subset of the experiments used in dipeptide spin system assembly

Standard dipeptide spin system definitions for sequential protein resonance assignments in solution and solid state NMR. Spin system root resonances are in red. The solid red box indicates that the root resonances are found in all standard experiments used in dipeptide spin system assembly. The dashed red boxes indicate pairs of root resonances are found in only a subset of the experiments used in dipeptide spin system assembly Figure 2 shows the protein resonance assignment problem represented as a bipartite graph. This assignment problem is essentially the same for both solution and solid-state NMR (Tycko 1996; Hong 1999) and involves seven basic steps to effectively solve it (Table 1). But one of the critical differences between solution and solid-state NMR is the root resonances used to group peaks into spin systems. These resonances are dictated by the set of NMR experiments (i.e., experimental strategy) used to solve this assignment problem. As shown in Fig. 1, common MAS SSNMR protein resonance assignment strategies use a partial triple resonance spin system root definition (Pauli et al. 2001; Igumenova et al. 2004; Franks et al. 2005; Balayssac et al. 2007; Hong 1999; Sperling et al. 2010), since not all three resonances may be present within each experiment in a given strategy. MAS SSNMR experimental strategies naturally group into three categories of assignment strategies (Table 2). In category I, two sets of experiments containing either Ni-C’i-1 or Ni-Cαi root resonances are combined into complete dipeptide spin systems using the single common amide nitrogen root resonance. In categories IIa and IIb, experiments containing either Ni-C’i-1 or Ni-Cαi root resonances are combined into complete dipeptide spin systems using two common root resonances. In category III, the listed 4D experiments contain all three root resonances, which represent a complete triple resonance spin system root definition. Labs have published assignment results using category I strategies, but only on small proteins (Hong 1999; Pauli et al. 2001; Igumenova et al. 2004; Franks et al. 2005; Balayssac et al. 2007). Labs are starting to use category II strategies for larger proteins (Frericks et al. 2006; Li et al. 2007; Li et al. 2008). It is expected that labs in the future will probably explore category III strategies using newer G-matrix Fourier transformation (GFT) experiments(Szyperski et al. 1993a; Szyperski et al. 1993b; Kim and Szyperski 2003; Kim and Szyperski 2004; Astrof et al. 2001; Luca and Baldus 2002). Moreover, category II and III strategies have strengths that could make them better for automation than even solution NMR strategies. First, the chemical shift dispersion in Euclidean space of Ni-Cαi, and especially C′i−1-Ni-Cαi root resonance tuples is significantly greater than for Ni-Hi root resonance tuples. Said another way, Ni-Cαi pairs of chemical shifts for a folded protein plotted on a 2D graph as small circles with radius representing the uncertainty in their chemical shift values will show less dense clumps (i.e. less overlapping of circles) than Ni-Hi pairs of chemical shifts plotted in a similar way. This helps prevent the non-unique grouping of peaks into spin systems, which severely complicates resonance assignments. Second, category IIa and IIb strategies can be combined into a single strategy represented as a merged double bipartite graph. This representation may lead to the development of superior grouping and linking algorithms.
Fig. 2

Bipartite graph representing the protein resonance assignment problem. Amino acid typing limits the edges present. Red highlights represent spin system linking into a uniquely mapped segment

Table 1

Protein resonance assignment process

Step
1. Peak list registration
2. Peak list quality assessment
3. Spin system grouping
4. Amino acid typing
5. Linking
6. Mapping
7. Resonance assignment quality assessment
Table 2

MAS SSNMR experimental strategies for protein resonance assignment

Category ICategory IIaCategory IIbCategory III
i- Ni-C′i−1a,b C′ i−1 -N i -Cα ia,b i -N i -C′ i−1a,b i -N i -C′ i−1 -CX i-1c
N i - i - CX id,e,f C′ i−1 - N i -( i )- CX ig i - N i -( C′ i1 )- CX i1c C′ i-1 - N i - i - CX i
N i - C′ i1 - CX i−1d,e,h N i C′ i1 CX i1d,e,h N i - i - CX id,e,f,h i-Ni-C′i−1-Cαi−1c
Ni-C′i−1-Cαi−1f,i Ni-C′i−1-Cαi−1f,i Ni-Cαi-Cαiii,j C′i−1-Ni-Cαi-Cβi
Ni-Cαi-Cαiii,j C′i−1-Ni-(Cαi)-Cβi Ni-Cαi-Cβib C′i−1-Ni-Cαi-Cαii
Ni-Cαi-Cβib C′i−1-Ni-(Cαi)-C′i Ni-Cαi-C′If C′i−1-Ni-Cαi-C′i
Ni-C′i−1-(Cαi-1)-Cαi−1i−1j Ni-C′i-1-(Cαi-1)-Cαi−1i−1j i-Ni-(C′i−1)-Cαi-1
Ni-C′i−1-(Cαi−1)-Cβi−1 Ni-C′i−1-(Cαi-1)-Cβi−1
Ni-Cαi-C′If

Experiments refer to the detected nuclei and magnetization transfer and not to specific pulse sequence implementations. Experiments in Bold are required

aAstrof et al. (2001), b Li et al. (2007), c Franks et al. (2007), d Sun et al. (1997), e Rienstra et al. (2000), f Igumenova et al. (2004), g Zhong et al. (2007), h Pauli et al. (2001), i Hong (1999), j Bockmann et al. (2003)

Bipartite graph representing the protein resonance assignment problem. Amino acid typing limits the edges present. Red highlights represent spin system linking into a uniquely mapped segment Protein resonance assignment process MAS SSNMR experimental strategies for protein resonance assignment Experiments refer to the detected nuclei and magnetization transfer and not to specific pulse sequence implementations. Experiments in Bold are required aAstrof et al. (2001), b Li et al. (2007), c Franks et al. (2007), d Sun et al. (1997), e Rienstra et al. (2000), f Igumenova et al. (2004), g Zhong et al. (2007), h Pauli et al. (2001), i Hong (1999), j Bockmann et al. (2003) Automated resonance assignments of β1 immunoglobulin binding domain of protein G. Resonances derived from intra experiments are indicated in red. Resonances derived from sequential experiments are indicated in blue However, MAS SSNMR spectra, especially of membrane proteins, often lack significant numbers of resonances at a given experimental condition (Andronesi et al. 2005; Li et al. 2007), which can especially confuse both global optimization and exhaustive search mapping algorithms. But spectroscopists are finding clever ways to optimize their experiments for higher sensitivity. For instance, dropping the temperature below 0°C can improve signal intensity several-fold (Kloepper et al. 2007). Moreover, experiments can be collected under multiple conditions to improve detection of all resonances. Another historical problem in SSNMR experiments is large spectral line widths, which increase spectral crowding and peak overlap. However, improvements in magic-angle spinning techniques, pulse sequences, and micro/nano crystalline sample preparations are greatly reducing observed line widths into the sub-ppm range (Franks et al. 2005; Pauli et al. 2000, McDermott et al. 2000; Martin and Zilm 2003). For example, a recent MAS SSNMR resonance assignment of 20 kDa membrane protein DsbB had average 15N and 13C line widths of 0.7 and 0.5 ppm, respectively (Li et al. 2007, 2008). Furthermore, several labs have recently developed and used 3D and 4D experiments to reduce peak overlap in spectra of membrane proteins (Zhong et al. 2007; Kijac et al. 2007; Li et al. 2007, 2008; Frericks et al. 2006; Franks et al. 2007).

Materials and methods

We have implemented a prototype of alignment, grouping, and typing algorithms and combined them with the linking and mapping algorithms from the solution NMR assignment package AutoAssign (Moseley et al. 2001; Moseley and Montelione 1999; Moseley et al. 2004; Baran et al. 2004; Huang et al. 2005; Zimmerman et al. 1997) to provide a proof of concept. The alignment algorithm constructs and compares Euclidean distance matrices for “input” and “root” peak lists and is similar to the point pattern match algorithm pioneered by Ranade and Rosenfeld (Ranade and Rosenfeld 1980) and improved later for use in landstat image registration (Ton and Jain 1989). We have three improvements over their algorithm: (i) the use of the Jaccard coefficient (i.e. set union divided by set intersection) in place of a simple support list count as the robustness score; (ii) the multiplication of the Jaccard coefficient by the probability of a support pair’s registration; and (iii) the use of a weighted standard deviation of registration in deriving support tolerances. The latter two improvements convert the algorithm into a stationary iterative method. The algorithm is optimized to a computational complexity of O(mn2logn) where m and n represent the lengths of the root and input peak lists, respectively. But we see a clear path to improve the computational complexity to O(mn2). This alignment algorithm provides: (i) the best mapping of peaks from an “input” peak list to peaks in a “root” peak list for their comparable spectral dimensions; (ii) the registration needed to translate the input peak list to the root peak list in their comparable dimensions; and (iii) the standard deviation of this registration, which is needed to calculate match tolerances. While the alignment step is the most computationally intensive step, it only has to be performed once and provides the first set of major quality control measures for the given dataset. The next step involves grouping of peaks into dipeptide spin systems using root resonances that all the peaks in the spin system have in common. Each dipeptide spin system is composed of intra-residue resonances and sequential-residue resonances organized as ladders. Our grouping algorithm uses a new bottom-up approach to dipeptide spin system grouping in contrast to the common top-down algorithms that use a single root spectrum as seeds for spin system creation. In this grouping algorithm, peak list-based and ladder-based groupings are done first before building the dipeptide spin systems. Peaks from a single spectrum are more self-consistent in their values than peaks between spectra. The new algorithm can use narrower tolerances to group peaks within a spectrum first and then average the root resonances of these intra-spectra peaks to improve their standard error. The same logic is applied to groups of peaks in the same ladder. The number of complete spin systems derived from the grouping algorithm provides the second major quality control measure for the given dataset. For the typing algorithm, we introduce the concept of a chemical shift tuple or ordered list of chemical shifts that have some support for being in the same ladder or dipeptide spin system. Using a heuristic, the algorithm constructs a set of possible carbon chemical shift tuples to calculate Bayesian typing probabilities. Doing so minimizes the deleterious effects of resonance misclassification, which can arise from a multitude of situations including overlapped spin systems, noise peaks, and missing peaks. Furthermore, we can constrain tuple creation using 4D information from category III experiments (Table 2) and bottom-up grouping. However, the probability densities are no longer comparable in this Bayesian statistical framework because the probability density function changes with the number of carbon chemical shifts or independent variables used. This variation in the number of independent variables across the 20 residue types requires the use of chi-square probabilities, or p-values of a chi-square statistic, instead of probability densities. In the future, we can use the tuple concept to improve the linking and mapping algorithms.

Results and discussion

Currently, our implementation handles only a limited set of experimental peak lists which includes: (i) NCACX 3D (with 35ms DARR mixing) (ii) CANcoCA 3D, and (iii) CANCOCX 4D (Franks et al. 2005; Franks et al. 2007). These peak lists represent a category IIb assignment strategy (Table 2) which uses a Ni-Cαi root to create dipeptide spin systems. The implementation takes these peak lists, aligns them, groups peaks into dipeptide spin systems in a bottom-up strategy, and then types each ladder to probable amino acids using the carbon shift tuples. The implementation then simulates a set of Ni-Hi rooted peak lists for AutoAssign with an artificial HN shift equal to the observed CA shift divided by 6 (HN = CA/6). This creation of artificial HN shifts is necessary because AutoAssign requires Ni-Hi rooted peak lists. We then use AutoAssign to perform the linking and mapping steps. From this, we have an overall 84.1% assignment of the N, CO, CA, and CB resonances with no errors (Fig. 3), as compared to manually determined and verified assignments (BMRB entry 15156). These results demonstrate the feasibility of automating protein resonance assignments of MAS SSNMR spectral data. They are easily reproduced by the software and lack significant human subjectivity in the grouping and typing of spin systems. Also, the input peak lists are not perfect either, representing realistic peak lists that a spectroscopist used for manual assignment. There are only matching peaks to form 52 out of 56 dipeptide spin systems and some CB peaks are simply missing. Since the CANCOCX experiment is a 4D experiment, the resolution of the CA dimension is very low, causing a matching standard deviation of ~0.5 ppm when aligned to the other two peak lists. But our implementation handled the missing information and resolution issues and assigned 43 out of 52 dipeptide spin systems. There are three main reasons for these results: (i) better dispersion with a Ni-Cαi root; (ii) an improved bottom-up grouping algorithm that especially allows CANCOCX peaks to group around a common C’i-1-Ni-Cαi root before grouping with peaks from other peak lists; and (iii) improved amino acid typing algorithms that shrank the average “possible residue type list” to 5.7 residues with 0.9999 confidence (normally ~8 residues with Cα/Cβ typing). We expect even better results once improved linking and mapping algorithms are implemented, allowing the development of software that will improve the quality of analysis over manual assignment alone. This software is available at http://bioinformatics.chem.louisville.edu.
Fig. 3

Automated resonance assignments of β1 immunoglobulin binding domain of protein G. Resonances derived from intra experiments are indicated in red. Resonances derived from sequential experiments are indicated in blue

Below is the link to the electronic supplementary material. Supplementary material 1 (PDF 42 kb)
  37 in total

Review 1.  Automated analysis of NMR assignments and structures for proteins.

Authors:  H N Moseley; G T Montelione
Journal:  Curr Opin Struct Biol       Date:  1999-10       Impact factor: 6.809

2.  Automatic determination of protein backbone resonance assignments from triple resonance nuclear magnetic resonance data.

Authors:  H N Moseley; D Monleon; G T Montelione
Journal:  Methods Enzymol       Date:  2001       Impact factor: 1.600

3.  PACES: Protein sequential assignment by computer-assisted exhaustive search.

Authors:  Brian E Coggins; Pei Zhou
Journal:  J Biomol NMR       Date:  2003-06       Impact factor: 2.835

4.  Assignment validation software suite for the evaluation and presentation of protein resonance assignment data.

Authors:  Hunter N B Moseley; Gurmukh Sahota; Gaetano T Montelione
Journal:  J Biomol NMR       Date:  2004-04       Impact factor: 2.835

5.  Automated sequence-specific NMR assignment of homologous proteins using the program GARANT.

Authors:  C Bartels; M Billeter; P Güntert; K Wüthrich
Journal:  J Biomol NMR       Date:  1996-05       Impact factor: 2.835

6.  Backbone and side-chain 13C and 15N signal assignments of the alpha-spectrin SH3 domain by magic angle spinning solid-state NMR at 17.6 Tesla.

Authors:  J Pauli; M Baldus; B van Rossum; H de Groot; H Oschkinat
Journal:  Chembiochem       Date:  2001-04-02       Impact factor: 3.164

7.  Assignment strategies for large proteins by magic-angle spinning NMR: the 21-kDa disulfide-bond-forming enzyme DsbA.

Authors:  Lindsay J Sperling; Deborah A Berthold; Terry L Sasser; Victoria Jeisy-Scott; Chad M Rienstra
Journal:  J Mol Biol       Date:  2010-04-13       Impact factor: 5.469

8.  Automated backbone assignment of labeled proteins using the threshold accepting algorithm.

Authors:  M Leutner; R M Gschwind; J Liermann; C Schwarz; G Gemmecker; H Kessler
Journal:  J Biomol NMR       Date:  1998-01       Impact factor: 2.835

9.  Assignment of the backbone resonances for microcrystalline ubiquitin.

Authors:  Tatyana I Igumenova; A Joshua Wand; Ann E McDermott
Journal:  J Am Chem Soc       Date:  2004-04-28       Impact factor: 15.419

10.  Four-dimensional heteronuclear correlation experiments for chemical shift assignment of solid proteins.

Authors:  W Trent Franks; Kathryn D Kloepper; Benjamin J Wylie; Chad M Rienstra
Journal:  J Biomol NMR       Date:  2007-08-09       Impact factor: 2.582

View more
  15 in total

Review 1.  Membrane proteins in their native habitat as seen by solid-state NMR spectroscopy.

Authors:  Leonid S Brown; Vladimir Ladizhansky
Journal:  Protein Sci       Date:  2015-05-27       Impact factor: 6.725

2.  Automated solid-state NMR resonance assignment of protein microcrystals and amyloids.

Authors:  Elena Schmidt; Julia Gath; Birgit Habenstein; Francesco Ravotti; Kathrin Székely; Matthias Huber; Lena Buchner; Anja Böckmann; Beat H Meier; Peter Güntert
Journal:  J Biomol NMR       Date:  2013-05-21       Impact factor: 2.835

3.  VirtualSpectrum, a tool for simulating peak list for multi-dimensional NMR spectra.

Authors:  Jakob Toudahl Nielsen; Niels Chr Nielsen
Journal:  J Biomol NMR       Date:  2014-08-14       Impact factor: 2.835

4.  Automated robust and accurate assignment of protein resonances for solid state NMR.

Authors:  Jakob Toudahl Nielsen; Natalia Kulminskaya; Morten Bjerring; Niels Chr Nielsen
Journal:  J Biomol NMR       Date:  2014-05-10       Impact factor: 2.835

5.  A general Monte Carlo/simulated annealing algorithm for resonance assignment in NMR of uniformly labeled biopolymers.

Authors:  Kan-Nian Hu; Wei Qiang; Robert Tycko
Journal:  J Biomol NMR       Date:  2011-06-28       Impact factor: 2.835

6.  On the problem of resonance assignments in solid state NMR of uniformly ¹⁵N,¹³C-labeled proteins.

Authors:  Robert Tycko
Journal:  J Magn Reson       Date:  2015-04       Impact factor: 2.229

7.  Improved chemical shift prediction by Rosetta conformational sampling.

Authors:  Ye Tian; Stanley J Opella; Francesca M Marassi
Journal:  J Biomol NMR       Date:  2012-09-25       Impact factor: 2.835

8.  Experimental Protein Structure Verification by Scoring with a Single, Unassigned NMR Spectrum.

Authors:  Joseph M Courtney; Qing Ye; Anna E Nesbitt; Ming Tang; Marcus D Tuttle; Eric D Watt; Kristin M Nuzzio; Lindsay J Sperling; Gemma Comellas; Joseph R Peterson; James H Morrissey; Chad M Rienstra
Journal:  Structure       Date:  2015-09-10       Impact factor: 5.006

9.  Analysis of local molecular motions of aromatic sidechains in proteins by 2D and 3D fast MAS NMR spectroscopy and quantum mechanical calculations.

Authors:  Piotr Paluch; Tomasz Pawlak; Agata Jeziorna; Julien Trébosc; Guangjin Hou; Alexander J Vega; Jean-Paul Amoureux; Martin Dracinsky; Tatyana Polenova; Marek J Potrzebowski
Journal:  Phys Chem Chem Phys       Date:  2015-11-21       Impact factor: 3.676

10.  A software framework for analysing solid-state MAS NMR data.

Authors:  Tim J Stevens; Rasmus H Fogh; Wayne Boucher; Victoria A Higman; Frank Eisenmenger; Benjamin Bardiaux; Barth-Jan van Rossum; Hartmut Oschkinat; Ernest D Laue
Journal:  J Biomol NMR       Date:  2011-09-28       Impact factor: 2.835

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.