| Literature DB >> 36135853 |
Adilakshmi Dwarasala1, Mehdi Rahimi1, John L Markley1,2, Woonghee Lee1.
Abstract
The heightened dipolar interactions in solids render solid-state NMR (ssNMR) spectra more difficult to interpret than solution NMR spectra. On the other hand, ssNMR does not suffer from severe molecular weight limitations like solution NMR. In recent years, ssNMR has undergone rapid technological developments that have enabled structure-function studies of increasingly larger biomolecules, including membrane proteins. Current methodology includes stable isotope labeling schemes, non-uniform sampling with spectral reconstruction, faster magic angle spinning, and innovative pulse sequences that capture different types of interactions among spins. However, computational tools for the analysis of complex ssNMR data from membrane proteins and other challenging protein systems have lagged behind those for solution NMR. Before a structure can be determined, thousands of signals from individual types of multidimensional ssNMR spectra of samples, which may have differing isotopic composition, must be recognized, correlated, categorized, and eventually assigned to atoms in the chemical structure. To address these tedious steps, we have developed an automated algorithm for ssNMR spectra called "ssPINE". The ssPINE software accepts the sequence of the protein plus peak lists from a variety of ssNMR experiments as inputs and offers automated backbone and side-chain assignments. The alpha version of ssPINE, which we describe here, is freely available through a web submission form.Entities:
Keywords: MAS-NMR; assignment; automation; membrane proteins; solid-state NMR; ssPINE
Year: 2022 PMID: 36135853 PMCID: PMC9503581 DOI: 10.3390/membranes12090834
Source DB: PubMed Journal: Membranes (Basel) ISSN: 2077-0375
Figure 1Spin system matrix assembly in ssPINE. (a) Peaks from the strip of the NCOCACB experiment containing the CA(i − 1) and CB(i − 1) resonances are inserted in the row of the table corresponding to the ith residue (IDX). Note that the peak is selected from the CO(i − 1) and N(i), in root information. (b) Similarly, peaks from the strip of the NCACB experiment containing the CA(i) and CB(i) resonances are inserted in the row of the table corresponding to the ith residue. N(i) and CA(i) in root information are used to select a peak from NCACB. (c) The process is repeated for all residues in the peptide sequence.
ssNMR experiments supported by ssPINE with their dimensionality and connectivity profiles. CX(i) represents carbon A, B, D, E, G, or H atoms of the ith residue; N(i) represents the nitrogen atom of the ith residue; and CO(i − 1) represents the carbon atom of the carboxyl group of the preceding residue. The minimum set of experiments needed is indicated by asterisks.
| Experiment | Dimension | Profile |
|---|---|---|
| CC * | 2D | CX/O(i)-CX/O(i) |
| NCA * | 2D | N(i)-CA(i) |
| NCACB | 2D | N(i)-CA/B(i) |
| NCO * | 2D | N(i)-CO(i − 1) |
| NCACO | 3D | N(i)-CA(i)-CO(i) |
| NCACB | 3D | N(i)-CA(i)-CA/B(i) |
| NCACX * | 3D | N(i)-CA(i)-CX(i) |
| NCOCX * | 3D | N(i)-CO(i − 1)-CX/C(i − 1) |
| NCOCA | 3D | N(i)-CO(i − 1)-CA(i − 1) |
| NCOCACB | 3D | N(i)-CO(i − 1)-CA/B(i − 1) |
| CANCO | 3D | CA(i)-N(i)-CO(i − 1) |
| CANCOCX * | 3D | CA(i)-N(i)-CX/O(i − 1) |
| CANCOCA | 3D | CA(i)-N(i)-CA/O(i − 1) |
| CANCOCACB | 3D | CA(i)-N(i)-CO/A/B(i − 1) |
* Minimum experiments to run ssPINE.
Figure 2Bar graphs indicating the correct assignment probability (p) for each residue of GBI resulting from ssPINE analysis. Green indicates p greater than 0.99; cyan indicates p = 0.85–0.99; yellow indicates p = 0.5–0.84; red indicates p less than 0.5; and gray indicates no assignment (not seen with these test sets). (a) Unrefined GBI data as input. (b) Refined GBI data as input.
Figure 3Results from ssPINE analysis of synthetic ssNMR data as averages for the 82 proteins studied. (a) Chemical shift assignment probabilities returned by ssPINE for all assignment candidates (x-axis) versus assignment type (y-axis). All (dashed black), given (dashed blue), and correct (solid green) assignments are represented by the numbers on the left side, whereas the incorrect assignments (solid red) are represented by the numbers on the right side. (b) Data from the assignment candidate for each protein with the highest assignment probability. Completeness (solid blue) and correctness (solid green) are plotted as a function of that assignment probability.