Literature DB >> 17478502

NQ-Flipper: recognition and correction of erroneous asparagine and glutamine side-chain rotamers in protein structures.

Christian X Weichenberger1, Manfred J Sippl.   

Abstract

The current Protein Data Bank (PDB) contains about 40 000 protein structures with approximately half a million incorrect atom positions resulting from erroneously assigned asparagine (Asn) and glutamine (Gln) rotamers. These errors affect applications in protein structure analysis, modeling and docking and therefore the detection, correction and prevention of such errors is highly desirable. We present NQ-Flipper, a web service based on mean force potentials to automatically detect and correct erroneous Asn and Gln rotamers. The service accepts protein structure files formatted in PDB style or PDB codes. For an Asn/Gln side-chain amide NQ-Flipper computes the total interaction energy with the surrounding atoms as the sum of pairwise atom-atom interaction energies. The energy difference between the original and the alternative rotamers identifies the correct configuration of the amide group. The web service lists the interaction energies of all Asn/Gln residues found in a PDB file and shows the structure and offending residues in an interactive 3D viewer. The corrected protein structure is available for download in various compression formats. The web service is accessible at http://flipper.services.came.sbg.ac.at.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17478502      PMCID: PMC1933125          DOI: 10.1093/nar/gkm263

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The side-chains of the amino acids asparagine (Asn) and glutamine (Gln) terminate by an amide group. The amide group may form several hydrogen bonds with the surrounding atoms and thus frequently plays an important role in protein structure stability and in intermolecular interactions (1–8). A single amide group may form up to four hydrogen bonds, two donated by the nitrogen and two accepted by the oxygen. Hence, an incorrect configuration generally results in highly unfavorable interaction energies and in view of the refinement protocols used in X-ray analysis, which frequently employ energy calculations, the occurrence of errors may seem unlikely. Nevertheless, the average error rate found in the current Protein Data Bank (PDB) (9) is of the order of 20% (10–14). The high error rate arises from limitations encountered in X-ray analysis. The amide nitrogen and oxygen atoms have similar electron densities and are indistinguishable in electron density maps at moderate and low resolutions and it is generally thought that structures of higher resolution (1.5 Å or better) are largely error free. From this point of view the Asn/Gln rotamer problem seems to be a peculiarity of X-ray analysis. However, a similar error rate is found for solution structures of proteins determined by NMR (10). Presently two web services provide information on Asn/Gln rotamers. The PDBREPORT database/WHATIF server (15) presents a static view on a plethora of automatically generated quality scores. Among these is a listing of incorrect Asn/Gln rotamers. The WHATIF server displays information but does not provide a mechanism to correct erroneous Asn/Gln rotamers. The MolProbity (16) server is dedicated to protein structure analysis and offers a variety of interactive tools to validate either an uploaded coordinate file or a structure from PDB. One special subtask is the validation and correction of Asn/Gln rotamers by adding hydrogens and picking the rotamer with lesser steric clashes. Here we present NQ-Flipper, a web service for interactive validation and correction of Asn/Gln amide rotamers. The server operates in three steps. A protein coordinate file in PDB format is uploaded and scores based on potentials of mean force (10, 17) are computed. In the second step, the results are displayed in a table where incorrect Asn/Gln rotamers are marked and the respective structure is shown in an interactive 3D Java applet (http://www.jmol.org) where any offending Asn/Gln residues are highlighted. In the third step a coordinate file with corrected atom positions is produced, which may be downloaded from the server in various compression formats. If desired any changes suggested by the server may be edited and overruled by the user.

METHODS

NQ-Flipper employs knowledge-based potentials of mean force (17–20) derived from complete crystal structures. The atom types consist of all non-hydrogen atoms of standard amino acids found in the ATOM records of PDB files. However, the backbone atoms N, Cα, C and O of individual amino acids are treated as distinct atom types. Potentials of mean force do not require hydrogen atom positions. This is an advantage over potential functions where hydrogen atoms are needed for energy evaluation. Since hydrogen atoms are generally not visible in electron densities, their coordinates have to be inferred from the attached heavy atoms. Therefore, derived hydrogen positions do not provide new information and since such positions are frequently ambiguous they may actually falsify experimental data. Rare atom types frequently found in ligands and non-standard groups as well as water molecules whose positions are frequently unreliable are presently omitted from NQ-Flipper calculations. The mean force potentials are derived from a non-redundant database of protein structures and are refined to a stable self-consistent set of potentials by repeated rotamer correction and potential recompilation (10). For any two atoms k and i of atom types a and b separated by a distance r the mean force potential is given by ε (a,b,r). The interaction at r is attractive if ε (a,b,r) < 0 and repulsive if ε (a,b,r) > 0. We denote by R1 the original rotamer found in crystal structures and by N1 and O1 the associated side-chain amide atoms. The total interaction energy is computed by where k = N1 and l = O1 and the summation is over all atoms i. For the alternative rotamer R2 the nitrogen and oxygen atoms swap their position and the respective energy ε (R2) is computed analogously with k = N2 and l = O2. The energy difference Δ ε: = ε (R1)-ε (R2) serves as a score and indicates the choice of the correct rotamer. The probability or expected relative occupancy is derived from the energy difference. A probability close to unity indicates that R1, the configuration found in the crystal structure, corresponds to the correct rotamer (11). Conversely, when this probability approaches zero the alternative rotamer R2 is highly favored and the configuration found in the crystal structure is incorrect. Probabilities between 0.1 and 0.9 indicate that both rotamers may be occupied and they most likely coexist in the crystal structure. A comparison with the reference data set of Word et al. results in an overall accuracy of 95.8% (10, 11).

WEB SERVER USAGE

The NQ-Flipper web service is a computational tool to validate and correct Asn/Gln rotamers in protein structures. Rotamers are reported with their associated Δε-score and erroneous rotamers are replaced by their corrected conformations resulting in a separate coordinate file available for download. NQ-Flipper has a small set of parameters that may be controlled and optionally changed by the user. Protein structures are entered at the main page along with a small number of parameters as described in the following paragraphs.

PDB code

A valid PDB four letter code specifies the protein structure to be processed. The repository of coordinate files maintained by NQ-Flipper is concurrent with PDB (9).

File name

Any coordinate file compliant with the PDB file format can be uploaded. When determining a new structure this allows rotamer validation at any stage of the protein crystal structure refinement process. Crystallographic symmetry is used to generate the complete crystal structures so that complete structures are used for rotamer assignment. Valid input file compression formats include gzip and unix compress.

Model number

Computations for coordinate files containing more than one model are restricted to a single model identified by this number.

Altloc indicator

Treatment of alternate location (altloc) indicators is similar to model numbers in that coordinates are retrieved for a certain altloc character only. The usage of altloc indicators is not clearly defined by PDB. In extreme cases all possible combinations of atoms with alternate locations may have to be considered. Depending on the particular PDB file this may result in an enormous number of possible combinations. A consistent treatment of alternate locations requires the submission of a complete model for each variant.

Threshold

A threshold v applied to Δε-scores is used to distinguish rotamers with single and multiple occupations. The larger the value of v the larger is the number of Asn/Gln amides considered to occupy both rotamers. The NQ-Flipper web page provides a statistical analysis of the agreement of NQ-Flipper results with a refined reference data set (12) as a function of the threshold v.

Rotate/Flip

Coordinates for the alternative rotamer atoms are derived by a 180° rotation about the Cβ - Cγ (Asn) and Cγ - Cδ (Gln) axis. This preserves bond angles and distances which is not guaranteed for the ‘Flip’ option where only atom identities are swapped. The latter option has been applied to correct Asn/Gln rotamers in the reference set (12). The option is offered here for comparison and the computation of the statistical analysis, but its use correcting Asn/Gln rotamers is strongly discouraged, since it results in improper covalent geometry of the amide atoms. For a moderately sized protein the Asn/Gln rotamers are validated within seconds. The results are presented in tabular form listing all Asn/Gln residues sorted and color-coded by their associated Δε-score. Entries highlighted in red refer to rotamers with unfavorable interaction energies. An interactive 3D-viewer based on Jmol (http://www.jmol.org) displays the Cα backbone trace of the protein structure with side-chain atoms of incorrect rotamers rendered as spheres. Multiple occupancies (| Δ ε | < v) are indicated in orange and are left in their original conformation. Residues with side-chain amides closer than 8 Å to atom centers of non-standard groups are colored in blue. These are frequently residues participating in functional sites and therefore, they may be particularly important. The assignment produced by NQ-Flipper may be edited by the user. Corrected coordinate files can be downloaded in various compression formats. Figure 1 provides an example of acutohaemolysin, PDB code 1mc2 (alternate location indicator A) (21), a phospholipase A2 at a resolution of 0.85 Å directly solved by dual-space Shake-and-Bake refinement. The structure contains eleven Asn/Gln residues. Out of these NQ-Flipper flags three rotamers as erroneous based on disagreement with empirical statistics which is also in line with basic physico-chemical principles.
Figure 1.

NQ-Flipper analysis of Asn/Gln rotamers. We provide an example of the application of NQ-Flipper to acutohaemolysin (PDB code 1mc2) determined at 0.85 Å resolution (21). The PDB file contains a single chain (identifier A) of 122 amino acid residues. The molecule contains eleven Asn/Gln residues. Three of these residues are flagged as incorrect rotamers which is confirmed by a detailed examination based on simple physico-chemical principles. The structure is determined to a very high resolution where errors in rotamer configuration may seem unlikely. However, the result shown here is quite common. Generally protein structures, even at very high resolution, contain several incorrect rotamers. (a) The protein backbone C-alpha trace with unfavorable rotamers rendered as a ball-and-stick model. The crystal structure contains two heterogeneous groups termed IPA. (b) Table listing all Asn/Gln rotamers and their respective Δε-values as displayed on the NQ-Flipper web page. Coordinates were derived for alternate location indicator A. The software detects three Asn residues with unfavorable Δε-scores, i.e., rotamers with Δ ε > v where the threshold value is set to v = 6 (the default). These incorrect rotamers are highlighted in red and include Asn-1114 (Δ ε = + 43.3), Asn-1088 (Δ ε = + 39.9) and Asn-1020 (Δ ε = + 12.3). One residue, Gln-1133 (Δ ε = -5.1), is considered ‘ambiguous’ since its Δε-score is in the range where both rotamers are occupied with non-negligible probability (| Δ ε | < 6). Such residues of multiple occupancy are colored orange. All remaining Asn/Gln residues have Δε-scores below -6 indicating correct rotamers. Residue Asn-1016 (Δ ε =-24.0) shown in blue is in the vicinity of an isopropanol buffer molecule, IPA. The minimum distance between atoms from the IPA molecule and the respective amide group in Asn-1016 is 5.7 Å. The residue is colored blue to indicate that it is close to a heterogeneous group so that some important interactions may not be included in the score—although even in such cases a score greater than v generally indicates an incorrect rotamer. Figure 1c and d provide a detailed view of two of the incorrect rotamers in the conformation R1, as found in the PDB file, interacting with their respective chemical environment. The atoms are colored by atom type: carbon, gray; nitrogen, blue; oxygen, red. The dashed lines indicate distances in Å between atoms. (c) In the original conformation residue Asn-1020 (Δ ε = + 12.3) is unable to participate in a hydrogen bond network. In the alternative conformation R2 the amide nitrogen and oxygen atoms swap positions enabling hydrogen bond formation between the amide nitrogen and the carbonyl oxygen of Asn-1016, the amide oxygen and the side-chain nitrogen of Lys-1015 and a nitrogen of the guanidinium group of Arg-1118. Note that by flipping the amide group several ‘anti-hydrogen bonds’ of very high energy are replaced by genuine hydrogen bonds of very low energy. (d) Residue Asn-1088 (Δ ε = + 39.9), where the amide nitrogen and oxygen atoms have unfavorable interactions with the respective main chain nitrogen and oxygen atoms of Ser-1074. The amide oxygen of this residue additionally interacts unfavorably with Gln-1093, a correct rotamer since it forms a hydrogen bond to the carbonyl group of Gly-1086. Flipping Asn-1088 to rotamer R2 results in the formation of three hydrogen bonds. Figure a, c and d were generated using PyMOL (http://pymol.sourceforge.net).

NQ-Flipper analysis of Asn/Gln rotamers. We provide an example of the application of NQ-Flipper to acutohaemolysin (PDB code 1mc2) determined at 0.85 Å resolution (21). The PDB file contains a single chain (identifier A) of 122 amino acid residues. The molecule contains eleven Asn/Gln residues. Three of these residues are flagged as incorrect rotamers which is confirmed by a detailed examination based on simple physico-chemical principles. The structure is determined to a very high resolution where errors in rotamer configuration may seem unlikely. However, the result shown here is quite common. Generally protein structures, even at very high resolution, contain several incorrect rotamers. (a) The protein backbone C-alpha trace with unfavorable rotamers rendered as a ball-and-stick model. The crystal structure contains two heterogeneous groups termed IPA. (b) Table listing all Asn/Gln rotamers and their respective Δε-values as displayed on the NQ-Flipper web page. Coordinates were derived for alternate location indicator A. The software detects three Asn residues with unfavorable Δε-scores, i.e., rotamers with Δ ε > v where the threshold value is set to v = 6 (the default). These incorrect rotamers are highlighted in red and include Asn-1114 (Δ ε = + 43.3), Asn-1088 (Δ ε = + 39.9) and Asn-1020 (Δ ε = + 12.3). One residue, Gln-1133 (Δ ε = -5.1), is considered ‘ambiguous’ since its Δε-score is in the range where both rotamers are occupied with non-negligible probability (| Δ ε | < 6). Such residues of multiple occupancy are colored orange. All remaining Asn/Gln residues have Δε-scores below -6 indicating correct rotamers. Residue Asn-1016 (Δ ε =-24.0) shown in blue is in the vicinity of an isopropanol buffer molecule, IPA. The minimum distance between atoms from the IPA molecule and the respective amide group in Asn-1016 is 5.7 Å. The residue is colored blue to indicate that it is close to a heterogeneous group so that some important interactions may not be included in the score—although even in such cases a score greater than v generally indicates an incorrect rotamer. Figure 1c and d provide a detailed view of two of the incorrect rotamers in the conformation R1, as found in the PDB file, interacting with their respective chemical environment. The atoms are colored by atom type: carbon, gray; nitrogen, blue; oxygen, red. The dashed lines indicate distances in Å between atoms. (c) In the original conformation residue Asn-1020 (Δ ε = + 12.3) is unable to participate in a hydrogen bond network. In the alternative conformation R2 the amide nitrogen and oxygen atoms swap positions enabling hydrogen bond formation between the amide nitrogen and the carbonyl oxygen of Asn-1016, the amide oxygen and the side-chain nitrogen of Lys-1015 and a nitrogen of the guanidinium group of Arg-1118. Note that by flipping the amide group several ‘anti-hydrogen bonds’ of very high energy are replaced by genuine hydrogen bonds of very low energy. (d) Residue Asn-1088 (Δ ε = + 39.9), where the amide nitrogen and oxygen atoms have unfavorable interactions with the respective main chain nitrogen and oxygen atoms of Ser-1074. The amide oxygen of this residue additionally interacts unfavorably with Gln-1093, a correct rotamer since it forms a hydrogen bond to the carbonyl group of Gly-1086. Flipping Asn-1088 to rotamer R2 results in the formation of three hydrogen bonds. Figure a, c and d were generated using PyMOL (http://pymol.sourceforge.net).

CONCLUSION

The NQ-Flipper web service provides an interactive tool for the detection and correction of unfavorable Asn/Gln rotamers utilizing knowledge-based potentials of mean force derived from high resolution protein structures. Except for very large crystal structures the response time of the server is immediate (i.e. within seconds). The NQ-Flipper pages provide an easy to use and robust interface. Different colors relate Δε-scores to correct and incorrect rotamers, and to amides having multiple occupations. Any assignment made by the program can be edited by the user and the corrected coordinate file can be downloaded. The results obtained from NQ-Flipper largely agree with data on correct and incorrect Asn/Gln rotamers validated by experts (10, 11). The software is available freely as a web service at http://flipper.services.came.sbg.ac.at. For the integration of NQ-Flipper with X-ray analysis or NMR refinement protocols, a stand-alone Linux version is available for download on the home page. Protein structures available from the PDB database are analyzed by entering the PDB four letter code, model number and alternate location indicator of the respective file. The service may also be accessed without entering data in the HTML form by directly supplying the PDB four letter code and optional model numbers and alternate location indicators as part of the URL. For example, PDB code 1ra9 is validated using the URL http://flipper.services.came.sbg.ac.at/cgi-bin/flipper.php?PDBCode=1ra9. A detailed description is provided by the NQ-Flipper online help page.
  20 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  MOLPROBITY: structure validation and all-atom contact analysis for nucleic acids and their complexes.

Authors:  Ian W Davis; Laura Weston Murray; Jane S Richardson; David C Richardson
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

3.  Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins.

Authors:  M J Sippl
Journal:  J Mol Biol       Date:  1990-06-20       Impact factor: 5.469

4.  Structural basis for oligosaccharide recognition by Pyrococcus furiosus maltodextrin-binding protein.

Authors:  A G Evdokimov; D E Anderson; K M Routzahn; D S Waugh
Journal:  J Mol Biol       Date:  2001-01-26       Impact factor: 5.469

5.  New domain motif: the structure of pectate lyase C, a secreted plant virulence factor.

Authors:  M D Yoder; N T Keen; F Jurnak
Journal:  Science       Date:  1993-06-04       Impact factor: 47.728

6.  Self-consistent assignment of asparagine and glutamine amide rotamers in protein crystal structures.

Authors:  Christian X Weichenberger; Manfred J Sippl
Journal:  Structure       Date:  2006-06       Impact factor: 5.006

7.  X-ray structures of the maltose-maltodextrin-binding protein of the thermoacidophilic bacterium Alicyclobacillus acidocaldarius provide insight into acid stability of proteins.

Authors:  Karsten Schäfer; Ulrika Magnusson; Frank Scheffel; André Schiefner; Mats O J Sandgren; Kay Diederichs; Wolfram Welte; Anja Hülsmann; Erwin Schneider; Sherry L Mowbray
Journal:  J Mol Biol       Date:  2004-01-02       Impact factor: 5.469

8.  The crystal structure of a novel, inactive, lysine 49 PLA2 from Agkistrodon acutus venom: an ultrahigh resolution, AB initio structure determination.

Authors:  Qun Liu; Qingqiu Huang; Maikun Teng; Charles M Weeks; Christian Jelsch; Rongguang Zhang; Liwen Niu
Journal:  J Biol Chem       Date:  2003-07-19       Impact factor: 5.157

9.  Structures of active conformations of Gi alpha 1 and the mechanism of GTP hydrolysis.

Authors:  D E Coleman; A M Berghuis; E Lee; M E Linder; A G Gilman; S R Sprang
Journal:  Science       Date:  1994-09-02       Impact factor: 47.728

10.  Structural and functional roles of asparagine 175 in the cysteine protease papain.

Authors:  T Vernet; D C Tessier; J Chatellier; C Plouffe; T S Lee; D Y Thomas; A C Storer; R Ménard
Journal:  J Biol Chem       Date:  1995-07-14       Impact factor: 5.157

View more
  22 in total

1.  Models of protein-ligand crystal structures: trust, but verify.

Authors:  Marc C Deller; Bernhard Rupp
Journal:  J Comput Aided Mol Des       Date:  2015-02-10       Impact factor: 3.686

2.  Crystal structure of the sensory domain of Escherichia coli CadC, a member of the ToxR-like protein family.

Authors:  Andreas Eichinger; Ina Haneburger; Christiane Koller; Kirsten Jung; Arne Skerra
Journal:  Protein Sci       Date:  2011-04       Impact factor: 6.725

3.  Characterization of the structures of phosphodiesterase 10 binding with adenosine 3',5'-monophosphate and guanosine 3',5'-monophosphate by hybrid quantum mechanical/molecular mechanical calculations.

Authors:  Haiting Lu; Alan C Goren; Chang-Guo Zhan
Journal:  J Phys Chem B       Date:  2010-05-27       Impact factor: 2.991

Review 4.  Predicting the Structures of Glycans, Glycoproteins, and Their Complexes.

Authors:  Robert J Woods
Journal:  Chem Rev       Date:  2018-08-09       Impact factor: 60.622

5.  Antigen aggregation decides the fate of the allergic immune response.

Authors:  Nadja Zaborsky; Marietta Brunner; Michael Wallner; Martin Himly; Tanja Karl; Robert Schwarzenbacher; Fatima Ferreira; Gernot Achatz
Journal:  J Immunol       Date:  2009-12-07       Impact factor: 5.422

Review 6.  Recent advances in employing molecular modelling to determine the specificity of glycan-binding proteins.

Authors:  Oliver C Grant; Robert J Woods
Journal:  Curr Opin Struct Biol       Date:  2014-08-07       Impact factor: 6.809

7.  Hydrogen-bond detection, configuration assignment and rotamer correction of side-chain amides in large proteins by NMR spectroscopy through protium/deuterium isotope effects.

Authors:  Aizhuo Liu; Jifeng Wang; Zhenwei Lu; Lishan Yao; Yue Li; Honggao Yan
Journal:  Chembiochem       Date:  2008-11-24       Impact factor: 3.164

8.  Engineered human angiogenin mutations in the placental ribonuclease inhibitor complex for anticancer therapy: Insights from enhanced sampling simulations.

Authors:  Xiaojing Cong; Christian Cremer; Thomas Nachreiner; Stefan Barth; Paolo Carloni
Journal:  Protein Sci       Date:  2016-05-19       Impact factor: 6.725

9.  Web application for studying the free energy of binding and protonation states of protein-ligand complexes based on HINT.

Authors:  Alexander S Bayden; Micaela Fornabaio; J Neel Scarsdale; Glen E Kellogg
Journal:  J Comput Aided Mol Des       Date:  2009-06-25       Impact factor: 3.686

10.  Crystallographically mapped ligand binding differs in high and low IgE binding isoforms of birch pollen allergen bet v 1.

Authors:  Stefan Kofler; Claudia Asam; Ulrich Eckhard; Michael Wallner; Fátima Ferreira; Hans Brandstetter
Journal:  J Mol Biol       Date:  2012-05-23       Impact factor: 5.469

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.