Literature DB >> 26036565

NMR Exchange Format: a unified and open standard for representation of NMR restraint data.

Aleksandras Gutmanas1, Paul D Adams2, Benjamin Bardiaux3, Helen M Berman4, David A Case4, Rasmus H Fogh5, Peter Güntert6, Pieter M S Hendrickx1, Torsten Herrmann7, Gerard J Kleywegt1, Naohiro Kobayashi8, Oliver F Lange9, John L Markley10, Gaetano T Montelione11, Michael Nilges3, Timothy J Ragan5, Charles D Schwieters12, Roberto Tejero13, Eldon L Ulrich10, Sameer Velankar1, Wim F Vranken14, Jonathan R Wedell10, John Westbrook4, David S Wishart15, Geerten W Vuister5.   

Abstract

Entities:  

Mesh:

Year:  2015        PMID: 26036565      PMCID: PMC4546829          DOI: 10.1038/nsmb.3041

Source DB:  PubMed          Journal:  Nat Struct Mol Biol        ISSN: 1545-9985            Impact factor:   15.369


× No keyword cloud information.
We present here a unified, easily adaptable, open-source NMR exchange format (NEF) for NMR restraints and associated data. Atomic-resolution, three-dimensional structures of macromolecules have been determined by NMR spectroscopy since the late 1980s. In 2013, the number of NMR-derived structures in the Protein Data Bank (PDB)[1] passed the milestone of 10,000 entries (Fig. 1), and they currently account for approximately 10% of the total number of structures in the PDB. To improve the quality and integrity of the archive, the Worldwide Protein Data Bank (wwPDB)[2], the consortium that manages the PDB archive, made the deposition of the underlying experimental data mandatory and established expert validation task forces (VTFs) to provide consensus recommendations for validating the structures and accompanying experimental data for entries determined by X-ray, NMR or cryo-EM techniques. The initial recommendations of the NMR VTF[3] have been implemented in a software pipeline that will be used to produce validation reports during structure deposition and annotation.
Figure 1

Growth in the number of NMR entries in the PDB archive.

NMR data and restraints are diverse in their nature: they are typically derived from various kinds of NMR experiments, and they may be interpreted differently by different software programs, even when the same spectral data are used as input. In addition, almost all NMR programs rely on a variety of formats, thus necessitating conversions when multiple programs are used in structure determination and analysis, with a concomitant risk of information loss or misinterpretation. Two software projects, NMR-STAR[4], developed at the Biological Magnetic Resonance Bank (BMRB)[5] with input from the NMR community, and the Collaborative Computational Project for NMR (CCPN)[6], provide systematic and comprehensive data models for storing and accessing NMR data. Unfortunately, neither of these two approaches has been widely adopted by the developers of popular software tools for NMR structure determination, refinement and validation, partly because both data models suffer from substantial and similar drawbacks: their data structures are large—more extensive and more complex than any single program would typically require—and they are not easily and independently adapted and extended for any specific program. NMR restraint data are currently deposited in a variety of software-specific formats that have to be curated by the BMRB into a common format for deposition in the NMR Restraints Grid (NRG)[7], thus enabling many useful applications. Unfortunately, efforts to develop universal restraint converters have been challenged because some restraint formats omit information required by other restraint formats[8], and full parsing of each software-specific format has proven to be impossible. The current situation hampers the proper archiving and use of bio-molecular NMR data, and prevents the routine inclusion of NMR restraint validation in the wwPDB NMR validation pipeline. For these reasons, the wwPDB partners, together with CCPN, organized a series of consultations and two workshops that included developers of key software packages used for NMR structure determination and refinement (Table 1), with the aim of attaining a unified approach to represent NMR restraints and associated data. Together, they agreed on and successfully implemented and tested an NMR data representation, denoted the NEF, and devised a governance structure for its maintenance and further development. Importantly, the different program developers committed to the ambitious goal of making their software capable of both reading and writing NEF-compliant files.
Table 1

Software packages implementing the NEF

Software packageCategoryPrincipal investigator or representative
AMBERMolecular dynamics (with NMR restraints)D.A. Case
CYANAAutomated assignment and structure determinationP. Güntert
UNIOAutomation from spectral acquisition to structureT. Herrmann
CS-ROSETTAStructure determination from chemical shiftsO. Lange
NMR-STAR converterFormat conversionJ.L. Markley, E.L. Ulrich
ASDPAutomated NOESY cross-peak assignmentG.T. Montelione, Y.J. Huang
PSVS and PDBStatStructure validationG.T. Montelione, R. Tejero, Y.J. Huang
ARIA and CNSStructure determination and refinementM. Nilges, B. Bardiaux
XPLOR-NIHStructure determination and refinementC.D. Schwieters
CCPN FormatConverterFormat conversionW.F. Vranken
CCPNData modeling, spectral analysis, format conversion, integration of other NMR softwareG.W. Vuister, R.H. Fogh
CINGStructure validationG.W. Vuister
CS23DStructure determination from chemical shiftsD. Wishart
PROSESS and RESPROXStructure validationD. Wishart
The detailed specifications of the NEF (https://github.com/NMRExchangeFormat/NEF/) are based on the consensus that emerged during the consultations and workshops: the format accommodates a variety of restraint types and is extensible beyond the common agreed-upon elements, so that new science can be easily incorporated. The NEF format is self-contained, so that unambiguous interpretation of the data does not require any auxiliary software-specific files, and is readable by both machines and humans. In addition to the restraints data, NEF requires polymer sequence information and chemical-shift assignments, and allows inclusion of peak lists. A compliant NEF file contains all the data in a single, appropriately sectioned file, implemented with the STAR syntax[9] and controlled by a versioned dictionary of tag names. Developers can extend the standard dictionary to accommodate their own new data or experimental practices, which need not be supported by other software packages, by simply registering an individual dictionary namespace. Thus, the NEF is inherently flexible and extensible, and it allows for unlimited program-specific additional data without the need for any adaptation of the format. Importantly, it has been anticipated that such initially nonstandard additions might evolve into the general practice and be adopted by other programs. A mechanism to incorporate such developments is part of the management of the NEF specification. All authors of this Correspondence have been involved in the planning and development of the NEF, and they include representatives of all major packages for NMR structure determination, refinement and validation (Table 1). The program developers have agreed to release updated versions of their software capable of handling the NEF format by the end of September 2015. After a transition period, the wwPDB partners are expected to accept only NEF-formatted NMR data for deposition into the PDB. The efforts presented here show that the biological NMR community is ready to resolve the issues of representation and exchange of experimental NMR data. We encourage developers of current and future NMR software to support the NEF, and we invite the wider community of NMR-software developers and other stakeholders to participate in its development and maintenance.
  8 in total

Review 1.  Macromolecular structure determination by NMR spectroscopy.

Authors:  John L Markley; Eldon L Ulrich; William M Westler; Brian F Volkman
Journal:  Methods Biochem Anal       Date:  2003

2.  The Protein Data Bank: a computer-based archival file for macromolecular structures.

Authors:  F C Bernstein; T F Koetzle; G J Williams; E F Meyer; M D Brice; J R Rodgers; O Kennard; T Shimanouchi; M Tasumi
Journal:  J Mol Biol       Date:  1977-05-25       Impact factor: 5.469

3.  Recommendations of the wwPDB NMR Validation Task Force.

Authors:  Gaetano T Montelione; Michael Nilges; Ad Bax; Peter Güntert; Torsten Herrmann; Jane S Richardson; Charles D Schwieters; Wim F Vranken; Geerten W Vuister; David S Wishart; Helen M Berman; Gerard J Kleywegt; John L Markley
Journal:  Structure       Date:  2013-09-03       Impact factor: 5.006

4.  PDBStat: a universal restraint converter and restraint analysis software package for protein NMR.

Authors:  Roberto Tejero; David Snyder; Binchen Mao; James M Aramini; Gaetano T Montelione
Journal:  J Biomol NMR       Date:  2013-07-30       Impact factor: 2.835

5.  The CCPN data model for NMR spectroscopy: development of a software pipeline.

Authors:  Wim F Vranken; Wayne Boucher; Tim J Stevens; Rasmus H Fogh; Anne Pajon; Miguel Llinas; Eldon L Ulrich; John L Markley; John Ionides; Ernest D Laue
Journal:  Proteins       Date:  2005-06-01

6.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data.

Authors:  Helen Berman; Kim Henrick; Haruki Nakamura; John L Markley
Journal:  Nucleic Acids Res       Date:  2006-11-16       Impact factor: 16.971

7.  The NMR restraints grid at BMRB for 5,266 protein and nucleic acid PDB entries.

Authors:  Jurgen F Doreleijers; Wim F Vranken; Christopher Schulte; Jundong Lin; Jonathan R Wedell; Christopher J Penkett; Geerten W Vuister; Gert Vriend; John L Markley; Eldon L Ulrich
Journal:  J Biomol NMR       Date:  2009-10-07       Impact factor: 2.835

8.  BioMagResBank.

Authors:  Eldon L Ulrich; Hideo Akutsu; Jurgen F Doreleijers; Yoko Harano; Yannis E Ioannidis; Jundong Lin; Miron Livny; Steve Mading; Dimitri Maziuk; Zachary Miller; Eiichi Nakatani; Christopher F Schulte; David E Tolmie; R Kent Wenger; Hongyang Yao; John L Markley
Journal:  Nucleic Acids Res       Date:  2007-11-04       Impact factor: 16.971

  8 in total
  19 in total

1.  Biomolecular NMR: Past and future.

Authors:  John L Markley; William Milo Westler
Journal:  Arch Biochem Biophys       Date:  2017-05-08       Impact factor: 4.013

2.  BioMagResBank (BMRB) as a Resource for Structural Biology.

Authors:  Pedro R Romero; Naohiro Kobayashi; Jonathan R Wedell; Kumaran Baskaran; Takeshi Iwata; Masashi Yokochi; Dimitri Maziuk; Hongyang Yao; Toshimichi Fujiwara; Genji Kurusu; Eldon L Ulrich; Jeffrey C Hoch; John L Markley
Journal:  Methods Mol Biol       Date:  2020

3.  Rapid and reliable protein structure determination via chemical shift threading.

Authors:  Noor E Hafsa; Mark V Berjanskii; David Arndt; David S Wishart
Journal:  J Biomol NMR       Date:  2017-12-01       Impact factor: 2.835

4.  Xplor-NIH for molecular structure determination from NMR and other data sources.

Authors:  Charles D Schwieters; Guillermo A Bermejo; G Marius Clore
Journal:  Protein Sci       Date:  2017-09-18       Impact factor: 6.725

5.  F 1 F 2-selective NMR spectroscopy.

Authors:  Erik Walinda; Daichi Morimoto; Masahiro Shirakawa; Kenji Sugase
Journal:  J Biomol NMR       Date:  2017-05-04       Impact factor: 2.835

Review 6.  Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.

Authors:  Stephen K Burley; Helen M Berman; Gerard J Kleywegt; John L Markley; Haruki Nakamura; Sameer Velankar
Journal:  Methods Mol Biol       Date:  2017

7.  The second round of Critical Assessment of Automated Structure Determination of Proteins by NMR: CASD-NMR-2013.

Authors:  Antonio Rosato; Wim Vranken; Rasmus H Fogh; Timothy J Ragan; Roberto Tejero; Kari Pederson; Hsiau-Wei Lee; James H Prestegard; Adelinda Yee; Bin Wu; Alexander Lemak; Scott Houliston; Cheryl H Arrowsmith; Michael Kennedy; Thomas B Acton; Rong Xiao; Gaohua Liu; Gaetano T Montelione; Geerten W Vuister
Journal:  J Biomol NMR       Date:  2015-06-14       Impact factor: 2.835

8.  Validation of Structures in the Protein Data Bank.

Authors:  Swanand Gore; Eduardo Sanz García; Pieter M S Hendrickx; Aleksandras Gutmanas; John D Westbrook; Huanwang Yang; Zukang Feng; Kumaran Baskaran; John M Berrisford; Brian P Hudson; Yasuyo Ikegawa; Naohiro Kobayashi; Catherine L Lawson; Steve Mading; Lora Mak; Abhik Mukhopadhyay; Thomas J Oldfield; Ardan Patwardhan; Ezra Peisach; Gaurav Sahni; Monica R Sekharan; Sanchayita Sen; Chenghua Shao; Oliver S Smart; Eldon L Ulrich; Reiko Yamashita; Martha Quesada; Jasmine Y Young; Haruki Nakamura; John L Markley; Helen M Berman; Stephen K Burley; Sameer Velankar; Gerard J Kleywegt
Journal:  Structure       Date:  2017-11-22       Impact factor: 5.006

9.  Federating Structural Models and Data: Outcomes from A Workshop on Archiving Integrative Structures.

Authors:  Helen M Berman; Paul D Adams; Alexandre A Bonvin; Stephen K Burley; Bridget Carragher; Wah Chiu; Frank DiMaio; Thomas E Ferrin; Margaret J Gabanyi; Thomas D Goddard; Patrick R Griffin; Juergen Haas; Christian A Hanke; Jeffrey C Hoch; Gerhard Hummer; Genji Kurisu; Catherine L Lawson; Alexander Leitner; John L Markley; Jens Meiler; Gaetano T Montelione; George N Phillips; Thomas Prisner; Juri Rappsilber; David C Schriemer; Torsten Schwede; Claus A M Seidel; Timothy S Strutzenberg; Dmitri I Svergun; Emad Tajkhorshid; Jill Trewhella; Brinda Vallat; Sameer Velankar; Geerten W Vuister; Benjamin Webb; John D Westbrook; Kate L White; Andrej Sali
Journal:  Structure       Date:  2019-11-25       Impact factor: 5.006

10.  CcpNmr AnalysisAssign: a flexible platform for integrated NMR analysis.

Authors:  Simon P Skinner; Rasmus H Fogh; Wayne Boucher; Timothy J Ragan; Luca G Mureddu; Geerten W Vuister
Journal:  J Biomol NMR       Date:  2016-09-23       Impact factor: 2.835

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.