Literature DB >> 17933765

Glycoconjugate Data Bank: Structures--an annotated glycan structure database and N-glycan primary structure verification service.

Taku Nakahara1, Ryo Hashimoto, Hiroaki Nakagawa, Kenji Monde, Nobuaki Miura, Shin-Ichiro Nishimura.   

Abstract

Glycobiology has been brought to public attention as a frontier in the post-genomic era. Structural information about glycans has been accumulating in the Protein Data Bank (PDB) for years. It has been recognized, however, that there are many questionable glycan models in the PDB. A tool for verifying the primary structures of glycan 3D structures is evidently required, yet there have been no such publicly available tools. The Glycoconjugate Data Bank:Structures (GDB:Structures, http://www.glycostructures.jp) is an annotated glycan structure database, which also provides an N-glycan primary structure (or glycoform) verification service. All the glycan 3D structures are detected and annotated by an in-house program named 'getCARBO'. When an N-glycan is detected in a query coordinate by getCARBO, the primary structure of the glycan is compared with the most similar entry in the glycan primary structure database (KEGG GLYCAN), and unmatched substructure(s) are indicated if observed. The results of getCARBO are stored and presented in GDB:Structures.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17933765      PMCID: PMC2238941          DOI: 10.1093/nar/gkm833

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Glycans (carbohydrate chains) are one of the major classes of biological molecules, like nucleic acids and proteins. There is a substantial amount of structural information about glycan 3D structures in the Protein Data Bank (PDB) (1), and this is the major resource in the structural biology of glycans. Most of these glycan structures coexist with proteins as ligands or modifications of glycoproteins. There are two major classes of modified glycans, namely N-glycans (asparagine-linked glycans) and O-glycans (serine or threonine-linked glycans). The accuracy of the glycan structures in the PDB has been in question for years. It was reported that there are some hundreds of asparagine-linked N-acetylglucosamines with inaccurate anomeric forms (2,3). Furthermore, ∼30% of saccharide units in the PDB contain errors, mainly in monosaccharide nomenclature (4). So far, there is only one web service for checking the correspondence between the structure and nomenclature of a saccharide unit in a PDB file (5). Recently, Crispin et al. (6) pointed out that glycan structures with unreported primary structure motifs were observed in the PDB. Models of glycan 3D structures built by X-ray crystallography should be supported by the glycan primary structure data derived by other methods (NMR, HPLC, mass spectrometry and so on). Our recent comprehensive survey revealed that 13.6% of glycans in the PDB contains substructure(s) which are not found in the glycan primary structure database (7), which suggested that the PDB contains a substantial number of glycans of which primary structures cannot be synthesized by known biological pathways. A tool to verify the glycan primary structures of 3D models is in urgent demand. The Glycoconjugate Data Bank:Structures (GDB:Structures) is a database of annotated glycan structures. Glycans in the PDB were detected and annotated by a computer program named ‘getCARBO’. Furthermore, N-glycans were compared with entries in the glycan primary structure database, KEGG GLYCAN (8). The KEGG GLYCAN entry with the most similar structure to the PDB glycan is selected by using a glycan structure search function implemented in getCARBO, and unmatched substructure(s) between the glycan found in the PDB and the most similar KEGG GLYCAN entry were noted graphically. GDB:Structures also provides a web service for checking glycan structures in a user-uploaded structure file by getCARBO. As far as we know, there have been no verification services for glycan primary structures. The widely used site, pdb-care (5), a verification service for nomenclature of a saccharide unit, does not have a functionality to verify a glycan primary structure. GDB:Structures provides functionality not only of verifying monosaccharide structures, but also that of glycan primary structures, and will compensate for the lack of tools for modeling biologically meaningful glycan 3D structures.

STRUCTURE AND CONTENT OF GDB:SRUCTURES

GDB:Structures is running on ruby on rails framework with lighttpd web server and MySQL database management system. A glycan annotation program (getCARBO) was written in Java language. A web interface for glycan structure search was developed by using Flash ActionScript 2.0. All the glycans in GDB:Structures are classified into three types: ligands, N-glycans and O-glycans. Users can retrieve lists of glycans by specifying the type and length of the glycans. PDB ID is also available to access glycan information. A glycan primary structure search application is also available. Users can draw query glycan primary structures on the Flash web interface and obtain a list of similar glycans in GDB:Structures. The contents of GDB:Structures will be updated regularly, and the statistics of the current content is also presented. In each glycan information page, the primary structure of the glycan and annotations for the glycan and monosaccharide units are presented. In the case of N-glycan, the most similar KEGG GLYCAN entry is also presented. For each monosaccharide unit, the identifier in the PDB, IUPAC nomenclature (9), common name, anomeric form and chair conformation are presented. The KEGG GLYCAN entries are linked to the original data in KEGG (10). The three-letter-identifiers of each monosaccharide unit are linked to Het-PDB navi (11). Each of the PDB entries in GDB:Structures is mutually linked with that in the PDBj (1). Users can interactively browse the 3D structural model of the PDB entry by using a Jmol applet (www.jmol.org). Details of the content of this database are described elsewhere (7).

GLYCAN STRUCTURE ANNOTATION SERVICE

Users can upload their structure file (PDB format) to GDB:Structures to annotate glycan structure(s) in the file by getCARBO. The results of the annotation will be sent back to the users by Email, normally within a few minutes after uploading. The results are summarized in an HTML file (Figure 1). The HTML file presents the results in the same format as the glycan information page of GDB:Structures. In the case of N-glycan, the most similar KEGG GLYCAN entry is presented, and unmatched substructures are noted by red question marks. O-glycans are not checked in GDB:Structures, because studies on the structures of (especially, non-mammalian) O-glycans have not reached the critical mass to perform database searches compared to those on N-glycans, of which primary structures and biosynthetic pathways are well known.
Figure 1.

An example of glycan structure annotation service. An HTML file containing results of glycan annotation is sent back to the user who submitted a structure file (PDB format). The glycan structure detected in the user-uploaded file is shown in the ‘Structure’ record and the most similar KEGG GLYCAN entry is shown in the ‘KEGG analogue’ record. Unmatched substructures (glycoside bonds between mannose residues) are indicated by red question marks.

An example of glycan structure annotation service. An HTML file containing results of glycan annotation is sent back to the user who submitted a structure file (PDB format). The glycan structure detected in the user-uploaded file is shown in the ‘Structure’ record and the most similar KEGG GLYCAN entry is shown in the ‘KEGG analogue’ record. Unmatched substructures (glycoside bonds between mannose residues) are indicated by red question marks.

CONCLUSION AND FUTURE PERSPECTIVES

GDB:Structures is not only an annotated glycan structure database, but also a web service for verifying the primary structures of glycan 3D structures, which is in high demand by structural biologists (12). This database will help users to determine the characteristics of the glycan structures of their interests and also allow structural biologists to verify their determined glycan structures. Our N-glycan checking process is not always valid and is limited due to the nature of the reference glycan primary structure database. If the glycan primary structure is not found in the reference database, the annotation by getCARBO will point out unmatched substructures on a query glycan structure, even though the 3D structure is built on X-ray diffraction data fine enough to assign unreported motifs. We believe that such incidents should be rare. The entire data of GDB:Structures will be publicly available following the establishment of an international consensus on the glycan description format (13–15).
  12 in total

Review 1.  Conformational studies of oligosaccharides and glycopeptides: complementarity of NMR, X-ray crystallography, and molecular modelling.

Authors:  Mark R Wormald; Andrei J Petrescu; Ya-Lan Pao; Ann Glithero; Tim Elliott; Raymond A Dwek
Journal:  Chem Rev       Date:  2002-02       Impact factor: 60.622

2.  Announcing the worldwide Protein Data Bank.

Authors:  Helen Berman; Kim Henrick; Haruki Nakamura
Journal:  Nat Struct Biol       Date:  2003-12

3.  Het-PDB Navi.: a database for protein-small molecule interactions.

Authors:  Akihiro Yamaguchi; Kei Iida; Nobuaki Matsui; Shirou Tomoda; Kei Yura; Mitiko Go
Journal:  J Biochem       Date:  2004-01       Impact factor: 3.387

4.  Data mining the protein data bank: automatic detection and assignment of carbohydrate structures.

Authors:  Thomas Lütteke; Martin Frank; Claus-W von der Lieth
Journal:  Carbohydr Res       Date:  2004-04-02       Impact factor: 2.104

5.  The carbohydrate sequence markup language (CabosML): an XML description of carbohydrate structures.

Authors:  Norihiro Kikuchi; Akihiko Kameyama; Shuuichi Nakaya; Hiromi Ito; Takashi Sato; Toshihide Shikanai; Yoriko Takahashi; Hisashi Narimatsu
Journal:  Bioinformatics       Date:  2004-11-25       Impact factor: 6.937

6.  A statistical analysis of N- and O-glycan linkage conformations from crystallographic data.

Authors:  A J Petrescu; S M Petrescu; R A Dwek; M R Wormald
Journal:  Glycobiology       Date:  1999-04       Impact factor: 4.313

7.  GLYDE-an expressive XML standard for the representation of glycan structure.

Authors:  Satya S Sahoo; Christopher Thomas; Amit Sheth; Cory Henson; William S York
Journal:  Carbohydr Res       Date:  2005-10-20       Impact factor: 2.104

8.  LINUCS: linear notation for unique description of carbohydrate sequences.

Authors:  A Bohne-Lang; E Lang; T Förster; C W von der Lieth
Journal:  Carbohydr Res       Date:  2001-11-01       Impact factor: 2.104

9.  Building meaningful models of glycoproteins.

Authors:  Max Crispin; David I Stuart; E Yvonne Jones
Journal:  Nat Struct Mol Biol       Date:  2007-05       Impact factor: 15.369

10.  From genomics to chemical genomics: new developments in KEGG.

Authors:  Minoru Kanehisa; Susumu Goto; Masahiro Hattori; Kiyoko F Aoki-Kinoshita; Masumi Itoh; Shuichi Kawashima; Toshiaki Katayama; Michihiro Araki; Mika Hirakawa
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

View more
  15 in total

1.  Glycan Reader: automated sugar identification and simulation preparation for carbohydrates and glycoproteins.

Authors:  Sunhwan Jo; Kevin C Song; Heather Desaire; Alexander D MacKerell; Wonpil Im
Journal:  J Comput Chem       Date:  2011-08-03       Impact factor: 3.376

2.  Molecular dynamics simulations of glycoproteins using CHARMM.

Authors:  Sairam S Mallajosyula; Sunhwan Jo; Wonpil Im; Alexander D MacKerell
Journal:  Methods Mol Biol       Date:  2015

3.  Targeted metabolic labeling of yeast N-glycans with unnatural sugars.

Authors:  Mark A Breidenbach; Jennifer E G Gallagher; David S King; Brian P Smart; Peng Wu; Carolyn R Bertozzi
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-08       Impact factor: 11.205

Review 4.  Bioinformatics and molecular modeling in glycobiology.

Authors:  Martin Frank; Siegfried Schloissnig
Journal:  Cell Mol Life Sci       Date:  2010-04-04       Impact factor: 9.261

5.  Visualisation of cyclic and multi-branched molecules with VMD.

Authors:  Simon Cross; Michelle M Kuttel; John E Stone; James E Gain
Journal:  J Mol Graph Model       Date:  2009-05-04       Impact factor: 2.518

Review 6.  Crystallographic model validation: from diagnosis to healing.

Authors:  Jane S Richardson; Michael G Prisant; David C Richardson
Journal:  Curr Opin Struct Biol       Date:  2013-09-21       Impact factor: 6.809

7.  SuperSweet--a resource on natural and artificial sweetening agents.

Authors:  Jessica Ahmed; Saskia Preissner; Mathias Dunkel; Catherine L Worth; Andreas Eckert; Robert Preissner
Journal:  Nucleic Acids Res       Date:  2010-10-14       Impact factor: 16.971

8.  Analysis and validation of carbohydrate three-dimensional structures.

Authors:  Thomas Lütteke
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2009-01-20

Review 9.  Function and 3D structure of the N-glycans on glycoproteins.

Authors:  Masamichi Nagae; Yoshiki Yamaguchi
Journal:  Int J Mol Sci       Date:  2012-07-06       Impact factor: 6.208

10.  Glycan fragment database: a database of PDB-based glycan 3D structures.

Authors:  Sunhwan Jo; Wonpil Im
Journal:  Nucleic Acids Res       Date:  2012-10-26       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.