Literature DB >> 16381979

IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences.

Véronique Giudicelli1, Patrice Duroux, Chantal Ginestoux, Géraldine Folch, Joumana Jabado-Michaloud, Denys Chaume, Marie-Paule Lefranc.   

Abstract

IMGT/LIGM-DB is the IMGT comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and other vertebrate species. It was created in 1989 by LIGM, Montpellier, France and is the oldest and the largest database of IMGT. IMGT/LIGM-DB includes all germline (non-rearranged) and rearranged IG and TR genomic DNA (gDNA) and complementary DNA (cDNA) sequences published in generalist databases. IMGT/LIGM-DB allows searches from the Web interface according to biological and immunogenetic criteria through five distinct modules depending on the user interest. For a given entry, nine types of display are available including the IMGT flat file, the translation of the coding regions and the analysis by the IMGT/V-QUEST tool. IMGT/LIGM-DB distributes expertly annotated sequences. The annotations hugely enhance the quality and the accuracy of the distributed detailed information. They include the sequence identification, the gene and allele classification, the constitutive and specific motif description, the codon and amino acid numbering, and the sequence obtaining information, according to the main concepts of IMGT-ONTOLOGY. They represent the main source of IG and TR gene and allele knowledge stored in IMGT/GENE-DB and in the IMGT reference directory. IMGT/LIGM-DB is freely available at http://imgt.cines.fr.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16381979      PMCID: PMC1347451          DOI: 10.1093/nar/gkj088

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

IMGT/LIGM-DB is the comprehensive IMGT® database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and other vertebrate species, created in 1989 by Marie-Paule Lefranc, LIGM, Montpellier, France, on the Web since July 1995 (1–3). IMGT/LIGM-DB is the first and the largest database of IMGT®, the international ImMunoGeneTics information system® (4,5). It provides standardized and detailed immunogenetics annotations. Owing to the complexity of the IG and TR molecular genetics (6,7) that is unique to the vertebrate genomes, IMGT/LIGM-DB has to deal with (i) large germline (non-rearranged) genomic DNA (gDNA) sequences, which may involve a complete locus from several hundred kilobases to one (or more) megabase(s); (ii) rearranged gDNA sequences resulting from the recombination of V (variable), D (diversity) and J (joining) genes (V-J genes and V-D-J genes); and (iii) rearranged V-J-C (constant) and V-D-J-C complementary DNA (cDNA designated as ‘mRNA’ in generalist databases) sequences. The complexity is further enhanced by the characteristics of the loci and chain types in the different species (reviewed in the IMGT Repertoire) and by the mechanisms of diversity such as combinatorial diversity, N diversity, somatic hypermutation and gene conversion (6,7). Thus, the detailed sequence annotation is a huge and complex task which requires the interpretation of DNA rearrangements and recombination, of sequence polymorphisms, of nucleotide deletions and insertions at the V-J and V-D-J junctions and, for IG, of somatic hypermutations (6,7). Annotations rely on the accuracy and the coherence of IMGT-ONTOLOGY (8), the first ontology in the field of immunogenetics which has allowed to set up the rules for standardized sequence identification (9), gene and allele classification (6,7), constitutive and specific motif description, amino acid numbering (10–13) and sequence obtaining information.

IMGT/LIGM-DB DATA SOURCE AND CONTENT

The unique source of IMGT/LIGM-DB nucleotide sequences is EMBL (14). Prior to being entered in IMGT/LIGM-DB, IG and TR sequences must be submitted to EMBL, GenBank or DDBJ, in order to get a unique accession number which is also the entry identifier in IMGT/LIGM-DB. Then, EMBL automatically sends the IG and TR sequences (new entries and updates) to LIGM. Sequences belonging to the human (HUM), mouse (MUS), primate (PRI), other mammals (MAM) and vertebrate (VRT) divisions, which are sufficiently reliable, are managed in IMGT/LIGM-DB, plus IG and TR-related sequences from synthetic (SYN) and unclassified (UNC). The sequences from the other EMBL divisions (CON, GSS, HTG, HTC, STS and EST) are not included. The new sequences and updates received at LIGM represent >700 sequences a week. In November 2005, IMGT/LIGM-DB contains 98 800 sequences from 150 vertebrate species. They comprise germline gDNA, rearranged gDNA, a few germline cDNA and, for the half of the database content, rearranged cDNA (or ‘mRNA’). Almost three quarters of the sequences are from human and mouse.

IMGT/LIGM-DB ANNOTATIONS

At the reception at LIGM, data are checked by LIGM curators for their relevance. Data are then scanned to store sequences, bibliographical references and taxonomic data, whereas standardized IMGT/LIGM-DB keywords are assigned mainly manually. Based on expert analysis, specific detailed annotations are added in a second step. They follow the concepts of IMGT-ONTOLOGY (8) and the rules of the IMGT Scientific chart (9). This allows, for example for the sequence shown in Figure 1, the precise sequence identification with the characterization of the nature of the molecule, the configuration, the structure, the functionality, the species, the chain type and the gene type (IDENTIFICATION concept), the characterization of the group and subgroup, and the classification of the gene and allele according to the IMGT nomenclature (CLASSIFICATION concept) (15), the description of the constitutive immunogenetics specific motifs (DESCRIPTION concept), the codon and amino acid numbering (NUMEROTATION concept), and the sequence obtaining information (OBTENTION concept, currently in development, with an important analysis devoted to the biological origin of the sequence, the clinical specification and the description of used methodology). Most of the annotations are manually performed with the help of IMGT® tools, IMGT/V-QUEST (16) and IMGT/JunctionAnalysis (17). However, a part of human and mouse cDNA sequences have been automatically annotated by the internal tool IMGT/Automat (18,19).
Figure 1

Part of a fully annotated IMGT/LIGM-DB entry according to the IMGT Scientific chart rules (5,9). The corresponding five main concepts of IMGT-ONTOLOGY (8) have been added on the right-hand side.

IMGT/LIGM-DB SEARCH AND DISPLAY

The IMGT/LIGM-DB data are provided with a user-friendly interface. The Web interface allows searches according to immunogenetic-specific criteria and is easy to use without any knowledge in a computing language. The interface allows the users to get easily connected from any type of platform using free browsers. All IMGT/LIGM-DB information is available through five modules of search: Catalogue, Taxonomy and Characteristics, Keywords, Annotation labels and References. Selection is displayed at the top of the ‘results of your search’ page, so the users can check their own queries (20). Users have the possibility to modify their request or to consult the results. They can (i) add new conditions to increase or decrease the number of resulting sequences; (ii) view details: selecting this ‘View’ option provides a list of resulting sequences; selection of one sequence in the list offers nine possibilities: annotations, IMGT flat file, coding regions with protein translation, catalogue and external references, sequence in dump format, sequence in FASTA format, sequence with three reading frames, EMBL flat file, IMGT/V-QUEST (16); or (iii) search for sequence fragments: selecting this ‘Subsequences’ option allows to search for sequence fragments (subsequences) corresponding to a particular label for the resulting sequences (available for fully annotated sequences) (20).

IMGT/LIGM-DB DISTRIBUTION

IMGT/LIGM-DB flatfiles are available by anonymous FTP servers at CINES (), at EBI (), and at IGH () and from many SRS (Sequence Retrieval System) sites. IMGT/LIGM-DB can be searched by BLAST or FASTA on different servers (e.g. CINES, EBI, INFOBIOGEN and Institut Pasteur). IMGT/LIGM-DB data can also be retrieved through Web services which are developed and implemented with Axis (5). For instance, they include the ‘queryKnowledge’ which provides the lists of instances for the IMGT-ONTOLOGY concepts, and the ‘querySeqData’ which allows the retrieval of any sequence-related data, identified, classified, described according to the IMGT® concepts, such as the nucleotide sequence, the description labels, the literature references, the metadata, etc. The result is then a list of data entries, in IMGT-ML format, sharing these given values (5). IMGT/LIGM-DB data are cross-referenced in the EMBL databank (14), in IMGT/GENE-DB (15) which allows to link gene entries with the corresponding genomic reference sequences and with the known expressed cDNAs, and in IMGT/PRIMER-DB (21) in order to display the oligonucleotide primers within the sequences.

CONCLUSION AND PERSPECTIVES

IMGT/LIGM-DB manages all published vertebrate IG and TR nucleotide sequences. Very interestingly and despite the complexity of these sequences, and their variability in many species (sequences of 150 species are dealt in IMGT/LIGM-DB), the detailed annotations are all performed according to the concepts of IMGT-ONTOLOGY. The organization of the concepts has been formalized, with XML Schema, in IMGT-ML (5). A new IMGT/LIGM-DB interface, available in 2006, will allow queries according to these concepts and the retrieval of entries as XML files, in IMGT-ML format.

CITATION

Users of IMGT/LIGM-DB are requested to cite this article in their publications and to quote the IMGT® home page URL ().
  17 in total

1.  Ontology for immunogenetics: the IMGT-ONTOLOGY.

Authors:  V Giudicelli; M P Lefranc
Journal:  Bioinformatics       Date:  1999-12       Impact factor: 6.937

2.  IMGT/V-QUEST, an integrated software program for immunoglobulin and T cell receptor V-J and V-D-J rearrangement analysis.

Authors:  Véronique Giudicelli; Denys Chaume; Marie-Paule Lefranc
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

3.  IMGT-ONTOLOGY for immunogenetics and immunoinformatics.

Authors:  Marie-Paule Lefranc; Véronique Giudicelli; Chantal Ginestoux; Nathalie Bosc; Géraldine Folch; Delphine Guiraudou; Joumana Jabado-Michaloud; Séverine Magris; Dominique Scaviner; Valérie Thouvenin; Kora Combres; David Girod; Stéphanie Jeanjean; Céline Protat; Mehdi Yousfi-Monod; Elodie Duprat; Quentin Kaas; Christelle Pommié; Denys Chaume; Gérard Lefranc
Journal:  In Silico Biol       Date:  2003-11-22

4.  IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains.

Authors:  Marie-Paule Lefranc; Christelle Pommié; Quentin Kaas; Elodie Duprat; Nathalie Bosc; Delphine Guiraudou; Christelle Jean; Manuel Ruiz; Isabelle Da Piédade; Mathieu Rouard; Elodie Foulquier; Valérie Thouvenin; Gérard Lefranc
Journal:  Dev Comp Immunol       Date:  2005       Impact factor: 3.636

5.  Immunogenetics Sequence Annotation: the Strategy of IMGT based on IMGT-ONTOLOGY.

Authors:  Véronique Giudicelli; Denys Chaume; Joumana Jabado-Michaloud; Marie-Paule Lefranc
Journal:  Stud Health Technol Inform       Date:  2005

6.  IMGT-Choreography for immunogenetics and immunoinformatics.

Authors:  Marie-Paule Lefranc; Oliver Clement; Quentin Kaas; Elodie Duprat; Patrick Chastellan; Isabelle Coelho; Kora Combres; Chantal Ginestoux; Veronique Giudicelli; Denys Chaume; Gerard Lefranc
Journal:  In Silico Biol       Date:  2005

7.  IMGT, the international ImMunoGeneTics database.

Authors:  M P Lefranc; V Giudicelli; C Ginestoux; J Bodmer; W Müller; R Bontrop; M Lemaitre; A Malik; V Barbié; D Chaume
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

8.  Unique database numbering system for immunogenetic analysis.

Authors:  M P Lefranc
Journal:  Immunol Today       Date:  1997-11

9.  IMGT/JunctionAnalysis: the first tool for the analysis of the immunoglobulin and T cell receptor complex V-J and V-D-J JUNCTIONs.

Authors:  Mehdi Yousfi Monod; Véronique Giudicelli; Denys Chaume; Marie-Paule Lefranc
Journal:  Bioinformatics       Date:  2004-08-04       Impact factor: 6.937

10.  IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes.

Authors:  Véronique Giudicelli; Denys Chaume; Marie-Paule Lefranc
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  118 in total

1.  Human Immunoglobulin Heavy Gamma Chain Polymorphisms: Molecular Confirmation Of Proteomic Assessment.

Authors:  Magalie Dambrun; Célia Dechavanne; Alexandra Emmanuel; Florentin Aussenac; Marjorie Leduc; Chiara Giangrande; Joëlle Vinh; Jean-Michel Dugoujon; Marie-Paule Lefranc; François Guillonneau; Florence Migot-Nabias
Journal:  Mol Cell Proteomics       Date:  2017-03-06       Impact factor: 5.911

2.  Clonal expansion and TCR-independent differentiation shape the HIV-specific CD8+ effector-memory T-cell repertoire in vivo.

Authors:  Dirk Meyer-Olson; Brenna C Simons; Joseph A Conrad; Rita M Smith; Louise Barnett; Shelly L Lorey; Coley B Duncan; Ramesh Ramalingam; Spyros A Kalams
Journal:  Blood       Date:  2010-04-27       Impact factor: 22.113

3.  Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies.

Authors:  Keith W Rickert; Luba Grinberg; Robert M Woods; Susan Wilson; Michael A Bowen; Manuel Baca
Journal:  MAbs       Date:  2016       Impact factor: 5.857

Review 4.  Protein Bioinformatics Databases and Resources.

Authors:  Chuming Chen; Hongzhan Huang; Cathy H Wu
Journal:  Methods Mol Biol       Date:  2017

5.  CapTCR-seq: hybrid capture for T-cell receptor repertoire profiling.

Authors:  David T Mulder; Etienne R Mahé; Mark Dowar; Youstina Hanna; Tiantian Li; Linh T Nguyen; Marcus O Butler; Naoto Hirano; Jan Delabie; Pamela S Ohashi; Trevor J Pugh
Journal:  Blood Adv       Date:  2018-12-11

6.  Recovery of Immunoglobulin VJ Recombinations from Pancreatic Cancer Exome Files Strongly Correlates with Reduced Survival.

Authors:  Jacob C Kinskey; Yaping N Tu; Wei Lue Tong; John M Yavorski; George Blanck
Journal:  Cancer Microenviron       Date:  2018-02-05

7.  Transfer of Cell-Surface Antigens by Scavenger Receptor CD36 Promotes Thymic Regulatory T Cell Receptor Repertoire Development and Allo-tolerance.

Authors:  Justin S A Perry; Emilie V Russler-Germain; You W Zhou; Whitney Purtha; Matthew L Cooper; Jaebok Choi; Mark A Schroeder; Vanessa Salazar; Takeshi Egawa; Byeong-Chel Lee; Nada A Abumrad; Brian S Kim; Mark S Anderson; John F DiPersio; Chyi-Song Hsieh
Journal:  Immunity       Date:  2018-05-08       Impact factor: 31.745

8.  Natural human antibodies to pneumococcus have distinctive molecular characteristics and protect against pneumococcal disease.

Authors:  H E Baxendale; M Johnson; R C M Stephens; J Yuste; N Klein; J S Brown; D Goldblatt
Journal:  Clin Exp Immunol       Date:  2007-11-05       Impact factor: 4.330

9.  Chronic lymphocytic leukemia antibodies with a common stereotypic rearrangement recognize nonmuscle myosin heavy chain IIA.

Authors:  Charles C Chu; Rosa Catera; Katerina Hatzi; Xiao-Jie Yan; Lu Zhang; Xiao Bo Wang; Henry M Fales; Steven L Allen; Jonathan E Kolitz; Kanti R Rai; Nicholas Chiorazzi
Journal:  Blood       Date:  2008-09-23       Impact factor: 22.113

10.  Differential expression and ligand binding indicate alternative functions for zebrafish polymeric immunoglobulin receptor (pIgR) and a family of pIgR-like (PIGRL) proteins.

Authors:  Amanda N Kortum; Ivan Rodriguez-Nunez; Jibing Yang; Juyoung Shim; Donna Runft; Marci L O'Driscoll; Robert N Haire; John P Cannon; Poem M Turner; Ronda T Litman; Carol H Kim; Melody N Neely; Gary W Litman; Jeffrey A Yoder
Journal:  Immunogenetics       Date:  2014-01-28       Impact factor: 2.846

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.