Literature DB >> 19036791

OryGenesDB 2008 update: database interoperability for functional genomics of rice.

Gaëtan Droc1, Christophe Périn, Sébastien Fromentin, Pierre Larmande.   

Abstract

OryGenesDB (http://orygenesdb.cirad.fr/index.html) is a database developed for rice reverse genetics. OryGenesDB contains FSTs (flanking sequence tags) of various mutagens and functional genomics data, collected from both international insertion collections and the literature. The current release of OryGenesDB contains 171,000 FSTs, and annotations divided among 10 specific categories, totaling 78 annotation layers. Several additional tools have been added to the main interface; these tools enable the user to retrieve FSTs and design probes to analyze insertion lines. The major innovation of OryGenesDB 2008, besides updating the data and tools, is a new tool, Orylink, which was developed to speed up rice functional genomics by taking advantage of the resources developed in two related databases, Oryza Tag Line and GreenPhylDB. Orylink was designed to field complex queries across these three databases and store both the queries and their results in an intuitive manner. Orylink offers a simple and powerful virtual workbench for functional genomics. Alternatively, the Web services developed for Orylink can be used independently of its Web interface, increasing the interoperability between these different bioinformatics applications.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 19036791      PMCID: PMC2686528          DOI: 10.1093/nar/gkn821

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Reverse genetics is a powerful way to discover the biological role of thousands of genes whose functions are currently unknown. Currently, large-scale random insertion mutagenesis is one of the most powerful gene inactivation methods for reverse genetics (1). Several laboratories around the world have developed T-DNA insertion libraries with various constructs [see (1) for a review]. The systematic sequencing of DNA flanking the mutagens is a simple way to identify disrupted lines for virtually any rice gene. OryGenesDB is the most complete open-access database for rice reverse genetics and compiles large-scale insertion mutagenesis data from several laboratories. OryGenesDB is now a standard resource for rice reverse genetics and has greatly expanded since its first release in 2006 (2). Several additional databases of interest for functional genomics have recently been released, including Oryza Tag Line (3), which contains phenotypic descriptions of the Genoplante insertion lines, and GreenPhylDB (4), a comprehensive platform designed to facilitate comparative functional genomics between the Oryza sativa and Arabidopsis thaliana genomes. In order to facilitate user navigation between these three databases, and as a first step towards database interoperability for rice functional genomics, we developed a specific tool called Orylink. Orylink is a personalized integrated system for functional genomics analysis and is fully integrated into OryGenesDB. This modular system was developed to carry out complex queries across these three databases using different entry points, including TIGR (5), InterPro (6), KEGG id (7) or keywords. We developed Web services for interoperability between the OryGenesDB, Oryza Tag Line and GreenPhylDB databases, which can also be used as a stand-alone system to query across these three databases using BioMOBY (8,9). Using Orylink and its associated Web services, a user can retrieve and store information, including FSTs, mutant phenotypes and A. thaliana orthologs, across these three databases on a project basis. A classical query example is to start from a candidate rice gene, look for the A. thaliana orthologs (GreenPhylDB), identify the corresponding rice mutant insertion lines (OryGenesDB), and finally check if any of these lines has an already characterized phenotype (Oryza Tag Line). Orylink can field such classical queries, as well as more complex queries, across these three databases and others. Here, we present a major expansion of OryGenesDB that includes Orylink as well as some new tools and data.

RESULTS

Expansion of OryGenesDB

FST

OryGenesDB is specifically dedicated to rice reverse genetics and attempts to map FSTs from international laboratories. Since the release of OryGenesDB in 2006, when around 45 000 FSTs were available, more than 125 000 additional FSTs have been added, for a current total of 171 000 FSTs from 10 laboratories (Table 1). OryGenesDB is regularly updated, with updates including genome annotations [version 5 (10) and 7 (11) of the TIGR (O. sativa) and TAIR (A. thaliana) annotations, respectively] and the GMOD browser, which offers new user friendly interfaces through AJAX technologies (12). A new interface with a tab-like organization of sub-menus was developed to simplify navigation across OryGenesDB tools.
Table 1.

Rice insertion resources integrated in OryGenesDB

InstitutionMutagenNo. of flanking sequencesNo. of mapped sequences
CerealGene Tags, European UnionAc/Ds13801380
CIRAD, GenoscopeTos1713 74513 622
CIRAD-INRA-IRD-CNRS, GenoplanteT-DNA14 13713 384
CSIROT-DNA787787
National Center of Plant Gene ResearchT-DNA16 15815 807
National Institute of Agrobiological SciencesTos1718 02417 955
Plant Functional Genomics LaboratoryT-DNA80 00678 709
PMBBRCAc/Ds10721044
Taiwan Rice Insertional Mutant ProgramT-DNA11 79911 754
University of California at DavisAc/Ds13 92212 927
Total171 030167 369
Rice insertion resources integrated in OryGenesDB When the phenotypic characterization of 30 000 T-DNA enhancer trap lines from the Genoplante library was recently completed, the data were stored in Oryza Tag Line (OTL) (3). The corresponding FSTs are stored in OryGenesDB, and we cross-linked OTL lines and the corresponding FSTs in OryGenesDB. The direct link to OTL appears in the pop-up window to allow a direct observation of the line phenotype, if any (3). Cross links are also available through Orylink (see below).

Automatic design of primers for FST validation

A new tool, called Primer blaster, was designed to test the specificity of any primer pairs on the rice genome. The user can paste or download primer sequences as multifasta files, with the reverse and forward primers in the same order in the two files. During the next step, the database, usually pseudochromosomes, is selected, and the expected amplicon size is fixed. After the query is submitted, a table is displayed that includes all of the primers tested with the primer name, the chromosome, the primer's position, the amplicon size and the number of hits for the forward and reverse primers on the rice genome, and their status is shifted to ‘ok’ if the primer pair is really specific. Designing probes to analyze insertion lines is a very repetitive and common task. Therefore, we developed the ‘Primer Designer’ tool to help users develop probes for Southern blot. Users input either the candidate FST, plant name, a sequence size from FST or alternatively a sequence size range, and the restriction enzyme to be used for the Southern blot. Primer designer will then search for a probe around the FST, according to the user's search parameters, that do not contain the chosen restriction site. The probe can be then used to check if there is a rearrangement at the expected locus and also to identify the given FST as homozygous or heterozygous.

New sub-databases

OryGenesDB is not only a repository for FSTs but was also designed to store more specialized data related to functional genomics. Hence, several sub-databases, including Archipelago, were developed to store information on more 2500 genes of the rice defense arsenal (13) that were obtained from more than 70 publications. Similarly, extensive QTL cataloguing for Rice Blast resistance identified 85 blast resistance and around 350 QTL that were mapped on the rice genome and could be incorporated into OryGenesDB (14). These sub-databases are now accessible as specific categories in the Genome Browser tracks through ‘Resistance Gene Analogues’ and ‘Rice Defense genes’ for Archipelago and ‘Blast QTL and R-Genes’ for the Meta-QTL analysis.

New annotation layers

A total of 10 specific categories are accessible in OryGenesDB, totaling 78 annotation layers. For instance, the ‘Oryza Map Alignment Project’ (OMAP) aims to construct and align BAC/STC-based physical maps of 11 wild and one cultivated rice species of the c.v. Nipponbare, to exploit the potential of the wild species of the genus Oryza for breeding cultivated rice cultivars (15). In order to provide simple and fast access to this resource, all STC identified in the OMAP project were mapped to the Nipponbare pseudomolecule and are visible as a supplemental layer of annotation.

Extended functionalities

Orylink: comparative functional genomics across databases in a ‘click and view’ manner

For the user's point of view, Orylink is like a virtual workbench that helps biologists retrieve all kind of information linked with the gene locus. Figure 1 shows an example of a biological query process.
Figure 1.

Data search process using Orylink. This figure illustrates the various steps to initiate a user query. Starting with the project framework (A), users can create new projects (B) and display the results (C and D) after providing a login and a password. (A) Project management interface. (B) Project creation interface. (C) A synthetic result obtained for the execution of a given workflow and (D) gene report for the corresponding locus name ‘0s10g36924.1.’ In this view, the data are organized into broad categories like gene, phenotype and phylogenomic predictions reports. Cross-references are built into all the data to link the summary data to their original sources.

Data search process using Orylink. This figure illustrates the various steps to initiate a user query. Starting with the project framework (A), users can create new projects (B) and display the results (C and D) after providing a login and a password. (A) Project management interface. (B) Project creation interface. (C) A synthetic result obtained for the execution of a given workflow and (D) gene report for the corresponding locus name ‘0s10g36924.1.’ In this view, the data are organized into broad categories like gene, phenotype and phylogenomic predictions reports. Cross-references are built into all the data to link the summary data to their original sources. Orylink is available through the tool menu of the main OryGenesDB tool bar (16). Starting with the project framework (Figure 1A), users start a new project (e.g. Aquaporin) (Figure 1B) after providing a login and password. Each project is focused on one species, either O. sativa or A. thaliana. Biologists can build their own queries according to seven different data types [TIGR ID (5), InterPro ID (6), KEGG ID (7), Enzyme Commission number (17), Gene Ontology ID (18), Germplasm ID (3) and genomic location]. In this case, a list of loci is submitted. Figure 1C shows the results of the Aquaporin project. In this synthetic view, 14 genes are retrieved. Users can quickly observe the locations of the genes and their annotations and whether the genes have either KO lines (identified in the ‘Have FST’ column) or reporter gene expressions and phenotypes (identified in the ‘Have expression’ and ‘Have phenotype’ columns, respectively). The column ‘supported by evidence’ represents the presence of cDNA and EST for the locus in question. The results can also be downloaded in the Excel TM file format. By clicking on the locus entry, users can display detailed results (Figure 1D). For these detailed results, Orylink first provides a GBrowse image of the genomic location corresponding to the locus entry. It lists all FSTs that disrupt the gene, with details of their origins (column source) and features (e.g. location, orientation and type of insert). It provides numerous features of the phenotype observations extracted from Oryza Tag Line, for example, the phenotype name with its description. Also, the phenotypic class joined with the trait ontology ID is displayed. Orylink extracts corresponding A. thaliana orthologs from GreenPhylDB. This information can be an open area in comparative genomics and may provide a bridge between A. thaliana and O. sativa functional genomics data. EST and cDNA features are displayed with their annotations to enrich some putative gene functions. Finally, a list of InterPro names is provided to identify all protein domain families. One of the most important benefits of Web services is that access to original data sources guarantees up-to-date information. Another benefit is the ability to launch massive queries recursively with free access. Moreover, bioinformaticians can easily chain Web services with a minimum of programming (see Materials and Methods section of Supplementary data for the tools we used).

CONCLUSIONS AND FUTURE DIRECTIONS

OryGenesDB is now not only the most popular and complete database for rice reverse genetics, but it is also a user-oriented web application that automates and organizes Web queries in rice functional genomics. Biologists can greatly reduce the time wasted assembling data from heterogeneous data sources. In addition to complex queries across databases, OryGenesDB offers a simple way to store and organize hypotheses as projects through Orylink. Users can run their queries and store the output but also update the results as the source databases are continuously independently updated. OryGenesDB offer a way to accelerate the manual process of integrating and compiling data from heterogeneous sources. As a whole, this database is guided by the biologists' need for automation and integration. In the future, new workflows like the Nottingham Arabidopsis Stock Centre (NASC) (19) and the Munich Information center for Protein Sequences (MIPS) (20) will be developed or integrated into Orylink. Depending on the Web services available, new queries will be implemented, for instance, to assign rice gene functions using the available compiled gene-oriented literature on A. thaliana genes. Starting from a gene of interest in rice, an Orylink user will soon be able to find the corresponding A. thaliana genes and identify the putative functions of the rice gene using TAIR data. This functionality will greatly help any biologist with interest in the comparative functional genomics between rice and A. thaliana, the two plant models. Last, the integration of genomics data in OryGenesDB will continue, including new FSTs and new annotation layers. Users are also encouraged to submit proposals for interface modifications and data of interest that they want to integrate into OryGenesDB at orygenesdb@cirad.fr.

SUPPLEMENTARY DATA

Supplementary data are available at NAR Online.

FUNDING

Funding for the open access charge: Centre de coopération internationale en recherche agronomique pour le développement.
  16 in total

1.  The ENZYME database in 2000.

Authors:  A Bairoch
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Seed and molecular resources for Arabidopsis.

Authors:  R L Scholl; S T May; D H Ware
Journal:  Plant Physiol       Date:  2000-12       Impact factor: 8.340

3.  BioMOBY: an open source biological web services proposal.

Authors:  Mark D Wilkinson; Matthew Links
Journal:  Brief Bioinform       Date:  2002-12       Impact factor: 11.622

Review 4.  Interoperability with Moby 1.0--it's better than sharing your toothbrush!

Authors:  Mark D Wilkinson; Martin Senger; Edward Kawas; Richard Bruskiewich; Jerome Gouzy; Celine Noirot; Philippe Bardou; Ambrose Ng; Dirk Haase; Enrique de Andres Saiz; Dennis Wang; Frank Gibbons; Paul M K Gordon; Christoph W Sensen; Jose Manuel Rodriguez Carrasco; José M Fernández; Lixin Shen; Matthew Links; Michael Ng; Nina Opushneva; Pieter B T Neerincx; Jack A M Leunissen; Rebecca Ernst; Simon Twigger; Bjorn Usadel; Benjamin Good; Yan Wong; Lincoln Stein; William Crosby; Johan Karlsson; Romina Royo; Iván Párraga; Sergio Ramírez; Josep Lluis Gelpi; Oswaldo Trelles; David G Pisano; Natalia Jimenez; Arnaud Kerhornou; Roman Rosset; Leire Zamacola; Joaquin Tarraga; Jaime Huerta-Cepas; Jose María Carazo; Joaquin Dopazo; Roderic Guigo; Arcadi Navarro; Modesto Orozco; Alfonso Valencia; M Gonzalo Claros; Antonio J Pérez; Jose Aldana; M Mar Rojano; Raul Fernandez-Santa Cruz; Ismael Navas; Gary Schiltz; Andrew Farmer; Damian Gessler; Heiko Schoof; Andreas Groscurth
Journal:  Brief Bioinform       Date:  2008-01-31       Impact factor: 11.622

5.  A genome-wide meta-analysis of rice blast resistance genes and quantitative trait loci provides new insights into partial and complete resistance.

Authors:  Elsa Ballini; Jean-Benoît Morel; Gaétan Droc; Adam Price; Brigitte Courtois; Jean-Loup Notteghem; Didier Tharreau
Journal:  Mol Plant Microbe Interact       Date:  2008-07       Impact factor: 4.171

Review 6.  ARCHIPELAGO: a dedicated resource for exploiting past, present, and future genomic data on disease resistance regulation in rice.

Authors:  E Vergne; E Ballini; G Droc; D Tharreau; J-L Nottéghem; J-B Morel
Journal:  Mol Plant Microbe Interact       Date:  2008-07       Impact factor: 4.171

7.  The Gene Ontology (GO) project: structured vocabularies for molecular biology and their application to genome and expression analysis.

Authors:  Judith A Blake; Midori A Harris
Journal:  Curr Protoc Bioinformatics       Date:  2008-09

8.  Using the TIGR gene index databases for biological discovery.

Authors:  Yuandan Lee; John Quackenbush
Journal:  Curr Protoc Bioinformatics       Date:  2003-11

9.  The oryza map alignment project: the golden path to unlocking the genetic potential of wild rice species.

Authors:  Rod A Wing; Jetty S S Ammiraju; Meizhong Luo; Hyeran Kim; Yeisoo Yu; Dave Kudrna; Jose L Goicoechea; Wenming Wang; Will Nelson; Kiran Rao; Darshan Brar; Dave J Mackill; Bin Han; Cari Soderlund; Lincoln Stein; Phillip SanMiguel; Scott Jackson
Journal:  Plant Mol Biol       Date:  2005-09       Impact factor: 4.076

10.  Rice mutant resources for gene discovery.

Authors:  Hirohiko Hirochika; Emmanuel Guiderdoni; Gynheung An; Yue-Ie Hsing; Moo Young Eun; Chang-Deok Han; Narayana Upadhyaya; Srinivasan Ramachandran; Qifa Zhang; Andy Pereira; Venkatesan Sundaresan; Hei Leung
Journal:  Plant Mol Biol       Date:  2004-02       Impact factor: 4.076

View more
  12 in total

1.  A genetic model for the female sterility barrier between Asian and African cultivated rice species.

Authors:  Andrea Garavito; Romain Guyot; Jaime Lozano; Frédérick Gavory; Sylvie Samain; Olivier Panaud; Joe Tohme; Alain Ghesquière; Mathias Lorieux
Journal:  Genetics       Date:  2010-05-10       Impact factor: 4.562

Review 2.  Genomics and bioinformatics resources for crop improvement.

Authors:  Keiichi Mochida; Kazuo Shinozaki
Journal:  Plant Cell Physiol       Date:  2010-03-05       Impact factor: 4.927

3.  Genome-wide identification and evolutionary analysis of positively selected miRNA genes in domesticated rice.

Authors:  Qingpo Liu; Hong Wang; Haichao Hu; Hengmu Zhang
Journal:  Mol Genet Genomics       Date:  2014-11-02       Impact factor: 3.291

4.  Turning rice meiosis into mitosis.

Authors:  Delphine Mieulet; Sylvie Jolivet; Maud Rivard; Laurence Cromer; Aurore Vernet; Pauline Mayonove; Lucie Pereira; Gaëtan Droc; Brigitte Courtois; Emmanuel Guiderdoni; Raphael Mercier
Journal:  Cell Res       Date:  2016-10-21       Impact factor: 25.617

Review 5.  Natural and artificial mutants as valuable resources for functional genomics and molecular breeding.

Authors:  Shu-Ye Jiang; Srinivasan Ramachandran
Journal:  Int J Biol Sci       Date:  2010-04-28       Impact factor: 6.580

Review 6.  High-throughput phenotyping of multicellular organisms: finding the link between genotype and phenotype.

Authors:  Rosangela Sozzani; Philip N Benfey
Journal:  Genome Biol       Date:  2011-03-28       Impact factor: 13.583

7.  GreenPhylDB v2.0: comparative and functional genomics in plants.

Authors:  Mathieu Rouard; Valentin Guignon; Christelle Aluome; Marie-Angélique Laporte; Gaëtan Droc; Christian Walde; Christian M Zmasek; Christophe Périn; Matthieu G Conte
Journal:  Nucleic Acids Res       Date:  2010-09-22       Impact factor: 16.971

8.  Fine mapping of RYMV3: a new resistance gene to Rice yellow mottle virus from Oryza glaberrima.

Authors:  Hélène Pidon; Alain Ghesquière; Sophie Chéron; Souley Issaka; Eugénie Hébrard; François Sabot; Olufisayo Kolade; Drissa Silué; Laurence Albar
Journal:  Theor Appl Genet       Date:  2017-01-31       Impact factor: 5.699

9.  The banana genome hub.

Authors:  Gaëtan Droc; Delphine Larivière; Valentin Guignon; Nabila Yahiaoui; Dominique This; Olivier Garsmeur; Alexis Dereeper; Chantal Hamelin; Xavier Argout; Jean-François Dufayard; Juliette Lengelle; Franc-Christophe Baurens; Alberto Cenci; Bertrand Pitollat; Angélique D'Hont; Manuel Ruiz; Mathieu Rouard; Stéphanie Bocs
Journal:  Database (Oxford)       Date:  2013-05-23       Impact factor: 3.451

10.  Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

Authors:  Julien Wollbrett; Pierre Larmande; Frédéric de Lamotte; Manuel Ruiz
Journal:  BMC Bioinformatics       Date:  2013-04-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.