Literature DB >> 26450962

MEPD: medaka expression pattern database, genes and more.

Juan I Alonso-Barba1, Raza-Ur Rahman2, Joachim Wittbrodt3, Juan L Mateo3.   

Abstract

The Medaka Expression Pattern Database (MEPD; http://mepd.cos.uni-heidelberg.de/) is designed as a repository of medaka expression data for the scientific community. In this update we present two main improvements. First, we have changed the previous clone-centric view for in situ data to a gene-centric view. This is possible because now we have linked all the data present in MEPD to the medaka gene annotation in ENSEMBL. In addition, we have also connected the medaka genes in MEPD to their corresponding orthologous gene in zebrafish, again using the ENSEMBL database. Based on this, we provide a link to the Zebrafish Model Organism Database (ZFIN) to allow researches to compare expression data between these two fish model organisms. As a second major improvement, we have modified the design of the database to enable it to host regulatory elements, promoters or enhancers, expression patterns in addition to gene expression. The combination of gene expression, by traditional in situ, and regulatory element expression, typically by fluorescence reporter gene, within the same platform assures consistency in terms of annotation. In our opinion, this will allow researchers to uncover new insights between the expression domain of genes and their regulatory landscape.
© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2015        PMID: 26450962      PMCID: PMC4702773          DOI: 10.1093/nar/gkv1029

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Medaka (Oryzias latipes) is already an established model organism in developmental biology (1,2). Key properties for this status are extra-embryonic development and the transparency of the embryo. In addition, the possibility of having hundreds of embryos per day makes medaka amenable to high-throughput screens of gene expression. The Medaka Expression Pattern Database (MEPD) was initiated already more than 10 years ago (3,4) with the aim of serving as a central repository for gene expression patterns to the scientific community. At that time the medaka genome sequence was not yet available and therefore all the information was based on expressed sequence tags (EST). In the meantime, the medaka genome was sequenced (5) using a shotgun approach to the Hd-rR inbred line with 10.6-fold coverage. The N50 value is 5.1Mb excluding gaps. This assembly is labelled as draft version, but at the time of writing this manuscript there is already a preliminary version of a new assembly that aims to be ‘near-complete’ (Kiyoshi Naruse, personal communication). The availability of the genome sequence together with the gene annotation accomplished by the ENSEMBL team (6) implies a major change in the structure of MEPD. We have now implemented this change by shifting the previous clone-centric view to a gene-centric view. However, this is not the only update. Medaka is as well a good model organism to study transcriptional regulation and the expression domain of regulatory elements, namely promoters and enhancers. Already there are many works published using medaka to analyse the spatio-temporal activity domain of this kind of elements (7–10). Other model organisms serve similar purpose and there are as well online databases making available these data like REDfly (11) for the fruit fly, Expression disruption screen (12) or Enhancer screen (http://www.upo.es/CABD/EnhancerScreen/) for zebrafish and the Vista enhancer browser (13) for mouse. Nonetheless there is not yet any site that integrates both sets of expression data, genes and regulatory elements, using the same vocabulary and ontology for annotation. This is the motivation that led us to incorporate this second part of expression information into MEPD. With this update we envision that, as the data hosted here grow, it will represent a valuable resource to analyse the logic and rules of transcriptional gene regulation.

DATABASE CONTENT

We have maintained the basic structure of the database from previous versions. There is, however, one main change related to the tables to accommodate regulatory expression data. We have replicated the structure containing the information of gene expression pattern with new tables for regulatory elements. For in situ data, tables with GE suffix in Figure 1, the information is distributed over the tables clone, sequence, picture and expression. The same structure fits now the information for regulatory elements, suffix RE in this case, substituting clone for construct. As represented in Figure 1, a construct is linked to two genes, which represent the closest up- and down-stream genes from the locus of each element. This assignment is done with respect to the direction in which the regulatory sequence is tested. If this element was not tested in a specific orientation, then we consider the forward strand independently on which sequence, forward or reverse strand, is added to the database. In case of a sequence overlapping a gene or an intragenic element this gene will be set both as up- and down-stream.
Figure 1.

Simplified entity–relationship model of MEPD. The structure of the database is maintained from previous releases (see tables with ‘GE’ suffix for in situ data), but we have added the tables for regulatory element data, with ‘RE’ suffix as a clone of those for gene expression data. For simplicity the tables are shown without fields.

Simplified entity–relationship model of MEPD. The structure of the database is maintained from previous releases (see tables with ‘GE’ suffix for in situ data), but we have added the tables for regulatory element data, with ‘RE’ suffix as a clone of those for gene expression data. For simplicity the tables are shown without fields. This association to genes is only for searching purpose, i.e. being able to retrieve expression patterns of regulatory elements near to a given gene. However, it is not straightforward to validate the interaction of a regulatory element and its target gene(s), therefore this should not be taken, in general, as the genes over which the enhancer/promoter acts. With the availability of the genome sequence and gene annotation, it is not anymore necessary to relay on blast hits to identify the target gene of an in situ probe, as it was in previous versions. Therefore, we have eliminated the blast and cluster tables. Thus, now the gene is the centre entity of the database. We use ENSEMBL (6) as our reference for gene annotation, although for clone sequences not matching a gene annotated in this database we make use of an automatic annotation of RNA-seq data at different stages of development (Mateo et al. in preparation). Taking advantage of the orthology relationship defined by the ENSEMBL database between medaka and zebrafish, we have included a new table containing this information. With that, from the result page of gene or regulatory element expression, it is possible to access a link to the corresponding gene in the Zebrafish Model Organism Database (ZFIN) (14) for comparison. This functionality can be very useful to easily identify conserved or diverging expression patterns of orthologous genes between these two fish species. At the moment of writing this manuscript, MEPD contains expression data for 947 genes and 7476 images, from 623 and 3863, respectively, in the previous update. For regulatory elements the current content is 56 elements with 69 pictures. Already now it is possible to illustrate the power of using MEPD, for instance comparing the precise overlapping expression pattern of the RX2 gene (15) and its direct upstream regulatory element (promoter) (16).

DATA ACCESS

The MEPD data are stored in a MySQL database. The access to these data is done through a Java web application using JavaServer Faces and JBoss RichFaces technology, which is running on a Tomcat server. The access to the data for gene or regulatory element expression is done via separate forms, although in both cases the same filters can be applied, namely: gene, annotated anatomical structure and stages. The data submission can be done also through the web interface. This functionality is only available to registered users, but anyone can register an account. We have improved significantly the data entry forms with a tabulated view of the clones or constructs. This view allows sorting and multi-column filtering by gene name or ID and the name of the clone or construct. After selecting one of the items in the table, the information related to that item is shown in the right panel. Using this panel it is possible to modify or add information in the corresponding fields. In order to ease the submission of large amount of expression data we have also implemented a bulk upload function. In this case, the user can fill the required information in a spread sheet. This sheet can be sent to us together with the corresponding pictures. We will perform a manual check of the data for consistency and coherence and then include them on behalf of the submitter. In the ‘Links’ section of MEPD online (http://mepd.cos.uni-heidelberg.de/mepd/forms/links.jsf) it is possible to download the MEPD user manual and a template spread sheet to upload bulk data.

AUTOMATIC ANNOTATION

We have used an automated pipeline to associate each clone sequence to the proper ENSEMBL gene. This pipeline is based on blat alignments (17) of the sequences in MEPD to the cDNA sequences of the genes annotated in ENSEMBL, or to our unpublished annotation based on RNA-seq data. In the future we will use the same pipeline to update the gene references when new versions of the medaka genome assembly or the ENSEMBL gene annotation are published. For entry of new data, the user is responsible to assign the proper gene name and ID to each record. However, on the process of an automatic update, conflicting cases, in which the genes assigned by the user and the automatically assigned are not the same, we will perform a manual evaluation. In this evaluation we will contact, if possible, the responsible user.

FUTURE DIRECTIONS

With the new improved data entry interface and bulk updates we expect that the volume of data uploaded to MEPD will increase significantly. Specially, we aim at hosting an amount of expression data for regulatory sequences comparable to that of gene expression. This information will be very important for researchers willing to create fish transgenic lines with specific spatio-temporal expression domains. In order to ease this task, we are planning to include also in MEPD information about stable transgenic lines from laboratories willing to share them with the rest of the community. As mentioned before, we foresee that combining gene and regulatory element expression patterns MEPD, and medaka as model organism, can become a primary resource for deciphering and understanding transcriptional regulation in vertebrates.
  17 in total

1.  Six3 overexpression initiates the formation of ectopic retina.

Authors:  F Loosli; S Winkler; J Wittbrodt
Journal:  Genes Dev       Date:  1999-03-15       Impact factor: 11.361

2.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

Review 3.  Medaka--a model organism from the far East.

Authors:  Joachim Wittbrodt; Akihiro Shima; Manfred Schartl
Journal:  Nat Rev Genet       Date:  2002-01       Impact factor: 53.242

4.  MEPD: a resource for medaka gene expression patterns.

Authors:  Thorsten Henrich; Mirana Ramialison; Beate Wittbrodt; Beatrice Assouline; Franck Bourrat; Anja Berger; Heinz Himmelbauer; Takashi Sasaki; Nobuyoshi Shimizu; Monte Westerfield; Hisato Kondoh; Joachim Wittbrodt
Journal:  Bioinformatics       Date:  2005-05-06       Impact factor: 6.937

5.  MEPD: a Medaka gene expression pattern database.

Authors:  Thorsten Henrich; Mirana Ramialison; Rebecca Quiring; Beate Wittbrodt; Makoto Furutani-Seiki; Joachim Wittbrodt; Hisato Kondoh
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

6.  The medaka draft genome and insights into vertebrate genome evolution.

Authors:  Masahiro Kasahara; Kiyoshi Naruse; Shin Sasaki; Yoichiro Nakatani; Wei Qu; Budrul Ahsan; Tomoyuki Yamada; Yukinobu Nagayasu; Koichiro Doi; Yasuhiro Kasai; Tomoko Jindo; Daisuke Kobayashi; Atsuko Shimada; Atsushi Toyoda; Yoko Kuroki; Asao Fujiyama; Takashi Sasaki; Atsushi Shimizu; Shuichi Asakawa; Nobuyoshi Shimizu; Shin-Ichi Hashimoto; Jun Yang; Yongjun Lee; Kouji Matsushima; Sumio Sugano; Mitsuru Sakaizumi; Takanori Narita; Kazuko Ohishi; Shinobu Haga; Fumiko Ohta; Hisayo Nomoto; Keiko Nogata; Tomomi Morishita; Tomoko Endo; Tadasu Shin-I; Hiroyuki Takeda; Shinichi Morishita; Yuji Kohara
Journal:  Nature       Date:  2007-06-07       Impact factor: 49.962

7.  Combining computational prediction of cis-regulatory elements with a new enhancer assay to efficiently label neuronal structures in the medaka fish.

Authors:  Emmanuel Mongin; Thomas O Auer; Franck Bourrat; Franziska Gruhl; Ken Dewar; Mathieu Blanchette; Joachim Wittbrodt; Laurence Ettwiller
Journal:  PLoS One       Date:  2011-05-27       Impact factor: 3.240

8.  REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila.

Authors:  Steven M Gallo; Dave T Gerrard; David Miner; Michael Simich; Benjamin Des Soye; Casey M Bergman; Marc S Halfon
Journal:  Nucleic Acids Res       Date:  2010-10-21       Impact factor: 16.971

9.  De novo genesis of enhancers in vertebrates.

Authors:  Michael P Eichenlaub; Laurence Ettwiller
Journal:  PLoS Biol       Date:  2011-11-01       Impact factor: 8.029

10.  VISTA Enhancer Browser--a database of tissue-specific human enhancers.

Authors:  Axel Visel; Simon Minovitsky; Inna Dubchak; Len A Pennacchio
Journal:  Nucleic Acids Res       Date:  2006-11-27       Impact factor: 16.971

View more
  4 in total

1.  QDPR homologues in Danio rerio regulate melanin synthesis, early gliogenesis, and glutamine homeostasis.

Authors:  Maximilian Breuer; Luca Guglielmi; Matthias Zielonka; Verena Hemberger; Stefan Kölker; Jürgen G Okun; Georg F Hoffmann; Matthias Carl; Sven W Sauer; Thomas Opladen
Journal:  PLoS One       Date:  2019-04-17       Impact factor: 3.240

2.  Igf signaling couples retina growth with body growth by modulating progenitor cell division.

Authors:  Clara Becker; Katharina Lust; Joachim Wittbrodt
Journal:  Development       Date:  2021-04-15       Impact factor: 6.862

3.  Representation of anatomy in online atlases and databases: a survey and collection of patterns for interface design.

Authors:  Melissa D Clarkson
Journal:  BMC Dev Biol       Date:  2016-05-21       Impact factor: 1.978

4.  Automatic Segmentation of Drosophila Neural Compartments Using GAL4 Expression Data Reveals Novel Visual Pathways.

Authors:  Karin Panser; Laszlo Tirian; Florian Schulze; Santiago Villalba; Gregory S X E Jefferis; Katja Bühler; Andrew D Straw
Journal:  Curr Biol       Date:  2016-07-14       Impact factor: 10.834

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.