Literature DB >> 19934263

FlyTED: the Drosophila Testis Gene Expression Database.

Jun Zhao1, Graham Klyne, Elizabeth Benson, Elin Gudmannsdottir, Helen White-Cooper, David Shotton.   

Abstract

FlyTED, the Drosophila Testis Gene Expression Database, is a biological research database for gene expression images from the testis of the fruit fly Drosophila melanogaster. It currently contains 2762 mRNA in situ hybridization images and ancillary metadata revealing the patterns of gene expression of 817 Drosophila genes in testes of wild type flies and of seven meiotic arrest mutant strains in which spermatogenesis is defective. This database has been built by adapting a widely used digital library repository software system, EPrints (http://eprints.org/software/), and provides both web-based search and browse interfaces, and programmatic access via an SQL dump, OAI-PMH and SPARQL. FlyTED is available at http://www.fly-ted.org/.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19934263      PMCID: PMC2808924          DOI: 10.1093/nar/gkp1006

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Our activities

We have determined the mRNA expression patterns of genes involved in spermatogenesis in the Drosophila testis, including many that show differences in expression level between wild type flies and meiotic arrest mutant strains exhibiting abnormal spermatogenesis. Gene expression studies in the Drosophila testis have the advantage of a clear correlation between the position of the developing germ cell within the elongated testis and its developmental stage (see diagram at http://www.fly-ted.org/images/Spermatogenesis_diagram.png). It is thus possible to infer at what developmental stage a particular gene product is likely to act, simply by observing its mRNA expression pattern. We have created the Drosophila Testis Gene Expression Database (FlyTED; http://www.fly-ted.org) to provide web access to these gene expression images and their metadata, including the primer sequences that ultimately define the gene product being localized (1). This research and development, ongoing since 2003 with funding from BBSRC and the JISC, has led to the publication of expression images of 817 genes, about 10% of all genes expressed in the testis and the male genital tract, and have given new understanding about post-meiotic gene expression by the discovery of two hitherto unknown classes of Drosophila genes named ‘comets’ and ‘cups’, whose expression show characteristic sub-cellular localization patterns that proved hitherto unknown post-meiotic gene expression (2,3).

Other Drosophila image databases

There are a number of other public databases of Drosophila gene expression images, mostly showing expression in the Drosophila embryo, that complement FlyTED. These include the Berkeley Drosophila Genome Project expression database (BDGP; http://www.fruitfly.org/) (4); Fly-FISH (http://fly-fish.ccbr.utoronto.ca/), a new database of mRNA localization patterns at the subcellular level during early Drosophila embryogenesis determined by fluorescence in situ hybridization (5); FlyView (http://flyview.uni-muenster.de/), that contains pictures from enhancer-trap lines; FlyMove (http://flymove.uni-muenster.de/) providing didactic images, movies and interactive diagrams of the embryonic development of Drosophila melanogaster (6); FlyEx (http://flyex.ams.sunysb.edu/FlyEx/), showing embryonic segmentation gene expression patterns (7); FlyBrain (http://flybrain.neurobio.arizona.edu/), an online Drosophila nervous system atlas that contains some gene expression data revealed by antibody labelling; FlyProt (http://www.flyprot.org/), an exon trap database containing Drosophila gene expression images from all stages of development and tissue types; and FlyTrap (http://flytrap.med.yale.edu/), a protein trap database (8). FlyBase (http://flybase.org/), the definitive global database of information concerning the genes and genomes of several Drosophila species including D. melanogaster, while containing no gene expression image data, is of relevance to all of the aforementioned image databases. Sites that integrate these distributed heterogeneous resources include FlyMine (http://www.flymine.org/) and 4DXpress (http://4dx.embl.de/4DXpress/) (9,10). Additionally, we have created a small demonstration system, OpenFlyData (http://www.openflydata.org), to show how Semantic Web technologies can be used to integrate information from FlyTED, FlyBase, FlyAtlas (http://www.flyatlas.org) (11) and BDGP into a single user interface ‘on the fly’ (12). A number of global testis-specific microarray studies have been published (13–15). We have conducted our own microarray analysis to compare gene expression in wild type testes with that in several meiotic arrest mutants (White-Cooper,H., unpublished data), and have used these data, in conjunction with the published array and EST data, to identify testis-expressed genes. A subset of these testis-expressed genes were selected for analysis by mRNA in situ hybridization. Most of the genes selected for analysis were dependent on the meiotic arrest genes for full expression in testes, while others were expressed independently of the meiotic arrest genes.

DATABASE METHODS

About the dataset

The primary spermatocyte stage of Drosophila spermatogenesis lasts ∼3.5 days, and is characterized by extensive cell growth, associated with activation of expression of a large repertoire of testis-specific genes. Typically, primary spermatocytes transcribe genes required in the primary spermatocytes themselves and in the spermatids that develop from them. The transcripts required after meiosis are stabilized and stored in the cytoplasm in a translationally repressed state for up to 4 days (16). In Drosophila spermatogenesis, meiotic cell cycle progression is linked to spermatid differentiation by the function of the meiotic arrest genes. Mature primary spermatocytes in testes from a meiotic arrest mutant male arrest during differentiation, and show no signs of entering either the meiotic divisions or spermatid differentiation (17,18). These meiotic arrest genes fall into two phenotypic classes—aly-class (aly, comr, topi, tomb and achi/vis) (19–24), and can-class (can, mia, sa, nht, rye) (25,26). The failure of meiotic arrest mutant germ cells to progress past the mature primary spermatocyte stage is due to failure to activate expression of genes required for meiotic cell cycle progression (e.g. twine) and for spermatid differentiation (e.g. fzo) (27). One of our aims has been to determine the expression patterns for genes that require the meiotic arrest genes for their expression, in comparison with those whose expression in testes is independent of the meiotic arrest genes. To achieve this, since the mutant testes are morphologically easily distinguished from wild type testes, the two genotypes are mixed and stained in the same hybridization well. Thus, although mRNA in situ hybridization is not quantitative, qualitative judgements of gene expression level in mutant versus wild type can be made on the basis of side-by-side comparisons.

Data acquisition

The experimental methodology used to obtain the gene expression images within FlyTED is summarized at http://www.fly-ted.org/meth.html, and is more fully documented by White-Cooper (28). In brief, testes from young male Drosophila (0–1 day old) were dissected, hybridized to probes specific for the gene under study, stained and then examined using DIC microscopy, typically using a 10 × objective magnification. Images were captured with a digital colour camera and were not subjected to post-capture digital manipulations. For each gene, pictures were taken of at least one wild type and one mutant strain testis, with additional pictures, including higher magnification views, being taken if the staining pattern looked interesting. Images were also acquired if staining occurred in the somatic cells of the testis.

Metadata structure

Every FlyTED image was annotated manually at the time of capture by the biologist concerned, with metadata that is compliant with the emerging MISFISHIE standard (29). The gene expression pattern revealed in each image is described using controlled vocabulary terms from the Drosophila Anatomy Ontology (http://www.obofoundry.org/cgi-bin/detail.cgi?id=fly_anatomy). In addition to the gene name, each image is also annotated with the FlyBase identification number (gene id) that uniquely identifies each gene, which is linked to the corresponding FlyBase gene report page, and with the CG (Computed Gene) number, by which biologists can search FlyTED if they are not familiar with the gene name used.

Database content

FlyTED, the Drosophila Testis Gene Expression Database, was constructed by customizing an instance of the EPrints open source repository software system (http://eprints.org/software/), as detailed on the ‘About the Database’ page of the FlyTED website. Currently, the database contains 2762 mRNA in situ hybridization images and ancillary data revealing the patterns of expression of 817 individual genes involved in spermatogenesis in the testis of the fruit fly, D. melanogaster, both in flies with normal spermatogenesis (wild type; typically, we used the strain red e), and in seven meiotic arrest mutant strains of flies exhibiting abnormal spermatogenesis: aly (always early), achi/vis (achintya and vismay), can (cannonball), comr (cookie monster), nht (no hitter), tomb (tombola) and topi (matotopetli). Full details of the alleles used are at http://www.fly-ted.org/meth.html#strain (19,21,22,24,30). The database also contains a small number of images of testis gene expression in Drosophila pseudoobscura. For most genes, the PCR primer sequences used (designed from genomic sequences) and the predicted sequence of the PCR reaction are also included in the database. The Drosophila thumbnail images and accompanying metadata contained within the FlyTED Database are published as Open Access Data under the Creative Commons CC0 1.0 Universal License, in conformity with the Science Commons Protocol for Implementing Open Access Data (http://sciencecommons.org/projects/publishing/open-access-data-protocol/), and may be freely downloaded and reused for any purpose, or aggregated with third party metadata using tools such as OpenFlyData.org, without attribution. In contrast, the higher resolution Drosophila testis gene expression images contained within the FlyTED Database are published under the Creative Commons Attribution 2.0 UK: England & Wales License, which permits reuse only if each image is attributed to Dr Helen White-Cooper. Further details concerning licensing and attribution are given on the ‘About the Database Licenses’ page of the FlyTED website. The database is not open for public data submission.

DATABASE ACCESS METHODS

Browsing interface

Users can browse FlyTED by gene name, strain name or gene expression location. For example, the ‘Browse by Gene Name’ view groups all the FlyTED images according to the gene names associated with the images, and construct a Web page listing the 817 gene names currently in our database, in alphabetical order. Each name links to one of 817 Web pages in which thumbnails of all images recorded for that gene name are displayed, ordered by strain name, each annotated with a caption containing its gene and strain name. This allows users to compare the images from different strains relating to a single gene together in a single page. Hovering the mouse over the centre of a thumbnail gives a pop-up box containing a description of the staining pattern. Once an individual image record has been selected, the user can view both the full-size image (by clicking on the thumbnail) and also a detailed information page containing an intermediate-sized image, descriptive metadata about the image and a link to the FlyBase database (by clicking on the image caption). Similar presentation of images is given in three other FlyTED browse views: the ‘Browse by CG Number’ view, that groups images by the CG number; the ‘Browse by Strain Name’ view, that groups images by the strain of fly from which the images were acquired; and the ‘Browse by Expression Location’ view, that groups images by the pattern of gene expression revealed in the images. In the last case, because our images are annotated using controlled terms from the extended Fly Anatomy Ontology, users can browse images using the hierarchical structure of the ontology, as shown in Figure 1. The number next to each term indicated how many images in FlyTED are annotated using that term and its sub-terms.
Figure 1.

FlyTED browse results, presented as an array of captioned image thumbnails, for the gene expression location Cup-like_pattern_of_dismal_end_of_elongating_spermatids.

FlyTED browse results, presented as an array of captioned image thumbnails, for the gene expression location Cup-like_pattern_of_dismal_end_of_elongating_spermatids. On the FlyTED home page, in addition to a general description of the database and a few exemplar images, users can also find links to pages providing details of the dataset and the database, other Drosophila resources such as FlyBase, and further relevant information. The footer on all pages of the database displays the license statements given above.

Search interfaces

We provide both a simple and an advanced search interface to permit users to make specific queries across the database content. The simple search interface allows querying for images by gene name, CG number or FlyBase gene id. Queries for multiple genes can be achieved by separating the names with commas. The advanced search interface (Figure 2A) supports more complex queries, allowing users to search across multiple gene names, and/or strain names, and/or gene expression locations. The image results are presented as a tiled array of captioned thumbnails (Figure 2B), allowing users to compare them side by side. Again, enlarged images can be viewed by clicking on the thumbnail images, and metadata can be displayed by clicking on the captions.
Figure 2.

Example of an advanced search, a search for images of genes CG18628 and MtnA that are expressed in a terminal epithelial cell. (A) The interface for entering the search conditions. (B) The search results with image thumbnails.

Example of an advanced search, a search for images of genes CG18628 and MtnA that are expressed in a terminal epithelial cell. (A) The interface for entering the search conditions. (B) The search results with image thumbnails.

Programmatic access

In addition to the conventional Web interfaces permitting human access to the FlyTED Database, programmatic access is provided as detailed in the ‘How to use the Database’ page on the FlyTED website, involving either a database SQL dump, OAI-PMH access (http://www.openarchives.org/OAI/openarchivesprotocol.html), or queries against a SPARQL endpoint (http://openflydata.org/query/flyted) (31).

DATABASE INTEROPERABILITY

In FlyTED, we provide links to FlyBase, the central Drosophila genomic database. More flexible cross search of Drosophila information can be found in our demonstration Drosophila data web application OpenFlyData (http://openflydata.org), a web application that allows scientists to cross search for Drosophila gene expression information from FlyTED, BDGP and FlyAtlas using any synonyms of a gene, either individually, by a batch of gene names or by gene expression profiles. Data integration between distributed resources containing heterogeneous data is a difficult task for which various approaches have previously been proposed (32). Our novel use of Semantic Web technologies in OpenFlyData has proven their value in promoting interoperability between the data resources, and in lowering the cost of development For this, accurate cross-database mapping of gene names and identifiers was a key prerequisite (12). However, the maintenance of such mapping between different identifiers in a reliable way, during the ongoing churn of database revisions and updates so eloquently described by Stein (32), presents a separate problem. In recent papers (33,34), we have proposed methods employing a set of RDF patterns called Named Graphs (http://www.w3.org/2004/03/trix/) (35) that can be adopted to express provenance information about data identifier mappings and to record nomenclature changes. Adoption of these patterns would permit database updates to be documented in machine-processable ways, and would allow third-party annotations made using an old nomenclature to be interpreted correctly in terms of a revised or updated nomenclature.

CONCLUSION

We report the creation by our Image Bioinformatics Research Group of FlyTED, a biological database for images of gene expression in the testis of D. melanogaster obtained by our Drosophila Spermatogenesis Research Group, the biological significance of which has been reported in the papers referenced above. FlyTED was created by adapting an existing software system, EPrints, and provides both human and programmatically accessible interfaces to the images and their metadata. In collaboration with the curators of the Fly Anatomy Ontology, we have corrected and expanded that section of the ontology dealing with the male reproductive system, to permit appropriate descriptions to be made in FlyTED of the germ cell developmental stages in which particular genes are expressed. FlyTED data acquisition is largely complete, although we are continuing work on a number of genes not yet characterized in testis that will be added to FlyTED at a later date. This work has triggered us to consider novel solutions to problems of database interoperability, including the creation of OpenFlyData, a data web to integrate Drosophila gene expression information.

FUNDING

UK Biotechnology and Biological Sciences Research Council (BBSRC BB/C503903/1 to H.W.-C. and D.S., BB/E018068/1 to D.S. and H.W.-C., BB/D009324/1 to H.W.-C.), Joint Information Systems Committee for the Defining Image Access Project and the FlyWeb Project (grant numbers not used). Funding for open access charge: Departmental funds. Conflict of interest statement. None declared.
  30 in total

1.  Spermatogenesis: analysis of meiosis and morphogenesis.

Authors:  Helen White-Cooper
Journal:  Methods Mol Biol       Date:  2004

2.  Regulation of transcription of meiotic cell cycle and terminal differentiation genes by the testis-specific Zn-finger protein matotopetli.

Authors:  Lucia Perezgasga; JianQiao Jiang; Benjamin Bolival; Mark Hiller; Elizabeth Benson; Margaret T Fuller; Helen White-Cooper
Journal:  Development       Date:  2004-04       Impact factor: 6.868

3.  Linked data and provenance in biological data webs.

Authors:  Jun Zhao; Alistair Miles; Graham Klyne; David Shotton
Journal:  Brief Bioinform       Date:  2008-12-06       Impact factor: 11.622

4.  Post-meiotic transcription in Drosophila testes.

Authors:  Carine Barreau; Elizabeth Benson; Elin Gudmannsdottir; Fay Newton; Helen White-Cooper
Journal:  Development       Date:  2008-04-23       Impact factor: 6.868

5.  Flytrap, a database documenting a GFP protein-trap insertion screen in Drosophila melanogaster.

Authors:  Reed J Kelso; Michael Buszczak; Ana T Quiñones; Claudia Castiblanco; Stacy Mazzalupo; Lynn Cooley
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

6.  Requirement for two nearly identical TGIF-related homeobox genes in Drosophila spermatogenesis.

Authors:  Zhaohui Wang; Richard S Mann
Journal:  Development       Date:  2003-07       Impact factor: 6.868

7.  Comet and cup genes in Drosophila spermatogenesis: the first demonstration of post-meiotic transcription.

Authors:  Carine Barreau; Elizabeth Benson; Helen White-Cooper
Journal:  Biochem Soc Trans       Date:  2008-06       Impact factor: 5.407

Review 8.  FlyMove--a new way to look at development of Drosophila.

Authors:  Katrin Weigmann; Robert Klapper; Thomas Strasser; Christof Rickert; Gerd Technau; Herbert Jäckle; Wilfried Janning; Christian Klämbt
Journal:  Trends Genet       Date:  2003-06       Impact factor: 11.639

9.  4DXpress: a database for cross-species expression pattern comparisons.

Authors:  Yannick Haudry; Hugo Berube; Ivica Letunic; Paul-Daniel Weeber; Julien Gagneur; Charles Girardot; Misha Kapushesky; Detlev Arendt; Peer Bork; Alvis Brazma; Eileen E M Furlong; Joachim Wittbrodt; Thorsten Henrich
Journal:  Nucleic Acids Res       Date:  2007-10-04       Impact factor: 16.971

10.  FlyMine: an integrated database for Drosophila and Anopheles genomics.

Authors:  Rachel Lyne; Richard Smith; Kim Rutherford; Matthew Wakeling; Andrew Varley; Francois Guillier; Hilde Janssens; Wenyan Ji; Peter Mclaren; Philip North; Debashis Rana; Tom Riley; Julie Sullivan; Xavier Watkins; Mark Woodbridge; Kathryn Lilley; Steve Russell; Michael Ashburner; Kenji Mizuguchi; Gos Micklem
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  19 in total

1.  Discovering non-coding RNA elements in Drosophila 3' untranslated regions.

Authors:  Cuncong Zhong; Justen Andrews; Shaojie Zhang
Journal:  Int J Bioinform Res Appl       Date:  2014

2.  Eukaryotic initiation factor 4E-3 is essential for meiotic chromosome segregation, cytokinesis and male fertility in Drosophila.

Authors:  Greco Hernández; Hong Han; Valentina Gandin; Lacramioara Fabian; Tiago Ferreira; Joanna Zuberek; Nahum Sonenberg; Julie A Brill; Paul Lasko
Journal:  Development       Date:  2012-07-25       Impact factor: 6.868

Review 3.  Investigating spermatogenesis in Drosophila melanogaster.

Authors:  Rafael S Demarco; Åsmund H Eikenes; Kaisa Haglund; D Leanne Jones
Journal:  Methods       Date:  2014-05-02       Impact factor: 3.608

4.  Paternal imprint essential for the inheritance of telomere identity in Drosophila.

Authors:  Guanjun Gao; Yan Cheng; Natalia Wesolowska; Yikang S Rong
Journal:  Proc Natl Acad Sci U S A       Date:  2011-03-07       Impact factor: 11.205

5.  Conservation of male-specific expression of novel phosphoprotein phosphatases in Drosophila.

Authors:  Csaba Adám; László Henn; Márton Miskei; Miklós Erdélyi; Péter Friedrich; Viktor Dombrádi
Journal:  Dev Genes Evol       Date:  2010-07-15       Impact factor: 0.900

6.  Single-cell RNA-sequencing reveals pre-meiotic X-chromosome dosage compensation in Drosophila testis.

Authors:  Evan Witt; Zhantao Shao; Chun Hu; Henry M Krause; Li Zhao
Journal:  PLoS Genet       Date:  2021-08-17       Impact factor: 5.917

7.  Three levels of regulation lead to protamine and Mst77F expression in Drosophila.

Authors:  Bridlin Barckmann; Xin Chen; Sophie Kaiser; Sunil Jayaramaiah-Raja; Christina Rathke; Christine Dottermusch-Heidel; Margaret T Fuller; Renate Renkawitz-Pohl
Journal:  Dev Biol       Date:  2013-03-04       Impact factor: 3.582

8.  Two rapidly evolving genes contribute to male fitness in Drosophila.

Authors:  Josephine A Reinhardt; Corbin D Jones
Journal:  J Mol Evol       Date:  2013-11-13       Impact factor: 2.395

Review 9.  Methods and tools for spatial mapping of single-cell RNAseq clusters in Drosophila.

Authors:  Stephanie E Mohr; Sudhir Gopal Tattikota; Jun Xu; Jonathan Zirin; Yanhui Hu; Norbert Perrimon
Journal:  Genetics       Date:  2021-04-15       Impact factor: 4.562

10.  Tissue, cell type and stage-specific ectopic gene expression and RNAi induction in the Drosophila testis.

Authors:  Helen White-Cooper
Journal:  Spermatogenesis       Date:  2012-01-01
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.