Literature DB >> 25404137

SuperFly: a comparative database for quantified spatio-temporal gene expression patterns in early dipteran embryos.

Damjan Cicin-Sain1, Antonio Hermoso Pulido2, Anton Crombach1, Karl R Wotton1, Eva Jiménez-Guri1, Jean-François Taly2, Guglielmo Roma2, Johannes Jaeger3.   

Abstract

We present SuperFly (http://superfly.crg.eu), a relational database for quantified spatio-temporal expression data of segmentation genes during early development in different species of dipteran insects (flies, midges and mosquitoes). SuperFly has a special focus on emerging non-drosophilid model systems. The database currently includes data of high spatio-temporal resolution for three species: the vinegar fly Drosophila melanogaster, the scuttle fly Megaselia abdita and the moth midge Clogmia albipunctata. At this point, SuperFly covers up to 9 genes and 16 time points per species, with a total of 1823 individual embryos. It provides an intuitive web interface, enabling the user to query and access original embryo images, quantified expression profiles, extracted positions of expression boundaries and integrated datasets, plus metadata and intermediate processing steps. SuperFly is a valuable new resource for the quantitative comparative study of gene expression patterns across dipteran species. Moreover, it provides an interesting test set for systems biologists interested in fitting mathematical gene network models to data. Both of these aspects are essential ingredients for progress toward a more quantitative and mechanistic understanding of developmental evolution.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2014        PMID: 25404137      PMCID: PMC4383950          DOI: 10.1093/nar/gku1142

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

One of the central challenges of current biology is to understand how differences in developmental processes between species lead to the wide variety of organismic patterns and forms we observe in nature. Developmental processes are influenced by gene regulatory networks and their associated expression dynamics (see, for example, (1)). A mechanistic understanding of developmental evolution therefore requires systematic and quantitative comparative studies of gene regulatory networks based on a combination of experimental data and mathematical modeling (2). An important prerequisite for this kind of study is to carefully quantify gene expression patterns in different species of interest. Here, we focus on spatio-temporal gene expression patterns during early development in several dipteran insects (dipterans include flies, midges and mosquitoes). Specifically, our SuperFly database (http://superfly.crg.eu) includes expression data for segmentation genes during the cleavage and blastoderm stages of development in three species: our reference, the vinegar fly Drosophila melanogaster (family: Drosophilidae) (3–5), as well as two emerging non-drosophilid model organisms, the scuttle fly Megaselia abdita (Phoridae) (6–11) and the moth midge Clogmia albipunctata (Psychodidae) (10,12–14). Currently, SuperFly contains spatio-temporal expression data for three maternal co-ordinate genes (bicoid, bcd; hunchback, hb; and caudal, cad), as well as the trunk gap genes (hb; Krüppel, Kr; knirps, kni/knirps-like, knl; and giant, gt), and terminal gap genes tailless (tll) and huckebein (hkb). Maternal co-ordinate and gap genes constitute the top-most regulatory tier of the segmentation gene network that lays down the molecular pre-pattern underlying the dipteran body plan along the anterior-posterior (A–P) axis of the embryo (15–17). The gap gene network in D. melanogaster is ideally suited as a starting point for quantitative, comparative approaches, since it is one of the most thoroughly studied developmental gene regulatory networks today (18). Our expression data are of high spatial and temporal accuracy and resolution (19,20). They consist of (i) carefully curated and staged images of blastoderm embryos stained against one or two mRNA gene products by whole-mount in situ hybridization (13,19,21); (ii) the associated quantified spatial gene expression profiles classified by gene (see above) and developmental time point (cleavage cycles 1–14A, C1–C14A; cleavage cycle C14A is subdivided into eight time classes, C14-T1–T8) (3,11,22); and (iii) integrated spatio-temporal expression data, averaged across all embryos stained for a given gene at a given time point (19,21). The current number of genes, time points and embryos in SuperFly are shown in Table 1. As a representative example, our D. melanogaster dataset consists of expression patterns for 9 genes at 15 developmental time points, comprising a total of 762 stained embryos.
Table 1.

SuperFly database content: number of genes, time points and embryos for each species.

SpeciesD. melanogasterM. abditaC. albipunctata
Genes985
Time points1516NA (5)a
Embryos762938123

aNA means ‘not available’. Embryos of C. albipunctata (13) were collected prior to the establishment of a time classification scheme that is comparable to the other two species (14). For this reason, we used an alternative scheme for classifying embryos into five time classes based on carefully timed embryo fixations (21).

aNA means ‘not available’. Embryos of C. albipunctata (13) were collected prior to the establishment of a time classification scheme that is comparable to the other two species (14). For this reason, we used an alternative scheme for classifying embryos into five time classes based on carefully timed embryo fixations (21). SuperFly distinguishes itself from existing fly-oriented databases, such as FlyBase (with its associated BDGP in situ database) (23), FlyMine (24), FlyAtlas (25), FlyEx (26,27) and FlyView (28), because of its unique combination of quantitative spatio-temporal gene expression patterns and its focus on emerging non-drosophilid model systems. SuperFly has a higher temporal resolution than many drosophilid datasets (e.g. the BDGP in situ database, and the 3D expression atlas presented in (29)). Our data are based on enzymatic in situ hybridization and wide-field microscopy (19,20). These do not quite reach the accuracy of FlyEx, which contains data based on immunofluorescence in combination with confocal miscroscopy (22,30), nor of a related spatio-temporal expression dataset in C. albipunctata (31). On the other hand, SuperFly provides equivalent, high spatial and temporal resolution compared to these datasets, and a much wider, more comprehensive, coverage of genes in non-drosophilid flies. Other cross-species databases either remain limited to drosophilids (29), or only contain microarray and/or RNA-seq data without a spatial component (e.g. FlyBase). For these reasons, we believe that SuperFly constitutes a valuable new resource for the communities of fly developmental geneticists and evolutionary developmental biologists.

DATABASE STRUCTURE AND CONTENT

SuperFly is implemented as a MySQL (http://www.mysql.com) relational database, embedded in a standard LAMP (Linux, Apache, MySQL and PHP) environment. The database is composed of 3 core and 16 auxiliary tables (Figure 1a). Core tables store embryo images, extracted expression profiles and position of expression boundaries. Auxiliary tables contain additional information concerning data annotation, classification and processing.
Figure 1.

SuperFly database schema. (a) Cyan boxes indicate core tables, pink boxes are auxiliary tables that store metadata in a structured manner. (b) Core tables: embryo represents a single stained embryo (shown with lateral orientation: anterior is to the left, dorsal up); profile represents a one-dimensional A–P expression profile extracted from a specific staining; manual_slope stores information on clamped splines that mark the position of expression domain boundaries. See text for details.

SuperFly database schema. (a) Cyan boxes indicate core tables, pink boxes are auxiliary tables that store metadata in a structured manner. (b) Core tables: embryo represents a single stained embryo (shown with lateral orientation: anterior is to the left, dorsal up); profile represents a one-dimensional A–P expression profile extracted from a specific staining; manual_slope stores information on clamped splines that mark the position of expression domain boundaries. See text for details.

Embryo images, expression profiles and expression domain boundaries

The embryo core table contains wide-field microscopy images of laterally aligned dipteran embryos at blastoderm stage (Figure 1b). These embryos are stained for one or two segmentation gene products (mRNA) using an enzymatic (colorimetric) whole-mount in situ hybridization protocol as previously described (19). Images of stained embryos were processed using our FlyGUI quantification tool (20) in order to extract mRNA expression profiles along with the A–P axis. The corresponding tabulated expression data are stored in the profile core table (Figure 1b). Finally, we determined the position of expression domain boundaries by fitting clamped spline curves to the extracted expression profiles (20). Spline data are stored in the manual_slope core table (Figure 1b). Information on data annotation and classification (species and gene names, developmental time class, genotype: wild-type or mutant/RNAi knock-down, comments on embryo morphology and alignment/orientation, etc.) as well as intermediate processing steps (nuclear stains used for time classification, binary embryo masks, lateral band along the midline used for profile extraction, etc.) are stored in auxiliary tables (Figure 1a).

Integrated expression data

For each of the three dipteran species that are currently available in SuperFly, we integrated expression domain boundaries of the four trunk gap genes (hb, Kr, gt, kni) (18) across embryos for each gene and time class. This was achieved by pooling data points into 13 (C11), 25 (C12), 50 (C13) and 100 (C14A) bins along the A–P axis and averaging the resulting sets of measurements (19,30). These data were further post-processed and re-scaled to facilitate comparison between species as previously described (19). The resulting data are not part of the database schema described above, but can be accessed through a separate tab on the SuperFly web site (see below).

DATABASE ACCESS AND WEB SITE FEATURES

The web interface of SuperFly is implemented using PHP (http://php.net), JavaScript and GNUPlot (http://www.gnuplot.info). It provides easy access to embryo images/quantified expression profiles, measured boundary positions and integrated datasets through three corresponding tabs in the header area (Figure 2a). Basic and advanced search criteria are provided on the left. A fourth tab, named ‘Data Overview’, allows the user to generate a summary of the database contents per species. In addition, the main page provides a basic introduction to the species and the data in SuperFly, with brief instructions on how to query the database. Additional help and documentation are accessible via the ‘Help’ and ‘About’ hyperlinks next to the four tabs in the header area.
Figure 2.

Example of a SuperFly embryo search. (a) After selecting Drosophila melanogaster as the species, the gene Krüppel (Kr), and two time classes (C14-T7, T8), 22 embryos are retrieved and shown. (b) For each embryo, the user can display and download detailed information via the ‘Details...’ hyperlink. Selecting the first embryo brings up a panel containing annotation metadata, intermediate processing results and the gene expression boundaries extracted from the selected expression profile. (c) It is possible to visually compare expression boundaries between different species via the ‘Draw all in one’ option in the ‘Boundaries’ tab.

Example of a SuperFly embryo search. (a) After selecting Drosophila melanogaster as the species, the gene Krüppel (Kr), and two time classes (C14-T7, T8), 22 embryos are retrieved and shown. (b) For each embryo, the user can display and download detailed information via the ‘Details...’ hyperlink. Selecting the first embryo brings up a panel containing annotation metadata, intermediate processing results and the gene expression boundaries extracted from the selected expression profile. (c) It is possible to visually compare expression boundaries between different species via the ‘Draw all in one’ option in the ‘Boundaries’ tab. For embryo images and boundaries, search queries are composed by selecting a species, genotype (RNAi knock-down), gene and time class from the lists displayed on the left-hand side of the screen. In order to allow for a flexible workflow, the user can select multiple genes, knock-down backgrounds and/or time classes. Additional search options are available under the ‘Toggle extra options...’ hyperlink, where the user can restrict the search, for example, to embryo images with stains of the highest quality only. For basic queries, we have provided default values for these advanced options, which should suffice for regular use of SuperFly. Query results for the embryo search are displayed in a single column and divided over multiple pages in case of more than 15 items (Figure 2a). For each embryo matching a query, we show a brightfield image, the extracted gene expression profile and the corresponding expression boundaries (fitted splines). Further information on each embryo is available by following the ‘Details...’ hyperlink (Figure 2b). This brings up an overlaying panel that contains all available embryo images (brightfield and DIC images, nuclear counterstain and membrane morphology for time classification, as well as binary embryo masks with and without the position of the midline area), together with annotation on the gene that is displayed, the quality of the embryo (e.g. is it damaged or misshapen? is it oriented laterally?), the quality of the binary mask, the quality of the stain (background levels, strength of signal, etc.) and the extracted one-dimensional expression profile with corresponding expression domain boundaries represented by clamped splines (Figure 2b). For the boundary search, a single plot displays all selected boundaries at a given time point that match a given query. In this plot, boundaries from individual embryos are shown in black, while median boundary positions are colored by gene. The boundaries of multiple species can be plotted and compared in a single graph (per time class), by ticking the option ‘Draw all in one’. In this manner, the user can explore common features and differences between species in a visual, qualitative manner (Figure 2c). For further off-line analysis, the user can download results on selected embryos by means of compressed ZIP archives. Individual images and on-the-fly generated plots of expression profiles and boundaries are also available for download in PDF format by clicking on the corresponding icon next to the displayed data. Finally, integrated datasets are available for download as simple text tables and have been visualized by 2D and 3D plots for use in presentations and lectures.

FUTURE DEVELOPMENT OF SuperFly

In terms of content, SuperFly contains expression data for maternal co-ordinate and trunk gap genes in three dipteran species. Apart from wild-type patterns, we also provide a range of RNAi knock-down experiments for M. abdita. We plan to extend the range of our data in various directions in the future. First of all, we will include additional non-drosophilid dipteran species. Suitable candidates are the marmalade hoverfly Episyrphus balteatus (32,33), the midge Chironomus riparius or the malaria mosquito Anopheles gambiae (34). Second, we will extend SuperFly with data for additional segmentation genes—such as pair-rule and segment-polarity genes—and factors involved in dorso-ventral patterning (35–37). Finally, SuperFly will be extended to include genes that are active during later stages of development, after the onset of gastrulation (38–41). In terms of the web interface, we plan to add enhanced visualization tools for quantitative and qualitative comparisons of gene expression domains, as the database comes to include more genes and species.

CONCLUSION

SuperFly (http://superfly.crg.eu) provides a detailed and accurate dataset on spatio-temporal expression of segmentation genes in various dipteran species. It contains expression data from an established laboratory model, the vinegar fly D. melanogaster, plus data from two emerging non-drosophilid models: the scuttle fly M. abdita and the moth midge C. albipunctata. SuperFly is the first multi-species database for the comparative study of spatio-temporal expression data that includes non-drosophilid dipterans. It allows evolutionary and developmental biologists to assess quantitative differences in timing and position of expression features between species. This is of central importance for understanding the evolution of developmental processes. In addition, our dataset provides an interesting resource for systems biologists interested in network inference from complex spatio-temporal expression data. It is unique in its high temporal and spatial resolution, and the large number of replicate measurements for each boundary providing estimates of positional variability. This provides an ideal test case for developing model-fitting algorithms and methods for parameter identifiability analysis (42–45). Finally, SuperFly aims to be a prototype for systematic, rigorous solutions to the growing need for structured image storage and analysis. With its intuitive web interface, we provide the communities of fly developmental geneticists and evolutionary developmental biologists with access not only to integrated, processed datasets, but also to the raw imaging data and intermediate processing results generated in our laboratory.
  42 in total

Review 1.  Pipeline for acquisition of quantitative data on segmentation gene expression from confocal images.

Authors:  Svetlana Surkova; Ekaterina Myasnikova; Hilde Janssens; Konstantin N Kozlov; Anastasia A Samsonova; John Reinitz; Maria Samsonova
Journal:  Fly (Austin)       Date:  2008-03-08       Impact factor: 2.160

2.  A quantitative spatiotemporal atlas of gene expression in the Drosophila blastoderm.

Authors:  Charless C Fowlkes; Cris L Luengo Hendriks; Soile V E Keränen; Gunther H Weber; Oliver Rübel; Min-Yu Huang; Sohail Chatoor; Angela H DePace; Lisa Simirenko; Clara Henriquez; Amy Beaton; Richard Weiszmann; Susan Celniker; Bernd Hamann; David W Knowles; Mark D Biggin; Michael B Eisen; Jitendra Malik
Journal:  Cell       Date:  2008-04-18       Impact factor: 41.582

3.  FlyView, a Drosophila image database, and other Drosophila databases

Authors: 
Journal:  Semin Cell Dev Biol       Date:  1997-10       Impact factor: 7.727

4.  A systematic analysis of the gap gene system in the moth midge Clogmia albipunctata.

Authors:  Mónica García-Solache; Johannes Jaeger; Michael Akam
Journal:  Dev Biol       Date:  2010-04-28       Impact factor: 3.582

Review 5.  The molecular genetics of embryonic pattern formation in Drosophila.

Authors:  P W Ingham
Journal:  Nature       Date:  1988-09-01       Impact factor: 49.962

6.  Medium-throughput processing of whole mount in situ hybridisation experiments into gene expression domains.

Authors:  Anton Crombach; Damjan Cicin-Sain; Karl R Wotton; Johannes Jaeger
Journal:  PLoS One       Date:  2012-09-28       Impact factor: 3.240

7.  Studies of nuclear and cytoplasmic behaviour during the five mitotic cycles that precede gastrulation in Drosophila embryogenesis.

Authors:  V E Foe; B M Alberts
Journal:  J Cell Sci       Date:  1983-05       Impact factor: 5.285

8.  A staging scheme for the development of the scuttle fly Megaselia abdita.

Authors:  Karl R Wotton; Eva Jiménez-Guri; Belén García Matheu; Johannes Jaeger
Journal:  PLoS One       Date:  2014-01-07       Impact factor: 3.240

9.  A quantitative atlas of Even-skipped and Hunchback expression in Clogmia albipunctata (Diptera: Psychodidae) blastoderm embryos.

Authors:  Hilde Janssens; Ken Siggens; Damjan Cicin-Sain; Eva Jiménez-Guri; Marco Musy; Michael Akam; Johannes Jaeger
Journal:  Evodevo       Date:  2014-01-07       Impact factor: 2.250

10.  Reverse-engineering post-transcriptional regulation of gap genes in Drosophila melanogaster.

Authors:  Kolja Becker; Eva Balsa-Canto; Damjan Cicin-Sain; Astrid Hoermann; Hilde Janssens; Julio R Banga; Johannes Jaeger
Journal:  PLoS Comput Biol       Date:  2013-10-31       Impact factor: 4.475

View more
  8 in total

1.  Developing an integrated understanding of the evolution of arthropod segmentation using fossils and evo-devo.

Authors:  Ariel D Chipman; Gregory D Edgecombe
Journal:  Proc Biol Sci       Date:  2019-10-02       Impact factor: 5.349

2.  Maternal co-ordinate gene regulation and axis polarity in the scuttle fly Megaselia abdita.

Authors:  Karl R Wotton; Eva Jiménez-Guri; Johannes Jaeger
Journal:  PLoS Genet       Date:  2015-03-10       Impact factor: 5.917

3.  High-resolution gene expression data from blastoderm embryos of the scuttle fly Megaselia abdita.

Authors:  Karl R Wotton; Eva Jiménez-Guri; Anton Crombach; Damjan Cicin-Sain; Johannes Jaeger
Journal:  Sci Data       Date:  2015-03-03       Impact factor: 6.444

4.  Germ line transformation and in vivo labeling of nuclei in Diptera: report on Megaselia abdita (Phoridae) and Chironomus riparius (Chironomidae).

Authors:  Francesca Caroti; Silvia Urbansky; Maike Wosch; Steffen Lemke
Journal:  Dev Genes Evol       Date:  2015-06-05       Impact factor: 0.900

5.  Quantitative system drift compensates for altered maternal inputs to the gap gene network of the scuttle fly Megaselia abdita.

Authors:  Karl R Wotton; Eva Jiménez-Guri; Anton Crombach; Hilde Janssens; Anna Alcaine-Colet; Steffen Lemke; Urs Schmidt-Ott; Johannes Jaeger
Journal:  Elife       Date:  2015-01-05       Impact factor: 8.140

6.  Geometric Morphometrics on Gene Expression Patterns Within Phenotypes: A Case Example on Limb Development.

Authors:  Neus Martínez-Abadías; Roger Mateu; Martina Niksic; Lucia Russo; James Sharpe
Journal:  Syst Biol       Date:  2015-09-16       Impact factor: 15.683

7.  Analysis of functional importance of binding sites in the Drosophila gap gene network model.

Authors:  Konstantin Kozlov; Vitaly V Gursky; Ivan V Kulakovskiy; Arina Dymova; Maria Samsonova
Journal:  BMC Genomics       Date:  2015-12-16       Impact factor: 3.969

8.  Gap Gene Regulatory Dynamics Evolve along a Genotype Network.

Authors:  Anton Crombach; Karl R Wotton; Eva Jiménez-Guri; Johannes Jaeger
Journal:  Mol Biol Evol       Date:  2016-01-21       Impact factor: 16.240

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.