| Literature DB >> 30231853 |
Andrew J Robinson1, Muluneh Tamiru2, Rachel Salby3, Clayton Bolitho3, Andrew Williams3, Simon Huggard3, Eva Fisch3, Kathryn Unsworth4, James Whelan2, Mathew G Lewsey5.
Abstract
BACKGROUND: The genome-wide expression profile of genes in different tissues/cell types and developmental stages is a vital component of many functional genomic studies. Transcriptome data obtained by RNA-sequencing (RNA-Seq) is often deposited in public databases that are made available via data portals. Data visualization is one of the first steps in assessment and hypothesis generation. However, these databases do not typically include visualization tools and establishing one is not trivial for users who are not computational experts. This, as well as the various formats in which data is commonly deposited, makes the processes of data access, sharing and utility more difficult. Our goal was to provide a simple and user-friendly repository that meets these needs for data-sets from major agricultural crops. DESCRIPTION: AgriSeqDB ( https://expression.latrobe.edu.au/agriseqdb ) is a database for viewing, analysing and interpreting developmental and tissue/cell-specific transcriptome data from several species, including major agricultural crops such as wheat, rice, maize, barley and tomato. The disparate manner in which public transcriptome data is often warehoused and the challenge of visualizing raw data are both major hurdles to data reuse. The popular eFP browser does an excellent job of presenting transcriptome data in an easily interpretable view, but previous implementation has been mostly on a case-by-case basis. Here we present an integrated visualisation database of transcriptome data-sets from six species that did not previously have public-facing visualisations. We combine the eFP browser, for gene-by-gene investigation, with the Degust browser, which enables visualisation of all transcripts across multiple samples. The two visualisation interfaces launch from the same point, enabling users to easily switch between analysis modes. The tools allow users, even those without bioinformatics expertise, to mine into data-sets and understand the behaviour of transcripts of interest across samples and time. We have also incorporated an additional graphic download option to simplify incorporation into presentations or publications.Entities:
Keywords: Agriculture; Barley; Database; Gene-expression; Maize; RNA-Seq; Rice; Tomato; Transcriptomics; Visualisation; Wheat
Mesh:
Substances:
Year: 2018 PMID: 30231853 PMCID: PMC6146512 DOI: 10.1186/s12870-018-1406-2
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Fig. 1High level structure of AgriSeqDB showing the linkage between data browsers and the central Landing Portal. The Landing Portal provides a central place to access all data-sets and provide meta-data that isn’t provided by data browsers. The data browsers provide access to the same data in various forms to enable greater insight
Fig. 2AgriSeqDB home screen showing the six data-sets from species including crops species of major agricultural importance that are currently in the database
RNA-Seq data-sets included in AgriSeqDB
| Data-set | Species | Tissue/cell type | Developmental stage/treatment | Data source | Reference |
|---|---|---|---|---|---|
| Seed germination | Arabidopsis | Whole seed | 0 to 48 h post stratification | GSE94459 | [ |
| Seed germination | Barley | aleurone, starchy endosperm, embryo, scutellum, pericarptesta, husk and crushed cell layers | 0 to 24 h | PRJNA378132 | [ |
| Endosperm development | Maize | Different cell types of endosperm (embryo, nucellus, placento-chalazal region, pericarp, and the vascular region of the pedicel) | 8 d after pollination | GSE62778 | [ |
| Seed germination and coleoptile growth | Rice | Embryo and coleoptile | 0 h to 4 d | GSE115373) | [ |
| Grain/endosperm development | Bread wheat | starchy | 10, 20, or 30 days post anthesis (DPA) | E-MTAB-2137 | [ |
| Fruit development | Tomato | Fruit | Mature ripe fruits | GSE75273 | [ |
Fig. 3The full screenshot showing AT2G40170 gene expression in GeneView (eFP) browser. The user uses the search form at the top to select the gene of interest and select the mode of operation including: (1) absolute, shows the counts as stored in the database for the primary gene, (2) relative, shows the counts relative to the control for primary gene, and (3) compare, counts as a ratio between the primary and secondary genes. Clicking the view button updates the figure below to show the expression levels of each sample by colour coding the fill area with a scale red-yellow (for absolute) and red-grey-blue (for relative and compare). Alternatively, the user can click the download button (indicated by a green arrowhead) to download the expression image at twice the resolution as shown on-screen (ready for publication). Data is from transcriptome of Arabidopsis seeds during germination [20]
Fig. 4Screen shot of the GeneExplore (Degust) browser and subsequent result pages. a the Arabidopsis time-series data-set is shown here as an example, displaying transcripts that are up-regulated in S samples but down-regulated in SL samples (Top panel). The user can select which samples they wish to see with the checkboxes in the top left of screen along with the method of analysis (voom/limma, edgeR, or voom). In the top right, the user can control the rendering and thresholds of using the options dialog. All genes that match filters above are shown in a heat-map, which clusters genes with similar levels of expression (Middle panel). Running the mouse-over each gene highlights it in the plots above. Table showing all matching genes in tabular format with the expression levels for each sample, false discovery rate and any extra annotation columns provided in the data-set (Lower panel). In the top centre the user can limit genes by using 1 of 3 interactive plots, and the parallel coordinates plot allows the user to limit genes by their log fold gene expression (per sample). b Example of an MA plot. Users can limit genes by drawing a box around genes on the on the MA plot; the two samples used for the MA plot are specified in the Options dialog (top right). c An MDS plot showing groupings of the individual replicates of each sample. Data is from transcriptome of Arabidopsis seed during germination [20]
Fig. 5The process of uploading a data-set to a local installation of AgriSeqDB consists of 3-steps. First, the user selects a unique identifier and display name for the data-set (a). Second, the user chooses the count file and various eFP images used to display expression values from their PC to upload (b). Finally, the user can alter many settings that control how the data-set is displayed in the landing portal and each data-viewer as shown in Additional file 1 (see Supplementary Fig. 1). Most settings have sensible defaults where user input is required, time saving tools (such as colour picker & clickable image for eFP settings) or spreadsheet import/export (sample settings)