| Literature DB >> 35605200 |
Peng Ken Lim1, Xinghai Zheng1, Jong Ching Goh1, Marek Mutwil2.
Abstract
There are now more than 300 000 RNA sequencing samples available, stemming from thousands of experiments capturing gene expression in organs, tissues, developmental stages, and experimental treatments for hundreds of plant species. The expression data have great value, as they can be re-analyzed by others to ask and answer questions that go beyond the aims of the study that generated the data. Because gene expression provides essential clues to where and when a gene is active, the data provide powerful tools for predicting gene function, and comparative analyses allow us to study plant evolution from a new perspective. This review describes how we can gain new knowledge from gene expression profiles, expression specificities, co-expression networks, differential gene expression, and experiment correlation. We also introduce and demonstrate databases that provide user-friendly access to these tools.Entities:
Keywords: co-expression; comparative transcriptomics; databases; differential expression; gene expression; gene function
Mesh:
Year: 2022 PMID: 35605200 PMCID: PMC9284291 DOI: 10.1016/j.xplc.2022.100323
Source DB: PubMed Journal: Plant Commun ISSN: 2590-3462
Figure 1Data mining workflows using the different onboard functions/tools of plant transcriptomic databases.
Colored edges connect different aims and databases to one type of function/tool.
Summary of the featured databases
| Databases | Notable onboard functionalities/statistical methods |
|---|---|
| ePlant by BAR | gene expression profiles as eFP heatmaps |
| Expression Atlas | gene expression profiles as heatmaps |
| GENEVESTIGATOR | gene expression profiles in the form of boxplots, scatterplots, or heatmaps |
| CoNekT-Plants | gene expression profiles as bar charts and heat maps |
| Expression Angler by BAR | hosts thematic co-expression data from well-curated samples of relevant studies |
| ATTED-II | hosts comprehensive co-expression data amalgamated from both RNA-seq and microarray samples |
| CoNekT-Plants | hosts comprehensive co-expression data from RNA-seq samples |
| AtCAST | enables users to search for control/treatment DGE comparisons based on single- or multi-gene query |
| Expression Atlas | enables users to search for control/treatment DGE comparisons based on single- or multi-gene query |
| GENEVESTIGATOR | enables users to search for control/treatment DGE comparisons based on single- or multi-gene query |
| Rice Expression Database | enables housekeeping and specific genes to be identified based on the stability of expression across different conditions |
| CoNekT-Plants | features tool to identify specific genes in tissue of choice |
| GENEVESTIGATOR | features tool to identify housekeeping genes based on the stability of expression across a set of user-defined conditions |
| AtCAST | enables users to search for correlated experiments based on an experimental condition or user-uploaded gene expression data |
| AtCAST | hosts species data for |
| ATTED-II | hosts species data for |
| CoNekT-Plants | hosts species data for |
| ePlant by BAR | hosts species data for |
| Expression Angler by BAR | hosts species data for |
| Expression Atlas | hosts species data for |
| GENEVESTIGATOR | hosts species data for |
| Rice Expression Database | hosts species data for |
DGE, differential gene expression; eFP, electronic pictograph; PCC, Pearson’s correlation coefficient; MR, mutual rank; HRR, highest reciprocal rank; GO, Gene Ontology; DEGs, differentially expressed genes; SPM, specificity measure.
Figure 2Methods to visualize gene expression profiles, demonstrated using AG (AT4G18960).
(A) ePlant: Plant Viewer eFP. Each organ is colored to indicate gene expression level; yellow and red colors indicate low and high transcripts per million (TPM)values, respectively.
(B) CoNekT-Plants. The samples and gene expression level are shown on the x and y axis, respectively. The bars and points indicate the mean and maximum/minimum expression.
(C) GENEVESTIGATOR developmental atlas. Points indicate mean values, and whiskers indicate standard errors.
(D) GENEVESTIGATOR: anatomy search tool. The different organs, tissues, and cell types have been arranged into logical hierarchies. The gene expression levels are indicated by a heatmap, where white and dark beige colors indicate low and high expression, respectively.
(E) Visualization of scRNA-seq data. Single-cell RNA-seq of A. thaliana cellulose synthase-like D3 (AT3G03050) visualized in ePlant. Each point depicts a single cell, and the low and high expression value of the gene are represented by yellow and red colors, respectively.
Figure 3Comparative gene expression analyses.
(A) Comparative expression heatmap of AG. Rows correspond to genes, and columns represent organs. Low and high expression values are represented by green and red cells, respectively. Black cells indicate missing data. Each row has been scaled to have one as the highest value.
(B) ePlant Navigator viewer. The cladogram captures phylogenetic relationships among genes, and the bars depict sequence and expression similarities of the genes to the query (AG).
(C) Phylogenetic tree of the AG orthogroup. The different species are represented by color-coded leaves (species without expression data are black). The expression levels in the organs are visualized as a heatmap, where low and high expressions are indicated by yellow and dark blue colors, respectively. The different clades and their expression are indicated by colored boxes. The species are indicated by gene names: Zm (Zea mays), LOC (Oryza sativa), Bradi (Brachypodium distachyon), Zosma (Zosteria marina), GS (Vitis vinifera), Sol (Solanum lycopersicum), AM (Amborella trichopoda), MA (Picea abies), and Gb (Ginkgo biloba).
Figure 4Different visualizations of co-expression networks.
(A) CoNekT-Plants: co-expression cluster 69 that contains AG and 74 other genes and is enriched in genes with GO:0080086 (stamen filament development), GO:0048443 (stamen development), GO:0048441 (petal development), and GO:0009733 (response to auxin). For brevity, only genes involved in flower development (yellow rectangle) and response to auxin (blue squares) are displayed. Genes from the same gene family are labeled with the same node shape and color.
(B) ATTED-II: co-expression network from the local co-expression neighborhood of AG. A thicker edge indicates a stronger correlation between genes. Red dotted lines indicate protein-protein interactions.
(C) CoNekT-Plants: cross-species comparison between cluster 69 in A. thaliana (green rectangle) and cluster 51 in V. vinifera (red rectangle). Blue dotted edges connect genes from the same gene family. Genes from the same gene family are labeled with the same node shape and color.