| Literature DB >> 29718322 |
Sebastian Proost1, Marek Mutwil1,2.
Abstract
The recent accumulation of gene expression data in the form of RNA sequencing creates unprecedented opportunities to study gene regulation and function. Furthermore, comparative analysis of the expression data from multiple species can elucidate which functional gene modules are conserved across species, allowing the study of the evolution of these modules. However, performing such comparative analyses on raw data is not feasible for many biologists. Here, we present CoNekT (Co-expression Network Toolkit), an open source web server, that contains user-friendly tools and interactive visualizations for comparative analyses of gene expression data and co-expression networks. These tools allow analysis and cross-species comparison of (i) gene expression profiles; (ii) co-expression networks; (iii) co-expressed clusters involved in specific biological processes; (iv) tissue-specific gene expression; and (v) expression profiles of gene families. To demonstrate these features, we constructed CoNekT-Plants for green alga, seed plants and flowering plants (Picea abies, Chlamydomonas reinhardtii, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Zea mays and Solanum lycopersicum) and thus provide a web-tool with the broadest available collection of plant phyla. CoNekT-Plants is freely available from http://conekt.plant.tools, while the CoNekT source code and documentation can be found at https://github.molgen.mpg.de/proost/CoNekT/.Entities:
Mesh:
Year: 2018 PMID: 29718322 PMCID: PMC6030989 DOI: 10.1093/nar/gky336
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Gene page contents exemplified with Arabidopsis PsaD-2. The gene page provides information (as tables) and links (in blue) specific to the gene. The links allow quick access to the co-expression neighborhood, cluster, gene family and phylogenetic tree of PsaD-2.
Species included in CoNekT-Plants
| Organism | Genome source | Class | Number of samples (retained) | Number of nodes | Number of HCCA clusters |
|---|---|---|---|---|---|
|
| TAIR10 | Eudicot | 913 | 27 172 | 479 |
|
| Phytozome v5.5 | Chlorophyceae | 605 | 17 741 | 273 |
|
| Phytozome v7.0 | Monocot | 750 | 39 717 | 662 |
|
| ConGenIE v1.0 | Pinopsida | 148 | 66 632 | 1814 |
|
| ITAG 3.10 | Eudicot | 706 | 34 879 | 612 |
|
| Genescope 12x | Eudicot | 612 | 26 346 | 499 |
|
| Ensembl Plants AGPv4 | Monocot | 574 | 39 000 | 728 |
The table indicates the genome source, phylogenetic class, number of RNA-seq samples that passed the LSTrAP quality control, number of nodes (genes) and the number of co-expression clusters identified by HCCA algorithm.
Figure 2.Comparative expression analysis. (A) Expression profile of Arabidopsis (genes starting with AT), rice (LOC), maize (Zm) and tomato (Solyc) PsaD gene family in roots, leaves, female reproduction (e.g. ovaries, stigma), stems, male reproduction (e.g. pollen, anthers) and fruits. The expression values of each gene were normalized by diving by their maximum, and range from 1 (red, maximum expression) to 0 (green, no expression). Missing expression data is shown with a black box (e.g. female reproduction and stems for rice). (B) Phylogenetic tree of the Cellulose Synthase-like D (CSLD) gene family. The heatmap shows the expression level in different tissues, full red dots show genes with low-expression and the bar on the right indicates the maximum expression level (TPM). The color of a gene identifier indicates the species. The added gray box contains genes that have shifted towards being expressed in roots. Note that OrthoFinder tree nodes do not contain bootstrap values, and should be interpreted with care. Missing data is indicated by absent box; for example, spruce has insufficient expression data to provide an informative expression.
Figure 3.Comparative network analysis of AtPsaD-2 and Solyc06g054260. AtPsaD-2 and tomato ortholog Solyc06g054260 are shown together with their co-expression neighborhood (co-expressed genes are connected using solid gray lines). Nodes with the same shape and color are members of the same OrthoGroup. Orthologs found in both neighborhoods are connected with dashed blue lines. The indicated menus are used to change the node and edge labels, network layout and export the networks as images and Cytoscape-compatible data. For clarity, non-conserved nodes were made semi-transparent, and the two query genes are connected by a solid edge.