| Literature DB >> 26968660 |
Francisco J Romero-Campero1, Ignacio Perez-Hurtado1, Eva Lucas-Reina2, Jose M Romero2, Federico Valverde3.
Abstract
BACKGROUND: Chlamydomonas reinhardtii is the model organism that serves as a reference for studies in algal genomics and physiology. It is of special interest in the study of the evolution of regulatory pathways from algae to higher plants. Additionally, it has recently gained attention as a potential source for bio-fuel and bio-hydrogen production. The genome of Chlamydomonas is available, facilitating the analysis of its transcriptome by RNA-seq data. This has produced a massive amount of data that remains fragmented making necessary the application of integrative approaches based on molecular systems biology.Entities:
Keywords: Chlamydomonas reinhardtii; RNA-seq; gene co-expression networks; green algae; light-regulated transcription factors and transcriptional regulators; molecular systems biology; transcriptomics
Mesh:
Substances:
Year: 2016 PMID: 26968660 PMCID: PMC4788957 DOI: 10.1186/s12864-016-2564-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Network Construction and Topology Analysis. a Correlation Threshold Selection. The blue line shows that at the absolute value of correlation 0.90 the scale-free model fit exhibits a maximum. The red line shows that for increasing correlation thresholds the average connectivity of the network decreases, nonetheless for 0.90 it still presents a high value. According to this, the correlation threshold used to generate ChlamyNET was fixed to 0.90. b The degree distribution of a scale-free network follows an exponential negative distribution. The scale-free topology fit of ChlamyNET was computed using linear regression over the logarithmic transform of its degree distribution. c The clustering coefficient of a node or gene represents the degree of co-expression or correlation between its neighbours. Genes with a high clustering coefficient posses a high degree of co-expression or coordination among its co-expressed genes. ChlamyNET exhibits the high average clustering coefficient of 0.66. d ChlamyNET constitutes a small world network, that is a scale-free network with a high clustering coefficient. This is reflected in the fact that the average minimal path length between genes is 7.5
Fig. 2- Network Visualization, Hubs and Clustering Coefficient. a Graphical representation of ChalmyNET consisting of 9171 genes or nodes and 139019 co-expression relationships or edges. It is organized into a major connected component where most of the genes are located and a multitude of small components. b Network hubs. We have represented in yellow hub genes characterized by being co-expressed with a large number of other genes. Note that hub genes are located in specific regions of the network. c Authoritative hubs. Those hubs whose neighbours are highly connected, are mainly located at the core of the network. These authoritative hubs are represented in red. d The clustering coefficient of a gene meassures the degree of co-expression among its co-expressed genes. Genes with a high clustering coefficient are coloured in darker blue than those with a low clustering coefficient. Notice that regions of genes with a high clustering coefficient overlap with those where hubs are located
Biological Process GO terms significally enriched in the 1000 most authoritative hub genes in ChlamyNET
| GO term | Representative Genes | Potential | Number of neighbours |
|---|---|---|---|
| protein phosphorylation GO:0006468 (p-value 2.6 x 10-11) | Cre02.g108700 - Serine/Threonine Protein Kinase | At5g08160 | 245 |
| transmembrane transport GO:0055085 (p-value 3 x 10-7) | Cre09.g396000 - Nitrate Transporter | At1g12940 | 250 |
| response to light stimulus GO:0009416 (p-value 2 x 10-5) | Cre03.g182700 - Bbox Protein Cre02.g118000 - Photolyase Cre12.g510200 - bZIP Protein g6302 - Constans-like Cre06.g295200 - Cryptochrome | At5g15850 | 259 |
| carbohydrate metabolic process GO:0005975 (p-value 8 x 10-5) | Cre07.g336950 - Alpha-glucan phosphorylase Cre08.g362450 - Alpha Amylase g3160 - Isoamylase Cre04.g215150 - Soluble Starch Synthase | At3g46970 | 257 |
| nitrogen compound metabolic process GO:0006807 (p-value 5.2 x 10-5) | Cre09.g410950 - Nitrate reductase Cre09.g410750 - Nitrite Reductase Cre03.g207250 - Glutamine synthetase | At1g37130 | 251 |
Fig. 3Selection of the Clustering Algorithm and Number of Clusters using as Criterion the Clustering Silhouette. a Algorithm and number of clusters selection. The absolute value of Pearson correlation coefficient between gene expression profiles was used as gene similarity measure to perform our clustering analysis. The performance of the clustering algorithms hierarchical clustering (HCLUST in red triangles) and partition around medoids (PAM in blue squares) were compared for different number of clusters ranging from 4 to 20 using the clustering silhouette. The highest silhouette value was reached for the PAM algorithm with nine cluster (marked with an arrow). b Silhouette for PAM with nine clusters. The silhouette of a clustering measures both the inter and intra cluster similarities. The best clustering silhouette obtained with the PAM algorithm for nine clusters is shown. Each horizontal line represents a gene in a given cluster. A high positive value indicates a gene with a high intra cluster similarity and a low inter cluster similarity. Whereas a negative value indicates a gene with a low intra cluster similarity and a high inter cluster similarity. Genes belonging to the same cluster are represented with the same colour. For each cluster from one to nine, the number of genes and its average silhouette are specified
Fig. 4ChlamyNET Clustering and Cluster Functional Annotation. In ChlamyNET each node specifies a gene and an edge between genes represents that the corresponding gene expression profiles exhibit an absolute Pearson correlation coefficient value greater than 0.90. Therefore, edges represent co-expression relationships. Blue edges stand for positive correlation whereas pink edges stand for negative values. The nine different gene clusters are identified by numbers and different colours corresponding to the code in Fig. 3. Clusters are also annotated with the biological processes where the corresponding genes are potentially involved in
Biological Process GO terms significally enriched in the clusters of the gene co-expression network ChlamyNET and the Metabolic and Signalling Pathways contained in each cluster
| Cluster | Functional Annotation | Representative Genes | Metabolic/Signalling Pathways |
|---|---|---|---|
| Cluster 2 (Brown) 535 genes Silhoutte 0.44 | DNA replication (GO:0006260) | Cre01.g015250 - POLD1 Cre16.g651000 - RFA1 | Pyrimidine deoxyribonucleotides de novo biosynthesis pathway Cre16.g667850 - DUT Cre17.g715900 - THY Cre03.g190800 - TMPK |
| Chromosome organization (GO:00051276) | Cre02.g086650 - SMC2 Cre12.g4 934 00 - SMC4 | ||
| Regulation of Cell Cycle (GO:0010564) | Cre10.g466200 - CYCAB1 Cre03.g207900 - CYCA1 | ||
| Cluster 9 (Blue) 1058 genes Silhouette 0.40 | protein phosphorylation (GO:0006468) | Cre17.g742400 - PTK17 Cre12.g537400 - CrAUR3 | Starch Biosynthetic Pathway Cre04.g215150 - SSS Sucrose Biosynthetic Pathway Cre06.g283400 - SPP Nitrogen Assimilation Pathway Cre09.g410750 - NII1 |
| carbohydrate metabolic process (GO:0005975) | Cre08.g384750 - AMY Cre10.g444700 - SBE3 | ||
| transmembrane transport (GO:0055085) | Cre09.g396000 - NRT2.3 Cre13.g564650 - MRS5 | ||
| Cluster 1 (Orange) 824 genes Silhouette 0.38 | vesicle-mediated transport (GO:0016192) | Cre17.g728150 - Yky6 Cre16.g676650 - AP1G1 | TAG Biosynthetic Pathway Cre02.g106400 - PDAT Phospholipid Biosynthetic Pathway Cre01.g035500 - PI3K Coenzyme A Biosynthetic Pathway Cre01.g048050 - COAB |
| GTPase activity (GO:0043087) | Cre12.g532600 - CGL44 Cre07.g315350 - RABGAP | ||
| Autophagy (GO:0006914) | Cre09.g391500 - APG9 | ||
| Cluster 3 (Red) 1723 genes Silhouette 0.28 | protein phosphorylation (GO:0006468) | Cre02.g145500 - PTK24 Cre12.g498650 - ALK3 | TAG Biosynthetic Pathway g9572 - DGAT1 Hydrogen production Cre09.g396600 - HYDA2 MAP kinase cascade Cre10.g461150 - CrMAPKKK |
| ribosome biogenesis (GO:0042254) | Cre12.g532550 - RPL13a Cre09.g400650 - RPS6 | ||
| macromolecule biosynthesis (GO:0009059) | Cre03.g207250 - GLN4 | ||
| Cluster 4 (Purple) 1174 genes Silhouette 0.26 | translation (GO:0006412) | Cre03.g199900 - EIF4E Cre02.g117900 - RH | tRNA Charging Pathway g2951 - TrpS Amino Acid Biosynthesis Cre03.g161400 - WSN2 Pentose Phosphate Non-oxydative Cre12.g511900 - RPE1 TAG Biosynthetic Pathway Cre03.g205050 - DGAT2 |
| RNA processing (GO:0006396) | Cre16.g653050 - SpoU Cre10.g421600 - ThrRS g4 679 - RNase P | ||
| lipid metabolism (GO:0006629) | Cre09.g397250 - FAD5 Cre06.g295250 - PAP | ||
| Cluster 7 (Green) 909 genes Silhouette 0.25 | protein complex assembly (GO:0006461) | g9912 - CSN5 Cre16.g663500 - CrRPN10 | Aerobic Respiration Pathway Cre15.g638500 - CYC1 COP9 Signalling g11578 - CSN6 |
| response to misfolded protein (GO:0051788) | Cre06.g280850 - PSMB4 Cre12.g501200 - SKP1 | ||
| Cluster 6 (Yellow) 1351 genes Silhouette 0.24 | chromatin organization (GO:0006325) | g11636 - HDA Cre13.g590750 - HTB37 | Chromatin Remodelling Cre13.g591200 - HTB38 Cre13.g562400 - ABI3 |
| posttranscriptional regulation (GO:0010608) | g7250 - DCL | ||
| Cluster 5 (Dark Green) 567 genes Silhouette 0.21 | response to heat (GO:0009408) | Cre14.g617400 - HSP22F Cre08.g372100 - HSP70A | Stress Response Cre02.g098800 - ERP29 g9861 - TOR |
| protein folding (GO:0006457) | g9881 - FKBP Cre01.g047700 - CYP40 | ||
| Cluster 8 (Turquoise) 1030 genes Silhouette 0.10 | photosynthesis (GO:0015979) | Cre09.g412100 - PSAF Cre10.g44 04 50 - PSB28 | Calvin Cycle Cre12.g554800 - PRK1 TCA Cycle Cre02.g143250 - IDH2 |
| hexose metabolic process (GO:0019318) | Cre17.g725550 - GLD1 Cre02.g093450 - FBA2 |
Fig. 5Location of Transcription Factors and Transcriptional Regulators in ChlamyNET. a Transcription Factors in ChlamyNET. We identified 118 different TFs classified into 28 different families represented using symbols with different colours and shapes. The distribution of the TFs over the clusters of ChlamyNET is not uniform. Clusters 9 (blue) and 3 (red) are enriched in TFs according to p-values of 2.62°10 −3 and 2.37°10 −3 obtained using Fisher's exact test. b Transcription Regulators in ChlamyNET. We identified 109 different TRs classified into 17 different families represented using symbols with different colours and shapes. The distribution of the TRs over ChlamyNET is uniform. Our analysis based on the Fisher's exact test did not identify any cluster significantly enriched in Trs
Fig. 6Transcription Factors and Transcriptional Regulators Clustering and Functional Annotation. According to the similarity between their gene expression profiles the TFs and TRs in ChlamyNET can be classified into 13 different groups identified by different symbols, colours and letters. The analysis of the GO terms overrepresented in the neighbourhood of each group suggest the biological processes that they might be regulating
Biological processes and transcription binding sites significantly over-represented in the neighbourhood of the TFs and TRs groups in ChlamyNET
Biological processes and transcription binding sites signifcantly over-represented in the neighbourhood of the TFs and TRs groups in ChlamyNET
Fig. 7Potentially Light-regulated Transcription Factors and Transcriptional Regulators in ChlamyNET. Twenty-one TFs and TRs exhibiting a high similarity with light regulated TFs and TRs in Arabidopsis were identified in ChlamyNET. These genes are not uniformly distributed over ChlamyNET. Clusters 9 (blue), 3 (red) and 7 (green) were significantly enriched in these potentially ligh-regulated TFs and TRs, so that they are expected to be involved mainly in carbon/nitrogen metabolism, signalling by phosphorylation and protein degradation. The central location of several light-regulated TFs and TRs such as CrHY5 (Cre12.510200) and CrCRY1 (Cre06.g295200) suggests that they are highly authoritative hub genes. Indeed CrHY5 and CrCRY1 have 133 and 57 neighbouring genes respectively
Potentially Light Regulated TFs and TRs in ChlamyNET. Their potential Arabidopsis ortholog and topological indexes are indicated as well
| Chlamydomonas gene | Putative Arabidopsis Ortholog | Number of neighbours | Normalized hub score | Clustering |
|---|---|---|---|---|
| Cre06.g295200 CPH1 / CrCRYl | At4g08920 CRYPTOCHROME 1 | 57 | 8.12 x 10-5 | 0.39 |
| Cre01.g043150 CrGBF1 | At4g36730 G-BOX BINDING FACTOR 1 | 182 | 7.36 x 10-7 | 0.31 |
| Cre12.g510200 | At5g11260 ELONGATED HYPOCOTYL 5 | 133 | 0.32 | 0.47 |
| Cre06.g310500 | At3g17609 HY5-HOMOLOG | 39 | 0.12 | 0.71 |
| Cre12.g521150 | At5g39660 CYCLING DOF FACTOR 2 | 27 | 3.79 x 10-8 | 0.34 |
| g6302 | At5g15840 CONSTANS | 58 | 2.29 x 10-4 | 0.40 |
| Cre02.g094150 | At2g46790 PSEUDO-RESPONSE REGULATOR 9 | 1 | 7.72 x 10-7 | 0 |
| Cre06.g275350 | At1g01060 LATE ELONGATED HYPOCOTYL | 78 | 5.86 x 10-7 | 0.38 |
| g1542 | At3g46640 LUX | 38 | 2.23 x 10-7 | 0.40 |
| Cre06.g277350 | At4g38130 HISTONE DEACETYLASE 1 | 1 | 5.12 x 10-18 | 0 |
| g16739 | At5g61380 TIMING OF CAB EXPRESSION 1 | 3 | 1.49 x 10-5 | 0.33 |
| Cre14.g617350 | At5g63860 UVB-RESISTANCE 8 | 1 | 4.09 x 10-14 | 0 |
| Cre05.g234300 | At3g61140 CONSTITUTIVE PHOTOMORPHOGENIC 11 | 36 | 7.81 x 10-14 | 0.37 |
| Cre14.g608850 | At4g14110 CONSTITUTIVE PHOTOMORPHOGENIC 9 | 11 | 1.26 x 10-12 | 0.27 |
| Cre17.g708300 | At1g64520 REGULATORY PARTICLE NON-ATPASE 12A | 57 | 7.44 x 10-14 | 0.39 |
| Cre10.g439150 | At3g05530 REGULATORY PARTICLE TRIPLE-A ATPASE 5A | 51 | 7.49 x 10-14 | 0.45 |
| Cre13.g581450 | At4g24820 REGULATORY PARTICLE NON-ATPASE 7 | 59 | 1.52 x 10-13 | 0.28 |
| Cre06.g275650 | At1g20200 REGULATORY PARTICLE NON-ATPASE 3A | 14 | 2.80 x 10-15 | 0.41 |
| Cre16.g663500 | At4g38630 REGULATORY PARTICLE NON-ATPASE 10 | 4 | 3.48 x 10-14 | 0 |
| Cre04.g216600 | At5g19990 REGULATORY PARTICLE TRIPLE-A ATPASE 6A | 19 | 1.74 x 10-12 | 0.25 |
| Cre07.g329700 | At4g29040 REGULATORY PARTICLE TRIPLE-A ATPASE 2A | 3 | 3.44 x 10-16 | 0.33 |
Fig. 8Heatmap Representing the Co-expression Patterns among the Potentially Light-regulated TFs and TRs in ChlamyNET. High positive correlation between the corresponding gene profiles is represented by red/yellow colours, low negative correlation is represented by blue/purple colours. Three different groups are apparent. The first group can be divided into two subgroups. We can observe negative correlations between genes in the subgroup 1b and genes in the second group such as between CrLHY and CrTOC1 which indicates that these two genes may be true orthologs of the circadian clock Arabidopsis genes LHY/CCA1 and TOC1. Very low negative correlations are observed between genes in Group 1 and genes in Group 3. Genes coding for different 26S proteasome and signalosome subunits such as CrRPN7 and CrCOP9 can be found in. Group 3. Their putative Arabidopsis orthologs have been described to degrade proteins involved in light response that exhibit a high sequence similarity with those coded by genes in Group 1
Fig. 9Experimental Cross-validation of the Predictive Power of ChlamyNET using RNA-seq Data from Algae Overexpressing the CrDOF gene. a The CrDOF gene (identified as a green diamond in ChlamyNET) has a neighbourhood at distance two consisting of 216 genes represented in yellow. These genes showed an average fold-change increase of 2.7 which is significantly higher than the fold-change in the rest of ChlamyNET with a p-value of 5.63° 10 −3. b Genes increasing their expression level in LD conditions at least by a four fold-change in the CrDOF genotype when compared to the wild type CW15 are represented in red. Note that the neighbourhood of the CrDOF gene, represented in green, is enriched in these genes according to a p-value of 0.029 obtained using Fisher's exact test. c Genes increasing their expression level in SD conditions at least by a four fold-change in the CrDOF genotype when compared to the wild type CW15 are represented in red. These genes tend to group around the CrDOF gene, represented in green. d Inhibited genes in LD conditions in the CrDOFin genotype when compared to the wild type CW15 with at least by a four fold-change are represented in blue. Note that cluster 2 (brown) involved in DNA replication and cell cycle processes is significantly enriched in these genes
Fig. 10Expression Levels (FPKM) of several CrDOF Neighbouring Genes and Distant Genes in the CrDOFin and CW15 Genotypes Grown in LD and SD Conditions. Three genes in the neighbourhood at distance two from CrDOF were chosen to illustrate the correct prediction provided by ChlamyNET with respect to their increase in expression level in the CrDOFin genotype when compared to CW15. These genes are fatty acid desaturase FAD6 (Cre13.g590500) involved in carbon metabolism, glutamate dehydrogenase GDH2 (Cre05.g232150) involved in nitrogen metabolism and serine/threonine kinase MAPKKK2 (Cre16.g684450) possibly involved in cell cycle regulation. Additionally, we selected three genes from the purple cluster located far away from the CrDOF gene in ChlamyNET to show that distant genes expression is not substantially affected by CrDOF overexpression. These genes are glycinamide ribonucleotide synthetase CrGARS (g18106), phosphoribosylglycinamide formyltransferase PGFT (Cre12.g550700), both involved in purine biosynthesis and plastid transcription factor PTAC3 (Cre12.g497350) involved in regulation of plastid genes