| Literature DB >> 18826618 |
Wieslawa I Mentzen1, Eve Syrkin Wurtele.
Abstract
BACKGROUND: Despite the mounting research on Arabidopsis transcriptome and the powerful tools to explore biology of this model plant, the organization of expression of Arabidopsis genome is only partially understood. Here, we create a coexpression network from a 22,746 Affymetrix probes dataset derived from 963 microarray chips that query the transcriptome in response to a wide variety of environmentally, genetically, and developmentally induced perturbations.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18826618 PMCID: PMC2567982 DOI: 10.1186/1471-2229-8-99
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1Data processing for construction of the transcriptional network. Filters were applied to original probe sets on ATH1 chip to remove the genes with expression lower than the mean of 100 and with correlation to other genes lower than the threshold of 0.7. The network was then constructed and the largest connected component of this network was retained; smaller connected components as well as genes with only one neighbor in the giant connected component were filtered out. This resulting network, containing 13,456 genes, was then clustered. Enrichment of Gene Ontology terms in groups of filtered genes is indicated.
Figure 2Statistical significance of Markov chain graph clustering results. The best p-values for over-representations of Gene Ontology (GO) terms, averaged over all clusters (S score, denoted by color arrow) are compared to the analogous values for 100 randomly-obtained clusterings (histogram). GO categories: (A) Molecular Function, (B) Biological Process, (C) Cellular Component. In each case, the actual clustering scored significantly better than any of 100 randomly obtained ones (Wilcoxon test p-value < 2.2 × 10-16). In a comparison of the MCL (Markov clustering) and k-means clustering results (the latter denoted by black arrows), MCL had better S scores for GO terms overrepresentation than the k-means method (0.0016 versus 0.0020 for "Molecular Function" category; 0.0016 versus 0.0026 for "Biological Process"; and 0.0044 versus 0.0050 for "Cellular Component").
Predominant functions of the regulons
| 1 | 1629 | mixed (tricellular and mature pollen-specific) b | ND |
| 2 | 1136 | Photosynthesis | 29 |
| 3 | 869 | protein synthesis | 65 |
| 4 | 583 | Mitosis | 49 |
| 5 | 507 | membrane transporters -metal, toxins removal (root-preferential) | 77 |
| 6 | 417 | embryo maturation (fruit and seed-preferential) | 25 |
| 7 | 330 | developmental regulation (leaf apex- preferential) | 38 |
| 8 | 281 | information (uninucleate microspore and bicellular pollen-specific) | 64 |
| 9 | 234 | response to environmental stimuli | 26 |
| 10 | 223 | protein modification, defense response | 66 |
| 11 | 215 | nuclear, others with very low expression | ND |
| 12 | 182 | mixed (fruit-preferential) | ND |
| 13 | 154 | upregulated in 'response to CO2 levels' experiment | ND |
| 14 | 140 | regulation of organ development | 61 |
| 15 | 138 | plastid stress and circadian rhythm | 56 |
| 16 | 121 | Information | 58 |
| 17 | 115 | Information | 51 |
| 18 | 100 | cell wall, respiration/catabolism (pollen-specific; highest in tricellular pollen) | 46 |
| 19 | 96 | mixed (flower-preferential) | ND |
| 20 | 94 | Information | 80 |
| 21 | 92 | secondary products, secondary wall (flower-specific, mostly tapetum) | 54 |
| 22 | 81 | cell wall biosynthesis, carbohydrate metabolism | 47 |
| 23 | 77 | membrane proteins | 69 |
| 24 | 71 | defense response | 70 |
| 25 | 70 | defense response | 77 |
| 26 | 68 | Information | 79 |
| 27 | 68 | regulation, root (root- and hypocotyl-preferential) | 73 |
| 28 | 66 | nucleic acid binding, regulation | 60 |
| 29 | 63 | aerobic respiration in mitochondria | 92 |
| 30 | 56 | Signalling | 89 |
| 31 | 52 | defense response | 25 |
| 32 | 48 | nuclear genes, RNA processing, DNA replication | 70 |
| 33 | 48 | chloroplast organization and biogenesis | 62 |
| 34 | 47 | mitochondrial genes | 96 |
| 35 | 45 | kinases, signaling, disease resistance | 69 |
| 36 | 43 | lipid modification and cuticular wax synthesis (flowers and shoot apex-specific) | 54 |
| 37 | 42 | heat shock response | 60 |
| 38 | 40 | RNA processing, translation, transcription regulation | 82 |
| 39 | 40 | catabolic processes deriving energy | 51 |
| 40 | 40 | transcription, translation, protein folding and transport | 86 |
| 41 | 36 | regulation, information | 83 |
| 42 | 34 | Regulation | 78 |
| 43 | 33 | flower/fruit, cell wall depositions (flower/fruit-preferential) | 48 |
| 44 | 31 | metabolic processes in flowers/fruit (flower/fruit-specific) | 22 |
| 45 | 30 | proteasome complex | 87 |
| 46 | 29 | defense response | 50 |
| 47 | 29 | nuclear, replication, chromosome organization, cell cycle | 67 |
| 48 | 28 | cell culture and tumor specific | ND |
| 49 | 27 | chloroplast-encoded | 100 |
| 50 | 27 | Signalling | 90 |
| 51 | 26 | organ specification in shoot (leaf apex- and hypocotyl-preferential) | 35 |
| 52 | 26 | endoplasmic reticulum: protein folding and secretion/redox function | 73 |
| 53 | 25 | fatty acid biosynthesis | 83 |
| 54 | 23 | protein degradation and lipid modification | 59 |
| 55 | 23 | epidermal/cuticular deposits | 43 |
| 56 | 22 | nectaries/carpel specific function (carpel-specific) | 29 |
| 57 | 22 | phloem specific (vasculature tissues-specific) | 26 |
| 58 | 22 | transposases, mostly CACTA-type | 100 |
| 59 | 21 | metabolism and transport of triterpenoids (root hairs-preferential) | 71 |
| 60 | 21 | ubiquitin ligase | 53 |
| 61 | 20 | metabolism of glutathione and glutamate, redox | 56 |
| 62 | 20 | information, nuclear | 78 |
| 63 | 20 | stress-induced catabolism, mediated by jasmonic acid | 72 |
| 64 | 20 | information, nuclear | 87 |
| 65 | 20 | secondary metabolism/pathogen infection | 61 |
| 66 | 20 | exocytosis | 29 |
| 67 | 20 | Ca 2+ – triggered exocytosis (pathogen response?) | 60 |
| 68 | 20 | shoot meristem development and nucleic acid binding (leaf apex and hypocotyl – preferential) | 69 |
| 69 | 20 | leucine/glucosinolates metabolism | 65 |
Regulons with 20 or more genes are shown. Annotations are postulated based on GO terms supplemented with information from the published literature.
Functional Coherence, calculated as percentage of annotated genes whose TAIR annotation is consistent with the cluster functional classification. (Genes designated "hypothetical" or "unknown" are not included in this calculation.)
prevalent locations of expression are indicated in parenthesis. "specific" refers to virtually all expression in the given location; "preferential" refers to most expression being in the given location
Figure 3Higher-order structure in the coexpression network. All regulons containing at least 20 genes are depicted; these comprise a total of 9,436 genes. Regulons are represented by ovals numbered 1 through 69. A linkage between two clusters means that one or more genes in one of the clusters are correlated with one or more genes in the other cluster. As observed from the proximity of regulons with similar broader functional category, three super-clusters of regulons are revealed: regulons related to information-related functions (purple), plastidic functions (green) and defense response-related functions (yellow). The predominant functionality of each regulon is defined in Table 1. Network was visualized using the GraphExplore tool [118].
Figure 4Regulons with organelle-specific functions and organelle-encoded genes. Regulon 2, photosynthesis (for clarity, representative expression profiles of 200 randomly chosen genes from this regulon are shown) (A); Regulon 49, plastid-encoded genes (B); Regulon 29, mitochondrial respiration (C); Regulon 34, mitochondrion-encoded genes (D). The plots on the right side show expression profiles of the genes in respective regulon (each gene depicted with different color) across the 424 samples in the dataset. The samples have been arranged according to plant tissue. Pie charts are based on manual annotations from published data. RNA profiles plotted using MetaOmGraph [24,124].
Figure 5Regulons with developmental and metabolic functions. Regulon 4, cell division (for clarity, representative expression profiles of 200 randomly chosen genes are shown) (A); Regulon 20, nuclear regulation (B); Regulon 35, protein kinases, signaling and defense response (C); Regulon 69, glucosinolate biosynthesis (D); Regulon 25, defense response (E); and Regulon 1, pollen-specific (200 randomly chosen genes) (F). Pie charts are based on manual annotation. RNA profiles plotted using MetaOmGraph [24,124].
Figure 6Coexpressed neighboring genes are absent from the region of long arm of chromosome 4. Distribution of the coexpressed neighboring genes (marked in yellow) on five Arabidopsis chromosomes (visualized in Chromosome Map Tool, [126,115]). Domains of coexpressed neighbors are absent from large part of the long arm of chromosome 4, adjacent to the pericentromeric region, and very rare in the analogous area of chromosome 2.
Figure 7Functional assignments and expression profiles of genes with the most and the least variable expression across multiple conditions. (A) 100 genes with the most variable expression (highest standard deviation of logE). (B) 100 genes with the most steady expression (lowest standard deviation of logE). The scale along Y axis (expression values) is the same for both plots to facilitate comparison of the expression profiles between them. Inlet shows a version of plot B with zoomed scale of expression values.