| Literature DB >> 15575966 |
Paul M Magwene1, Junhyong Kim.
Abstract
We describe a computationally efficient statistical framework for estimating networks of coexpressed genes. This framework exploits first-order conditional independence relationships among gene-expression measurements to estimate patterns of association. We use this approach to estimate a coexpression network from microarray gene-expression measurements from Saccharomyces cerevisiae. We demonstrate the biological utility of this approach by showing that a large number of metabolic pathways are coherently represented in the estimated network. We describe a complementary unsupervised graph search algorithm for discovering locally distinct subgraphs of a large weighted graph. We apply this algorithm to our coexpression network model and show that subgraphs found using this approach correspond to particular biological processes or contain representatives of distinct gene families.Entities:
Mesh:
Year: 2004 PMID: 15575966 PMCID: PMC545795 DOI: 10.1186/gb-2004-5-12-r100
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Simplification of the yeast FOCI coexpression network constructed by retaining the 4,000 strongest edges (= 1,729 vertices). The colored vertices represent a subset of the locally distinct subgraphs of the FOCI network; letters are as in Table 2, and further details can be found there. Some of the locally distinct subgraphs of Table 2 are not represented in this figure because they involve subgraphs whose edge weights are not in the top 4,000 edges.
Figure 2Topological properties of the yeast FOCI coexpression network. Distribution of (a) vertex degrees and (b) path lengths for the network.
Summary of queries for 38 metabolic pathways against the yeast FOCI coexpression network
| Pathway | Number of genes(in KEGG) | Size of largest coherent subnetwork(s) |
| Glycolysis/gluconeogenesis | 41 (47) | 18* |
| Citrate cycle (TCA cycle) | 27 (30) | 18* |
| Pentose phosphate pathway | 20 (27) | 6* |
| Fructose and mannose metabolism | 39 (46) | 4 |
| Galactose metabolism | 25 (30) | 8* |
| Ascorbate and aldarate metabolism | 11 (13) | 3 |
| Pyruvate metabolism | 32 (34) | 8* |
| Glyoxylate and dicarboxylate metabolism | 12 (14) | 6* |
| Butanoate metabolism | 27 (30) | 7* |
| Oxidative phosphorylation | 53 (76) | 31* |
| ATP synthesis | 21 (30) | 7* |
| Nitrogen metabolism | 24 (27) | 3 |
| Fatty acid metabolism | 13 (17) | 3 |
| Purine metabolism | 87 (99) | 34* |
| Pyrimidine metabolism | 72 (80) | 15* |
| Nucleotide sugars metabolism | 11 (14) | 2 |
| Glutamate metabolism | 25 (27) | 3 |
| Alanine and aspartate metabolism | 26 (27) | 7* |
| Glycine, serine and threonine metabolism | 36 (42) | 7* |
| Methionine metabolism | 13 (14) | 6* |
| Valine, leucine and isoleucine biosynthesis | 15 (16) | 10* |
| Lysine biosynthesis | 16 (20) | 3 |
| Lysine degradation | 26 (30) | 4 |
| Arginine and proline metabolism | 20 (24) | 5* |
| Histidine metabolism | 20 (25) | 3 |
| Tyrosine metabolism | 27 (34) | 2 |
| Tryptophan metabolism | 20 (25) | 2 |
| Phenylalanine, tyrosine and tryptophan biosynthesis | 21 (23) | 6* |
| Starch and sucrose metabolism | 118 (139) | 29 |
| N-Glycans biosynthesis | 43 (49) | 13* |
| O-Glycans biosynthesis | 18 (20) | 2 |
| Aminosugars metabolism | 16 (20) | 2 |
| Keratan sulfate biosynthesis | 18 (20) | 2 |
| Glycerolipid metabolism | 56 (68) | 12* |
| Inositol phosphate metabolism | 87 (103) | 10 |
| Sphingophospholipid biosynthesis | 101 (118) | 11 |
| Vitamin B6 metabolism | 11 (14) | 2 |
| Folate biosynthesis | 14 (17) | 1 |
The values in the second column represent the number of pathway genes represented in the GCC of the yeast FOCI graph, with the total number of genes assigned to the given pathway in parentheses. The third column indicates the number of pathway genes in the largest coherent subgraph resulting from each pathway query. Pathways represented by coherent subgraphs that are significantly larger than are expected at random (p < 0.05) are marked with asterisks.
Figure 3Largest connected subgraph resulting from combined query on four pathways involved in carbohydrate metabolism: glycolysis/gluconeogenesis (red); pyruvate metabolism (yellow); TCA cycle (green); and the glyoxylate cycle (pink). Genes encoding proteins involved in more than one pathway are highlighted with multiple colors. Uncolored vertices represent non-pathway genes that were recovered in the combined pathway query. See text for further details.
Summary of locally distinct subgraphs of the yeast FOCI coexpression network
| Subgraph | Number of genes | Number unkown | Major GO terms | |
| A | 33 | 0 | Protein biosynthesis (32) | |
| B | 67 | 2 | Protein biosynthesis (64) | |
| C | 124 | 26 | Ribosome biogenesis and assembly (74) | |
| D | 10 | 0 | Glycolysis/gluconeogenesis (8) | |
| E | 7 | 1 | Carboxylic/organic acid metabolism (4) | |
| F | 41 | 7 | Ubiquitin dependent protein catabolism (21) | |
| G | 14 | 4 | Cell organization and biogenesis (7) | 1.60e-04 |
| H | 7 | 0 | Main pathways of carbohydrate metabolism (4) | |
| I | 13 | 0 | Electron transport (7) | |
| J | 13 | 0 | Glutamate biosynthesis/TCA cycle (4) | |
| K | 71 | 25 | Response to stress (17); carbohydrate metabolism (13) | |
| L | 10 | 4 | Response to stress (2) | 3.35e-02 |
| N | 149 | 51 | Sporulation (27) | |
| M | 5 | 2 | Mitochondrial matrix (5); mitochondrial ribosome (4) | |
| O | 7 | 2 | Meiosis (4) | |
| P | 52 | 13 | Cell proliferation (32); DNA replication and chromosome cycle (28) | |
| Q | 26 | 21 | Telomerase-independent telomere maintenance (5) | |
| R | 7 | 0 | Chromatin assembly/disassembly (7) | |
| S | 14 | 5 | Cell wall (4); bud (4) | |
| T | 24 | 8 | Cell proliferation (15); mitotic cell cycle (9) | |
| U | 21 | 4 | Cell separation during cytokinesis (4); cell proliferation (9); cell wall organization and biogenesis (5) | |
| V | 12 | 4 | Metabolism (7) | 2.48e-02 |
| W | 10 | 9 | Nine of ten are members of the seripauperin gene family | NA |
| X | 9 | 0 | Sulfur amino acid metabolism (6); amino acid metabolism (3) | |
| Y | 7 | 1 | Cell growth and maintenance (6) | 7.50e-04 |
| Z | 19 | 2 | Conjugation with cellular fusion (13) | |
| AA | 8 | 4 | Biotin biosynthesis (2) | |
| BB | 7 | 0 | Response to abiotic stimulus (2) | 1.48e-02 |
| CC | 9 | 5 | Six of nine members belong to COS family of subtelomerically encoded proteins | NA |
| DD | 18 | 7 | Cell growth and/or maintenance (8) | 4.43e-03 |
| EE | 11 | 3 | Vitamin B6 metabolism (2) | |
| FF | 7 | 0 | Ty element transposition (7) |
The columns of the table summarize the total size of the locally distinct subgraph, the number of genes in the subgraph that are unannotated (according to the GO Slim annotation from the Saccharomyces Genome Database of December 2003), the primary GO term(s) associated with the subgraph, and a p-value indicating the frequency at which one would expect to find the same number of genes assigned to the given GO term in a random assemblage of the same size.