| Literature DB >> 19343223 |
Gregory W Carter1, David J Galas, Timothy Galitski.
Abstract
Extraction of all the biological information inherent in large-scale genetic interaction datasets remains a significant challenge for systems biology. The core problem is essentially that of classification of the relationships among phenotypes of mutant strains into biologically informative "rules" of gene interaction. Geneticists have determined such classifications based on insights from biological examples, but it is not clear that there is a systematic, unsupervised way to extract this information. In this paper we describe such a method that depends on maximizing a previously described context-dependent information measure to obtain maximally informative biological networks. We have successfully validated this method on two examples from yeast by demonstrating that more biological information is obtained when analysis is guided by this information measure. The context-dependent information measure is a function only of phenotype data and a set of interaction rules, involving no prior biological knowledge. Analysis of the resulting networks reveals that the most biologically informative networks are those with the greatest context-dependent information scores. We propose that these high-complexity networks reveal genetic architecture at a modular level, in contrast to classical genetic interaction rules that order genes in pathways. We suggest that our analysis represents a powerful, data-driven, and general approach to genetic interaction analysis, with particular potential in the study of mammalian systems in which interactions are complex and gene annotation data are sparse.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19343223 PMCID: PMC2659753 DOI: 10.1371/journal.pcbi.1000347
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Complexity and number of biological statements obtained for the two genetic interaction networks for various interaction classifications.
|
|
|
| ||
| Ψ |
| Ψ |
| |
| Drees, et al. | 0.57 | 68 | 0.27 | 41 |
| Segré, et al. | 0.52 | 60 | 0.32 | 23 |
| St Onge, et al. | - | - | 0.16 | 9 |
| Maximal Ψ | 0.79 | 93 | 0.62 | 43 |
*: Subclassification of alleviating interactions could not be performed for the invasion network since this scheme does not have rules to classify every inequality in the invasion data.
The first three classification schemes are from the publications cited. Optimized classifications were determined from unsupervised maximization of complexity Ψ, which has a theoretical maximum of 1. The optimized classification scheme for the invasion data was the greatest found via sampling, whereas the optimal MMS-growth network is the network of absolute maximum complexity, found by exhaustive calculation.
Rules for maximal complexity in the MMS-growth network.
|
|
|
|
|
|
| 1 | 120 |
| 120 | epistatic |
| 2 | 55 |
| 55 | additive |
| 3 | 92 |
| 92 | additive |
| 4 | 30 |
| 24 | asynthetic |
|
| 6 | non-interactive | ||
| 5 | 26 |
| 14 | conditional |
|
| 4 | epistatic | ||
|
| 4 | single-nonmonotonic | ||
|
| 3 | non-interactive | ||
|
| 1 | synthetic |
Frequencies refer to the number of occurrences of the rule in the full network of 323 interactions, and classical interpretations follow Drees et al. [5].
Figure 1MMS-growth network with maximal set complexity, Ψ.
(A) is the complete network. Sub-networks of relationships shown in Table 2 for (B) Rule 1, (C) Rule 2, (D) Rule 3, (E) Rule 4, and (F) Rule 5. The same color codes are used in Figure 3.
Figure 3Examples of biological information extracted from the maximally complex MMS-growth network.
(A) Deletion of PSY3 interacts via Rule 1 (red edges) with meiotic recombination gene deletions. (B) Deletion of SGS1 interacts via Rule 5 (green edges) with four error-free DNA repair gene deletions. Deletion of SWC5 interacts with the same genes via Rule 2 (orange edges). These four genes interact via Rule 4 (violet edges), significantly for CSM2 and SHU2 deletions. (C) Deletion of HPR5 interacts via Rule 3 (blue edges) with genes involved in negative regulation of DNA transposition and via Rule 1 (red edges) to genes involved in gene conversion at mating-type locus. Deletion of RTT101 interacts via Rule 2 (orange edges) with heteroduplex formation genes.
Rules for high complexity in the invasion network.
|
|
|
|
|
|
| 1 | 312 |
| 146 | suppression |
|
| 79 | epistatic | ||
|
| 38 | conditional | ||
|
| 17 | conditional | ||
|
| 14 | single-nonmonotonic | ||
|
| 8 | asynthetic | ||
|
| 4 | double-nonmonotonic | ||
|
| 4 | double-nonmonotonic | ||
|
| 2 | epistatic | ||
| 2 | 325 |
| 97 | epistatic |
|
| 56 | suppression | ||
|
| 50 | single-nonmonotonic | ||
|
| 47 | epistatic | ||
|
| 41 | conditional | ||
|
| 29 | synthetic | ||
|
| 4 | double-nonmonotonic | ||
|
| 1 | additive | ||
| 3 | 398 |
| 143 | non-interactive |
|
| 103 | asynthetic | ||
|
| 38 | epistatic | ||
|
| 33 | synthetic | ||
|
| 32 | non-interactive | ||
|
| 14 | double-nonmonotonic | ||
|
| 13 | double-nonmonotonic | ||
|
| 11 | conditional | ||
|
| 8 | epistatic | ||
|
| 2 | double-nonmonotonic | ||
|
| 1 | double-nonmonotonic | ||
| 4 | 356 |
| 176 | additive |
|
| 105 | additive | ||
|
| 67 | conditional | ||
|
| 8 | double-nonmonotonic | ||
| 5 | 418 |
| 268 | non-interactive |
|
| 72 | conditional | ||
|
| 40 | additive | ||
|
| 19 | additive | ||
|
| 11 | single-nonmonotonic | ||
|
| 5 | additive | ||
|
| 1 | additive | ||
|
| 1 | double-nonmonotonic | ||
|
| 1 | double-nonmonotonic |
Frequencies refer to the number of occurrences of the rule in the full network of 1809 interactions, and classical interpretations follow Drees et al. [5].
Biological statements extracted from the maximally complex MMS-growth network.
|
|
|
|
|
| SGS1 | Rule 5 | error-free DNA repair | 7.91E-05 |
| SWC5 | Rule 2 | error-free DNA repair | 0.00040 |
| RAD51 | Rule 4 | heteroduplex formation | 0.00043 |
| CLA4 | Rule 3 | developmental process | 0.00047 |
| PSY3 | Rule 1 | meiosis I | 0.00047 |
| CLA4 | Rule 3 | DNA recombination | 0.00098 |
| CSM2 | Rule 1 | meiosis I | 0.0012 |
| PSY3 | Rule 3 | negative regulation of transposition, RNA-mediated | 0.0017 |
| MPH1 | Rule 4 | error-free DNA repair | 0.0017 |
| CSM2 | Rule 4 | error-free DNA repair | 0.0017 |
| SHU2 | Rule 4 | error-free DNA repair | 0.0020 |
| HPR5 | Rule 1 | mitotic recombination | 0.0022 |
| CLA4 | Rule 3 | reproductive developmental process | 0.0024 |
| PSY3 | Rule 1 | reproductive developmental process | 0.0024 |
| SHU1 | Rule 1 | reproductive developmental process | 0.0024 |
| MAG1 | Rule 3 | reproductive developmental process | 0.0024 |
| RAD52 | Rule 1 | double-strand break repair via single-strand annealing | 0.0026 |
| HPR5 | Rule 1 | cellular component organization and biogenesis | 0.0026 |
| MPH1 | Rule 1 | heteroduplex formation | 0.0028 |
| CLA4 | Rule 3 | mitotic recombination | 0.0029 |
| MAG1 | Rule 3 | mitotic recombination | 0.0029 |
| SHU2 | Rule 1 | meiosis I | 0.0030 |
| HPR5 | Rule 3 | negative regulation of transposition, RNA-mediated | 0.0043 |
| SHU1 | Rule 4 | error-free DNA repair | 0.0043 |
| RAD59 | Rule 1 | postreplication repair | 0.0043 |
| RAD52 | Rule 1 | non-recombinational repair | 0.0044 |
| CSM2 | Rule 1 | reproductive developmental process | 0.0047 |
| HPR5 | Rule 1 | reproductive developmental process | 0.0055 |
| HPR5 | Rule 4 | error-free DNA repair | 0.0067 |
| RTT101 | Rule 2 | heteroduplex formation | 0.0067 |
| MPH1 | Rule 3 | DNA recombination | 0.0071 |
| MMS1 | Rule 2 | telomere maintenance via recombination | 0.0087 |
| HPR5 | Rule 1 | meiotic DNA recombinase assembly | 0.0087 |
| RAD51 | Rule 4 | double-strand break repair via single-strand annealing | 0.0087 |
| MMS4 | Rule 3 | reproductive developmental process | 0.0087 |
| RTT107 | Rule 2 | DNA recombination | 0.0097 |
| CLA4 | Rule 3 | heteroduplex formation | 0.0100 |
| CSM3 | Rule 1 | heteroduplex formation | 0.0100 |
| PSY3 | Rule 1 | heteroduplex formation | 0.0100 |
| SHU1 | Rule 1 | heteroduplex formation | 0.0100 |
| MAG1 | Rule 3 | heteroduplex formation | 0.0100 |
| MUS81 | Rule 3 | heteroduplex formation | 0.0100 |
| MUS81 | Rule 1 | error-free DNA repair | 0.0100 |
Figure 2Biological information as a function of set complexity Ψ in the MMS-growth networks.
Average number of biological statements (significance P<0.01) for binned complexity calculated from all possible networks. Error bars denote the standard deviation of binned data points.
Figure 4Networks of mutual information for yeast invasion data.
Nodes represent alleles and edges represent significant mutual information between the connected alleles. (A) Mutual information network obtained using the classification scheme of Drees, et al, showing all pairs of significance p<0.001 [5]. (B) Mutual information network obtained using the maximally complex classification scheme on the same data, showing all pairs of significance p<0.0001. The maximally complex classification scheme produces more pairs and higher significance.
Figure 5Network modularity of genetic interactions.
(A) A simple, hypothetical genetic interaction network of seven genes with three biological functions. (B) An example biological statement inferred from the genetic interaction network, establishing a coherent interaction rule between gene perturbation F and Function 1. (C) Inferred mutual information network that exhibits the functional modularity of the genetic network.
Biological statements extracted from the high-complexity invasion network.
|
|
|
|
|
| HOG1 | Rule 4 | invasive growth in response to glucose limitation | 0.00027 |
| TEC1 | Rule 2 | regulation of cellular process | 0.00031 |
| STE11 | Rule 3 | organelle organization and biogenesis | 0.00053 |
| HOG1 | Rule 4 | positive regulation of biological process | 0.00064 |
| XBP1 | Rule 5 | reproduction | 0.00070 |
| ROX1 | Rule 5 | conjugation with cellular fusion | 0.0010 |
| HOG1 | Rule 4 | positive regulation of transcription from RNA polymerase II promoter | 0.0011 |
| RAS2 | Rule 2 | positive regulation of biological process | 0.0011 |
| SPO12 | Rule 5 | reproduction | 0.0012 |
| MSN1 | Rule 1 | cell surface receptor linked signal transduction | 0.0014 |
| KSS1 | Rule 2 | positive regulation of catalytic activity | 0.0014 |
| KSS1 | Rule 4 | regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | 0.0014 |
| HSL1 | Rule 1 | cell wall organization and biogenesis | 0.0014 |
| HOG1 | Rule 4 | conjugation with cellular fusion | 0.0015 |
| CDC42 | Rule 4 | positive regulation of metabolic process | 0.0015 |
| SRL1 | Rule 3 | conjugation with cellular fusion | 0.0015 |
| MSN1 | Rule 1 | positive regulation of catalytic activity | 0.0018 |
| COD4 | Rule 5 | reproduction | 0.0019 |
| RAS2 | Rule 4 | regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | 0.0019 |
| HOG1 | Rule 4 | response to pheromone | 0.0021 |
| HOG1 | Rule 4 | pheromone-dependent signal transduction during conjugation with cellular fusion | 0.0021 |
| RAS2 | Rule 2 | response to pheromone | 0.0022 |
| GLN3 | Rule 2 | protein targeting | 0.0023 |
| KSS1 | Rule 2 | intracellular signaling cascade | 0.0023 |
| URE2 | Rule 3 | reproduction | 0.0025 |
| URE2 | Rule 3 | response to pheromone | 0.0025 |
| YPS1 | Rule 3 | G-protein coupled receptor protein signaling pathway | 0.0025 |
| KTR2 | Rule 3 | G-protein coupled receptor protein signaling pathway | 0.0025 |
| TEC1 | Rule 2 | positive regulation of biological process | 0.0025 |
| DIG2 | Rule 2 | osmosensory signaling pathway | 0.0027 |
| PBS2 | Rule 4 | replicative cell aging | 0.0031 |
| XBP1 | Rule 5 | cell communication | 0.0031 |
| XBP1 | Rule 5 | response to chemical stimulus | 0.0031 |
| CDC42 | Rule 4 | regulation of transcription, DNA-dependent | 0.0032 |
| YPS1 | Rule 3 | filamentous growth | 0.0038 |
| STE12 | Rule 4 | positive regulation of metabolic process | 0.0038 |
| STE20 | Rule 4 | positive regulation of metabolic process | 0.0038 |
| DSE1 | Rule 5 | cellular component organization and biogenesis | 0.0039 |
| TEC1 | Rule 4 | regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | 0.0039 |
| HSL1 | Rule 5 | pheromone-dependent signal transduction during conjugation with cellular fusion | 0.0040 |
| STE12 | Rule 3 | cell wall organization and biogenesis | 0.0040 |
| STE20 | Rule 3 | cell wall organization and biogenesis | 0.0040 |
| PRY2 | Rule 2 | osmosensory signaling pathway | 0.0041 |
| IPK1 | Rule 4 | invasive growth in response to glucose limitation | 0.0043 |
| KSS1 | Rule 2 | protein amino acid phosphorylation | 0.0043 |
| YPL114W | Rule 5 | developmental process | 0.0045 |
| HOG1 | Rule 5 | protein targeting | 0.0048 |
| PDE2 | Rule 3 | intracellular signaling cascade | 0.0048 |
| DIG2 | Rule 2 | response to chemical stimulus | 0.0048 |
| ROX1 | Rule 5 | response to pheromone during conjugation with cellular fusion | 0.0048 |
| TPK1 | Rule 2 | cell surface receptor linked signal transduction | 0.0049 |
| YJL017W | Rule 5 | establishment of cell polarity | 0.0049 |
| YJL017W | Rule 1 | positive regulation of catalytic activity | 0.0049 |
| MIH1 | Rule 5 | cell communication | 0.0051 |
| STE20 | Rule 2 | growth | 0.0052 |
| RSR1 | Rule 1 | sporulation | 0.0054 |
| STE20 | Rule 2 | filamentous growth | 0.0055 |
| MSN1 | Rule 2 | M phase of mitotic cell cycle | 0.0055 |
| MSN1 | Rule 2 | conjugation with cellular fusion | 0.0055 |
| RSR1 | Rule 3 | cellular metabolic process | 0.0057 |
| DBR1 | Rule 2 | mitotic cell cycle | 0.0059 |
| TEC1 | Rule 4 | negative regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | 0.0060 |
| MSN1 | Rule 2 | invasive growth in response to glucose limitation | 0.0062 |
| SRL1 | Rule 3 | response to pheromone during conjugation with cellular fusion | 0.0065 |
| MSN1 | Rule 3 | cell communication | 0.0068 |
| TEC1 | Rule 2 | positive regulation of metabolic process | 0.0071 |
| CTS1 | Rule 2 | osmosensory signaling pathway | 0.0071 |
| PHD1 | Rule 5 | protein localization | 0.0073 |
| RIM13 | Rule 5 | conjugation with cellular fusion | 0.0076 |
| RIM13 | Rule 5 | invasive growth in response to glucose limitation | 0.0076 |
| CLN3 | Rule 3 | response to pheromone | 0.0076 |
| DIA3 | Rule 5 | response to pheromone | 0.0076 |
| BUD4 | Rule 1 | regulation of molecular function | 0.0077 |
| IME2 | Rule 3 | G-protein coupled receptor protein signaling pathway | 0.0077 |
| PAM1 | Rule 3 | cell morphogenesis | 0.0077 |
| KSS1 | Rule 1 | biopolymer metabolic process | 0.0081 |
| GPR1 | Rule 1 | response to pheromone | 0.0082 |
| BEM1 | Rule 1 | nitrogen utilization | 0.0085 |
| STE12 | Rule 1 | intracellular signaling cascade | 0.0087 |
| STE12 | Rule 3 | nitrogen utilization | 0.0090 |
| IPK1 | Rule 4 | conjugation with cellular fusion | 0.0090 |
| CDC42 | Rule 3 | cell wall organization and biogenesis | 0.0090 |
| URE2 | Rule 3 | signal transduction | 0.0090 |
| YPS1 | Rule 3 | reproduction | 0.0090 |
| YPS1 | Rule 3 | invasive growth in response to glucose limitation | 0.0090 |
| YPS1 | Rule 3 | pseudohyphal growth | 0.0090 |
| KTR2 | Rule 3 | reproduction | 0.0090 |
| MSN1 | Rule 1 | G-protein coupled receptor protein signaling pathway | 0.0092 |
| RSR1 | Rule 3 | cellular macromolecule metabolic process | 0.0093 |
| CDC42 | Rule 4 | regulation of transcription from RNA polymerase II promoter | 0.0093 |
| FLO8 | Rule 5 | filamentous growth | 0.0094 |
| GLN3 | Rule 2 | macromolecule metabolic process | 0.0095 |
| IPK1 | Rule 3 | cell communication | 0.0095 |
Figure 6Set complexity Ψ as a function of the number of interaction rules in the MMS-growth networks.
Average complexity as a function of number of rules for all possible networks. Error bars denote the standard deviation of binned data points.
Figure 7Set complexity Ψ as a function of the standard deviation of interaction rule frequencies in the MMS-growth networks.
Average complexity for binned standard deviations of rule frequencies calculated from all possible networks. Error bars denote the standard deviation of binned data points.