| Literature DB >> 16507110 |
Ghislain Bidaut1, Karsten Suhre, Jean-Michel Claverie, Michael F Ochs.
Abstract
BACKGROUND: As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target identification. Microarrays provide a global measure of cellular response, however linking these responses to signaling pathways requires an analytic approach tuned to the underlying biology. An ongoing issue in pattern recognition in microarrays has been how to determine the number of patterns (or clusters) to use for data interpretation, and this is a critical issue as measures of statistical significance in gene ontology or pathways rely on proper separation of genes into groups.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16507110 PMCID: PMC1413561 DOI: 10.1186/1471-2105-7-99
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Data analysis flowchart. The data was downloaded from Rosetta Inpharmatics and filtered to include only genes and experiments that showed significant variation. Bayesian Decomposition analysis generated patterns and associated gene lists for all dimensionalities between 3 and 25. ClutrFree was used to interpret these results, including use of the MIPS database of ontologies.
Figure 2The average persistence across all dimensions. The average persistence across the dimensions is plotted for 3 to 25 dimensions. The significant drop between 15 and 16 dimensions suggests that 15 patterns provides the correct dimensionality for analysis.
Figure 3Yeast MAPK signaling for mating and filamentation. The strongly linked MAPK signaling pathways for mating and filamentation are shown schematically with black arrows indicating mating pathway signaling and gray arrows showing filamentation pathway signaling. The mating pathway is initiated by binding to Ste2p or Ste3p receptors, while the causative molecular trigger for filamentation is unclear. The pathways share many components.
The most enhanced gene ontology terms in patterns 13 and 15. Each term is presented together with a measure of how overrepresented it is compared to a random draw of the same number of genes. These were also confirmed to be significant by hypergeometric tests.
| Pheromone response, mating-type determination | 6.31 | Transposable elements, viral and plasmid | 8.42 |
| development | 6.09 | Pheromone response, mating-type determination | 7.26 |
| Fungal/microorganism development | 6.09 | Transmembrane signal transduction | 6.41 |
| Mating (fertilization) | 6.09 | G-protein mediated signal transduction | 6.10 |
| Transmembrane signal transduction | 4.98 | development | 5.69 |
| Chemoperception and response | 4.47 | Fungal/microorganism development | 5.69 |
| Cellular sensing and response | 4.25 | Mating (fertilization) | 5.69 |
| Interaction with cellular environment | 2.94 | Chemoperception and response | 5.47 |
| Meiosis | 2.86 | Cellular sensing and response | 5.21 |
| Cell growth/morphogenesis | 2.77 | Cellular communications | 4.45 |
| Cellular communications | 2.77 | Enzyme mediated signal transduction | 3.81 |
| Development of asco- basidio- or zygospore | 2.57 | Interaction with cellular environment | 3.61 |
| Enzyme mediated signal transduction | 2.37 | Enzyme activator | 3.56 |
| Protein kinase cascades | 2.37 | Intracellular signaling | 3.14 |
| G protein mediated signal transduction | 2.37 | Budding, cell polarity, filamentation | 2.87 |
The genes most strongly associated with patterns 13 and 15 in order of strength of association. For pattern 13, it is noted whether the genes are known to be regulated in the mating process, and whether the gene is known to be directly regulated by Stel2p. For pattern 15, the gene function is shown. All data is from the Saccharomyces Genome Database [73, 74].
| Fig1 | Yes | No | YCL019W | Transposable element gene |
| Prm6 | Yes | Yes | YER138C | Transposable element gene |
| Fus1 | Yes | Yes | YER117C | Verified ORF, Prm5 |
| Ste2 | Yes | Yes | YER160C | Transposable element gene |
| Aga1 | Yes | No | YJR029W | Transposable element gene |
| Fus3 | Yes | Yes | YBR012W | Removed from SGD |
| Pes4 | No | No | YML045W | Transposable element gene |
| Prm1 | Yes | Yes | YAR009C | Transposable element gene |
| ORF | -- | -- | YJR027W | Transposable element gene |
| Bar1 | Yes | No | YLR334C | Hypothetical ORF |
Figure 4Relationship of patterns across dimensionalities. The results for all patterns identified in all runs of Bayesian Decomposition are summarized here. The top row shows three patterns from an analysis with 3 dimensions, while the bottom row shows 25 dimensions. The highlighted node is pattern 13 in 15 dimensions, which is the pattern identified as the mating response. Nodes are connected as described in the text using Pearson correlation measures. The numbers within the nodes are indices and have no intrinsic meaning. Each number provides the row index for P and column index for A for the analysis at that level.
Figure 5A sample calculation of the average persistence for a single node. The average persistence is calculated by comparing the persistence at each node in the tree given in Figure 4. Each assignment of each mutant (4 are shown here) to a pattern is binarized as described in the text, then the average persistence for a node is calculated by checking on the number of times the mutant assigned to the pattern occurs in the connected nodes. The mutant can occur in any branch below the node of interest to be considered as present. If it occurs in multiple child nodes at a single level, that is still treated as a single occurrence for that level. The average for a dimension is then the average of the persistence of all nodes at that level.