| Literature DB >> 26124630 |
David Gutiérrez-Avilés1, Cristina Rubio-Escudero1.
Abstract
Microarray technology is highly used in biological research environments due to its ability to monitor the RNA concentration levels. The analysis of the data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior. Biclustering relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. Triclustering appears for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. These triclusters provide hidden information in the form of behavior patterns from temporal experiments with microarrays relating subsets of genes, experimental conditions, and time points. We present an evaluation measure for triclusters called Multi Slope Measure, based on the similarity among the angles of the slopes formed by each profile formed by the genes, conditions, and times of the tricluster.Entities:
Keywords: angular comparison; fitness function; genetic algorithms; microarrays; time series; triclustering
Year: 2015 PMID: 26124630 PMCID: PMC4479169 DOI: 10.4137/EBO.S25822
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1Tricluster representation.
Figure 2Graphic representation of a tricluster.
Figure 3Angles for TRI graphic view.
Figure 4AC(TRI) example.
Figure 5TriGen algorithm flowchart.
TriGen algorithm parameters
| PARAMETER | DESCRIPTION |
|---|---|
| Number of triclusters extracted | |
| Number of generations | |
| Number of individuals in the population | |
| Randomness rate | |
| Selection rate | |
| Mutation probability | |
| Weight for | |
| Weight for the number of genes | |
| Weight for the number of conditions | |
| Weight for the number of times | |
| Weight for the overlap among genes | |
| Weight for the overlap among conditions | |
| Weight for the overlap among times |
TriGen algorithm control parameters for yeast cell cycle dataset.
| PARAMETER | VALUES |
|---|---|
| N | 20 |
| G | 150 |
| I | 200 |
| Ale | 0.9 |
| Sel | 0.4 |
| Mut | 0.9 |
| wf | 0.8 |
| wg | 0.05 |
| wc | 0 |
| wt | 0 |
| wog | 0.05 |
| woc | 0.05 |
| wot | 0.05 |
Correlation results for triclusters from the yeast cell cycle dataset.
| PEARSON | SPEARMAN | |
|---|---|---|
| 0.97 | 0.98 | |
| 0.97 | 0.99 | |
| 0.96 | 0.99 | |
| 0.96 | 0.98 | |
| 0.97 | 0.99 | |
| 0.96 | 0.99 | |
| 0.96 | 0.99 | |
| 0.96 | 0.98 | |
| 0.96 | 0.98 | |
| 0.96 | 0.99 | |
| 0.96 | 0.99 | |
| 0.95 | 0.98 | |
| 0.96 | 0.98 | |
| 0.96 | 0.98 | |
| 0.97 | 1 | |
| 0.96 | 0.98 | |
| 0.96 | 0.98 | |
| 0.96 | 0.99 | |
| 0.96 | 0.99 | |
| 0.96 | 0.98 |
Figure 6TRI11 graphic representations from yeast cell cycle results. (A) sample curves, (b) time curves, (C) gene curves.
GO analysis for tricluster TRI11 found in the yeast cell cycle dataset.
| ID | NAME | |
|---|---|---|
| GO:0043605 | Cellular amide catabolic process | 1.98E-09 |
| GO:0000255 | Allantoin metabolic process | 1.17E −08 |
| GO:0000256 | Allantoin catabolic process | 1.17E −08 |
| GO:0004523 | RNA-DNA hybrid ribonuclease activity | 4.72E-05 |
| GO:0016893 | Endonuclease activity, active with either ribo- or deoxyribo nucleic acids and producing 5′-phosphomonoesters | 7.50E-05 |
| GO:0043603 | cellular amide metabolic process | 1.46E-04 |
| GO:0006144 | purine nucleobase metabolic process | 3.91E-04 |
| GO:0044419 | interspecies interaction between organisms | 4.03E-04 |
| GO:0009112 | nucleobase metabolic process | 6.08E-04 |
| GO:0016891 | endoribonuclease activity, producing 5′-phosphomonoesters | 1.01E-03 |
Comparison of MSR, LSL, and MSL yeast cell cycle results.
| Max Pearson | 1 | 0.79 | 0.97 |
| Min Pearson | 0.31 | 0.58 | 0.95 |
| Mean Pearson | 0.47 | 0.69 | 0.96 |
| Max Spearman | 1 | 0.82 | 1 |
| Min Spearman | 0.31 | 0.54 | 0.98 |
| Mean Spearman | 0.45 | 0.67 | 0.99 |
| Max | 1.04 × 10−2 | 7.53 × 10−5 | 1.01 × 10−3 |
| Min | 1.97 × 10−3 | 4.35 × 10−6 | 1.98 × 10−9 |
| Mean | 5.68 × 10−3 | 4.29 × 10−5 | 2.68 × 10−4 |
TriGen algorithm control parameters for mouse GDS4510 dataset.
| PARAMETER | VALUES |
|---|---|
| 20 | |
| 150 | |
| 500 | |
| 0.5 | |
| 0.5 | |
| 0.5 | |
| 0.8 | |
| 0 | |
| 0 | |
| 0.1 | |
| 0 | |
| 0 | |
| 0.1 |
Correlation results for tricluster mouse GDS4510 dataset.
| PEARSON | SPEARMAN | |
|---|---|---|
| 0.93 | 0.9 | |
| 0.58 | 0.57 | |
| 0.92 | 0.89 | |
| 0.65 | 0.67 | |
| 0.61 | 0.65 | |
| 0.63 | 0.60 | |
| 0.54 | 0.62 | |
| 0.59 | 0.63 | |
| 0.63 | 0.65 | |
| 0.6 | 0.56 | |
| 0.95 | 0.9 | |
| 0.89 | 0.85 | |
| 0.93 | 0.89 | |
| 0.95 | 0.89 | |
| 0.95 | 0.9 | |
| 0.56 | 0.62 | |
| 0.93 | 0.87 | |
| 0.92 | 0.85 | |
| 0.94 | 0.89 | |
| 0.94 | 0.89 |
Figure 7TRI10 graphic representations from mouse GDS4510 results. (A) sample curves, (b) time curves, (C) gene curves.
GO analysis for tricluster TRI10 found in the mouse. GDS4510 dataset.
| ID | NAME | |
|---|---|---|
| GO:0007606 | Sensory perception of chemical stimulus | 7.92E-30 |
| GO:0004984 | Olfactory receptor activity | 4.98E-25 |
| GO:0050911 | Detection of chemical stimulus involved in sensory perception of smell | 4.98E-25 |
| GO:0050907 | Detection of chemical stimulus involved in sensory perception | 1.17E-24 |
| GO:0007186 | G-protein coupled receptor signaling pathway | 7.72E-23 |
| GO:0007608 | Sensory perception of smell | 3.62E-22 |
| GO:0009593 | Detection of chemical stimulus | 7.82E-22 |
| GO:0050906 | Detection of stimulus involved in sensory perception | 1.43E-20 |
| GO:0004888 | Transmembrane signaling receptor activity | 2.52E-20 |
| GO:0038023 | Signaling receptor activity | 5.35E-19 |
| GO:0007600 | Sensory perception | 7.33E-18 |
| GO:0004872 | Receptor activity | 7.21E-17 |
| GO:0051606 | Detection of stimulus | 1.90E-16 |
| GO:0050877 | Neurological system process | 5.97E-15 |
| GO:0004871 | Signal transducer activity | 6.91E-15 |
| GO:0060089 | Molecular transducer activity | 1.48E-13 |
| GO:0004930 | G-protein–coupled receptor activity | 4.69E-13 |
| GO:0003008 | System process | 1.07E-11 |
| GO:0007166 | Cell surface receptor signaling pathway | 4.10E-11 |
| GO:0016503 | Pheromone receptor activity | 1.15E-09 |
| GO:0019236 | Response to pheromone | 4.42E-09 |
| GO:0005550 | Pheromone binding | 5.88E-09 |
| GO:0005549 | Odorant binding | 1.59E-08 |
| GO:0042221 | response to chemical | 1.51E-07 |
| GO:0016021 | Integral component of membrane | 3.76E-07 |
| GO:0031224 | intrinsic component of membrane | 7.52E-07 |
Comparison of MSR, LSL and MSL GDS4510 results.
| Max Pearson | 1 | 1 | 0.95 |
| Min Pearson | 0.52 | 0.64 | 0.54 |
| Mean Pearson | 0.91 | 0.89 | 0.78 |
| Max Spearman | 1 | 1 | 0.9 |
| Min Spearman | 0.5 | 0.6 | 0.56 |
| Mean Spearman | 0.91 | 0.89 | 0.77 |
| Max | 7.34 × 10−4 | 7.40 × 10−8 | 7.52 × 10−7 |
| Min | 1.53 × 10−6 | 8.79 × 10−21 | 7.92 × 10−30 |
| Mean | 3.33 × 10–4 | 8.02 × 10−9 | 5.02 × 10−8 |
TriGen algorithm control parameters for human GDS4472 dataset.
| PARAMETER | VALUES |
|---|---|
| 20 | |
| 700 | |
| 500 | |
| 0.8 | |
| 0.2 | |
| 0.9 | |
| 0.8 | |
| 0.01 | |
| 0.045 | |
| 0.045 | |
| 0.01 | |
| 0.045 | |
| 0.045 |
Correlation results for tricluster human GDS4472 dataset.
| PEARSON | SPEARMAN | |
|---|---|---|
| 0.96683677 | 1 | |
| 0.48750048 | 0.47676471 | |
| 0.47592126 | 0.47121212 | |
| 0.75202621 | 0.79373957 | |
| 0.95392167 | 1 | |
| 0.94749964 | 1 | |
| 0.74349846 | 0.80536153 | |
| 0.47685516 | 0.46931529 | |
| 0.95370964 | 1 | |
| 0.95102082 | 1 | |
| 0.94672151 | 1 | |
| 0.48325216 | 0.47366678 | |
| 0.72992209 | 0. 80331134 | |
| 0.95754954 | 1 | |
| 0.95225892 | 1 | |
| 0.95262299 | 1 | |
| 0.95113522 | 1 | |
| 0.9536509 | 1 | |
| 0.47363796 | 0.45628145 | |
| 0.95549408 | 1 |
Figure 8TRI19 graphic representations from human GDS4472 results. () sample curves, (b) time curves, (C) gene curves.
GO analysis for tricluster TRI19 found in the human GDS4472 dataset.
| ID | NAME | |
|---|---|---|
| GO:0006614 | SRP-dependent cotranslational protein targeting to membrane | 3.33E-60 |
| GO:0006613 | Cotranslational protein targeting to membrane | 6.24E-60 |
| GO:0045047 | Protein targeting to ER | 1.15E-59 |
| GO:0022626 | Cytosolic ribosome | 2.83E-59 |
| GO:0072599 | Establishment of protein localization to endoplasmic reticulum | 3.80E-59 |
| GO:0000184 | Nuclear-transcribed mRNA catabolic process, nonsense-mediated decay | 1.19E-58 |
| GO:0070972 | Protein localization to endoplasmic reticulum | 4.72E-57 |
| GO:0003735 | Structural constituent of ribosome | 1.91E-55 |
| GO:0044391 | Ribosomal subunit | 8.37E-55 |
| GO:0006415 | Translational termination | 3.29E-53 |
| GO:0006612 | Protein targeting to membrane | 8.04E-53 |
| GO:0000956 | Nuclear-transcribed mRNA catabolic process | 3.17E-52 |
| GO:0019083 | Viral transcription | 1.18E-51 |
| GO:0006402 | mRNA catabolic process | 2.22E-51 |
| GO:0044033 | Multi-organism metabolic process | 5.60E-51 |
| GO:0019080 | Viral gene expression | 5.60E-51 |
| GO:0044445 | Cytosolic part | 2.46E-50 |
| GO:0005840 | Ribosome | 1.00E-49 |
| GO:0006401 | RNA catabolic process | 1.32E-49 |
| GO:0006413 | Translational initiation | 5.99E-48 |
GO analysis for tricluster TRI19 found in the human GDS4472 dataset.
| Max Pearson | 0.95 | 0.83 | 0.96 |
| Min Pearson | 0.8 | 0.76 | 0.47 |
| Mean Pearson | 0.89 | 0.8 | 0.8 |
| Max Spearman | 0.95 | 0.81 | 1 |
| Min Spearman | 0.8 | 0.74 | 0.45 |
| Mean Spearman | 0.89 | 0.78 | 0.83 |
| Max | 3.30 × 10−3 | 6.9 × 10−32 | 5.99 × 10−48 |
| Min | 6.48 × 10−4 | 1.15 × 10−44 | 3.33 × 10−60 |
| Mean | 1.64 × 10−3 | 7.88 × 10−33 | 3.1 × 10−49 |