| Literature DB >> 21966290 |
Qihua Tan1, Mads Thomassen, Jacob V B Hjelmborg, Anders Clemmensen, Klaus Ejner Andersen, Thomas K Petersen, Matthew McGue, Kaare Christensen, Torben A Kruse.
Abstract
Identifying the various gene expression response patterns is a challenging issue in expression microarray time-course experiments. Due to heterogeneity in the regulatory reaction among thousands of genes tested, it is impossible to manually characterize a parametric form for each of the time-course pattern in a gene by gene manner. We introduce a growth curve model with fractional polynomials to automatically capture the various time-dependent expression patterns and meanwhile efficiently handle missing values due to incomplete observations. For each gene, our procedure compares the performances among fractional polynomial models with power terms from a set of fixed values that offer a wide range of curve shapes and suggests a best fitting model. After a limited simulation study, the model has been applied to our human in vivo irritated epidermis data with missing observations to investigate time-dependent transcriptional responses to a chemical irritant. Our method was able to identify the various nonlinear time-course expression trajectories. The integration of growth curves with fractional polynomials provides a flexible way to model different time-course patterns together with model selection and significant gene identification strategies that can be applied in microarray-based time-course gene expression experiments with missing observations.Entities:
Year: 2011 PMID: 21966290 PMCID: PMC3182337 DOI: 10.1155/2011/261514
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Power assessment for different proportion of randomly missing observations.
| Fold change | Missing proportion | |||||
|---|---|---|---|---|---|---|
| 0% | 5% | 10% | 25% | 50% | 70% | |
| 2 | 0.62 | 0.60 | 0.53 | 0.40 | 0.19 | 0.08 |
| 2.5 | 0.97 | 0.97 | 0.96 | 0.91 | 0.64 | 0.27 |
| 3 | 1.00 | 1.00 | 1.00 | 1.00 | 0.94 | 0.58 |
| 3.5 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.83 |
The incomplete structure of example data. Many observations were missing during the time-course experiment for which gene expression data were not available. A total of 21 observations were available from 9 subjects.
| Time (hour) | ||||
|---|---|---|---|---|
| 0 | 0.5 | 4 | 24 | |
| Subject | ||||
| 1 | × | × | × | |
| 2 | × | × | × | |
| 3 | × | × | ||
| 4 | × | × | ||
| 5 | × | × | ||
| 6 | × | × | × | |
| 7 | × | × | ||
| 8 | × | × | × | |
| 9 | × | |||
Figure 1Heatmap showing the mean expression levels for the 15 genes with significant time-course patterns clustered using the hierarchical clustering method and ordered sequentially with time.
Figure 2Time-course expression patterns for the 15 significant genes plotted according to the estimated power for transformation and sign of the regression coefficient.