| Literature DB >> 22028849 |
Abstract
The microarray technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining these data one can identify the dynamics of the gene expression time series. The detection of genes that are periodically expressed is an important step that allows us to study the regulatory mechanisms associated with the circadian cycle. The problem of finding periodicity in biological time series poses many challenges. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, outliers and unevenly sampled time points. Consequently, the method for finding periodicity should preferably be robust against such anomalies in the data. In this paper, we propose a general and robust procedure for identifying genes with a periodic signature at a given significance level. This identification method is based on autoregressive models and the information theory. By using simulated data we show that the suggested method is capable of identifying rhythmic profiles even in the presence of noise and when the number of data points is small. By recourse of our analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis.Entities:
Mesh:
Year: 2011 PMID: 22028849 PMCID: PMC3196541 DOI: 10.1371/journal.pone.0026291
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Dynamic paramters of simulated data.
Scatter plot of the dynamic parameters corresponding to 24-point time series (A), and 12-point time series (B). Dark gray down-triangles correspond to saw-teeth signals, gray up-triangles correspond to square-step signals, light gray circles correspond to sinusoidal signals of a 48 h period, and black squares to a sinusoidal signal of a 24 h period. The small gray points correspond to surrogate time series, the ellipsoid corresponds to the quantile contour at level 0.9. Open symbols represent time series contaminated with a low level of noise (standard deviation of 5 of the signal amplitude), while filled symbols represent time series contaminated with a high level of noise (standard deviation of 15). We also observe that symbols corresponding to each of the four oscillatory patterns (24-point time series) form distinguishable clusters at low noise. We can also observe that sampling frequency affects the characterization.
Figure 2True positive rate and false positive rate.
The true positive rate (filled symbols) decays after a significance level of 0.01. At this level the percentage of false positives is and for 2 h and 4 h sampling rates, respectively. We use two sampling resolution: 2 h (circles) and 4 h (triangles).
Figure 3Dynamic paramters of microarray data.
Scatter plot of the dynamic parameters corresponding to each gene profile of Exp. 1 (A), and Exp. 2 (B). The parameter space was arbitrarily divided into two regions. Gray dots correspond to genes whose dynamic parameter values are and and are significant at the 0.9 level. The analysis reports 587 genes with oscillatory pattern from Exp. 1 and 514 from Exp. 2.
Figure 4Expression profile of 63 genes with oscillatory patterns in both experiments.
They are sorted by the phases of Exp. 1. The expression was normalized to the mean expression at all time points and represented by a gray scale.
Figure 5Scatter plot of the 63 selected genes.
The horizontal axis corresponds to the phase shift between gene profiles from the experiments, while the vertical axis corresponds to the dynamic distance between gene profiles from the different experiments. Black dots correspond to three circadian clock genes. KaiA (slr0756) and KaiC (slr0758) genes have a similar phase and similar dynamics in both replicates.
Figure 6Expression profile of the four genes.
Expression of kaiA (slr0756), kaiB3 (sll0486), kaiC1 (slr0758) and kaiC3 (slr1942), corresponding to the circadian clock machinery of cyanobacteria. Black squares correspond to Exp. 1, while circles correspond to Exp. 2.
Some genes with circadian expression.
| Index | Accession | Enzyme Name | Function |
| 30 F | sll0329 | 6-phosphogluconate dehydrogenase | Pentose phosphate pathway |
| 31 F | sll1196 | Phosphofructokinase | Glycolysis |
| 32 F | sll1234 | Adenosylhomocysteinase | Amino acids and amines |
| 33 F | sll1479 | 6-phosphogluconolactonase | Pentose phosphate pathway |
| 34 F | slr0301 | Phosphoenolpyruvate synthase | Pyruvate metabolism |
| 35 F | slr0394 | Phosphoglycerate kinase | Glycolysis |
| 36 F | slr0884 | GAPDH 1 | Glycolysis |
| 37 F | slr1705 | Aspartoacylase | Amino acids and amines |
| 38 F | slr1734 | G6PDN assembly protein | Pentose phosphate pathway |
| 39 F | slr1793 | Transaldolase | Pentose phosphate pathway |
| 40 F | slr1843 | Glucose 6-phosphate dehydrogenase (G6PDN) | Pentose phosphate pathway |
| 41 F | slr2094 | Fructose-1,6-/sedoheptulose-1,7-bisphosphatase | Other |
| 44 H | sll0741 | Pyruvate flavodoxin oxidoreductase | soluble electron carriers |
| 45 H | sll1220 | Diaphorase su. of the bidirectional hydrogenase | Hydrogenase |
| 46 H | sll1223 | Diaphorase su. of the bidirectional hydrogenase | Hydrogenase |
| 47 H | sll1484 | Type 2 NADH dehydrogenase | NADH dehydrogenase |
| 48 H | sll1899 | Cytochrome c oxidase folding protein | Respiratory terminal oxidases |
| 49 H | slr1136 | Cytochrome c oxidase su. II | Respiratory terminal oxidases |
| 50 H | slr1137 | Cytochrome c oxidase su. I | Respiratory terminal oxidases |
| 51 H | slr1138 | Cytochrome c oxidase su. III | Respiratory terminal oxidases |
| 52 H | slr2034 | Putative homolog of plant HCF136 | Photosystem II |
| 53 J | sll1330 | Two-component response regulator OmpR sf. | Regulatory functions |
| 54 J | slr0081 | Two-component response regulator OmpR sf. | Regulatory functions |
| 55 J | slr0312 | Two-component response regulator NarL sf. | Regulatory functions |
| 56 J | slr0947 | Response regulator for energy transfer | Regulatory functions |
| from phycobilisomes to photosystems | |||
| 57 J | slr1416 | similar to MorR protein | Regulatory functions |
| 58 J | slr1738 | Transcription regulator Fur family | Regulatory functions |
| 59 J | slr1983 | Two-component hybrid sensor and regulator | Regulatory functions |
Genes exhibiting circadian rhythm in both experiments that are linked to energy metabolism (F), to photosynthesis and respiration (H) and to regulatory functions (J). The functional categories of the genes are according to KEGG.
*Corresponds to genes detected by the cosinor method, and
**corresponds to genes detected by the cosinor method with relaxed filtering conditions reported by Kucho et al. 2004.
Corresponds to genes whose expression is influenced by light reported by Stephanopoulos et al. 2004. su. denotes subunit, and sf. denotes subfamily.
Figure 7Phase diagrams of genes listed in .
Group F corresponds to genes related to energy metabolism. Group H corresponds to genes related to photosynthesis and respiration. Group J corresponds to genes related to regulatory functions. and correspond to the horizontal axis and to the vertical axis, respectively (see details in the text).