Literature DB >> 22135306

Identification of a core set of signature cell cycle genes whose relative order of time to peak expression is conserved across species.

Miguel A Fernández1, Cristina Rueda, Shyamal D Peddada.   

Abstract

A cell division cycle is a well-coordinated process in eukaryotes with cell cycle genes exhibiting a periodic expression over time. There is considerable interest among cell biologists to determine genes that are periodic in multiple organisms and whether such genes are also evolutionarily conserved in their relative order of time to peak expression. Interestingly, periodicity is not well-conserved evolutionarily. A conservative estimate of a number of periodic genes common to fission yeast (Schizosaccharomyces pombe) and budding yeast (Saccharomyces cerevisiae) ('core set FB') is 35, while those common to fission yeast and humans (Homo sapiens) ('core set FH') is 24. Using a novel statistical methodology, we discover that the relative order of peak expression is conserved in ∼80% of FB genes and in ∼40% of FH genes. We also discover that the order is evolutionarily conserved in six genes which are potentially the core set of signature cell cycle genes. These include ace2 (a transcription factor) and polo-kinase plo1, which are well-known hubs of early M-phase clusters, cdc18 a key component of pre-replication complexes, mik1 which is critical for the establishment and maintenance of DNA damage check point, and histones hhf1 and hta2.

Entities:  

Mesh:

Year:  2011        PMID: 22135306      PMCID: PMC3326295          DOI: 10.1093/nar/gkr1077

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

A cell division cycle among eukaryotes consists of a sequence of four major phases, namely, the G1, S, G2 and the M phase. The G1 phase (also known as the Gap 1 phase) is a resting phase where the cells grow in size and prepare for synthesis during the S phase. Furthermore, G1 phase also serves as one of two major check points, where if DNA damage is detected then the cell is prevented from proceeding to the S phase (1,2). Cells which pass the G1 check point proceed to S phase where the DNA replication takes place. This phase is followed by the G2 or Gap 2 phase. Similar to G1, this phase serves as a check point to ensure cells with damaged DNA do not proceed to the M phase (or mitosis) where the cells divide to form two daughter cells. Genes participating in a cell division cycle have a cyclical pattern of expression with peak attained just before their function (3). There are several intrinsic differences among organisms in various aspects of cell division cycle. For instance, the amount of time spent by a cell in different phases varies. Fission yeast, Schizosaccharomyces pombe (S. pombe), cell spends almost 70% of its time in the G2 phase, whereas the budding yeast, Saccharomyces cerevisiae (S. cerevisiae), cell spends roughly quarter of its time in G2 phase. Arbidopsis thaliana has a relatively small G2 phase but a long S phase. In contrast to S. pombe, a human cell may spend substantially more time in the G1 phase than in G2 phase (See www.cyclebase.org). Also, the proportion of genes that are known to participate in cell division cycle varies with organisms. For example, it is estimated that there are twice as many periodic genes in S. cerevisiae as in S. pombe (4). Despite many such differences, researchers are interested in (i) identifying genes that are periodic in multiple organisms (referred to as ‘periodically conserved’ genes) (Figure 1); (ii) among periodically conserved genes, identifying those that are also conserved in their phase of peak expression (Figure 1). This has been an active area of research over the past several decades [cf. (3,5)]. With the advent of microarray technology, numerous microarray studies have been conducted on several model organisms such as S. cerevisiae (6–9), S. pombe (4,10–12), Homo sapiens (13) and Arabidopsis (14,15). Such large-scale genome wide data on multiple organisms provide an excellent opportunity to determine genes involved in cell cycle and study their functions. It also allows one to understand the similarities and dissimilarities in the cell cycle of various organisms. A useful database containing results from various cell cycle microarray experiments is available at www.cyclebase.org (16), henceforth referred as ‘cyclebase’. These microarray data allow biologists to debate the conservation of genes participating in a cell cycle and their times of peak expression (3,4,10,11, 12,17). Based on a comprehensive analysis of the above microarray data and other published data, Jensen et al. (3) concluded that both periodicity as well as phase of peak expression are evolutionarily poorly conserved. Earlier a similar conclusion was drawn by Rustici et al. (4) who concluded that only 40 or so orthologs are periodic in both species of yeasts.
Figure 1.

Conservation of periodicity and phase. In each panel, the vertical hash mark on the time axis represents the boundary of a phase.

Conservation of periodicity and phase. In each panel, the vertical hash mark on the time axis represents the boundary of a phase. Although the poor conservation of periodicity and the phase of peak expression for most cell cycle genes may be biologically plausible, as evolutionarily functions of some genes may have changed, one cannot ignore variability between and within studies that may have contributed to these findings. For instance, even within the same organism there are major differences among studies published in the literature. Recently, three different groups of researchers conducted a total of 10 microarray experiments on S. pombe [5 by Rustici et al. (4), 3 by Oliva et al. (11) and 2 by Peng et al. (12)]. As summarized by (11) and by Caretta-Cartozo et al. (17), the three studies disagreed considerably on the number of periodic genes. According to Caretta-Cartozo et al. (17), only 156 out of ∼5000 genes in S. pombe genome were declared to be periodic by all three studies, although individually each group identified at least three times as many periodic genes. Furthermore, even among genes that were found to be periodic in at least two of the three studies, there were disagreements among the studies in terms of time to peak expression of some genes. For instance, according to Peng et al. (12) cdc18, mob1, imp2 and cig2 peak during the G1 phase, whereas Oliva et al. (11), Rustici et al. (4) and cyclebase suggest that these are M phase genes. The above discrepancies among studies even in the same organism are not surprising and can be attributed to various factors such as, natural variability in the data, experimental conditions, etc. Factors such as these result in statistical variability and uncertainty in estimates of time to peak expression. Not much has been discussed in the literature regarding these issues. The problem becomes even more challenging when comparisons across multiple organisms are to be made. In such comparisons, the biological differences among organisms may be confounded by statistical uncertainties due to variability in the data. For example, Rustici et al. (4) concluded that cell cycle regulation of majority of genes is not conserved between S. pombe and S. cerevisiae. On the other hand, Oliva et al. (11) and Peng et al. (12) suggest greater amount of similarities between the two species of yeasts and infer conservation of regulatory mechanisms. Since cell division cycle is a carefully orchestrated process, it is reasonable to hypothesize that the relative order of peak expression among a core set of cell cycle genes may be conserved even if the phase of some genes may have been evolutionarily modified (Figure 2). Genes whose relative order of time to peak expression is conserved may not only have a well-defined function in cell division cycle, but may also have potential interactions or associations with each other. In this article, we develop a formal statistical methodology to test the hypothesis that the relative order of peak expression among a core set of cell cycle genes is conserved in a pair of organisms. Using this methodology, we investigate if the ‘core set FB’ of fission yeast genes have the same relative order of peak expression as their budding yeast orthologs. Similarly, we investigate if the ‘core set FH’ of fission yeast genes have the same relative order of peak expression as their human orthologs. The statistical procedure developed in this article is novel and could help biologists formulate and test other similar hypotheses.
Figure 2.

(I) Relative order of peak expression of genes A,B and C is not conserved in Species 1 and 2. (II) Relative order of peak expression of genes D, E and F is conserved in Species 1 and 2. Orthologs in Species 2 are denoted by lower case letters a, b, c, d, e and f. In each panel, the vertical hash mark on the time axis represents the boundary of a phase. Insets in each panel represent the time to peak expression in terms of phase of cell cycle.

(I) Relative order of peak expression of genes A,B and C is not conserved in Species 1 and 2. (II) Relative order of peak expression of genes D, E and F is conserved in Species 1 and 2. Orthologs in Species 2 are denoted by lower case letters a, b, c, d, e and f. In each panel, the vertical hash mark on the time axis represents the boundary of a phase. Insets in each panel represent the time to peak expression in terms of phase of cell cycle.

MATERIALS AND METHODS

Relative order of peak expression among cell cycle genes

For a gene D, the time to its peak expression can be described in terms of an angle on a unit circle, known as the phase angle, which is denoted by ϕ. Suppose D, E and F are three cell cycle genes where D is an S phase gene, E is an early G2 phase gene and F is a mid-G2 phase gene, then they satisfy the relative order; D followed by E which is followed by F and which is followed by D. We represent this relative order of peak expression among the three genes by ϕ ≺ ϕ ≺ ϕ ≺ ϕ (or equivalently, D ≺ E ≺ F ≺ D). Suppose S. cerevisiae genes D, E and F are the orthologs of S. pombe genes d, e and f, respectively. Suppose D ≺ E ≺ F ≺ D, then we say that the relative order is conserved among the orthologs if d ≺ e ≺ f ≺ d (see Figure 2). In the ideal setting, if functions of all genes in the core set are conserved through evolution and if cell cycle is a well-ordered mechanism of nature, then it is reasonable to hypothesize that the relative order of expression of genes in the core set is conserved between the two organisms. Note that the relative order is invariant of the location of the pole of the circle. This is important for several reasons. First, biologically, the order of genes around the circle has no bearing on where the pole of the circle is established (i.e. it is rotation invariant). Secondly, a common challenge with time course cell cycle experiments is that one cannot be sure about the exact biological time when the cells were arrested to define the pole precisely. Also, different labs and experiments may arrest cells during different phases of cell cycle. Consequently, it can be challenging to compare phase angles across experiments since each experiment may have a different pole. However, the relative order of genes should be invariant of the location of the pole. Also, it is important to note that our definition of conservation of relative order does not require the orthologs pairs (D, d ), (E, e), (F, f ), etc. to have same phase angles or even the same phases (see the right panels in Figure 2). We just require d, e, f to satisfy the same relative order as D, E, F. Using Rustici et al. (4), Oliva et al. (11) and cyclebase, we arrived at the core set FB of 35 S. pombe cell cycle genes that are periodic in both yeasts (Table 1). Similarly, using cyclebase we arrived at the core set FH of 24 S. pombe cell cycle genes that are periodic in both S. pombe as well as H. sapiens (Table 2). We limited our core sets to include only those genes whose cycelbase periodicity rank is ≤500. The rank cut-off of 500 was arbitrarily chosen. Our point is that genes with higher ranks are less likely to be periodic with estimated phase angles subject to small concentration parameter, resulting in large uncertainty estimates.
Table 1.

Saccharomyces pombe genes in the core set FB arranged according to the relative order of S. cerevisiae orthologs

Relative orderSaccharomyces cerevisiae
Saccharomyces pombe
GeneCBase RankPeaktime (CBase) (deg.)Phase (CBase)GeneCBase RankFSA order
1CDC6500349.2Mcdc181218
2RFA1668.4G1ssb15612
3RNR15472G1cdc2282
4MSH61672G1msh65010
5MRC119272G1mrc13323
6POL111272G1pol16127
7SMC35572G1psm37314
8MCD11175.6G1rad219416
9RAD513475.6G1rhp5127534
10CLN2879.2G1cig2389
11POL28479.2G1pol212832
12SWE17797.2Smik15111
13HHT228133.2Sh3.3205
14HHF130136.8Shhf1173
15HHT113140.4Shht3266
16HTA212144Shta2317
17HTB25144Shtb1194
18HTZ1188176.4Spht112231
19KIP3390165.6Sklp56928
20FKH185180S/G2fkh2358
21SWI5124234G2ace21519
22BUD4229234G2mid22822
23CDC5117234G2plo14624
24MOB1177248.4G2mob111930
25ASE1111252G2mcp135935
26MYO1253252G2myo34926
27CHS287252G2chs25913
28HOF1113255.6G2cdc151117
29HOF1113255.6G2imp27629
30KIN3104291.6G2/Mfin14725
31DBF272298.8Msid28315
32SST2471324Mrgs115933
33CDC2024338.4Mslp121
34PST11000M/G1SPAC1705.03C2421
35DSE443521.6G1eng12120

CBase = Cyclebase

Table 2.

Saccharomyces pombe genes arranged in the core set FH according to the relative order of H. sapiens orthologs

Relative orderHomo sapiens
Saccharomyces pombe
GeneCBase RankPeaktime (CBase) (deg.)Phase (CBase)GeneCBase rankFSA order
1IFIT231310.8G1ssn624919
2LRRC5649643.2G1sds2238422
3ZNF36748140.4G1ace2*153
4CDC640165.6G1cdc18*122
5PKMYT1231187.2Smik1*5113
6HIST2H4B219194.4Shhf1*1711
7DHFRL1213208.8Sdfr147324
8SH3GL2332219.6SSPBC19C2.1031021
9H2AFX83259.2S/G2hta2*314
10HRSP12116262.8G2mug7115514
11PCBP1189266.4G2rnc118616
12TOP2A14291.6G2top222618
13PIF135302.4G2pif1888
14FOXM1107309.6G2fkh2*3512
15Hsp40203320.4G2SPCC63.1321617
16CDC25B64324G2/Mcdc2515815
17AURKA4324G2/Mark126520
18KIF107327.6Mklp5*697
19CCNB149334.8Mcig2*385
20PLK11334.8Mplo1*466
21MAPK133334.8Mspm19710
22CDC2024334.8G2slp1*21
23API438338.4Mbir145323
24RAD2180345.6Mrad21*949

Genes with * have periodic S. cerevisiae orthologs, CBase = Cyclebase

Saccharomyces pombe genes in the core set FB arranged according to the relative order of S. cerevisiae orthologs CBase = Cyclebase Saccharomyces pombe genes arranged in the core set FH according to the relative order of H. sapiens orthologs Genes with * have periodic S. cerevisiae orthologs, CBase = Cyclebase In the case of human orthologs, we relied completely on the peaktime specified by the cyclebase database to arrive their relative order (Table 2). However, in the case of budding yeast orthologs, in addition to cyclebase, we used published literature and Saccharomyces Genome Database (http://www.yeastgenome.org/) to arrive at the relative order (Table 1). We now describe the relative order of the 35 S. cerevisiae orthologs. Since the mRNA level as well as its protein level peaks during the early stages of G1 phase and is the precursor for DNA synthesis, therefore we begin with CDC6. This gene is followed by several G1 phase genes such as those involved in DNA repair, replication and check point (Replication Factor Alpha, RNR1, MSH6, MRC1 and POL1), cohesion of sister chromatids (SMC3 and MCD1), recombinational repair of double-strand breaks in DNA (RAD51), activation of Cdc28p to promote transition from G1 to S phase (CLN2, a late G1 phase cyclin) and DNA synthesis during DNA repair (POL2). The S phase genes that followed the G1 phase genes are those involved in: regulation of G2/M transition by inhibition of Cdc28p kinase activity (SWE1), chromatin assembly (histones such as HHT2, HHF1, HHT1, HTA2, HTB2, HTZ1) and mitotic spindle position (KIP3). These are followed by G2 phase transcription factors such as FKH1 and SWI5. Several G2 phase genes considered here have proteins involved in important functions such as: bud site selection (BUD4), cytokinesis and septation (CDC5, MOB1, ASE1, MYO1, CHS2, HOF1). Note that we considered both S. pombe orthologs of HOF1, namely, cdc15 and imp2 in our analysis. These genes are followed by KIN3, G2/M check point gene whose protein Kin3 plays a critical role in DNA damage recognition before the cell enters M phase (18). Among the M phase genes in the proposed relative order, DBF2 and CDC20 have a function for cells to exit from mitosis, while PST1 has function in the construction of cell wall. Our proposed relative order concluded with DSE4, a daughter cell-specific protein which degrades the cell wall causing the daughter cell to separate from the mother cell. Hence, it is logical that DSE4 was the last gene in our proposed relative order before returning to CDC6. According to cyclebase, some of the S. cerevisiae orthologs in the core set FB have identical peaktimes hence we assigned identical phase angles to all such genes in the null hypothesis. Specifically, following genes within parenthesis were hypothesized to have the same phase angles: (cdc22, msh6, mrc1, pol1, psm3), (rad21, rhp51), (cig2, pol2), (ace2, mid2, plo1), (mcp1, myo3, chs2) and (cdc15, imp2). The resulting relative order is described in Figure 3, where genes that were hypothesized to have same phase angle are along the same ray from the center of the circle (ray not drawn). See also Table 1. Known biological functions of genes in the core set FB are provided in Supplementary Table S1. The goal of this study is to test whether the S. pombe genes satisfy the relative order specified by the S. cerevisiae orthologs. Thus we tested the null hypothesis that the phase angle of cdc18 is followed by the phase angle of ssb1, … , SPAC1705.03c which in turn is followed by the phase angle of eng1 which in turn is followed by the phase angle of cdc18 against the alternative hypothesis that this order is not true. More precisely:
Figure 3.

Saccharomyces pombe genes arranged according to the relative order and approximate locations of their S. cerevisiae orthologs (in parenthesis). Sectors drawn are according to S. pombe cell cycle.

H1 :H0 is not true Saccharomyces pombe genes arranged according to the relative order and approximate locations of their S. cerevisiae orthologs (in parenthesis). Sectors drawn are according to S. pombe cell cycle. Similarly, we tested the following hypotheses to see whether the S. pombe genes satisfy the relative order specified by the 24 H. sapiens orthologs: H1 :H0 is not true There are two reasons (biological and statistical) for the above formulation of null and alternative hypotheses. First, as stated earlier, the cell division cycle is a fundamental process in eukaryotes and one would expect various aspects of this process to be conserved through evolution. This is the basic premise of many recent papers [e.g. (4)] which tried to identify genes that are periodic in multiple organisms. Since our investigation is based on such conserved genes, the conservation of the relative order should be the null hypothesis rather than the alternative hypothesis. Basically, among genes that are declared to be conserved between species, our null hypothesis states that their relative order is also conserved. There is also a statistical reason for our choice of null and alternative hypotheses. If the null hypothesis was that the relative order is not conserved among the q genes in the two organisms, then the null hypotheses would contain (q − 1)! configurations of parameters and the alternative hypothesis contains only one configuration. As q increases, the null hypothesis is too large and is never likely to be rejected. For example, if q = 35 the total number of possible null configurations are of the order 1040, which is extremely large. No statistical test would have sufficient power to reject the null hypothesis in such situations. In fact, the power will go to zero as q increases!

Statistical test

For each gene i, i = 1, 2, … , q and experiment j, j = 1, 2, … , E, we model the unconstrained estimator of phase angle of peak expression θ, obtained from the Random Period Model (RPM) (19), using the von Mises distribution (VM). This distribution plays an important role in circular data analysis similar to the normal distribution for Euclidean data. Thus, we assume that where ϕ is the true unknown phase angle of peak expression of gene i in the j-th experiment. The concentration parameter κ represents the uncertainty associated with θ. We assume that κ depends on experiment j but not on gene i. There are two sources of uncertainty associated with the phase angle estimate of each gene, one is specific to the gene and the other is due to the experiment (which is common to all genes within the experiment). This resembles the classical mixed effects linear model in Euclidean space data. Since the number of time points used in each of the time course experiments considered in this article is fairly large, for any specific gene, the uncertainty associated with the estimator of the phase angle based on the RPM is negligible relative to the uncertainty due to the experiment. For this reason, we only retained the uncertainty component corresponding to the experiment. For each experiment j, j = 1, 2, … , E, our problem of interest is to the test the following hypotheses: H1 :H0 is not true. Let be a simple order constraint starting at index I and let ϕ = (ϕ1, ϕ2, … , ϕ)′. Then, for each j = 1, 2, …, E, the above null hypothesis can be rewritten as ϕ ∈ C, where . Let denote the restricted maximum likelihood estimator of ϕ subject to the constraint ϕ ∈ C (20). determines a partition f = {1, … , I} into sets of consecutive coordinates on which is constant. These sets are called level sets. Then, we construct the following test statistic to test the above hypotheses Notice that T is a measure of the angular distance between θ and . Our conditional test T, described in detail in the Supplementary Data, rejects the null hypothesis whenever T ≥ c(m), where m is the number of level sets for , c(m) is chosen so that , and ϕ0 is any point in the null hypotheses for which all coordinates are equal. Since in practice, κ are unknown, in order to derive a proper value for these parameters, we obtained its maximum likelihood estimator using an analysis of variance approach based on the von Mises distribution (see Section 1.3 in Supplementary Data). Theoretical details of T are given in Section 1.4 of Supplementary Data. There we demonstrate that our proposed test is an asymptotic α level test. The derivation of this latter property is not straightforward as several statistical issues arise as a result of the specific characteristic of the testing problem, namely, a complicated null hypothesis for a directional parametric model.

Lack of fit criterion for a given relative order

For a relative order specified by the null hypothesis, let p denote the P-value associated with the j-th experiment, j = 1, 2, … , E, and let . Note that L always lies between 0 and ∞, with smaller value corresponding to better fit. In the extreme case, if the presumed relative order is perfectly satisfied within each experiment with P-value of 1, then L = 0. Thus, among a collection of plausible orders for a set of cell cycle genes, a biologist may choose the order that corresponds to the smallest value of L. Note that under the null hypothesis, if the P-values are independently and uniformly distributed in the interval (0, 1), then 2L is distributed as a central χ2 random variable with E degrees of freedom. This is often known as Fisher's method of combining P-values and yields a formal statistical test.

RESULTS

Using the 10 S. pombe time course experimental data (4,11,12), we first obtained the unconstrained phase angle estimates of genes in the core sets FB and FH (Supplementary Tables S2 and S3) which are then used for testing various hypotheses described in this article. Using the estimates in Supplementary Table S2, we tested the hypotheses appearing in Equation (1) that all 35 S. pombe genes in FB satisfy the relative order specified by the S. cerevisiae orthologs against the alternative hypothesis that they are not. The null hypothesis is rejected at P ≤ 0.15 in 5 out of 10 experiments (Table 3). Of these five experiments, two have a P < 0.0001. If the null hypothesis was true in each of the 10 experiments, then the binomial probability of observing two or more experiments (out of 10) with a P = 0.0001 is 4.49 × 10−7, which is extremely small. This suggests that the relative order hypothesized in Equation (1) may not be true and thus the 35 S. pombe genes do not follow the same relative order as their S. cerevisiae orthologs. Of course, in the above argument we implicitly assume that the outcomes of the 10 experiments are identically and independently distributed. Although this is a commonly made assumption, we acknowledge that it may be restrictive.
Table 3.

Test for relative order of S. pombe genes in the core set FB (Order specified by S. cerevisiae orthologs)

ExperimentP-values
Based on all 35Based on final 28
Saccharomyces pombe genesSaccharomyces pombe genes
Oliva cdc0.080.53
Oliva elut10.690.98
Oliva elut20.140.41
Peng cdc0.240.99
Peng elut0.570.99
Rust cdc15.37E-110.34
Rust cdc20.190.86
Rust elut10.070.88
Rust elut20.240.99
Rust elut32.88E-050.53
Lack of fit46.753.57
Test for relative order of S. pombe genes in the core set FB (Order specified by S. cerevisiae orthologs) A question of interest is whether we can identify a subset of the 35 genes that conserve the relative order between the two yeasts. Since the number of all possible subsets (of various sizes) is extremely large, it would be practically impossible to enumerate all possible subsets of all sizes and then test the null hypotheses such as the one appearing in Equation (1) for each subset. This problem resembles the classical problem of selection of variables (or model selection) in linear regression analysis. Accordingly, we developed a Forward Selection Algorithm (FSA), which is described in the Supplementary Data. Similar to forward selection procedure in classical linear regression analysis, the FSA proceeds systematically by entering one gene at a time into the test for relative order according to its periodicity rank assigned by the cyclebase. Smaller the rank, the more periodic the gene is and hence its phase angle estimate is more likely to be reliable. The proposed FSA begins with all ortholog pairs that have a cyclebase rank <100. Thus, a gene is included in Step 1 of FSA if both fission yeast as well as the budding yeast orthologs of the gene has a rank <100. Details of the subsequent steps and the implementation of FSA are provided in the Supplementary Data. Using FSA (Supplementary Table S4), we discover that 28 out of 35 S. pombe genes, namely, cdc18, ssb1, cdc22, msh6, mrc1, pol1, psm3, rad21, cig2, pol2, mik1, h3.3, hhf1, hht3, hta2, htb1, pht1, klp5, fkh2, ace2, plo1, chs2, cdc15, imp2, sid2, slp1, SPAC1705.03C, eng1, potentially satisfy the same order as their S. cerevisiae orthologs. Thus, the relative order of these 28 genes seems to be conserved between the two species of yeasts. For these genes, the null hypothesis is rejected in none of the experiments even at a level of significance as high as 0.30 (Table 3). It is also interesting to note that the lack of fit criterion L based on all 35 genes was 46.75 and it dropped to 3.57 for the above 28 genes selected by FSA. Similar to genes in FB, we also tested the Equation (2) for genes in the core set FH and found that the relative order was rejected in 6 out of 10 experiments at a P < 0.001 (Table 4). Using FSA we found ace2, cdc18, mik1, histones (hhf1, hta2), rnc1, top2, cdc25, plo1 and slp1, to satisfy the same relative order as their human orthologs (Table 4). Among these 10 genes, ace2, cdc18, mik1, histones (hhf1, hta2) and plo1 also satisfied the relative order specified by their S. cerevisiae orthologs. Recall that, evolutionarily, humans and fungi are ∼1.5 billion years apart and budding yeast and fission yeasts are nearly billion years apart (21). Thus, it appears that the above six genes are evolutionarily conserved in their relative order of peak expression during the cell division cycle (Figure 4 and Table 5). These six genes are well known in the literature to play a critical role during cell division cycle. For example, the transcription factor ace2 and the polo-kinase plo1 are well-known hubs of early M phase clusters (22), the cell cycle gene cdc18 is a key component of pre-replication complexes for the onset of S phase (23), histones hhf1, hta2 play an important role during the S phase and mik1 is critical in the establishment and maintenance of DNA damage check point (24).
Table 4.

Test for relative order of S. pombe genes in the core set FH in the 10 experiments (Order specified by H. sapiens orthologs)

ExperimentP-values
Based on all 24Based on final 10
Saccharomyces pombe genesSaccharomyces pombe genes
Oliva cdc0.030.60
Oliva elut10.100.93
Oliva elut20.190.95
Peng cdc1.10E-030.95
Peng elut0.070.83
Rust cdc14.12E-101
Rust cdc22.37E-060.93
Rust elut100.92
Rust elut21.90E-130.93
Rust elut30.061
Lack of fit>100 0001.10
Figure 4.

A core set of signature cell cycle genes with relative order of time to peak expression conserved among S. pombe, S. cerevisae and H. sapiens. Sectors and approximate locations of genes are drawn according to S. pombe. S. pombe genes are in green, S. cerevisiae orthologs are in red and H. sapiens orthologs are in blue.

Table 5.

A core set of signature cell cycle genes

Saccharomyces pombe gene (Saccharomyces cerevisiae, Homo sapiens orthologs)Saccharomyces pombeSaccharomyces cerevisiaeHomo sapiens
plo1 (CDC5, PLK1)G2G2M
ace2 (SWI5, ZNF367)G2/MG2G1
cdc18 (CDC6, CDC6)MMG1/S
mik1 (SWE1, PKMYT1)MG1/SS
hhf1 (HHF1, HIST2H4B)G1/SSS
hta2 (HTA2, H2AFX)G1/SSS/G2

Cell cycle phases are obtained from cyclebase

A core set of signature cell cycle genes with relative order of time to peak expression conserved among S. pombe, S. cerevisae and H. sapiens. Sectors and approximate locations of genes are drawn according to S. pombe. S. pombe genes are in green, S. cerevisiae orthologs are in red and H. sapiens orthologs are in blue. Test for relative order of S. pombe genes in the core set FH in the 10 experiments (Order specified by H. sapiens orthologs) A core set of signature cell cycle genes Cell cycle phases are obtained from cyclebase To ensure that our statistical test has sufficient power to detect the alternative hypothesis, i.e. reject the null hypothesis that the genes in both species satisfy the same relative order, we conducted a simulation study for the fission and budding yeast data by randomly permuting the order of the genes in Step 1 of FSA and applied the algorithm. We considered 100 permutations and performed the first step of FSA on each permuted data. The null was rejected for all 100 permutations. We also found that in at least 5 out of the 10 experiments the P <0.05 and this occurred in every one of the 100 random permutations we considered. Note that the binomial probability of observing a P-value of 0.05 in at least 5 experiments out of 10 experiments by random chance is 6.36 × 10−5, which is a very unlikely event. Yet, in all 100 random permutations we found 5 out of 10 experiments to have a P <0.05, thus suggesting that our test is reasonably powerful to reject the null hypothesis of relative order if the hypothesis is not true. In our simulation study, we did not investigate the power of our test for alternatives where the order among the genes is not well conserved but not entirely random order. As with any statistical test, there will be a reduction in power as we get closer to the null hypothesis. In other words, if the true order is a very minor perturbation of the null hypothesis then probability of rejecting the null hypothesis would be smaller than when true order is substantially different from the null hypothesis. In a future project, we plan to investigate this problem in greater detail.

DISCUSSION

Since cell cycle genes follow a synchronized pattern of expression (25), one may speculate that some of the cell cycle genes are functionally conserved through evolution. There is an intrinsic order to the peak expression among the cell cycle genes so that they are converted into proteins in a well-synchronized manner to execute their respective functions during cell cycle. Consequently, the relative timing of peak expression of some of the cell cycle genes must be conserved through evolution. Often the order among genes is determined using heat maps and published literature. There does not seem to exist a formal statistical methodology to test hypothesis regarding the order among genes in a given experiment. In this article, we have developed a novel statistical methodology that can be used for testing relative order among the phase angles of cell cycle genes. Using the methodology developed in this article, we demonstrated that a core subset of 28 S. pombe genes have the same relative timing of peak expression as their S. cerevisiae orthologs. This number increases to 32 if we reduce the stringency of our criterion. Thus, it may be reasonable to infer that among the 35 genes in the core set FB, at least 80% satisfy the same relative order of peak expression as their S. cerevisiae orthologs. Similarly, ∼40% of the FH core set genes (10 out of 24) satisfy the same relative order of peak expression as their H. sapiens orthologs. Although this article takes the first step toward a formal statistical methodology for answering questions about conservation of the relative order cell cycle genes, it is important to acknowledge that, analogous to classical linear regression analysis, one may consider other alternatives to FSA and derive an improved algorithm. In this article, we took a ‘conservative approach’ when formulating the hypotheses appearing in Equations (1) and (2). Note that the average cyclebase ranks of S. cerevisiae and H. sapiens orthologs used in this article are 126.57 and 121.21, respectively. These are almost twice the average cyclebase rank of S. pombe genes, which is 66.17. Since higher cyclebase rank corresponds to poorer periodicity, therefore potentially, there is greater uncertainty in the phase angles of S. cerevisiae and H. sapiens orthologs in comparison to S. pombe genes. Since we formulated our null hypotheses using S. cerevisiae and H. sapiens orthologs and used S. pombe data to test, the FSA is more likely to select fewer genes than otherwise. The above formulation resembles the classical ‘single sample’ statistical hypothesis testing problem. It would be useful to extend our procedure so that uncertainties in both orthologs are taken into account when formulating the testing problem, resembling the classical ‘two sample’ testing problem. Note that the ‘two-sample’ problem for testing the equality of two sets of orderings is not well developed even for Euclidean space data, and the problem is substantially more complicated for circular data. The proposed relative order for S. cerevisiae and H. sapiens were determined using the peak times reported in cyclebase and the published literature. We recognize that the exact order among some of the ‘neighboring’ genes is difficult to ascertain. Thus, there is a potential for misspecification of the relative order. This resembles the classical problem of model misspecification that occurs so commonly in a variety of situations. If a biologist chooses to refine our proposed relative order based on her/his understanding of the functions and order of the genes, then she/he may explore such alternative orders and test them using our proposed methodology. A biologist could also select best fitting relative orders using the lack of fit criterion introduced in this article. Hence in this article we have provided a general methodology that would allow biologists to hypothesize a sequential order of peak expression for cell cycle genes and test it. A freely downloadable SAS based user-friendly software can be obtained by either contacting the first author or by visiting www.niehs.nih.gov/research/atniehs/labs/bb/staff/peddada/index.cfm. An R package containing this and other circular data analysis routines is being developed and will soon be made available at the above address.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online. Supplementary Tables S1–S4. Supplementary Information and Supplementary References [26-31].

FUNDING

Spanish Ministerio de Ciencia e Innovación (MTM2009-11161 to M.A.F. and C.R.); Intramural Research Program of the National Institutes of Health, National Institute of Environmental Health Sciences (Z01 ES101744-04). Funding for open access charge: Biostatistics Branch, NIEHS, National Institutes of Health. Conflict of interest statement. None declared.
  24 in total

1.  A random-periods model for expression of cell-cycle genes.

Authors:  Delong Liu; David M Umbach; Shyamal D Peddada; Leping Li; Patrick W Crockett; Clarice R Weinberg
Journal:  Proc Natl Acad Sci U S A       Date:  2004-05-03       Impact factor: 11.205

2.  Genome-wide gene expression in an Arabidopsis cell suspension.

Authors:  Margit Menges; Lars Hennig; Wilhelm Gruissem; James A H Murray
Journal:  Plant Mol Biol       Date:  2003-11       Impact factor: 4.076

3.  Co-evolution of transcriptional and post-translational cell-cycle regulation.

Authors:  Lars Juhl Jensen; Thomas Skøt Jensen; Ulrik de Lichtenberg; Søren Brunak; Peer Bork
Journal:  Nature       Date:  2006-09-27       Impact factor: 49.962

Review 4.  Cell cycle checkpoints: preventing an identity crisis.

Authors:  S J Elledge
Journal:  Science       Date:  1996-12-06       Impact factor: 47.728

5.  New weakly expressed cell cycle-regulated genes in yeast.

Authors:  Ulrik de Lichtenberg; Rasmus Wernersson; Thomas Skøt Jensen; Henrik Bjørn Nielsen; Anders Fausbøll; Peer Schmidt; Flemming Bryde Hansen; Steen Knudsen; Søren Brunak
Journal:  Yeast       Date:  2005-11       Impact factor: 3.239

Review 6.  Cell cycle control and cancer.

Authors:  L H Hartwell; M B Kastan
Journal:  Science       Date:  1994-12-16       Impact factor: 47.728

7.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

8.  Estimation of Parameters Subject to Order Restrictions on a Circle With Application to Estimation of Phase Angles of Cell Cycle Genes.

Authors:  Cristina Rueda; Miguel A Fernández; Shyamal Das Peddada
Journal:  J Am Stat Assoc       Date:  2009-03-01       Impact factor: 5.033

9.  Dissecting the fission yeast regulatory network reveals phase-specific control elements of its cell cycle.

Authors:  Pierre R Bushel; Nicholas A Heard; Roee Gutman; Liwen Liu; Shyamal D Peddada; Saumyadipta Pyne
Journal:  BMC Syst Biol       Date:  2009-09-16

10.  Periodic gene expression program of the fission yeast cell cycle.

Authors:  Gabriella Rustici; Juan Mata; Katja Kivinen; Pietro Lió; Christopher J Penkett; Gavin Burns; Jacqueline Hayles; Alvis Brazma; Paul Nurse; Jürg Bähler
Journal:  Nat Genet       Date:  2004-06-13       Impact factor: 38.330

View more
  8 in total

1.  Regulation of Ace2-dependent genes requires components of the PBF complex in Schizosaccharomyces pombe.

Authors:  M Belén Suárez; María Luisa Alonso-Nuñez; Francisco del Rey; Christopher J McInerny; Carlos R Vázquez de Aldana
Journal:  Cell Cycle       Date:  2015-08-03       Impact factor: 4.534

2.  Order restricted inference for oscillatory systems for detecting rhythmic signals.

Authors:  Yolanda Larriba; Cristina Rueda; Miguel A Fernández; Shyamal D Peddada
Journal:  Nucleic Acids Res       Date:  2016-09-04       Impact factor: 16.971

3.  Circular piecewise regression with applications to cell-cycle data.

Authors:  Cristina Rueda; Miguel A Fernández; Sandra Barragán; Kanti V Mardia; Shyamal D Peddada
Journal:  Biometrics       Date:  2016-03-17       Impact factor: 2.571

4.  isocir: An R Package for Constrained Inference using Isotonic Regression for Circular Data, with an Application to Cell Biology.

Authors:  Sandra Barragán; Miguel A Fernández; Cristina Rueda; Shyamal Das Peddada
Journal:  J Stat Softw       Date:  2013-08       Impact factor: 6.440

5.  Determination of Temporal Order among the Components of an Oscillatory System.

Authors:  Sandra Barragán; Cristina Rueda; Miguel A Fernández; Shyamal D Peddada
Journal:  PLoS One       Date:  2015-07-07       Impact factor: 3.240

6.  Integrated time course omics analysis distinguishes immediate therapeutic response from acquired resistance.

Authors:  Genevieve Stein-O'Brien; Luciane T Kagohara; Sijia Li; Manjusha Thakar; Ruchira Ranaweera; Hiroyuki Ozawa; Haixia Cheng; Michael Considine; Sandra Schmitz; Alexander V Favorov; Ludmila V Danilova; Joseph A Califano; Evgeny Izumchenko; Daria A Gaykalova; Christine H Chung; Elana J Fertig
Journal:  Genome Med       Date:  2018-05-23       Impact factor: 11.117

7.  Linear mixed effects models under inequality constraints with applications.

Authors:  Laura Farnan; Anastasia Ivanova; Shyamal D Peddada
Journal:  PLoS One       Date:  2014-01-21       Impact factor: 3.240

8.  Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments.

Authors:  Basel Abu-Jamous; Rui Fa; David J Roberts; Asoke K Nandi
Journal:  J R Soc Interface       Date:  2013-01-24       Impact factor: 4.118

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.