Literature DB >> 23055751

A novel biclustering approach with iterative optimization to analyze gene expression data.

Sawannee Sutheeworapong1, Motonori Ota, Hiroyuki Ohta, Kengo Kinoshita.   

Abstract

OBJECTIVE: With the dramatic increase in microarray data, biclustering has become a promising tool for gene expression analysis. Biclustering has been proven to be superior over clustering in identifying multifunctional genes and searching for co-expressed genes under a few specific conditions; that is, a subgroup of all conditions. Biclustering based on a genetic algorithm (GA) has shown better performance than greedy algorithms, but the overlap state for biclusters must be treated more systematically.
RESULTS: We developed a new biclustering algorithm (binary-iterative genetic algorithm [BIGA]), based on an iterative GA, by introducing a novel, ternary-digit chromosome encoding function. BIGA searches for a set of biclusters by iterative binary divisions that allow the overlap state to be explicitly considered. In addition, the average of the Pearson's correlation coefficient was employed to measure the relationship of genes within a bicluster, instead of the mean square residual, the popular classical index. As compared to the six existing algorithms, BIGA found highly correlated biclusters, with large gene coverage and reasonable gene overlap. The gene ontology (GO) enrichment showed that most of the biclusters are significant, with at least one GO term over represented.
CONCLUSION: BIGA is a powerful tool to analyze large amounts of gene expression data, and will facilitate the elucidation of the underlying functional mechanisms in living organisms.

Entities:  

Keywords:  Pearson’s correlation coefficient; biclustering; genetic algorithm; microarray data

Year:  2012        PMID: 23055751      PMCID: PMC3459542          DOI: 10.2147/AABC.S32622

Source DB:  PubMed          Journal:  Adv Appl Bioinform Chem        ISSN: 1178-6949


Background

The complete sequencing of the genomes of many organisms has led to the launch of various omics studies. In one study, the advent of deoxyribonucleic acid (DNA) microarray technology has enabled the monitoring of the expression levels of numerous genes at a time, under many different growth conditions. This technique is now widely used in diverse types of biological research, such as identifying disease markers, reconstructing cellular signaling pathways, and inferring gene regulatory networks. DNA microarray technology has also provided numerous biological insights.1–3 Data generated from even a few array measurements are quite complex, and the amounts of microarray data available in public databases are dramatically increasing, due to the efficiency and rapid improvement of DNA microarray technologies. As a result, the interpretation of DNA microarray data obtained under a large number of conditions has become a challenging problem. In the analyses of a large dataset, as the first step, researchers usually search for similar patterns appearing within the data. In the case of DNA microarray data, similar patterns of gene expression data are often investigated by using cluster analyses, such as K-means clustering4 and hierarchical clustering.5 Although clustering can provide considerable biological information, conventional clustering algorithms may not be suitable for some analyses of microarray data for the following two reasons. Firstly, there are many genes that encode proteins involved in several functional activities at a time, but the conventional clustering methods cannot identify these genes, because they only allow a gene to belong to one cluster at a time, instead of multiple clusters. Secondly, it is difficult to find the genes that are co-expressed under a few specific conditions but are differently expressed under other conditions because the similarity of the genes in conventional clustering is determined by the entire expression data.6,7 In terms of the above shortcomings, biclustering is more effective than conventional clustering, since it can cluster both genes and conditions simultaneously, and a gene (or a condition) can be involved in multiple clusters at a time.7 The concept of biclustering was first proposed by Hartigan,8 and Cheng and Church9 applied it to search for the most homogeneously expressed genes over certain sets of conditions by using greedy search algorithms.9 Most biclustering algorithms have been implemented with greedy search algorithms,1,10,11 to reduce the calculation costs. One such bicluster, a maximum bicluster, is known as a nondeterministic polynomial time (NP)-complete problem that can possibly be solved in polynomial time using a nondeterministic Turing machine,12 and a greedy search algorithm is required for actual applications to provide efficient approximations. Usually, one greedy search results in one bicluster, and the greedy search approach is repeatedly applied to the data, while preventing the reproduction of similar biclusters. The greedy search then tries to obtain a set of various biclusters as the final output. Biclustering has also been implemented by using a genetic algorithm (GA) to find a practical solution to balance bicluster quality and calculation cost. A GA emulates an evolutionary processes to obtain nearly optimal solutions.13 Initially, a set of candidate solutions is prepared; each solution being called a chromosome. The chromosomes evolve by exchanging their parts and changing some elements into a different state, and elite chromosomes are selected to survive as the parents of the next generation. This evolution and selection process is repeated over a number of generations to yield an optimal solution.13 Bleuler et al14 first applied GA to biclustering, whereby a binary string (representing a gene or a condition belonging to a bicluster, or not) was employed as a representation of chromosomes. To avoid any redundancy of the resulting biclusters, Bleuler et al introduced a special selection operator called environment selection. Chakraborty and Maka15 have generated a similar GA-based biclustering, but different in terms of chromosome initialization. Initial chromosomes are prepared by K-means clustering. These methods find an optimum set of biclusters from one GA search. For such methods, it would be difficult to obtain a set of various, nonredundant biclusters, because only better chromosomes can survive by the selection process of GA, and thus the resulting biclusters tend to converge into similar results in the later generations.14,15 Another type of GA-based biclustering, Sequential Evolutionary Biclustering (SEBI), has a distinct strategy. SEBI initially applies GA to select the optimal bicluster, and then this process is repeated so that the genes and the conditions in the biclusters already selected are less likely to be selected again. In other words, although SEBI would generate a set of diverse biclusters, it de-empathizes the overlap of biclusters, a significant feature of biclustering.16 In the present study, we propose BIGA as the basis of a novel biclustering approach. In BIGA, an attempt is made to progressively divide the large amounts of input data into small datasets, by iteratively using GA, such as SEBI. Instead of evaluating a set of biclusters, GA is applied to each division process. Therefore, the resulting biclusters are substantially diverse. In addition, BIGA introduces the overlap state explicitly defined in the ternary digit (or trit) encoding chromosome. In this study, the algorithm is described, the performance of BIGA is compared with those of six existing biclustering algorithms, and the biological relevance of BIGA is evaluated by using gene ontology (GO) enrichment analyses. Finally, we conclude that BIGA is a powerful and practical solution for biclustering with high-dimensional data.

Material and methods

Definition of biclusters

BIGA accepts a set of gene expression data with the matrix form D = (G, C), including N rows of genes G = {g1, g2, …, g} and M columns of conditions or samples C = {c1, c2, …, c}, where N and M are the total numbers of genes and conditions, respectively. All genes will be clustered into K overlapping biclusters B = {B1, B2, …, B}, and each bicluster (B) corresponds to a submatrix B = (X, Y) of D, where X ⊆ G and Y ⊆ C. The sizes of X and Y, ie, the numbers of genes and the conditions of a bicluster, are denoted by n and m, in which n ≤ N and m ≤ M, respectively.

Binary-iterative genetic algorithm

In order to decompose D into B systematically, a binary tree was introduced. Generally, a binary tree comprises nodes and directed edges, in which each node can be extended to at most two child nodes.17 In this work, we regarded each bicluster and each edge as a node and a parent–child relationship between a bicluster pair, respectively. We designated the method as BIGA. BIGA consists of the following three steps. A schematic diagram of BIGA is shown in (Figure 1).
Figure 1

Schematic diagram of binary-iterative genetic algorithm. (A) Decomposition of a parent bicluster into two child biclusters encoded in a string (left panel). The string indicates that a parent bicluster (middle panel) is divided into two child biclusters (right panel). The red, blue, and violet cells in the biclusters belong to bleft, bright, and both, respectively. (B) Decoding rule of a string. (C) Binary division performed by genetic algorithm (GA). The best string is underlined in the rectangle. For each GA, the generated biclusters (bleft and bright) are evaluated to determine their states: continue the decomposition (*), quit the decomposition and accept (+), or quit the decomposition and discard (−). (D) Flow diagram of the bicluster evaluation.

Step 1: A division of microarray data is represented by a string, a sequence of trit (0, 1, 2) with the length of n (number of genes in the parent bicluster) +m (number of conditions in the parent bicluster). The trit 0, 1, and 2 means that an associated gene or condition is contained in either of two biclusters, b or b, or both, respectively. This means that one string can encode the division of one bicluster into two biclusters, while allowing overlap. An example of this encoding is shown in (Figure 1A). The “|” symbol serves as a spacer of the genes and conditions for clarity. The string is equivalent to the division illustrated by the matrix (microarray data, or a bicluster) in the middle of (Figure 1A). In the matrix, the rows and the columns correspond to the genes and the conditions, respectively. The cell of the matrix belongs to either b (blue cell), b (red), or both (violet), under the decoding rule shown in (Figure 1B). The white cells are ignored because they are not coexpressed with color cells. Consequently, the bicluster shown in the middle of (Figure 1A) represents the division into two biclusters on the right of (Figure 1A). Step 2: To search for the best chromosome (the best trit string) representing the optimal division of a bicluster, GA is performed (rectangles in Figure 1C). In the GA procedure, a mutation and a crossover are introduced into each chromosome. Each number on a chromosome is altered to 0, 1, or 2, for the mutation; whereas two chromosomes exchange corresponding parts with each other in the crossover. Chromosomes with higher fitness scores (described in the following section) survive in the next generation, and all other chromosomes are discarded. GA was implemented via Java Genetic Algorithm Product,18 with a mutation rate of 0.01 and a crossover rate of 0.5. Finally, the best chromosome after 100 generations of GA (the underlined string in the rectangle) is selected, based on the fitness score (see the next section). The best chromosome is then decoded into two biclusters (b and b). We decide whether to continue with further decompositions after the evaluation of the biclusters, as follows. Step 3: Evaluation of biclusters. For each child bicluster, the numbers of genes and conditions, the average Pearson’s correlation coefficient (PCC), and the parent–child redundancy are examined to decide whether we should quit or continue the decomposition. Subsequently, the bicluster is either accepted as an element of the final biclusters, B, or discarded. We calculate the PCC of every gene pair in a bicluster, and average them (the average PCC). The parent–child redundancy is defined as the ratio of the number of genes of the child bicluster (n′) to that of the parent bicluster (n). Therefore, a small parent–child redundancy indicates that the child bicluster contains a smaller number of genes than the parent, and a large parent–child redundancy means that the number of genes in the child bicluster is almost the same as that of the parent. The average PCC and the parent–child redundancy are abbreviated as C and R, respectively. The decision process is illustrated in (Figure 1D). Briefly, the process employs four rules: (I) we quit the decomposition and accept the bicluster if C is higher than the threshold τ. (II) we quit the decomposition and discard the bicluster if the bicluster is “small,” which is judged by the thresholds τ and τ for n′ and m′, respectively. (III) we also quit the decomposition and discard the bicluster if the redundancy, R, is small (R < τ) or large (R > 1 − τ). The latter rule was employed to reduce the calculation cost, because a child bicluster that is similar to its parent bicluster and has a low C is not considered to produce promising results. Using the forth rule: (IV) we continue the decomposition. Four thresholds, τ, τ, τ, and τ, were empirically determined as 30, 10, 0.65, and 0.15, respectively (see Table S1). The Greek symbols in (Figure 1D) indicate the rule applied in each decision. In (Figure 1C), the accepted and discarded biclusters are marked by + and – symbols. The bicluster to be decomposed is marked by a * symbol. Figure 1C indicates that four biclusters are accepted.

Fitness function

In general, large biclusters including co-expressed genes across many specific conditions are preferable. The average PCC of a bicluster was employed to evaluate the gene co-expression. Furthermore, the relative area A of the bicluster, defined by (n′/n) (m′/m), using the gene and condition numbers of the parent and child biclusters was used to evaluate the size of a bicluster. Two parameters were introduced for gene-weight (α) and condition-weight (β), to control the balance between the number of genes and that of the conditions (0 < α, β < 1) in a relative area, A. The fitness function of a chromosome was defined as follows (Equation 1): where c, b (i = left or right), A(b), and C(b) denote a chromosome, one of the child biclusters, the relative area of child bicluster b, and the average PCC of child bicluster b, respectively. The balance between α and β was important in order to select biologically meaningful biclusters when using f(c). Since a high average PCC for a large number of genes was obtained rather easily when only a small number of conditions were considered, a certain number of conditions should be required for each bicluster, to ensure the biological significance. The variation of α and β was empirically estimated, and finally 0.3 and 0.5 were chosen, respectively (see the results in Table S1).

Assessment procedure

Six existing methods were compared to evaluate the performance of BIGA: Cheng and Church algorithm,9 Statistical-Algorithmic Method for Bicluster Analysis (SAMBA),19,20 order-preserving submatrix (OPSM),1 iterative signature algorithm (ISA),11 binary inclusion-maximal biclustering algorithm (BIMAX),21 and SEBI.16 SEBI is selected as a representative of the GA-based biclustering approaches,15,16 because SEBI adopts an outstanding system to reduce the redundancy of biclusters and performs iterative evolutionary searches like BIGA. The five other methods are based on greedy searches. Data provided by Gasch et al22 was used for the analyses of Saccharomyces cerevisiae. The analyses contained 2993 genes and 173 stress conditions, as a result the data size was large and abundant annotations were available. Prelic et al21 used this dataset to evaluate algorithms, and the resultant sets of biclusters for the five greedy-search algorithms are publicly available. These bicluster sets were obtained for comparison with our results. Neither the results of SEBI for the data nor SEBI itself is publicly available. The framework of SEBI was re-implemented in a second experiment.16 Note that there might be some minor differences between SEBI and the re-implemented SEBI. Henceforth, we denote mySEBI as our implementation. The sets of biclusters were evaluated in terms of the following four points. Since PCC is a widely used parameter to assess the similarity of expression patterns, the distribution of the average PCC of all biclusters was examined. One may consider the mean square residual (MSR) of biclusters9 to be useful as an indicator of the coherence of biclusters, but PCC is better than MSR in terms of finding the functional relevance of genes,23–26 in much biological data, for example, the involvement of the same pathway or the participation in the same protein complex.27,28 The existing methods do not necessarily optimize the correlation of biclusters, and some biclusters derived from other algorithms can contain biclusters showing strong anti-correlation (ie, genes expressed inversely). The absolute value of PCC was used to estimate such biclusters for comparisons. Coverage and overlap are also important measures to evaluate the biclustering, as higher coverage and lower overlap are preferable for further biological analyses. Previous studies29 used “cell coverage,” by calculating the percentages of area (genes × conditions) covered by the biclusters, and “cell overlap” by measuring the intersection areas of the biclusters. In this study, “gene coverage” and “gene overlap,” were adopted because higher cell coverage can be achieved even by a high coverage of conditions and a low coverage of genes, and this result is not biologically significant. In addition, cell overlap ignores the overlap of genes shared in any two biclusters, if the conditions in the biclusters are completely different. Gene coverage is defined as the ratio of genes that are assigned to any biclusters to all genes, and gene overlap is the ratio of total genes overlapping on multiple biclusters to the genes assigned to any biclusters (Equation 2): Gene coverage can evaluate the ability of an algorithm to decide the cluster for each gene, and gene overlap can measure the ability of an algorithm to specify the clusters for genes that are not necessarily involved in multiple biological processes. The biological significance of the results by measuring the GO enrichment was also evaluated. More precisely, FuncAssociate (2.0; Roth Laboratories, Harvard University, Boston, MA), a tool for finding overrepresented GO terms in a set of genes was utilised. Using this tool, we performed Fisher’s exact test to determine the probability of the appearance of genes associated with a GO term in each bicluster.30 FuncAssociate calculates an adjusted P-value (Padj) from the simulations, instead of the corrections of multiple tests. Padj is the probability of obtaining at least one false positive for any desired cutoff. We considered a biologically significant bicluster as one that is relevant to at least one GO term with a statistically significant appearance (namely, Padj less than significance level). The number of such biclusters, relative to the total number of biclusters (the GO enrichment), was used to estimate each algorithm. A previous study by Prelic et al21 evaluated the biological relevance of existing algorithms, using the GO enrichment.

Results and discussion

Biclusters for the Saccharomyces cerevisiae microarray data

With the selected parameters and thresholds, BIGA found 164 biclusters from the S. cerevisiae microarray data. The average numbers of genes and conditions in the biclusters are 92.25 and 23.65, respectively (Table 1). The detailed statistics of each bicluster are provided in Table S2. The properties of the biclusters obtained by other methods are also summarized in Table 1.
Table 1

Comparing quantitative metrics among biclustering algorithms

PropertiesCCSAMBAISAOPSMBIMAXmySEBIBIGA
Number of biclusters1001006612101100164
Average gene number82.01911.5276.2795.5824.0374.9892.25
Average condition number19.8525.158.7112.503.0080.523.65

Abbreviations: BIGA, binary-iterative genetic algorithm; BIMAX, binary inclusion-maximal biclustering algorithm; CC, Cheng and Church algorithm; ISA, iterative signature algorithm; OPSM, order-preserving submatrix; mySEBI, the Sequential Evolutionary Biclustering method used in this work; SAMBA, Statistical-Algorithmic Method for Bicluster Analysis.

Performance evaluation

The distribution of the average PCCs of the biclusters obtained by each biclustering algorithm is shown in the boxplot (Figure 2A). The thick line around the middle of the box indicates the median of the average PCCs. The top and bottom of the box indicate the upper and the lower quartiles, respectively. The circles show the outliers (more than 1.5 times the upper quartile or less than 1.5 times the lower quartile from the median). The whiskers mean the range of data between the maximum and the minimum values, other than the outliers. According to the plots, OPSM performs the best with a very small deviation in the average PCCs. Apart from OPSM, BIGA can outperform the other methods when compared by the median of the average PCC. One may consider that the fitness function of BIGA takes the average PCC into account (Equation 1), and thus it is obvious that the average PCC of BIGA is good. However, note that the results are not necessarily satisfactory if the optimization procedure does not work well, or the balance between the average PCC and the area of the bicluster in (Equation 1) is inappropriate. Next, using the the Wilcoxon signed-rank test the study examined whether the distribution of the average PCCs of BIGA is significantly better than those of the other algorithms.31 The results showed that BIGA detects significantly more co-expressed genes in biclusters than the other methods, except for OPSM (the highest P-value is only 5.4 × 10−6 against SAMBA). To clarify the performance, the expression profiles of the four best biclusters with higher average PCCs are demonstrated in Figure S1. Note: the reason for the highest performance of OPSM was related to the gene coverage and these analyses will be discussed later.
Figure 2

(A) Distribution of the average Pearson correlation coefficients for each biclustering algorithm, represented by a boxplot. (B) Histogram of gene coverage for each biclustering algorithm. The y-axis represents the coverage ratio between the union of genes appearing on biclusters and all analyzed genes. Higher coverage shows higher performance. (C) Histogram of gene overlap for each biclustering algorithm. The y-axis shows the gene overlap defined by (Equation 2). Lower overlap shows higher performance.

Abbreviations: CC, Cheng and Church algorithm; SAMBA, Statistical-Algorithmic Method for Bicluster Analysis; OPSM, order-preserving submatrix; ISA, iterative signature algorithm; BIMAX, binary inclusion-maximal biclustering algorithm; mySEBI, the Sequential Evolutionary Biclustering method used in this work; BIGA, binary-iterative genetic algorithm.

The gene coverage and the gene overlap are shown in (Figure 2B and 2C), respectively. As a result, BIGA achieved the fourth-highest gene coverage among the seven algorithms (Figure 2B). SAMBA could classify almost 100% of the genes into biclusters, but each bicluster contained more than 900 genes (Table 1) with extremely high overlap (Figure 2C), which will make the succeeding experimental or bioinformatics analyses difficult. mySEBI could produce a set of biclusters that would include 95% of all genes with a small amount of overlap. CC showed the best gene coverage (highest) and overlap (lowest). The results indicate that the techniques to reduce redundancy of biclusters in SEBI and CC are efficient for gaining high coverage and low overlap. However, the average PCCs of the biclusters by both algorithms were very low (Figure 2A). OPSM produced biclusters with the highest correlation (Figure 2A), but failed to achieve higher gene coverage due to the small number of clusters (Table 1). The average PCCs of OPSM and BIGA are high, because both methods adopt gene co-expression in the target function. By contrast, CC and SEBI adopt MSR instead of PCC. Although MSR can sometimes identify coherent biclusters, it is not necessarily efficient to achieve higher correlations of genes. BIGA yielded the second-largest gene overlap, with 6.29 (Figure 2C), which may imply that the biclusters of BIGA are mutually similar. The pairwise overlap (PO) of two biclusters defined by X ∩ X/X ∪ X, where X and X are genes in biclusters B and B, respectively, was measured to examine the similarity of the biclusters more directly, and plotted in Figure 3A. The median of the POs for BIGA was not very large, as compared with those of the other methods, indicating that the biclusters determined by BIGA are not necessarily similar. Moreover, the variety of biclusters using the single-linkage clustering method, where the distance between two biclusters defined by 1.0–PO was investigated. At each cut-off distance, the number of clusters was counted and normalized by the total number of biclusters, which we call the fraction of independent biclusters. When the cut-off distance is sufficiently small, no biclusters are merged and FIB is 1.0. This state indicates that the biclusters are independent and diverse. On the other hand, when the cut-off distance is sufficiently large, most of the biclusters may be merged together, and FIB will converge to 0.0. This state means that all of the biclusters are judged as being similar to each other. We consider a higher FIB to be an indicator illustrating the variety of the resultant biclusters. According to the plot (Figure 3B), the FIBs of SAMBA and ISA are obviously low in almost the whole cut-off distance range, showing that their biclusters are rather similar. The FIBs of OPSM show that its ability to detect diverse biclusters is moderate. CC, mySEBI, BIMAX, and BIGA provided a wider variety of biclusters than the other algorithms, when the cut-off distance was less than 0.5. In summary, the average bicluster determined by BIGA contains many genes that are shared with other biclusters (Figure 2C): however, when focusing on each pair of biclusters, a small number of genes are shared (Figure 3A). Consequently, the biclusters determined by BIGA seem to be independent (Figure 3B), and cover most of the genes efficiently (Figure 2B).
Figure 3

(A) Distribution of pairwise overlap (PO) of biclusters, shown in boxplots for each algorithm. Thick lines, boxes, whiskers, and circles indicate the same things as in (Figure 2A). (B) The fraction of independent biclusters (FIB) over the cut-off distance.

Abbreviations: CC, Cheng and Church algorithm; SAMBA, Statistical-Algorithmic Method for Bicluster Analysis (SAMBA); OPSM, order-preserving submatrix; ISA, iterative signature algorithm; BIMAX, binary inclusion-maximal biclustering algorithm; mySEBI, the Sequential Evolutionary Biclustering method used in this work; BIGA, binary-iterative genetic algorithm.

Evaluation of biological relevance by gene ontology enrichment analyses

In the study by Prelic et al21 on the evaluation of existing methods using GO enrichment, OPSM showed the best performance (100% of the biclusters were significant at the 0.05 significance level). However, it only produced twelve biclusters (Table 1), and thus the gene coverage was the lowest (Figure 2B). Less than half of the biclusters produced by CC were judged to be significant,21 probably because CC cannot detect biclusters with a higher average PCC (Figure 2A). The percentages of significant biclusters from mySEBI are 93%, 81%, 69%, and 42% for the 0.05, 0.01, 0.005, and 0.001, respectively. By contrast, 94.5% of the biclusters produced by BIGA were judged to be significant at the 0.05 significance level. This value was changed to 88.4%, 86.0%, and 79.3% for the 0.01, 0.005, and 0.001 significance levels, respectively. The performance of BIGA is almost the same as those of BIMAX and ISA in GO enrichment,21 but BIGA outperforms them in the gene coverage (Figure 2B). There was a functional relationship between the resultant biclusters by BIGA, based on the enriched GO terms at the 0.001 significance level. Among the 122 GO-enriched terms, ribosome-related terms (ribosome GO:0005840, ribosomal subunit GO:0033279, etc) are abundant in many biclusters (50 biclusters). This observation was consistent with the fact that 60% of transcription was devoted to ribosomal ribonucleic acid (RNA),32 because genes with higher expression levels tend to be clustered. Apart from the ribosome-related terms, primary metabolic (GO:0044238), translation (GO:0006412), protein-related (GO:0044267, GO:0019538), macromolecule-related (GO:0009059, GO:0034645, GO:0044260, GO:0043170), and biopolymer-related (GO:0043283, GO:0034960, GO:0043284, GO:0034961) processes also frequently appeared in several biclusters. This indicated that the genes involved in these terms are primary or essential in many biological processes. Five GO terms that are most enriched at the 0.001 significance level for each bicluster five specific GO terms among them are shown in Table S2. Furthermore, the novel aspects of the biclusters identified by BIGA were examined. For each bicluster defined by BIGA, the PO against all biclusters identified by the other five methods was measured and the maximum PO was derived (Table S2). The highest value of the maximum POs was at most 0.12, indicating that the biclusters defined by BIGA are quite different from those determined by the other methods. To explore the relationships of the genes that were detected only by BIGA, on the study examined the biclusters of BIGA that were not similar to any of the other biclusters; that is, the biclusters with maximum pair-wise similarity scores < 0.05. In bicluster 109 (the maximum PO = 0.039 with bicluster 29 of CC), 16 out of 86 genes are involved in a cellular nitrogen metabolic process (GO:0034641), eg, SAS3 (YBL052C), TEF2 (YBR118W), and SWD3 (YBR175W), are co-expressed under twelve conditions. In bicluster 118 (0.037 with bicluster 56 of CC), 26 out of 66 genes, eg, RRN6 (YBL014C), ORC2 (YBR060C), and PAF1 (YBR279W), are involved in an RNA metabolic process (GO:0016070). In bicluster 160 (0.037, bicluster 24 of ISA), 33 out of 74 genes, such as HEK2 (YBL032W), ROX3 (YBL093C), and SIF2 (YBR103W), are related to a nucleic acid metabolic process (GO:0090304). These results demonstrate that BIGA is useful to reveal the functional relevance underlying the biclusters. Furthermore, some genes belonged to the same bicluster, even though they lacked known co-functional evidence (see the biclusters in Table S2 without significant GO terms). These genes represent promising experimental targets that bridge biological processes exhibiting co-expression under specific conditions.

Conclusion

The development of biclustering algorithms has allowed biologists to start unraveling the underlying functional mechanisms in living organisms. We propose BIGA as an alternative biclustering technique, since it was designed to address the conventional problems of the pre-existing methods. Biclustering is obviously advantageous in accounting for the overlap state among clusters, but the suitable amount of overlap is still ambiguous and different algorithms often produce solutions with various degrees of overlap. We tried to develop a novel chromosome-encoding mode that explicitly defines the overlap between biclusters. BIGA revealed that the most frequently appearing genes express their functions in fundamental and essential biological processes, such as translation. A microarray often consists of relatively few conditions, with respect to a large number of genes. The weighting of genes and conditions diminishes the bias between the number of genes and conditions, which helps to eliminate unreliable results, such as biclusters with very few conditions. We also applied an alternative index, the average PCC, which impacts the biological meaning, rather than the MSR, to measure the goodness of a bicluster. The analysis of GO enrichment demonstrated that most of our biclusters were significant, with one or more enriched GO terms. When evaluated with the five pre-existing algorithms, BIGA performed well in most of the properties with good balance, although it did not show the best performance for all criteria. A pair-wise comparison of our biclusters with those obtained by the other algorithms revealed the novel aspects of the biclusters that are distinct from those of the other methods. Since biological systems are quite complicated, resulting in high-dimensional data, it is quite difficult to answer all biological questions with a single approach. For new discoveries, we recommend the application of several approaches, including BIGA. Parameter determination Notes: (A) Impact of gene-weight parameter on the goodness of biclusters (τ = 30, τ = 10, τ = 0.65, τ = 0.15 and β = 0.5). (B) Impact of redundant threshold on the goodness of biclusters (τ = 30, τ = 10, τ = 0.65, and α = 0.3, β = 0.5). (C) Impact of correlation threshold on the goodness of biclusters (τ = 30, τ = 10, τ = 0.15, and α = 0.3, β = 0.5). Expression profiles of biclusters 1 (A), 2 (B), 3 (C), and 4 (D), in the descending order of the average Pearson’s correlation coefficient. Note: The x-axis represents the series of conditions; eg, the number 8 denotes the 8th condition. Detailed statistics of resulting biclusters (sorted by descending order of average PCC) Notes: The steps to select specific GO terms from each cluster. (1) We hypothesise if a GO term appears on only a small number of biclusters (ie, 1 of 4 biclusters), it is specific for the biclusters. (2) We have 164 biclusters. By the proportion test, 1 of 4 biclusters corresponds to 31 of 164 biclusters at 0.05 significance level. (3) Therefore, GO terms appear less than 32 times are specific terms.
Table S1

Parameter determination

Goodness of biclusters

GenesConditionsCorrelationBiclustersCoverageOverlap
α
0.172.1522.840.741110.593.53
0.392.2523.650.711640.696.29
0.5102.2224.420.72520.6711.82
τr
0.181.2221.510.733550.7411.97
0.1592.2523.650.711640.696.29
0.2109.8625.070.69570.582.59
0.25128.1332.50.7180.220.53
0.3163450.6710.050
τc
0.60100.6222.170.691450.715.9
0.6592.2523.650.711640.696.29
0.7083.8422.690.741780.617.09

Notes: (A) Impact of gene-weight parameter on the goodness of biclusters (τ = 30, τ = 10, τ = 0.65, τ = 0.15 and β = 0.5). (B) Impact of redundant threshold on the goodness of biclusters (τ = 30, τ = 10, τ = 0.65, and α = 0.3, β = 0.5). (C) Impact of correlation threshold on the goodness of biclusters (τ = 30, τ = 10, τ = 0.15, and α = 0.3, β = 0.5).

Table S2

Detailed statistics of resulting biclusters (sorted by descending order of average PCC)

Bicluster IDNumber of genesNumber of conditionsAverage PCCThe minimum adjusted P-value of GO enrichmentNumber of enriched GO termsFive most significant GO termsFive most specific GO termsHighest pairwise simirarity score
147100.87<0.0012GO:0003674 molecular_functionGO:0032991 macromolecular complex0.044
274280.81<0.0013GO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0043234 protein complex0.067
385210.80<0.00114GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0022618 ribonucleoprotein complex assemblyGO:0007114 cell buddingGO:0022618 ribonucleoprotein complex assemblyGO:0032505 reproduction of a single-celled organismGO:0042257 ribosomal subunit assemblyGO:0043933 macromolecular complex subunit organization0.070
471320.8012GO:0030529 ribonucleoprotein complexGO:0032991 macromolecular complexGO:0005840 ribosomeGO:0044445 cytosolic partGO:0006412 translationGO:0022625 cytosolic large ribosomal subunit0.093
574180.800.0011GO:0005737 cytoplasmGO:0005737 cytoplasm0.050
65070.8000.043
779240.80<0.0018GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosomeGO:0009072 aromatic amino acid family metabolic process0.073
852160.7900.032
95640.7900.041
1087210.79<0.0015GO:0003674 molecular_functionGO:0006412 translationGO:0009987 cellular processGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process0.068
1172200.79<0.0015GO:0032991 macromolecular complexGO:0003674 molecular_functionGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle0.060
1278260.79<0.0016GO:0032040 small-subunit processomeGO:0030686 90S preribosomeGO:0042254 ribosome biogenesisGO:0030684 preribosomeGO:0022613 ribonucleoprotein complex biogenesisGO:0032040 small-subunit processomeGO:0022613 ribonucleoprotein complex biogenesisGO:0042254 ribosome biogenesisGO:0030684 preribosomeGO:0030686 90S preribosome0.074
1374140.79<0.0011GO:0003674 molecular_function0.048
1483330.78<0.00119GO:0044445 cytosolic partGO:0006412 translationGO:0022625 cytosolic large ribosomal subunitGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0015934 large ribosomal subunitGO:0022625 cytosolic large ribosomal subunitGO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process0.080
1586230.78<0.0012GO:0003674 molecular_functionGO:0032991 macromolecular complex0.056
1649180.78<0.00110GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0044260 cellular macromolecule metabolic processGO:0043283 biopolymer metabolic processGO:0030529 ribonucleoprotein complexGO:0008152 metabolic processGO:0016070 RNA metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic process0.059
1792230.78<0.00112GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0034621 cellular macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organizationGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0016070 RNA metabolic processGO:0044237 cellular metabolic process0.072
1877250.78<0.0014GO:0003674 molecular_functionGO:0044445 cytosolic partGO:0009987 cellular processGO:0032991 macromolecular complex0.050
1977210.78<0.0015GO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0015935 small ribosomal subunitGO:0015935 small ribosomal subunit0.062
2059120.78<0.0011GO:0044238 primary metabolic process0.046
2184300.77<0.00110GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0005840 ribosomeGO:0005737 cytoplasm0.073
2253110.770.0011GO:0044238 primary metabolic process0.058
2381280.77<0.00111GO:0032991 macromolecular complexGO:0043283 biopolymer metabolic processGO:0034960 cellular biopolymer metabolic processGO:0043234 protein complexGO:0043170 macromolecule metabolic processGO:0051246 regulation of protein metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0032268 regulation of cellular protein metabolic processGO:0043234 protein complex0.059
2461210.7700.039
2582130.77<0.0011GO:0003674 molecular_function0.045
26103240.76<0.0019GO:0044238 primary metabolic processGO:0003674 molecular_functionGO:0009987 cellular processGO:0005840 ribosomeGO:0003735 structural constituent of ribosomeGO:0045182 translation regulator activityGO:0003743 translation initiation factor activityGO:0045182 translation regulator activityGO:0008135 “translation factor activity, nucleic acid binding”GO:0032268 regulation of cellular protein metabolic processGO:0043234 protein complex0.077
2793270.76<0.00119GO:0044238 primary metabolic processGO:0003735 structural constituent of ribosomeGO:0009987 cellular processGO:0005840 ribosomeGO:0003735 structural constituent of ribosomeGO:0015935 small ribosomal subunitGO:0008152 metabolic processGO:0043229 intracellular organelleGO:0043226 organelleGO:0022627 cytosolic small ribosomal subunit0.098
2865110.76<0.0011GO:0003674 molecular_function0.045
2978320.76<0.0012GO:0003674 molecular_functionGO:0032991 macromolecular complex0.077
3062190.76<0.0016GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process0.056
3189190.76<0.00112GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0009059 macromolecule biosynthetic processGO:0044238 primary metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0016070 RNA metabolic processGO:0009059 macromolecule biosynthetic process0.063
3291300.76<0.00110GO:0017111 nucleoside-triphosphatase activityGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”GO:0044238 primary metabolic processGO:0017111 nucleoside-triphosphatase activityGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”GO:0034470 ncRNA processing0.081
33105340.76<0.0018GO:0009058 biosynthetic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0006412 translationGO:0044445 cytosolic partGO:0009058 biosynthetic process0.098
34105280.75<0.00116GO:0032991 macromolecular complexGO:0044267 cellular protein metabolic processGO:0006412 translationGO:0009987 cellular processGO:0043234 protein complexGO:0044444 cytoplasmic partGO:0044424 intracellular partGO:0043234 protein complexGO:0009058 biosynthetic process0.088
35110250.75<0.00129GO:0032991 macromolecular complexGO:0016070 RNA metabolic processGO:0044238 primary metabolic processGO:0009987 cellular processGO:0005198 structural molecule activityGO:0019438 aromatic compound biosynthetic processGO:0006396 RNA processingGO:0034470 ncRNA processingGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”0.085
3666160.75<0.0018GO:0032991 macromolecular complexGO:0003735 structural constituent of ribosomeGO:0033279 ribosomal subunitGO:0005198 structural molecule activityGO:0006412 translationGO:0022627 cytosolic small ribosomal subunit0.069
3771100.750.0011GO:0044085 cellular component biogenesisGO:0044085 cellular component biogenesis0.068
3859140.74<0.0013GO:0003674 molecular_functionGO:0005198 structural molecule activityGO:0032991 macromolecular complex0.040
3958160.74<0.00113GO:0044249 cellular biosynthetic processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0009058 biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022627 cytosolic small ribosomal subunit0.048
4083360.74<0.0018GO:0044445 cytosolic partGO:0006412 translationGO:0043229 intracellular organelleGO:0043226 organelleGO:0043228 nonmembrane-bounded organelleGO:0043229 intracellular organelleGO:0043226 organelle0.076
4178230.74<0.0015GO:0032991 macromolecular complexGO:0043234 protein complexGO:0003674 molecular_functionGO:0044238 primary metabolic processGO:0009987 cellular processGO:0043234 protein complex0.069
42113260.74<0.00123GO:0044445 cytosolic partGO:0030529 ribonucleoprotein complexGO:0005198 structural molecule activityGO:0033279 ribosomal subunitGO:0006412 translationGO:0006913 nucleocytoplasmic transportGO:0051169 nuclear transportGO:0005622 intracellularGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expression0.080
4390220.74<0.00118GO:0032991 macromolecular complexGO:0022627 cytosolic small ribosomal subunitGO:0030684 preribosomeGO:0030686 90S preribosomeGO:0030529 ribonucleoprotein complexGO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process0.081
4489250.74<0.0016GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0034621 cellular macromolecular complex subunit organizationGO:0016070 RNA metabolic processGO:0034621 cellular macromolecular complex subunit organizationGO:0016070 RNA metabolic process0.061
4592280.74<0.0018GO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0032268 regulation of cellular protein metabolic processGO:0005737 cytoplasmGO:0051246 regulation of protein metabolic processGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process0.057
46106280.74<0.00112GO:0009987 cellular processGO:0006412 translationGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0022627 cytosolic small ribosomal subunit0.089
47106360.74<0.00114GO:0030529 ribonucleoprotein complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosomeGO:0032991 macromolecular complexGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”0.100
48109250.74<0.00123GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0044445 cytosolic partGO:0009987 cellular processGO:0005840 ribosomeGO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunitGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translation0.083
4999270.74<0.00124GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0005840 ribosomeGO:0005198 structural molecule activityGO:0006412 translationGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022627 cytosolic small ribosomal subunitGO:0034960 cellular biopolymer metabolic processGO:0009059 macromolecule biosynthetic process0.082
5089240.73<0.00110GO:0030529 ribonucleoprotein complexGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0005840 ribosomeGO:0043228 nonmembrane-bounded organelleGO:0005488 binding0.074
5186150.73<0.0013GO:0003674 molecular_functionGO:0009987 cellular processGO:0000166 nucleotide bindingGO:0000166 nucleotide binding0.065
52141350.73<0.00118GO:0006412 translationGO:0032991 macromolecular complexGO:0009058 biosynthetic processGO:0009987 cellular processGO:0044249 cellular biosynthetic processGO:0006082 organic acid metabolic processGO:0019752 carboxylic acid metabolic processGO:0005737 cytoplasmGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic process0.119
53107310.73<0.00120GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005198 structural molecule activityGO:0007010 cytoskeleton organizationGO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunitGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process0.062
5468240.730.0016GO:0009987 cellular processGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0043229 intracellular organelleGO:0043226 organelleGO:0043229 intracellular organelleGO:0043226 organelle0.045
55128260.73<0.00121GO:0032991 macromolecular complexGO:0006412 translationGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0044238 primary metabolic processGO:0016043 cellular component organizationGO:0065007 biological regulationGO:0050789 regulation of biological processGO:0050794 regulation of cellular processGO:0009059 macromolecule biosynthetic process0.089
56101320.73<0.00115GO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0009987 cellular processGO:0005840 ribosomeGO:0022625 cytosolic large ribosomal subunitGO:0044424 intracellular part0.099
57107320.73<0.00111GO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044238 primary metabolic processGO:0009987 cellular processGO:0043170 macromolecule metabolic process0.091
58111330.72<0.00111GO:0032991 macromolecular complexGO:0009987 cellular processGO:0019538 protein metabolic processGO:0006412 translationGO:0043228 nonmembrane-bounded organelleGO:0043234 protein complex0.099
5992270.72<0.00111GO:0009987 cellular processGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0032268 regulation of cellular protein metabolic processGO:0044445 cytosolic partGO:0010608 posttranscriptional regulation of gene expressionGO:0016070 RNA metabolic processGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0044424 intracellular part0.106
60111330.72<0.0017GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0006412 translation0.078
6176150.72<0.0012GO:0003674 molecular_functionGO:0009987 cellular process0.050
6294200.72<0.0016GO:0032991 macromolecular complexGO:0032268 regulation of cellular protein metabolic processGO:0044238 primary metabolic processGO:0051246 regulation of protein metabolic processGO:0009987 cellular processGO:0051246 regulation of protein metabolic processGO:0032268 regulation of cellular protein metabolic process0.057
6383240.72<0.00113GO:0022627 cytosolic small ribosomal subunitGO:0032991 macromolecular complexGO:0015935 small ribosomal subunitGO:0044445 cytosolic partGO:0030686 90S preribosomeGO:0030686 90S preribosomeGO:0015935 small ribosomal subunitGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0022627 cytosolic small ribosomal subunit0.083
64126280.72<0.00139GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0005840 ribosomeGO:0030529 ribonucleoprotein complexGO:0015934 large ribosomal subunitGO:0044464 cell partGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022625 cytosolic large ribosomal subunit0.094
6545120.7200.045
66100320.72<0.0018GO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular process0.080
67124290.72<0.00115GO:0032991 macromolecular complexGO:0043234 protein complexGO:0009058 biosynthetic processGO:0009987 cellular processGO:0043284 biopolymer biosynthetic processGO:0010608 posttranscriptional regulation of gene expressionGO:0006417 regulation of translationGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044424 intracellular part0.097
68111370.72<0.0019GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelle0.099
6951210.7100.059
70106300.71<0.00121GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0005198 structural molecule activityGO:0034960 cellular biopolymer metabolic processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044260 cellular macromolecule metabolic processGO:0043234 protein complex0.065
7146120.7100.047
72126360.71<0.00117GO:0009987 cellular processGO:0044238 primary metabolic processGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”GO:0017076 purine nucleotide bindingGO:0032553 ribonucleotide bindingGO:0032555 purine ribonucleotide bindingGO:0000166 nucleotide bindingGO:0017111 nucleoside-triphosphatase activity0.101
7387250.71<0.0018GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0016070 RNA metabolic processGO:0016070 RNA metabolic processGO:0043170 macromolecule metabolic process0.070
74112300.71<0.00118GO:0032991 macromolecular complexGO:0006412 translationGO:0044238 primary metabolic processGO:0044424 intracellular partGO:0009058 biosynthetic processGO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic processGO:0010608 posttranscriptional regulation of gene expressionGO:0006417 regulation of translationGO:0044424 intracellular part0.085
75116310.71<0.00113GO:0032991 macromolecular complexGO:0005198 structural molecule activityGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0009987 cellular processGO:0005737 cytoplasmGO:0043234 protein complex0.093
7668140.71<0.0017GO:0022627 cytosolic small ribosomal subunitGO:0015935 small ribosomal subunitGO:0006412 translationGO:0044445 cytosolic partGO:0003735 structural constituent of ribosomeGO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunit0.074
7786200.71<0.0013GO:0003674 molecular_functionGO:0022627 cytosolic small ribosomal subunitGO:0032991 macromolecular complexGO:0022627 cytosolic small ribosomal subunit0.052
78104390.71<0.00123GO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complexGO:0006412 translationGO:0044238 primary metabolic processGO:0008135 “translation factor activity, nucleic acid binding”GO:0003743 translation initiation factor activityGO:0045182 translation regulator activityGO:0008135 “translation factor activity, nucleic acid binding”GO:0016070 RNA metabolic processGO:0034960 cellular biopolymer metabolic process0.108
7990230.71<0.0019GO:0006412 translationGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0005840 ribosome0.060
80108360.71<0.0017GO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005198 structural molecule activityGO:0030529 ribonucleoprotein complex0.078
8190240.71<0.00111GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0022625 cytosolic large ribosomal subunitGO:0044238 primary metabolic processGO:0043283 biopolymer metabolic processGO:0022625 cytosolic large ribosomal subunitGO:0043234 protein complexGO:0043170 macromolecule metabolic process0.067
82106330.71<0.00121GO:0044238 primary metabolic processGO:0034960 cellular biopolymer metabolic processGO:0009987 cellular processGO:0043283 biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic processGO:0043229 intracellular organelleGO:0043226 organelleGO:0034960 cellular biopolymer metabolic process0.084
83129310.71<0.00118GO:0032991 macromolecular complexGO:0006412 translationGO:0005198 structural molecule activityGO:0005840 ribosomeGO:0044445 cytosolic partGO:0005488 bindingGO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunitGO:0044422 organelle partGO:0044446 intracellular organelle part0.091
84129280.71<0.00122GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0006412 translationGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044424 intracellular partGO:0044237 cellular metabolic processGO:0044249 cellular biosynthetic process0.098
8577380.71<0.00112GO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0043228 nonmembrane-bounded organelleGO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunit0.074
86109280.70<0.0016GO:0009987 cellular processGO:0006412 translationGO:0044445 cytosolic partGO:0044238 primary metabolic process0.090
8778210.700.0018GO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic processGO:0060255 regulation of macromolecule metabolic processGO:0031326 regulation of cellular biosynthetic processGO:0009889 regulation of biosynthetic processGO:0019222 regulation of metabolic processGO:0060255 regulation of macromolecule metabolic processGO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic process0.055
88100240.70<0.00119GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0005840 ribosomeGO:0003735 structural constituent of ribosomeGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic process0.073
8982240.70<0.00113GO:0044445 cytosolic partGO:0006417 regulation of translationGO:0010608 posttranscriptional regulation of gene expressionGO:0032268 regulation of cellular protein metabolic processGO:0051246 regulation of protein metabolic processGO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic processGO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic process0.060
9077270.70<0.0015GO:0003674 molecular_functionGO:0005198 structural molecule activityGO:0009987 cellular processGO:0044238 primary metabolic processGO:0044445 cytosolic part0.050
9197220.70<0.00117GO:0044445 cytosolic partGO:0006412 translationGO:0032991 macromolecular complexGO:0009987 cellular processGO:0044238 primary metabolic processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044249 cellular biosynthetic process0.088
92110280.70<0.0016GO:0009987 cellular processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle0.090
9394290.70<0.00115GO:0032991 macromolecular complexGO:0006417 regulation of translationGO:0010608 posttranscriptional regulation of gene expressionGO:0032268 regulation of cellular protein metabolic processGO:0051246 regulation of protein metabolic processGO:0005083 small GTPase regulator activityGO:0030695 GTPase regulator activityGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic process0.067
94113340.70<0.00132GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0006412 translationGO:0009987 cellular processGO:0044238 primary metabolic processGO:0019222 regulation of metabolic processGO:0060255 regulation of macromolecule metabolic processGO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic process0.075
9594230.70<0.0014GO:0006412 translationGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle0.062
96104310.70<0.00110GO:0006412 translationGO:0009987 cellular processGO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0034660 ncRNA metabolic processGO:0034470 ncRNA processingGO:0034660 ncRNA metabolic processGO:0016070 RNA metabolic process0.107
9751130.70<0.0011GO:0003674 molecular_function0.043
98154320.70<0.00114GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosomeGO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complexGO:0005575 cellular_componentGO:0044464 cell partGO:0010556 regulation of macromolecule biosynthetic processGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expression0.115
99117300.70<0.00111GO:0005622 intracellularGO:0009987 cellular processGO:0044238 primary metabolic processGO:0006412 translationGO:0019538 protein metabolic processGO:0005622 intracellularGO:0022627 cytosolic small ribosomal subunitGO:0032268 regulation of cellular protein metabolic process0.100
100110280.70<0.0018GO:0009987 cellular processGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0044445 cytosolic partGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic process0.092
101139340.70<0.00116GO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular processGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process0.100
10298280.69<0.00148GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0043284 biopolymer biosynthetic processGO:0005840 ribosomeGO:0006333 chromatin assembly or disassemblyGO:0006446 regulation of translational initiationGO:0003743 translation initiation factor activityGO:0019222 regulation of metabolic processGO:0045182 translation regulator activity0.085
10371180.69<0.0011GO:0003674 molecular_function0.053
104105210.69<0.0015GO:0008150 biological_processGO:0009987 cellular processGO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0043234 protein complexGO:0043234 protein complex0.058
105140320.69<0.00116GO:0032991 macromolecular complexGO:0043234 protein complexGO:0044238 primary metabolic processGO:0009987 cellular processGO:0044445 cytosolic partGO:0005575 cellular_componentGO:0044464 cell partGO:0010608 posttranscriptional regulation of gene expressionGO:0043226 organelleGO:0051246 regulation of protein metabolic process0.098
10641120.6900.035
107101250.69<0.00124GO:0044238 primary metabolic processGO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0005840 ribosomeGO:0044445 cytosolic partGO:0034645 cellular macromolecule biosynthetic processGO:0022625 cytosolic large ribosomal subunitGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0009059 macromolecule biosynthetic process0.080
10899210.69<0.0019GO:0032991 macromolecular complexGO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0044238 primary metabolic processGO:0006412 translationGO:0044424 intracellular part0.080
10986120.69<0.0017GO:0044267 cellular protein metabolic processGO:0009987 cellular processGO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0043229 intracellular organelleGO:0043229 intracellular organelleGO:0043226 organelle0.039
110118300.69<0.00117GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0009987 cellular processGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0044424 intracellular part0.093
11198150.69<0.0015GO:0016043 cellular component organizationGO:0009987 cellular processGO:0006996 organelle organizationGO:0032991 macromolecular complexGO:0008150 biological_processGO:0006996 organelle organizationGO:0016043 cellular component organization0.041
112157430.69<0.00138GO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0009987 cellular processGO:0032991 macromolecular complexGO:0006412 translationGO:0015934 large ribosomal subunitGO:0030686 90S preribosomeGO:0044464 cell partGO:0034961 cellular biopolymer biosynthetic processGO:0015935 small ribosomal subunit0.108
113116340.68<0.00121GO:0009058 biosynthetic processGO:0032991 macromolecular complexGO:0044249 cellular biosynthetic processGO:0006412 translationGO:0009987 cellular processGO:0000105 histidine biosynthetic processGO:0006547 histidine metabolic processGO:0009075 histidine family amino acid metabolic processGO:0009076 histidine family amino acid biosynthetic processGO:0009059 macromolecule biosynthetic process0.084
11469130.680.0011GO:0009987 cellular process0.053
11596210.68<0.0015GO:0003674 molecular_functionGO:0009987 cellular processGO:0022627 cytosolic small ribosomal subunitGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0022627 cytosolic small ribosomal subunit0.050
1163890.6800.041
117109300.68<0.0019GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0009987 cellular processGO:0006412 translationGO:0005198 structural molecule activityGO:0043234 protein complex0.076
11866170.680.0011GO:0009987 cellular process0.037
119104270.68<0.0015GO:0003674 molecular_functionGO:0009987 cellular processGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0008150 biological_process0.072
120122360.68<0.00138GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0008152 metabolic processGO:0043283 biopolymer metabolic processGO:0022613 ribonucleoprotein complex biogenesisGO:0042254 ribosome biogenesisGO:0044085 cellular component biogenesisGO:0034961 cellular biopolymer biosynthetic processGO:0015935 small ribosomal subunit0.097
12174160.680.0018GO:0022627 cytosolic small ribosomal subunitGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0043332 mating projection tipGO:0044463 cell projection partGO:0043332 mating projection tipGO:0044463 cell projection partGO:0022627 cytosolic small ribosomal subunit0.089
122126380.68<0.00135GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular processGO:0043234 protein complexGO:0008135 “translation factor activity, nucleic acid binding”GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0043229 intracellular organelleGO:0044422 organelle part0.106
12383180.68<0.0013GO:0003674 molecular_functionGO:0043234 protein complexGO:0032991 macromolecular complexGO:0043234 protein complex0.053
124119310.67<0.0018GO:0032991 macromolecular complexGO:0006412 translationGO:0009987 cellular processGO:0005488 bindingGO:0044422 organelle partGO:0005488 bindingGO:0044422 organelle partGO:0044446 intracellular organelle part0.093
125133410.67<0.00127GO:0009987 cellular processGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0044238 primary metabolic processGO:0006412 translationGO:0015935 small ribosomal subunitGO:0043229 intracellular organelleGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0043226 organelle0.092
126132250.67<0.00118GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0044237 cellular metabolic processGO:0009987 cellular processGO:0008152 metabolic processGO:0031125 rRNA 3′-end processingGO:0043628 ncRNA 3′-end processingGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic process0.080
12757140.6700.042
12851180.67<0.0011GO:0003674 molecular_function0.044
12977250.67<0.0015GO:0009987 cellular processGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organizationGO:0003674 molecular_functionGO:0034622 cellular macromolecular complex assemblyGO:0034622 cellular macromolecular complex assemblyGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organization0.048
13075220.67<0.0014GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0003674 molecular_functionGO:0043283 biopolymer metabolic processGO:0016070 RNA metabolic process0.067
131106260.67<0.0016GO:0003674 molecular_functionGO:0043229 intracellular organelleGO:0032991 macromolecular complexGO:0043226 organelleGO:0044238 primary metabolic processGO:0043229 intracellular organelleGO:0043226 organelle0.076
132133250.67<0.00121GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0009987 cellular processGO:0044445 cytosolic partGO:0005198 structural molecule activityGO:0005488 bindingGO:0005622 intracellularGO:0044424 intracellular partGO:0044249 cellular biosynthetic process0.097
133128350.67<0.00122GO:0032991 macromolecular complexGO:0006412 translationGO:0005198 structural molecule activityGO:0005840 ribosomeGO:0043284 biopolymer biosynthetic processGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0043234 protein complex0.096
134107280.67<0.00119GO:0005198 structural molecule activityGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044445 cytosolic partGO:0005840 ribosomeGO:0005737 cytoplasmGO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunitGO:0032268 regulation of cellular protein metabolic process0.074
135109240.66<0.00117GO:0009058 biosynthetic processGO:0044238 primary metabolic processGO:0044249 cellular biosynthetic processGO:0032991 macromolecular complexGO:0043284 biopolymer biosynthetic processGO:0003676 nucleic acid bindingGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic processGO:0006417 regulation of translationGO:0009059 macromolecule biosynthetic process0.078
13672160.66<0.0019GO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0022627 cytosolic small ribosomal subunitGO:0006412 translationGO:0043228 nonmembrane-bounded organelleGO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0022627 cytosolic small ribosomal subunit0.050
137113240.66<0.00111GO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0005840 ribosomeGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0008152 metabolic processGO:0044237 cellular metabolic process0.080
13848120.6600.033
13958130.6600.041
140135370.66<0.00114GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005488 bindingGO:0044424 intracellular partGO:0044237 cellular metabolic processGO:0043170 macromolecule metabolic process0.101
141103210.66<0.00110GO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0043229 intracellular organelleGO:0043226 organelleGO:0065007 biological regulationGO:0050789 regulation of biological processGO:0050794 regulation of cellular processGO:0043229 intracellular organelleGO:0043226 organelle0.063
142164320.66<0.00126GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0006396 RNA processingGO:0016070 RNA metabolic processGO:0003824 catalytic activityGO:0006396 RNA processingGO:0030684 preribosomeGO:0030686 preribosomeGO:0034470 ncRNA processing0.091
14390180.66<0.00121GO:0032991 macromolecular complexGO:0019538 protein metabolic processGO:0044238 primary metabolic processGO:0043283 biopolymer metabolic processGO:0044267 cellular protein metabolic processGO:0008152 metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic processGO:0043234 protein complex0.064
144101200.66<0.0013GO:0009987 cellular processGO:0003674 molecular_functionGO:0008150 biological_process0.052
14512240.66<0.0012GO:0008150 biological_processGO:0003674 molecular_function0.045
146121320.66<0.00114GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0009987 cellular processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044249 cellular biosynthetic process0.061
147121300.66<0.0016GO:0003824 catalytic activityGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0030684 preribosomeGO:0003824 catalytic activityGO:0030684 preribosome0.088
148104220.66<0.00123GO:0044238 primary metabolic processGO:0034660 ncRNA metabolic processGO:0034470 ncRNA processingGO:0031125 rRNA 3′-end processingGO:0009987 cellular processGO:0000459 exonucleolytic trimming during rRNA processingGO:0000467 “exonucleolytic trimming to generate mature 3′-end of 5.8S rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0000469 cleavages during rRNA processingGO:0006364 rRNA processingGO:0016072 rRNA metabolic process0.070
149140190.66<0.00114GO:0044238 primary metabolic processGO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0032991 macromolecular complexGO:0005737 cytoplasmGO:0044464 cell partGO:0005737 cytoplasmGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic processGO:0043170 macromolecule metabolic process0.069
150116300.65<0.00114GO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0044267 cellular protein metabolic processGO:0044445 cytosolic partGO:0005198 structural molecule activityGO:0022625 cytosolic large ribosomal subunitGO:0043234 protein complex0.079
15161210.65<0.0011GO:0003674 molecular_function0.051
15262150.65<0.0011GO:0003674 molecular_function0.041
15385270.65<0.0015GO:0016070 RNA metabolic processGO:0003674 molecular_functionGO:0044238 primary metabolic processGO:0009987 cellular processGO:0034660 ncRNA metabolic processGO:0034660 ncRNA metabolic processGO:0016070 RNA metabolic process0.072
154142330.65<0.00112GO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0043228 nonmembrane-bounded organelleGO:0005622 intracellularGO:0043229 intracellular organelleGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0043226 organelle0.099
15554120.6500.039
15671150.65<0.0016GO:0043283 biopolymer metabolic processGO:0044238 primary metabolic processGO:0034960 cellular biopolymer metabolic processGO:0043170 macromolecule metabolic processGO:0044260 cellular macromolecule metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0043170 macromolecule metabolic process0.052
157103340.65<0.00121GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044445 cytosolic partGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0015934 large ribosomal subunitGO:0022625 cytosolic large ribosomal subunitGO:0051246 regulation of protein metabolic processGO:0044424 intracellular partGO:0032268 regulation of cellular protein metabolic process0.079
15884190.65<0.0016GO:0005198 structural molecule activityGO:0005488 bindingGO:0044445 cytosolic partGO:0009987 cellular processGO:0032991 macromolecular complexGO:0005488 binding0.074
159103200.65<0.00110GO:0032991 macromolecular complexGO:0034621 cellular macromolecular complex subunit organizationGO:0044238 primary metabolic processGO:0009987 cellular processGO:0043933 macromolecular complex subunit organizationGO:0065003 macromolecular complex assemblyGO:0034622 cellular macromolecular complex assemblyGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organization0.063
1607470.650.0013GO:0044422 organelle partGO:0044446 intracellular organelle partGO:0009987 cellular processGO:0044422 organelle partGO:0044446 intracellular organelle part0.037
1615770.64<0.0011GO:0003674 molecular_function0.048
1628760.63<0.0011GO:0003674 molecular_function0.048
1637550.61<0.0012GO:0032991 macromolecular complexGO:0003674 molecular_function0.045
16456100.5400.033

Notes: The steps to select specific GO terms from each cluster. (1) We hypothesise if a GO term appears on only a small number of biclusters (ie, 1 of 4 biclusters), it is specific for the biclusters. (2) We have 164 biclusters. By the proportion test, 1 of 4 biclusters corresponds to 31 of 164 biclusters at 0.05 significance level. (3) Therefore, GO terms appear less than 32 times are specific terms.

  18 in total

1.  Gene expression profiles of human breast cancer progression.

Authors:  Xiao-Jun Ma; Ranelle Salunga; J Todd Tuggle; Justin Gaudet; Edward Enright; Philip McQuary; Terry Payette; Maria Pistone; Kimberly Stecker; Brian M Zhang; Yi-Xiong Zhou; Heike Varnholt; Barbara Smith; Michelle Gadd; Erica Chatfield; Jessica Kessler; Thomas M Baer; Mark G Erlander; Dennis C Sgroi
Journal:  Proc Natl Acad Sci U S A       Date:  2003-04-24       Impact factor: 11.205

2.  Extracting conserved gene expression motifs from gene expression data.

Authors:  T M Murali; Simon Kasif
Journal:  Pac Symp Biocomput       Date:  2003

3.  Characterizing gene sets with FuncAssociate.

Authors:  Gabriel F Berriz; Oliver D King; Barbara Bryant; Chris Sander; Frederick P Roth
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

4.  Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data.

Authors:  Amos Tanay; Roded Sharan; Martin Kupiec; Ron Shamir
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-18       Impact factor: 11.205

5.  Inferring transcriptional regulatory networks from high-throughput data.

Authors:  Rui-Sheng Wang; Yong Wang; Xiang-Sun Zhang; Luonan Chen
Journal:  Bioinformatics       Date:  2007-09-22       Impact factor: 6.937

6.  Conservation of gene order: a fingerprint of proteins that physically interact.

Authors:  T Dandekar; B Snel; M Huynen; P Bork
Journal:  Trends Biochem Sci       Date:  1998-09       Impact factor: 13.807

7.  Iterative signature algorithm for the analysis of large-scale gene expression data.

Authors:  Sven Bergmann; Jan Ihmels; Naama Barkai
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2003-03-11

8.  Genomic expression programs in the response of yeast cells to environmental changes.

Authors:  A P Gasch; P T Spellman; C M Kao; O Carmel-Harel; M B Eisen; G Storz; D Botstein; P O Brown
Journal:  Mol Biol Cell       Date:  2000-12       Impact factor: 4.138

9.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

10.  A biclustering algorithm based on a bicluster enumeration tree: application to DNA microarray data.

Authors:  Wassim Ayadi; Mourad Elloumi; Jin-Kao Hao
Journal:  BioData Min       Date:  2009-12-16       Impact factor: 2.522

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.