| Literature DB >> 19620096 |
Kengo Kinoshita1, Takeshi Obayashi.
Abstract
BACKGROUND: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19620096 PMCID: PMC2759550 DOI: 10.1093/bioinformatics/btp442
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
GO prediction performance by AUC
| CC | BP | MF | |
|---|---|---|---|
| PCC | 0.694±0.0086 | 0.609±0.015 | 0.603±0.018 |
| SCC | 0.688±0.0009 | 0.628±0.0001 | 0.607±0.0002 |
| SVM | 0.733±0.014 (6) | 0.645±0.029 (6) | 0.641±0.024 (3) |
Prediction performance based on PCC, SCC and SVM. The number in the parenthesis is the number of used correlations for multidimensional correlations. See section ‘Materials and methods’ for details.
Fig. 1.(A) Frequently observed patterns of correlation changes revealed by rough clustering (rmsd <0.1), (B) schematic explanation of stability values, S, for the correlation change, (C) the distribution of stability values for all pairs we consider (black), the pairs with decrease cases (green) and those with increase cases (red), where the number of probe pairs, the mean of the S values and its standard deviation for the decrease cases are 430 866 (76.5%), 0.533 and 0.159, respectively, while those for the increase cases are 132 719 (23.5%), 0.547 and 0.199, and (D) frequency plot to show the relationship between correlation and stability. Frequency is shown in 10 base logarithmic scales. The stability tends to be high for the gene pairs with >0.7 correlations, but for the pairs with <0.7 correlations their correlations are usually fragile as indicated by low stability.
Fig. 2.Correlation changes (A) in the genes in Photosystem II and (B) in NDH-related genes. Each line corresponds to the cluster with rmsd <0.05, and the names of gene pairs in each cluster are shown in Table 1 for (A) and specified in the figure for (B).
Gene list of each cluster involved in Photosystem II
| Cluster number | List of coexpressed genes |
|---|---|
| PsbO1–PsbR, PsbP1–PsbX, PsbY–PsbP1, PsbY–PsbX, PsbP1–PsbR, PsbY–PsbO1, PsbQ1–PsbTn | |
| PsbY–PsbW, PsbQ-2–PsbTn, PsbQ1–PsbW, PsbY–PsbQ1, PsbTn–PsbX, PsbQ-2–PsbP1, PsbY–PsbTn, PsbO1–PsbO2, PsbP1–PsbW, PsbO1–PsbX, PsbQ1–PsbX, PsbO1–PsbQ2, PsbY–PsbQ2, PsbTn–PsbW | |
| PsbR–PsbX, PsbX–PsbW, PsbQ2–PsbR, PsbO2–PsbR, PsbQ2–PsbX, PsbY–PsbR | |
| PsbO-2–PsbX, PsbTn–PsbR, PsbR–PsbW, PsbO-1–PsbQ-1, PsbQ-1–PsbR, PsbO-1–PsbTn | |
| PsbO2– | |
| PsbY– | |
Upper four and lower three clusters exhibit stable and fragile coexpressions, respectively. Underlined genes indicate those with fragile coexpression. c′ indicates clusters obtained by rmsd <0.05.
Fig. 3.(A) Correlation change for genes in Glycerolipid metabolism (ath00561) and (B) their location in KEGG pathways.
Fig. 4.Factor loading of each sample in (A) first PC, (B) second PC, with the plot expanded for sample 1–200, and (C) third PC. The red dots indicate the largely contributing samples, or those with factor loadings >0.5. The background colors in the plots correspond to the rough classifications: developmental stage (1–237), time course samples (238–771) and others (772–1388).