| Literature DB >> 16156896 |
John Tomfohr1, Jun Lu, Thomas B Kepler.
Abstract
BACKGROUND: A promising direction in the analysis of gene expression focuses on the changes in expression of specific predefined sets of genes that are known in advance to be related (e.g., genes coding for proteins involved in cellular pathways or complexes). Such an analysis can reveal features that are not easily visible from the variations in the individual genes and can lead to a picture of expression that is more biologically transparent and accessible to interpretation. In this article, we present a new method of this kind that operates by quantifying the level of 'activity' of each pathway in different samples. The activity levels, which are derived from singular value decompositions, form the basis for statistical comparisons and other applications.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16156896 PMCID: PMC1261155 DOI: 10.1186/1471-2105-6-225
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Outline of pathway level analysis of gene expression.
Figure 2Pathway activity levels. Schematic illustration of our approach to quantifying the activity level of a pathway. (A) A colormap of the expression levels for the genes in a hypothetical pathway after standardizing the expression levels to have zero mean and unit variance over samples. This represents the matrix Y described in the text. (B) The main component of the variation in the expression matrix depicted in (A). This representation is determined by the activity levels c and weights w (see Methods) associated with the first metagene in the singular value decomposition (SVD) of Y . The activity level in a sample (one column of the expression matrix) can be thought of as specifying a location in the range of expression profiles shown in (C). Positive activity levels here indicate relatively high (low) expression for genes with positive (negative) weight. For example, the expression profile (column) furthest to the left in the expression matrix is in the high positive region of the range of expression profiles. The colormaps in (A) and (B) show the samples divided into two hypothetical groups (e.g., case samples and control samples). We note, however, that the matrix Y contains expression values for all samples: the activity levels are determined by performing SVD using expression data from all samples without regard to how the samples are classified.
Pathways correlated with a type 2 diabetic phenotype. The table shows p-values for the three pathways most correlated with blood glucose concentration as measured two hours after an oral glucose tolerance test (OGTT). r is the Pearson correlation. Also shown are p-values, generally indicating low significance levels, for these pathways determined from t-statistic comparisons between DM2 (type 2 diabetic) and NGT (normal glucose tolerance).
| Comparison | ||
| (genes in data set/total genes in pathway) Pathway | glucose after OGTT | DM2 vs. NGT |
| (96/123) Oxidative phosphorylation | 0.031 ( | 0.565 (down in DM2) |
| (6/6) Activation of cAMP-dependent protein kinase | 0.062 ( | 0.395 (up in DM2) |
| (35/40) ATP synthesis | 0.166 ( | 0.855 (down in DM2) |
Figure 3Negative correlation between oxidative phosphorylation and blood glucose levels after OGTT. Scatter plot of blood glucose levels 2 hours after OGTT vs. oxidative phosphorylation activity levels. The three subject groups – type 2 diabetic (DM2), normal glucose tolerance (NGT), and impaired glucose tolerance (IGT) – are distinguished by color; solid lines show the first principal component for each group independent of the others. Group means are shown in black squares. The inset shows the 95% confidence intervals for the linear correlation coefficients for each group. Negative correlation between glucose levels and oxidative phosphorylation reaches statistical significance only within DM2 subjects.
Top pathways identified in comparisons between current (C), former (F), and never (N) smokers. Pathways identified as differentially expressed with p < 0.05 using pathway activity levels in comparisons between smokers and non-smokers. The p-values were determined using 10,000 random permutations as described in the Methods section. p < 0.0001 means no pathway in any of the 10,000 permutations showed higher significance. We note that the change (up or down) is determined, somewhat arbitrarily, by the average expression level captured by the first metagene. Specifically, the pathway is called 'up' if the average of ∑cwis greater in the first group (e.g., C in 'C vs. N') than in the second. A given pathway, however, will typically have some genes with higher and some with lower mean expression in one group as compared to another.
| Comparison | (genes in data set/total genes in pathway) Pathway | change | |
| C vs. N | (14/45) gamma-Hexachlorocyclohexane degradation | <0.0001 | down |
| (15/39) Prostaglandin and leukotriene metabolism | <0.0001 | up | |
| (11/24) O-Glycans biosynthesis | <0.0001 | up | |
| (6/21) Pentose and glucuronate interconversions | <0.0001 | up | |
| (24/34) Glutathione metabolism | <0.0001 | up | |
| (3/12) Lectin Induced Complement Pathway | 0.0004 | down | |
| (11/19) Chaperones modulate interferon Signaling Pathway | 0.0006 | up | |
| (6/15) TACI and BCMA stimulation of B cell immune responses. | 0.0044 | down | |
| (3/6) Tetrachloroethene degradation | 0.0054 | up | |
| (3/6) FXR and LXR Regulation of Cholesterol Metabolism | 0.0062 | down | |
| (4/7) TSP-1 Induced Apoptosis in Microvascular Endothelial Cell | 0.0065 | down | |
| (16/28) Galactose metabolism | 0.0067 | down | |
| (13/20) Biosynthesis of steroids | 0.0164 | up | |
| (7/11) Map Kinase Inactivation of SMRT Corepressor | 0.0274 | down | |
| (25/68) Nicotinate and nicotinamide metabolism | 0.0279 | up | |
| (4/14) Classical Complement Pathway | 0.0314 | down | |
| (6/19) Complement Pathway | 0.0364 | down | |
| (9/13) Nucleotide sugars metabolism | 0.0369 | up | |
| (3/3) Degradation of the RAR and RXR by the proteasome | 0.0396 | down | |
| F vs. C | (3/12) Lectin Induced Complement Pathway | 0.0068 | up |
| (13/20) Biosynthesis of steroids | 0.0083 | down | |
| (24/34) Glutathione metabolism | 0.0403 | down | |
| F vs. N | (14/45) gamma-Hexachlorocyclohexane degradation | 0.0321 | down |
Figure 4Expression profiles in airway epithelia of current (C), former (F), and never (N) smokers. Top: colormap of pathway activity levels for the highest ranking pathways in the comparison between current smokers and never smokers. Bottom: colormap for genes in the KEGG glutathione metabolism pathway. Glutathione is an important anti-oxidant known to be increased in the lungs of smokers. The genes with the highest weights in this pathway, GCLM and GCLC, encode the subunits of glutamate cysteine ligase (GCL), the rate-limiting enzyme in the synthesis of glutathione [24].
Weights for genes in two pathways that show an association with smoking status. SVD weights for genes in pathways identified as significantly differentially expressed between current and never smokers. The gene with highest absolute weight in the gamma-Hexachlorocyclohexane degradation pathway, CYP2A6, plays a key role in nicotine metabolism and has been linked to nicotine dependence [21]. GALNT3, a gene with relatively high weight in the O-Glycans biosynthesis pathway, initiates mucin-type O-glycosylation [22], suggesting a connection with the increased sputum production observed in smokers. The overall sign of the weights has here been chosen so that a positive weight implies relatively higher expression in current as compared to never smokers.
| Pathway | gene | weight |
| gamma-Hexachlorocyclohexane degradation | -0.42 | |
| -0.38 | ||
| 0.38 | ||
| -0.38 | ||
| 0.28 | ||
| -0.27 | ||
| -0.27 | ||
| -0.23 | ||
| 0.19 | ||
| 0.15 | ||
| 0.15 | ||
| 0.15 | ||
| 0.09 | ||
| -0.02 | ||
| O-Glycans biosynthesis | 0.50 | |
| 0.42 | ||
| 0.42 | ||
| 0.38 | ||
| 0.28 | ||
| 0.23 | ||
| 0.21 | ||
| 0.20 | ||
| -0.15 | ||
| 0.11 | ||
| 0.05 | ||