| Literature DB >> 18586739 |
Joshua W K Ho1, Maurizio Stefani, Cristobal G dos Remedios, Michael A Charleston.
Abstract
MOTIVATION: Current microarray analyses focus on identifying sets of genes that are differentially expressed (DE) or differentially coexpressed (DC) in different biological states (e.g. diseased versus non-diseased). We observed that in many human diseases, some genes have a significant increase or decrease in expression variability (variance). As these observed changes in expression variability may be caused by alteration of the underlying expression dynamics, such differential variability (DV) patterns are also biologically interesting.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18586739 PMCID: PMC2718620 DOI: 10.1093/bioinformatics/btn142
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.An illustration of the concept of (a) DE, (b) DC and (c) DV. The x-axes represent individual samples and the y-axes represent gene expression level.
Summary of the eight tests of differential variability
| Test | Statistic | Distribution |
|---|---|---|
| Empirical | ||
| SD Diff, permutation | Empirical | |
| Empirical | ||
| Empirical |
Summary of the microarray dataset used
| Dataset | Disease | Probes | Platform | |DV| | |DE| | |DV ∩ DE| | ||
|---|---|---|---|---|---|---|---|---|
| Stearman | Lung adenocarcinoma | 19 | 20 | 12 625 | HG-U95Av2 | 1292 | 4668 | 854 |
| Haslett | Duchenne muscular dystrophy | 12 | 12 | 12 625 | HG-U95Av2 | 12 | 1567 | 12 |
| Hong | Colorectal cancer | 10 | 12 | 54 675 | HG-U133 Plus 2.0 | 35 | 5118 | 27 |
| CardioGenomics | Dilated cardiomyopathy | 14 | 27 | 54 675 | HG-U133 Plus 2.0 | 248 | 10 532 | 126 |
All datasets were generated from Affymetrix arrays. n is the number of arrays from non-diseased samples and m is the number of array from diseased samples. |DV|, |DE| and |DV ∩ DE| represent the number of DV genes, DE genes and genes that are both DE and DV, respectively.
Comparison of the performance of differential variability detection methods using simulated datasets
| DV | Distribution | SD Diff, perm.(%) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| No | Normal | 10 | 6.33 | ||||||
| No | Normal, 1 outlier | 44.33 | 9 | 6.33 | 1.33 | ||||
| No | Uniform | 8.33 | 2 | 1.33 | 1.67 | 1.33 | |||
| No | Gamma | 3 | 13.33 | 6.33 | 2 | 1.33 | |||
| Yes | Normal | 81 | 97 | 87.33 | 87 | ||||
| Yes | Normal, 1 outlier | 78.67 | 78.67 | 55.33 | 55.67 | 76 | 78.67 | ||
| Yes | Uniform | 95.33 | 97.33 | 87.33 | 81.33 | ||||
| Yes | Gamma | 95 | 49.67 | 49.67 | 38.67 | 89.33 | 90.67 | 14 | 4.67 |
The values represent the percentage of 300 genes that were identified as differentially variable (significance level 0.01). All results that have low false-positive rates (<1%) or high true-positive rates (>99%) are shown in bold.
Concordance of the 200 most highly ranked DV genes (genes with the lowest P-values) from the Stearman et al. (2005) dataset preprocessed by five different preprocessing methods
| DFW | FARMS | RMA | GCRMA | |
|---|---|---|---|---|
| MAS 5.0 | 0.460 | 0.450 | 0.395 | 0.295 |
| GCRMA | 0.395 | 0.460 | 0.515 | |
| RMA | 0.650 | 0.680 | ||
| FARMS | 0.695 |
Distribution of genes with significant (P ≤ 0.05) increasing (inc.), decreasing (dec.) or non-significant (n.c.) DE or DV in the four human disease datasets
| (a) Stearman | (b) Haslett | (c) Hong | (d) CardioGenomics | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DV | DV | DV | DV | ||||||||||||||||||||
| dec. | n.c. | inc. | dec. | n.c. | inc. | dec. | n.c. | inc. | dec. | n.c. | inc. | ||||||||||||
| DE | dec. | 18 | 1815 | 259 | DE | dec. | 0 | 677 | 0 | DE | dec. | 1 | 2792 | 9 | DE | dec. | 16 | 5009 | 7 | ||||
| n.c. | 10 | 7519 | 428 | n.c. | 0 | 11 058 | 0 | n.c. | 0 | 49 549 | 8 | n.c. | 44 | 44 021 | 78 | ||||||||
| inc. | 1 | 1999 | 576 | inc. | 0 | 878 | 12 | inc. | 0 | 2299 | 17 | inc. | 1 | 5397 | 102 | ||||||||
Relationship between DE, DV and DC
| Dataset | Patterns | Normal | Disease | ||
|---|---|---|---|---|---|
| Neg. | Pos. | Neg. | Pos. | ||
| Stearman | DV dec. | 0 | 149 | 1 | 16 |
| DV inc. | 0 | 2 | 49 | 183 | |
| DE dec. | 0 | 38 | 0 | 512 | |
| DE inc. | 1 | 11 | 0 | 58 | |
| Haslett | DV dec. | 49 | 111 | 19 | 33 |
| DV inc. | 19 | 21 | 140 | 382 | |
| DE dec. | 11 | 39 | 25 | 76 | |
| DE inc. | 6 | 144 | 2 | 379 | |
| Hong | DV dec. | 44 | 56 | 29 | 25 |
| DV inc. | 65 | 86 | 324 | 4356 | |
| DE dec. | 35 | 176 | 1 | 1528 | |
| DE inc. | 17 | 404 | 0 | 290 | |
| CardioGenomics | DV dec. | 78 | 484 | 0 | 1 |
| DV inc. | 6 | 82 | 706 | 12207 | |
| DE dec. | 1 | 68 | 0 | 761 | |
| DE inc. | 5 | 32 | 0 | 43 | |
The top ranking 200 genes with increasing/decreasing DV/DE are tested for DC. neg. = negatively coexpressed (r < −0.85), pos. = positively coexpressed (r > 0.85).
Fig. 2.Distribution of pairwise correlation coefficient among the 200 top-ranking increasing/decreasing DV and DE genes in the CardioGenomics dataset. There is a marked increase in coexpression in profiles with higher variability.
Some significant DV genes discovered in the four human disease datasets
| DV | DE | |||
|---|---|---|---|---|
| Dataset | Gene | Description | adjusted | adjusted |
| Stearman | IL1RL1* | Interleukin 1 receptor-like 1 | 0.000220 | 9.64E-05 ↘ |
| IL6* | Interleukin 6 (interferon, β 2) | 0.004523 | 0.001240 ↘ | |
| IL8RA* | Interleukin 8 receptor, α | 0.006749 | 0.000307 ↘ | |
| STARD7* | START domain containing 7 | 0.020005 | 0.770919 | |
| JUNB* | Jun B proto-oncogene | 0.048983 | 0.002005 ↘ | |
| ADCY9 | Adenylate cyclase 9 | 8.45E-07 | 0.002343 ↘ | |
| IFI16 | Interferon, γ-inducible protein 16 | 0.000117 | 0.665773 | |
| IGF2 | Insulin-like growth factor 2 (somatomedin A) | 0.020126 | 0.792114 | |
| MTSS1 | Metastasis suppressor | 0.015878 | 0.550911 | |
| Haslett | SPP1 | Secreted phosphoprotein 1 | 4.15E-05 | 0.002375 ↗ |
| PLA2G2A | Phospholipase A2, group IIA (platelets, synovial fluid) | 0.000546 | 0.003778 ↗ | |
| TIMP1 | TIMP metallopeptidase inhibitor | 0.019086 | 0.000242 ↗ | |
| PDIA3 | Protein disulfide isomerase family A, member 3 | 0.030963 | 0.008129 ↗ | |
| FRZB | Frizzled-related protein | 0.030963 | 0.006435 ↗ | |
| MYL4 | Myosin, light chain 4, alkali; atrial, embryonic | 0.043187 | 0.002082 ↗ | |
| Hong | G6PC* | Glucose-6-phosphatase, catalytic subunit | 0.046837 | 0.014506 ↘ |
| FOSB | FBJ murine osteosarcoma viral oncogene homolog B | 0.000436 | 0.001841 ↗ | |
| CYR61 | Cysteine-rich, angiogenic inducer, 61 | 0.000436 | 0.000151 ↗ | |
| EGR1 | Early growth response 1 | 0.009421 | 0.001968 ↗ | |
| FIGF | c-Fos induced growth factor (vascular endothelial growth factor D) | 0.009421 | 0.067034 | |
| MCAM | Melanoma cell adhesion molecule | 0.031857 | 0.005266 ↗ | |
| CardioGenomics | LIMS1* | LIM and senescent cell antigen-like domains 1 | 0.005508 | 0.011308 ↘ |
| MCM4* | Minichromosome maintenance complex component 4 | 0.008197 | 0.040694 ↗ | |
| SMAD3* | SMAD family member 3 | 0.009093 | 0.079661 | |
| EPHB4 | EPH receptor B4 | 0.000965 | 0.243504 | |
| TRPC4 | Transient receptor potential cation channel, subfamily C, member 4 | 0.005356 | 0.057685 | |
| ZBP1 | Z-DNA-binding protein 1 | 0.033411 | 0.111145 |
These DV genes are selected based on biological relevance to the disease under consideration. All genes marked with asterisk have decreased expression variability, while unmarked genes have increased variability in diseased patients. The adjusted P-values for DV and DE are shown. Significant up- and downregulation are marked next to the DE P-value by up- and down-arrow, respectively.
Fig. 3.Some typical genes with statistically significant DV in the Stearman dataset. The expression value is sorted within each group independently of other genes to better visualize the variability among samples. IL1RL1, IL6 and STARD7 are examples of genes with decreased variability in lung cancer patients. ADCY9, IFI16 and IGF2 are examples of genes with increased variability in lung cancer patients.
Fig. 4.The power curve for the F-test based on different significance levels (sig.) and sample sizes. This power curve assumes the true population variance of the ‘disease’ samples is five times higher than those from the ‘normal’ group.