| Literature DB >> 19756046 |
Joel T Dudley1, Robert Tibshirani, Tarangini Deshpande, Atul J Butte.
Abstract
Meta-analyses combining gene expression microarray experiments offer new insights into the molecular pathophysiology of disease not evident from individual experiments. Although the established technical reproducibility of microarrays serves as a basis for meta-analysis, pathophysiological reproducibility across experiments is not well established. In this study, we carried out a large-scale analysis of disease-associated experiments obtained from NCBI GEO, and evaluated their concordance across a broad range of diseases and tissue types. On evaluating 429 experiments, representing 238 diseases and 122 tissues from 8435 microarrays, we find evidence for a general, pathophysiological concordance between experiments measuring the same disease condition. Furthermore, we find that the molecular signature of disease across tissues is overall more prominent than the signature of tissue expression across diseases. The results offer new insight into the quality of public microarray data using pathophysiological metrics, and support new directions in meta-analysis that include characterization of the commonalities of disease irrespective of tissue, as well as the creation of multi-tissue systems models of disease pathology using public data.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19756046 PMCID: PMC2758720 DOI: 10.1038/msb.2009.66
Source DB: PubMed Journal: Mol Syst Biol ISSN: 1744-4292 Impact factor: 11.429
Figure 1Boxplots comparing the distributions of the Fisher's z transformed correlation coefficients across the four disease/tissue categories. Boxplot in (A) shows a pipeline routing using parameters (NoNorm/NoCollapse/NoAggregate/SubtractiveDiff), which resulted in a significant separation of the same disease, different tissue category (D+/T−) from the different disease, same tissue category (D−/T+). Boxplot in (B) shows a pipeline routing using parameters (NoNorm/NoCollapse/NoAggregate/TtestDiff), which resulted in a distribution similar to that produced by the randomized data using the same pipeline routing parameters.
Figure 2An aggregate view of the median correlation across the four disease/tissue categories for all 84 possible pipeline routings. Vertical black bars represent the s.e.m. correlation. The colored lines connect disease/tissue category medians computed using the same pipeline-routing. Although certain pipeline routings perform better than others at establishing disease concordance, we observe a general trend indicating that the disease signal is stronger than the tissue signal regardless of the analytical methods used.
Figure 3Symmetry of disease-state gene expression for the same disease in different tissues (D+/T−) versus different diseases in the same tissue (D−/T+). The colors indicate the direction of change in the expression of a gene in the disease state relative to the normal control state, in which green indicates upregulation of disease, and red indicates downregulation of disease. Here we observe that the differential expression concordance between Huntington's disease in the brain (GDS2169) and blood (GDS1331) is much more extensive than that observed between type 2 diabetes (GDS162) and Duchenne's muscular dystrophy (GDS214) in skeletal muscle.