| Literature DB >> 23198695 |
Shicheng Wu1, Yawen Xu, Zeny Feng, Xiaojian Yang, Xiaogang Wang, Xin Gao.
Abstract
BACKGROUND: It is desirable in genomic studies to select biomarkers that differentiate between normal and diseased populations based on related data sets from different platforms, including microarray expression and proteomic data. Most recently developed integration methods focus on correlation analyses between gene and protein expression profiles. The correlation methods select biomarkers with concordant behavior across two platforms but do not directly select differentially expressed biomarkers. Other integration methods have been proposed to combine statistical evidence in terms of ranks and p-values, but they do not account for the dependency relationships among the data across platforms.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23198695 PMCID: PMC3770449 DOI: 10.1186/1471-2105-13-320
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The simulation settings and results for two platforms with continuous data
| | | | | |
|---|---|---|---|---|
| Scenario 1: | ||||
| Right-side | Experiment1: | e = 0.5 for | ||
| | Experiment2: | e = 1.5 for | ||
| | 0.7895 | 0.5372 | 0.5588 | |
| | 0.0007 | 0.0007 | 0.0010 | |
| | 0.1907 | 0.2680 | 0.2600 | |
| | 0.0007 | 0.0013 | 0.0009 | |
| Left-side | Experiment1: | e = -0.5 for | ||
| | Experiment2: | e = -1.5 for | ||
| | 0.7908 | 0.5330 | 0.5556 | |
| | 0.0006 | 0.0006 | 0.0012 | |
| | 0.1891 | 0.2673 | 0.2649 | |
| | 0.0006 | 0.0009 | 0.0011 | |
| Two-sided | Experiment1: | e = -1 for | ||
| | Experiment2: | e = 2 for | ||
| | 0.6988 | 0.4113 | 0.5403 | |
| | 0.0011 | 0.0011 | 0.0010 | |
| | 0.2145 | 0.3202 | 0.2694 | |
| | 0.0007 | 0.0016 | 0.0012 | |
| Scenario 2: | ||||
| Right-side | Experiment1: | e = 0.5 for | ||
| | Experiment2: | e = 1.5 for | ||
| | 0.9405 | 0.6319 | 0.7819 | |
| | 0.0003 | 0.0005 | 0.0007 | |
| | 0.1560 | 0.2410 | 0.2051 | |
| | 0.0005 | 0.0009 | 0.0007 | |
| Left-side | Experiment1: | e = -0.5 for | ||
| | Experiment2: | e = -1.5 for | ||
| | 0.9400 | 0.6316 | 0.7871 | |
| | 0.0002 | 0.0004 | 0.0006 | |
| | 0.1605 | 0.2419 | 0.2024 | |
| | 0.0005 | 0.0007 | 0.0006 | |
| Two-sided | Experiment1: | e = -1 for | ||
| | Experiment2: | e = 2 for | ||
| | 0.9377 | 0.6670 | 0.7327 | |
| | 0.0003 | 0.0010 | 0.0007 | |
| | 0.1622 | 0.2270 | 0.2122 | |
| 0.0005 | 0.0009 | 0.0007 | ||
The simulation settings and results for five platforms with continuous data
| Scenario 1: | ||||||
| | Exp1: | e = 1.5 for g = 200 | ||||
| | Exp2: | e = 1.5 for | ||||
| | Exp3: | e = -0.5 for | ||||
| | Exp4: | e = -1 for | ||||
| | Exp5: | e = 2 for | ||||
| 0.9517 | 0.5601 | 0.4130 | 0.4464 | 0.4213 | 0.4471 | |
| 0.0002 | 0.0012 | 0.0011 | 0.0004 | 0.0010 | 0.0005 | |
| 0.1572 | 0.2605 | 0.3299 | 0.3108 | 0.3205 | 0.2727 | |
| 0.0004 | 0.0011 | 0.0018 | 0.0009 | 0.0010 | 0.0010 | |
| Scenario 2: | ||||||
| | Exp1: | e = 1.5 for g = 200 | ||||
| | Exp2: | e = 1.5 for | ||||
| | Exp3: | e = -0.5 for | ||||
| | Exp4: | e = -1 for | ||||
| | Exp5: | e = 2 for | ||||
| 0.9998 | 0.8360 | 0.6655 | 0.5682 | 0.6712 | 0.5699 | |
| 2.7e-06 | 0.0006 | 0.0010 | 0.0004 | 0.0010 | 0.0008 | |
| 0.1281 | 0.1898 | 0.2217 | 0.2593 | 0.2314 | 0.2093 | |
| 0.0004 | 0.0006 | 0.0009 | 0.0007 | 0.0007 | 0.0008 | |
The simulation settings and results for two platforms with continuous data and discrete data
| | | ||
|---|---|---|---|
| Experiment1: | Continues; | ||
| Experiment2: | Discrete; | ||
| 0.7356 | 0.5327 | 0.5228 | |
| 0.0008 | 0.0004 | 0.0012 | |
| 0.1967 | 0.2702 | 0.2763 | |
| 0.0008 | 0.0012 | 0.0012 | |
Figure 1Decision lines for comparing methods. Vertical lines use data from the first individual platform, horizontal lines use data from the second individual platform, and dashed lines use our multi-platform integration method. Circles represent non-differentially expressed biomarkers and triangles represent differentially expressed biomarkers. Plots are based on one simulated data set and 100 permutations.
True positives and false discovery rates with π = 0.8
| | ||||
| multi-platform | 224 | 165 | 143 | |
| | ( | 6.5547 | 6.0820 | 5.5202 |
| | 44.8125 | 8.0250 | 3.8375 | |
| | ( | 7.3348 | 3.4778 | 2.263 |
| | 0.1563 | 0.0386 | 0.0214 | |
| | ( | 0.0219 | 0.0161 | 0.0125 |
| | 0.1428 | 0.0388 | 0.0225 | |
| | ( | 0.0041 | 0.0014 | 0.0009 |
| 1st individual | 165 | 107 | 91 | |
| | ( | 8.8797 | 5.3066 | 4.9031 |
| | 50.5125 | 9.9000 | 4.6500 | |
| | ( | 8.9101 | 3.4982 | 2.1766 |
| | 0.2431 | 0.0736 | 0.0406 | |
| | ( | 0.0326 | 0.0246 | 0.0183 |
| | 0.1940 | 0.0600 | 0.0353 | |
| | ( | 0.0103 | 0.0030 | 0.0019 |
| 2nd individual | 197 | 106 | 79 | |
| | ( | 7.2442 | 8.2303 | 6.3222 |
| | 48.9250 | 9.6000 | 5.000 | |
| | ( | 7.1862 | 3.5750 | 2.5376 |
| | 0.1986 | 0.0721 | 0.0506 | |
| | ( | 0.0245 | 0.0258 | 0.0251 |
| | 0.1630 | 0.0607 | 0.0408 | |
| ( | 0.0060 | 0.0048 | 0.0033 |
Figure 2Decision lines for real data. Vertical lines use the mRNA data, horizontal lines use the protein data, and dashed lines use our multi-platform integration method.
SCO Summaries for the 9 genes which are identified by multi-platform integration method but not by individual platform analysis
| SCO1958 | uvrA | ABC excision | Macromolecule | DNA-replication, | excinuclease ABC, | [ |
| | | nuclease subunit A | metabolism | repair, restr./modific’n | A subunit | [ |
| SCO2940 | other | putative | Not classified | Not classified | xanthine | |
| | | oxidoreductase | (included putative | (included putative | dehydrogenase, | |
| | | | assignments) | assignments) | putative | |
| SCO2951 | other | putative malate | Central intermediary | Other central | malate | [ |
| | | oxidoreductase | metabolisms | intermediary metabolism | oxidoreductase | |
| SCO3094 | other | conserved | hypothetical | Conserved in | conserved | |
| | | hypothetical | protein | organism other than | hypothetical | |
| | | protein | protein | Escherichia coli | protein | |
| SCO4661 | fusA | elongation | Macromolecule | Proteins - | translation | [ |
| | | factor G | metabolism | translation and | elongation | |
| | | | | modification | factor G | |
| SCO5072 | actVIORF1 | hydroxylacyl-CoA | Secondary | PKS | hydroxylacyl-CoA | [ |
| | | dehydrogenase | metabolism | PKS | dehydrogenase | |
| SCO5080 | actVA5 | putative | Secondary | PKS | putative | [ |
| | | hydrolase | metabolism | PKS | hydrolase | |
| SCO6219 | Other | putative ATP/GTP | Protein | Serine/ | | [ |
| | | binding protein, | kinases | threonine | | |
| | | putative serine | | | | |
| SCO6222 | other | putative | Not classified | Not classified | aminotransferase, | [ |
| | | aminotransferase | (included putative | (included putative | class I | |
| assignments) | assignments) |
Additional simulations
| Scenario 1: | Extremely small sample size | ||
| | two measurements from each group | ||
| 0.3022 | 0.2363 | 0.2179 | |
| 0.0009 | 0.0006 | 0.0007 | |
| 0.3782 | 0.4436 | 0.4694 | |
| 0.0023 | 0.0025 | 0.0027 | |
| Scenario 2: | Correlation among platforms set to 0.5 | ||
| | Disease and normal groups are independent | ||
| 0.6689 | 0.5365 | 0.5578 | |
| 0.0009 | 0.0008 | 0.0011 | |
| 0.2255 | 0.2690 | 0.2641 | |
| 0.0008 | 0.0010 | 0.0010 | |
| Scenario 3: | Non-standardized version of | ||
| | i.e. | ||
| 0.8142 | 0.5479 | 0.5992 | |
| 0.0009 | 0.0005 | 0.0010 | |
| 0.1586 | 0.2358 | 0.2235 | |
| 0.0006 | 0.0011 | 0.0010 | |
Comparison with the quadratic test statistic t
| 0.9377 | 0.9155 | |
| 0.0003 | 0.0004 | |
| 0.1622 | 0.1804 | |
| 0.0005 | 0.0005 | |
| Quadratic: | Exp1: | e = -1 for |
| Exp2: | e = 2 for |
Comparison with Robust Rank Aggregation Method
| 1. | ||||
| | Exp1: e = 1.5 for g = 200 | 1.000 | 0.7497 | |
| | Exp2: e = 1.5 for | 1.98e-6 | 0.0012 | |
| | Exp3: e = -0.5 for | 0.2803 | 0.0912 | |
| | Exp4: e = -1 for | 0.0011 | 0.0003 | |
| | Exp5: e = 2 for | | | |
| 2. | ||||
| | Exp1: e = 1.5 for g = 100 | 0.9995 | 0.4995 | |
| | Exp2: e = 1.5 for | 0.23e-06 | 0.0008 | |
| | Exp3: e = -0.5 for | 0.1399 | 0.0823 | |
| | Exp4: e = -1 for | 0.0004 | 0.0004 | |
| | Exp5: e = 2 for | | | |
| 3. | ||||
| | Exp1: e = 1.5 for g = 100 | 0.9992 | 0.1133 | |
| | Exp2: e = 1.5 for | 2.23e-6 | 0.0002 | |
| | Exp3: e = -0.5 for | 0.0402 | 0.0796 | |
| | Exp4: e = -1 for | 0.0001 | 0.0015 | |
| Exp5: e = 2 for |