| Literature DB >> 22799431 |
Rui Song1, Jian Huang, Shuangge Ma.
Abstract
BACKGROUND: In high throughput cancer genomic studies, results from the analysis of single datasets often suffer from a lack of reproducibility because of small sample sizes. Integrative analysis can effectively pool and analyze multiple datasets and provides a cost effective way to improve reproducibility. In integrative analysis, simultaneously analyzing all genes profiled may incur high computational cost. A computationally affordable remedy is prescreening, which fits marginal models, can be conducted in a parallel manner, and has low computational cost.Entities:
Mesh:
Year: 2012 PMID: 22799431 PMCID: PMC3436748 DOI: 10.1186/1471-2105-13-168
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Simulation: summary based on 1000 replicates
| | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | 200 | 0.3 | 0.3 | 19 | 55 | 29 | 40 | 58 | 73 | 2200 | 78 | |
| 15 | 200 | 0.6 | 0.3 | 48 | 88 | 90 | 97 | 99 | 99 | 600 | 98 | |
| 15 | 200 | 0.3 | 0.6 | 20 | 56 | 30 | 43 | 59 | 74 | 1800 | 72 | |
| 15 | 200 | 0.6 | 0.6 | 50 | 94 | 91 | 98 | 98 | 100 | 400 | 98 | |
| 15 | 200 | 0.3 | 0.9 | 19 | 55 | 30 | 44 | 59 | 75 | 1900 | 72 | |
| 15 | 200 | 0.6 | 0.9 | 51 | 94 | 90 | 100 | 100 | 100 | 300 | 100 | |
| 15 | 400 | 0.3 | 0.3 | 19 | 48 | 31 | 43 | 58 | 74 | 3400 | 73 | |
| 15 | 400 | 0.6 | 0.3 | 54 | 87 | 89 | 97 | 97 | 97 | 500 | 94 | |
| 15 | 400 | 0.3 | 0.6 | 21 | 58 | 32 | 43 | 60 | 76 | 3700 | 75 | |
| 15 | 400 | 0.6 | 0.6 | 53 | 93 | 92 | 98 | 100 | 100 | 500 | 97 | |
| 15 | 400 | 0.3 | 0.9 | 20 | 57 | 30 | 45 | 61 | 78 | 3500 | 74 | |
| 15 | 400 | 0.6 | 0.9 | 53 | 95 | 93 | 98 | 99 | 100 | 700 | 98 | |
| 30 | 200 | 0.3 | 0.3 | 30 | 71 | 49 | 59 | 73 | 84 | 2100 | 86 | |
| 30 | 200 | 0.6 | 0.3 | 68 | 92 | 96 | 99 | 100 | 100 | 400 | 99 | |
| 30 | 200 | 0.3 | 0.6 | 31 | 72 | 50 | 60 | 74 | 86 | 1900 | 85 | |
| 30 | 200 | 0.6 | 0.6 | 69 | 96 | 96 | 99 | 100 | 100 | 400 | 99 | |
| 30 | 200 | 0.3 | 0.9 | 30 | 72 | 50 | 60 | 73 | 85 | 1900 | 84 | |
| 30 | 200 | 0.6 | 0.9 | 69 | 96 | 97 | 100 | 100 | 100 | 700 | 100 | |
| 30 | 400 | 0.3 | 0.3 | 31 | 68 | 50 | 62 | 74 | 85 | 3700 | 80 | |
| 30 | 400 | 0.6 | 0.3 | 70 | 93 | 96 | 99 | 100 | 100 | 400 | 97 | |
| 30 | 400 | 0.3 | 0.6 | 31 | 72 | 50 | 62 | 74 | 85 | 3600 | 82 | |
| 30 | 400 | 0.6 | 0.6 | 71 | 94 | 97 | 99 | 100 | 100 | 400 | 98 | |
| 30 | 400 | 0.3 | 0.9 | 32 | 73 | 49 | 62 | 75 | 87 | 3500 | 84 | |
| 30 | 400 | 0.6 | 0.9 | 71 | 96 | 97 | 100 | 100 | 100 | 600 | 99 | |
| 60 | 200 | 0.3 | 0.3 | 50 | 90 | 80 | 90 | 95 | 98 | 800 | 93 | |
| 60 | 200 | 0.6 | 0.3 | 92 | 96 | 100 | 100 | 100 | 100 | 900 | 100 | |
| 60 | 200 | 0.3 | 0.6 | 50 | 91 | 79 | 89 | 95 | 98 | 800 | 95 | |
| 60 | 200 | 0.6 | 0.6 | 92 | 96 | 100 | 100 | 100 | 100 | 1100 | 100 | |
| 60 | 200 | 0.3 | 0.9 | 50 | 91 | 80 | 90 | 96 | 99 | 2000 | 99 | |
| 60 | 200 | 0.6 | 0.9 | 92 | 96 | 100 | 100 | 100 | 100 | 2100 | 100 | |
| 60 | 400 | 0.3 | 0.3 | 50 | 92 | 80 | 91 | 95 | 98 | 800 | 88 | |
| 60 | 400 | 0.6 | 0.3 | 92 | 96 | 100 | 100 | 100 | 100 | 800 | 100 | |
| 60 | 400 | 0.3 | 0.6 | 49 | 89 | 80 | 90 | 95 | 98 | 800 | 89 | |
| 60 | 400 | 0.6 | 0.6 | 92 | 93 | 100 | 100 | 100 | 100 | 1000 | 100 | |
| 60 | 400 | 0.3 | 0.9 | 48 | 92 | 81 | 91 | 96 | 99 | 1400 | 93 | |
| 60 | 400 | 0.6 | 0.9 | 92 | 96 | 100 | 100 | 100 | 100 | 1600 | 100 |
Figure 1Operating characteristics of different approaches for a simulated dataset. Black lines: prescreening each dataset separately; Blue line: meta analysis; Light blue line: intensity approach; Red line: integrative analysis.
Figure 2Analysis of pancreatic cancer data (upper panel) and liver cancer data (lower panel): score as a function of number of genes selected.
Pancreatic cancer studies
| | ||||
|---|---|---|---|---|
| Author | Logsdon | Friess | Iacobuzio-Donahue | Crnogorac-Jurcevic |
| PDAC | 10 | 8 | 9 | 8 |
| N | 5 | 3 | 8 | 5 |
| Array | Affy. HuGeneFL | Affy. HuGeneFL | cDNA Stanford | cDNA Sanger |
| UG | 5521 | 5521 | 29621 | 5794 |
Liver cancer studies
| | ||||
|---|---|---|---|---|
| Experimenter | Hospital A | Hospital B | Hospital C | Hospital C |
| Tumor | 16 (14) | 23 | 29 | 12 (10) |
| Normal | 16 (14) | 23 | 5 | 9(7) |
| Chip type | cDNA(Ver.1) | cDNA(Ver.1) | cDNA(Ver.1) | cDNA(Ver.2) |
| (Cy5:Cy3) | sample:normal liver | sample:placenta | sample:placenta | sample:sample |
Data analysis: number of overlapped genes selected using different approaches
| | |||||||
|---|---|---|---|---|---|---|---|
| | Intensity | Meta | Integrative | ||||
| 117 | 10 | 11 | 11 | 20 | 24 | 27 | |
| | 117 | 6 | 23 | 35 | 34 | 36 | |
| | | 117 | 8 | 29 | 31 | 41 | |
| | | | 117 | 37 | 38 | 43 | |
| Intensity | | | | | 117 | 94 | 91 |
| Meta | | | | | | 117 | 92 |
| Integrative | | | | | | | 117 |
| | Intensity | Meta | Integrative | ||||
| 873 | 120 | 77 | 85 | 245 | 252 | 227 | |
| | 873 | 114 | 107 | 398 | 293 | 223 | |
| | | 873 | 93 | 205 | 253 | 341 | |
| | | | 873 | 167 | 247 | 382 | |
| Intensity | | | | | 873 | 372 | 395 |
| Meta | | | | | | 873 | 492 |
| Integrative | 873 |
Analysis of each dataset separately (S1…S4), intensity approach, meta analysis, and integrative prescreening.