| Literature DB >> 20861029 |
Mulin Jun Li1, Pak Chung Sham, Junwen Wang.
Abstract
MOTIVATION: Resampling methods, such as permutation and bootstrap, have been widely used to generate an empirical distribution for assessing the statistical significance of a measurement. However, to obtain a very low P-value, a large size of resampling is required, where computing speed, memory and storage consumption become bottlenecks, and sometimes become impossible, even on a computer cluster.Entities:
Mesh:
Year: 2010 PMID: 20861029 PMCID: PMC2971576 DOI: 10.1093/bioinformatics/btq540
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Comparison of FastPval and Exact method in memory, storage consumptions and running time
| Resampling size (1 000 000) | Memory(MB) | Storage (MB) | Running time (s) | |||||
|---|---|---|---|---|---|---|---|---|
| Model building | ||||||||
| Exact | FastPval | Exact | FastPval | Exact | FastPval | Exact | FastPval | |
| 1 | 4 | 0.39 | 12.1 | 1.3 + 0.013 | 1.10 | 1.05 | 0.74 + 2.33 | 0.08 + 1.53 |
| 10 | 38 | 0.39 | 121.4 | 1.3 + 0.131 | 11.21 | 9.29 | 7.61 + 29.88 | 0.09 + 16.07 |
| 100 | 373 | 0.78 | 1200 | 1.3 + 1.3 | 116.73 | 91.46 | 77.13 + 332.13 | 0.14 + 249.44 |
| 500 | 1900 | 2 | 5900 | 1.3 + 6.0 | 677.58 | 455.23 | 380.47 + 1885.12 | 0.40 + 1297.44 |
| 1000 | 3700 | 4 | 11 900 | 1.3 + 11.9 | 1409.61 | 919.45 | 761.52 + 4019.77 | 0.72 + 2530.65 |
| 5000 | N/Ac | 19 | 59 900 | 1.3 + 59.8 | N/A | 5475.32 | N/A | 3.34 + 12885.87 |
aModel loading + searching time.
bSizes of first model + second model.
cExact method failed to load due to large size of the dataset.