| Literature DB >> 28062445 |
George Teodoro1,2, Tahsin M Kurç2,3, Luís F R Taveira1, Alba C M A Melo1, Yi Gao2, Jun Kong4, Joel H Saltz2.
Abstract
Motivation: Sensitivity analysis and parameter tuning are important processes in large-scale image analysis. They are very costly because the image analysis workflows are required to be executed several times to systematically correlate output variations with parameter changes or to tune parameters. An integrated solution with minimum user interaction that uses effective methodologies and high performance computing is required to scale these studies to large imaging datasets and expensive analysis workflows.Entities:
Mesh:
Year: 2017 PMID: 28062445 PMCID: PMC5409344 DOI: 10.1093/bioinformatics/btw749
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Parameters their value ranges for two example workflows
| Parameter | Description | Range value |
|---|---|---|
| (a) Parameters of the Watershed based segmentation workflow. The parameter search space contains about 21 trillion parameter points | ||
| B/G/R | Background detection thresholds | [210, 220,…, 240] |
| T1/T2 | Red blood cell thresholds | [2.5, 3.0,…, 7.5] |
| G1/G2 | Thresholds to identify | [5, 10,…, 80] |
| candidate nuclei | [2, 4,…, 40] | |
| MinSize | Area threshold of candidate nuclei | [2, 4,…, 40] |
| MaxSize | Area threshold of candidate nuclei | [900,., 1500] |
| MinSizePl | Area threshold before watershed | [5, 10,…, 80] |
| MinSizeSeg | Area threshold from final segmentation | [2, 4,…, 40] |
| MaxSizeSeg | Area threshold from final segmentation | [900,., 1500] |
| FillHoles | propagation neighborhood | [4-conn, 8-conn] |
| MorphRecon | propagation neighborhood | [4-conn, 8-conn] |
| Watershed | propagation neighborhood | [4-conn, 8-conn] |
| (b) Parameters of the Level Set based segmentation workflow. The parameter search space contains about 2.8 billion parameter points | ||
| OTSU | OTSU threshold value | [0.3, 0.4,…, 1.3] |
| Curvature Weight | Curvature weight (CW) for level-set | [0.0, 0.05,…, 1.0] |
| MinSize | Minimum object size | [1, 2,…, 20] |
| MaxSize | Maximum object size | [50, 55,…, 400] |
| MsKernel | Radius in Mean-Shift calculation | [5, 6,…, 30] |
| LevetSetIt | Number of iterations of | [5, 6,…, 150] |
| the level set computation | ||
Fig. 1.The parameter study framework. A parameter study process (SA or auto-tuning) is selected by an investigator. The analysis workflow is executed on a parallel machine multiple times while input parameters are systematically varied. The analysis results are compared to a set of reference results to compute a new set of parameters. This iterative process continues until the process has converged in the case of the parameter tuning process or collected enough data in the case of the sensitivity analysis process (Color version of this figure is available at Bioinformatics online.)
Fig. 2.Evaluation of multiple parameter sets. The replica based composition scheme executes independent instances of the same workflow, whereas the compact composition scheme merges the multiple instances to eliminate duplicate computations and data storage
MOAT analysis for the watershed workflow with r values of 5, 10 and 15
We classify in green, yellow and red, respectively, those parameters having high, medium and low effect on the output. (Color version of this table is available at Bioinformatics online.)
VBD results (Main (S) and Total () effects)
| (a) Results for the watershed based segmentation workflow | ||||||
|---|---|---|---|---|---|---|
| Parameters | n = 50 | n = 100 | n = 200 | |||
| T2 | −1.25e-05 | 1.32e-07 | 2.86e-05 | 6.36e-08 | 1.67e-03 | 2.81e-04 |
| G1 | 3.52e-02 | 7.57e-02 | −1.88e-03 | 1.44e-01 | 5.95e-02 | 9.07e-02 |
| G2 | 7.80e-01 | 9.46e-01 | 5.28e-01 | 7.57e-01 | 5.39e-01 | 8.67e-01 |
| MinSize | 1.73e-02 | 3.92e-02 | 1.67e-02 | 4.13e-02 | 1.34e-02 | 1.58e-02 |
| MaxSize | 4.76e-03 | 2.80e-04 | 1.65e-03 | 1.70e-03 | 1.29e-04 | 5.39e-04 |
| MinSizePl | −5.48e-04 | 4.80e-02 | 2.31e-02 | 2.67e-02 | 1.39e-02 | 1.99e-02 |
| MinSizeSeg | 1.69e-01 | 1.95e-01 | 1.38e-01 | 1.08e-01 | 8.99e-02 | 9.37e-02 |
| Recon | −2.24e-02 | 2.22e-01 | −2.70e-02 | 3.21e-01 | 2.16e-02 | 2.06e-01 |
| Sum | 1.0 | 0.73 | 0.74 | |||
| (b) Results for the level set based segmentation workflow | ||||||
| OTSU | 8.91e-01 | 8.97e-01 | 9.23e-01 | 9.42e-01 | 9.25e-01 | 9.32e-01 |
| CW | 7.33e-02 | 7.53e-02 | 1.05e-02 | 1.48e-02 | 5.31e-02 | 5.51e-02 |
| MinSize | 1.29e-03 | 2.84e-03 | 1.84e-03 | 2.61e-03 | 9.51e-04 | 9.46e-04 |
| MsKernel | 3.15e-02 | 2.56e-02 | 3.09e-02 | 3.11e-02 | 1.71e-02 | 1.95e-02 |
| LevelSetIt | 4.88e-03 | 5.05e-03 | 1.03e-03 | 1.05e-03 | 2.90e-03 | 2.12e-04 |
| Sum | 1.0 | 0.96 | 0.99 | |||
Sensitivity analysis execution times (secs) using 128 computing nodes
| Application | Method (Sample Size) | ||
|---|---|---|---|
| MOAT (240) | Importance Measures (400) | VBD (2000) | |
| Watershed | 15 681 | 27 078 | 150 890 |
| Level Set | 6,825 | 22 696 | 211 912 |
Results (avg of the metric over the 15 images) using application default parameters and those selected by the tuning algorithms
| Workflow | Dice | Jaccard | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Default | NM | PRO | GA | GLCCLUSTER | DIRECT | Spearmint | Default | NM | PRO | GA | GLCCLUSTER | DIRECT | Spearmint | |
| Watershed | 0.71 | 0.80 | 0.80 | 0.80 | 0.78 | 0.80 | 0.57 | 0.67 | 0.67 | 0.67 | 0.64 | |||
| Level Set | 0.61 | 0.75 | 0.74 | 0.82 | 0.61 | 0.75 | 0.50 | 0.71 | 0.65 | 0.70 | 0.47 | 0.63 | ||
The best result for each pair segmentation algorithm and metric of interest is highlighted in bold.
Fig. 3.Two image patches are presented with human segmentation and the level set workflow segmentation using default and tuned parameter values. The first image (Image 04) has 0.34 and 0.92 dice values, respectively, with default and tuned parameters. For the second image (Image 08), the dice with default parameter is 0.77 and it is 0.86 after tuning (Color version of this figure is available at Bioinformatics online.)
Fig. 4.Scalability and performance with different optimizations