| Literature DB >> 27504236 |
Kuo-Chan Huang1, Wei-Ya Wu2, Feng-Jian Wang2, Hsiao-Ching Liu1, Chun-Hao Hung1.
Abstract
Parallel computation has been widely applied in a variety of large-scale scientific and engineering applications. Many studies indicate that exploiting both task and data parallelisms, i.e. mixed-parallel workflows, to solve large computational problems can get better efficacy compared with either pure task parallelism or pure data parallelism. Scheduling traditional workflows of pure task parallelism on parallel systems has long been known to be an NP-complete problem. Mixed-parallel workflow scheduling has to deal with an additional challenging issue of processor allocation. In this paper, we explore the processor allocation issue in scheduling mixed-parallel workflows of moldable tasks, called M-task, and propose an Iterative Allocation Expanding and Shrinking (IAES) approach. Compared to previous approaches, our IAES has two distinguishing features. The first is allocating more processors to the tasks on allocated critical paths for effectively reducing the makespan of workflow execution. The second is allowing the processor allocation of an M-task to shrink during the iterative procedure, resulting in a more flexible and effective process for finding better allocation. The proposed IAES approach has been evaluated with a series of simulation experiments and compared to several well-known previous methods, including CPR, CPA, MCPA, and MCPA2. The experimental results indicate that our IAES approach outperforms those previous methods significantly in most situations, especially when nodes of the same layer in a workflow might have unequal workloads.Entities:
Keywords: Mixed parallelism; Moldable task; Processor allocation; Workflow scheduling
Year: 2016 PMID: 27504236 PMCID: PMC4954800 DOI: 10.1186/s40064-016-2808-y
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Fig. 1Task parallelism represented by a workflow
Fig. 2An example mixed-parallel workflow
Fig. 3Schedule generated by CPA
Fig. 4Schedule generated by MCPA
Fig. 5Schedule generated by MCPA2
Fig. 6Schedule generated by CPR
Fig. 7Allocated critical path
Fig. 8Schedule generated by IAES
Fig. 9Workflow structure of Matmul
Average makespan (s) for Matmul structure of equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 188,162 | 136,645 | 111,903 | 78,397 |
| MCPA |
|
|
|
|
| MCPA2 |
|
|
|
|
| CPR |
|
|
|
|
| IAES |
|
|
|
|
Average SLR for Matmul structure of equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.30 | 0.21 | 0.18 | 0.12 |
| MCPA |
|
|
|
|
| MCPA2 |
|
|
|
|
| CPR |
|
|
|
|
| IAES |
|
|
|
|
Average makespan (s) for Matmul structure of unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 17,301 | 13,217 | 11,480 | 11,132 |
| MCPA | 13,857 | 10,006 | 8299 | 7642 |
| MCPA2 | 16,351 | 10,410 | 8835 | 8329 |
| CPR | 14,064 | 10,920 | 9433 | 8788 |
| IAES |
|
|
|
|
Average SLR for Matmul structure of unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.31 | 0.24 | 0.21 | 0.20 |
| MCPA | 0.25 | 0.18 | 0.15 | 0.14 |
| MCPA2 | 0.29 | 0.19 | 0.16 | 0.15 |
| CPR | 0.25 | 0.20 | 0.17 | 0.16 |
| IAES |
|
|
|
|
Fig. 10Workflow structure of Strassen
Average makespan (s) for Strassen structure of equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 33,888 | 20,723 | 15,153 | 12,587 |
| MCPA | 39,202 | 17,844 | 14,544 | 13,985 |
| MCPA2 |
| 19,826 | 13,275 | 10,841 |
| CPR | 31,363 |
|
|
|
| IAES | 32,483 | 20,444 | 12,506 | 9984 |
Average SLR for Strassen structure of equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.45 | 0.28 | 0.20 | 0.17 |
| MCPA | 0.52 | 0.24 | 0.19 | 0.19 |
| MCPA2 |
| 0.27 | 0.18 | 0.15 |
| CPR | 0.42 |
|
|
|
| IAES | 0.43 | 0.27 | 0.17 | 0.13 |
Average makespan (s) for Strassen structure of unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 47,882 | 28,738 | 20,614 | 18,996 |
| MCPA | 45,577 | 27,320 | 19,208 | 19,041 |
| MCPA2 | 47,682 | 28,658 | 19,748 | 17,952 |
| CPR | 49,590 | 26,893 | 18,717 | 16,233 |
| IAES |
|
|
|
|
Average SLR for Strassen structure of unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.60 | 0.38 | 0.27 | 0.22 |
| MCPA | 0.57 | 0.37 | 0.26 | 0.22 |
| MCPA2 | 0.60 | 0.38 | 0.27 | 0.21 |
| CPR | 0.63 |
| 0.25 | 0.19 |
| IAES |
|
|
|
|
Average makespan (s) for synthetic workflows of nodes with equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 43,932 | 34,206 | 28,855 | 26,166 |
| MCPA | 44,658 | 25,712 | 18,960 | 15,569 |
| MCPA2 | 42,623 | 27,291 | 20,688 | 17,398 |
| CPR |
| 20,879 | 18,165 | 16,810 |
| IAES | 30,606 |
|
|
|
Average SLR for synthetic workflows of nodes with equal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.42 | 0.33 | 0.28 | 0.25 |
| MCPA | 0.42 | 0.24 | 0.18 | 0.15 |
| MCPA2 | 0.41 | 0.26 | 0.20 | 0.17 |
| CPR |
| 0.20 | 0.17 | 0.16 |
| IAES | 0.30 |
|
|
|
Average makespan (s) for synthetic workflows of nodes with unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 23,687 | 18,618 | 16,086 | 15,139 |
| MCPA | 27,295 | 16,600 | 12,664 | 10,839 |
| MCPA2 | 23,696 | 17,712 | 13,789 | 11,995 |
| CPR | 19,285 | 14,135 | 12,496 | 11,814 |
| IAES |
|
|
|
|
Average SLR for synthetic workflows of nodes with unequal workloads
| np = 8 | np = 16 | np = 32 | np = 64 | |
|---|---|---|---|---|
| CPA | 0.33 | 0.26 | 0.23 | 0.21 |
| MCPA | 0.37 | 0.23 | 0.17 | 0.15 |
| MCPA2 | 0.33 | 0.24 | 0.19 | 0.17 |
| CPR | 0.27 | 0.20 | 0.17 | 0.16 |
| IAES |
|
|
|
|
Average makespan (s) for synthetic workflows of nodes with equal workloads
| 30 nodes | 40 nodes | 50 nodes | 60 nodes | |
|---|---|---|---|---|
| CPA | 39,209 | 44,023 | 49,117 | 68,521 |
| MCPA | 30,496 | 43,750 | 48,942 | 73,699 |
| MCPA2 | 35,115 | 39,740 | 48,655 | 54,891 |
| CPR | 46,649 | 54,030 | 56,507 | 66,616 |
| IAES |
|
|
|
|
Average SLR for synthetic workflows of nodes with equal workloads
| 30 nodes | 40 nodes | 50 nodes | 60 nodes | |
|---|---|---|---|---|
| CPA | 0.27 | 0.25 | 0.28 | 0.32 |
| MCPA | 0.21 | 0.25 | 0.28 | 0.34 |
| MCPA2 | 0.25 | 0.23 | 0.28 | 0.26 |
| CPR | 0.32 | 0.30 | 0.32 | 0.31 |
| IAES |
|
|
|
|
Average makespan (s) for synthetic workflows of nodes with unequal workloads
| 30 nodes | 40 nodes | 50 nodes | 60 nodes | |
|---|---|---|---|---|
| CPA | 21,229 | 22,829 | 25,162 | 26,118 |
| MCPA | 19,166 | 22,717 | 23,861 | 28,512 |
| MCPA2 | 20,704 | 22,337 | 23,999 | 25,879 |
| CPR | 27,211 | 27,923 | 27,179 | 35,546 |
| IAES |
|
|
|
|
Average SLR for synthetic workflows of nodes with unequal workloads
| 30 nodes | 40 nodes | 50 nodes | 60 nodes | |
|---|---|---|---|---|
| CPA | 0.22 | 0.23 | 0.25 | 0.25 |
| MCPA | 0.20 |
| 0.23 | 0.28 |
| MCPA2 | 0.21 |
| 0.23 | 0.25 |
| CPR | 0.27 | 0.28 | 0.28 | 0.33 |
| IAES |
|
|
|
|
Algorithm computation time (s)
| CPA | MCPA | MCPA2 | CPR | IAES | |
|---|---|---|---|---|---|
| Time | 0.0019 | 0.0015 | 0.0020 | 0.0101 | 0.0172 |