| Literature DB >> 32330179 |
Honggang Zhao1, Benjamin Beck2, Adam Fuller3, Eric Peatman1.
Abstract
The software programs STRUCTURE and NEWHYBRIDS are widely used population genetic programs useful in addressing questions related to genetic structure, admixture, and hybridization. These programs usually require a large number of independent runs with many iterations to provide robust data for downstream analyses, thus significantly increasing computation time. Programs such as Structure_threader and parallelnewhybrid were previously developed to address this problem by processing tasks in parallel on a multi-threaded processor; however some programming knowledge (e.g., R, Bash) is required to run these programs. We developed EasyParallel as a community resource to facilitate practical and routine population structure and hybridization analyses. The multi-threaded parallelization of EasyParallel allows processing of large genetic datasets in a very efficient way, with its point-and-click GUI providing ready access to users who have little experience in script programming. Performance evaluation of EasyParallel using simulated datasets showed similar speed-up and parallel execution time when compared to Structure_threader and Parallelnewhybrid. EasyParallel is written in Python 3 and freely available on the GitHub site https://github.com/hzz0024/EasyParallel.Entities:
Mesh:
Year: 2020 PMID: 32330179 PMCID: PMC7182190 DOI: 10.1371/journal.pone.0232110
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A screenshot of EasyParallel running the STRUCTURE and NEWHYBRIDS analyses in parallel.
(a) EasyParallel main window allows user to choose the STRUCTURE or NEWHYBRIDS module for data analysis. (b) The module panel assisting the user in adding major parameters (e.g. the number of thread or runs) and the input/parameter files. A progress bar at left shows the status of parallelization. A command window at top right shows the commands used for data running. (c) Message window shows the folder storing the outputs and the time to complete the analysis.
Fig 2Speed gain obtained by parallelization in EasyParallel and its comparison with Structure_threader and Parallelnewhybrid.
The speed increase was calculated by dividing the execution time on a single thread (sequential run) by the execution time obtained from different number of threads. i7 4700MQ ‒ Lenovo Y510, Windows 10, 2.4 GHz Intel Core i7- 4700MQ with 8 GB RAM and 4 physical cores (8 logical threads); i7 6700HQ ‒ Lenovo Y700, Windows 10, 2.6 GHz Intel Core i7-6700HQ with 8 GB RAM and 4 physical cores (8 logical threads); MacPro i5 ‒ MacBook Pro, OS 10.14, 2.7 GHz Intel Core i5 with 16 GB RAM and 2 physical cores; MacPro i7 ‒ MacBook Pro, OS 10.14, 2.6 GHz Intel Core i7 with 16 GB RAM and 6 physical cores.
Computational time (s) required to complete STRUCTURE and NEWHYBRIDS analyses in series compared to in parallel using EasyParallel, Structure_threader, and Parallelnewhybrid.
The speed gain (in parentheses) was calculated by dividing the execution time on a single thread (sequential run) by the execution time obtained from different number of threads. The analyses were repeated using different operating system and CPU architectures: i7 4700MQ ‒ Lenovo Y510, Windows 10, 2.4 GHz Intel Core i7- 4700MQ with 8 GB RAM and 4 physical cores (8 logical threads); i7 6700HQ ‒ Lenovo Y700, Windows 10, 2.6 GHz Intel Core i7-6700HQ with 8 GB RAM and 4 physical cores (8 logical threads); MacPro i5 ‒ MacBook Pro, OS 10.14, 2.7 GHz Intel Core i5 with 16 GB RAM and 2 physical cores; MacPro i7 ‒ MacBook Pro, OS 10.14, 2.6 GHz Intel Core i7 with 16 GB RAM and 6 physical cores.
| Threads | i7 6700HQ | i7 4700MQ | MacPro i5 | MacPro i7 |
|---|---|---|---|---|
| EasyParallel (STRUCTURE) | ||||
| 1 | 14711 | 14943 | 8226 | 5307 |
| 2 | 7772 (1.89) | 7929 (1.88) | 4143 (1.99) | 2785 (1.91) |
| 4 | 4052 (3.63) | 5212 (2.87) | ‒ | 1561 (3.40) |
| 6 | 3617 (4.07) | 5106 (2.93) | ‒ | 1300 (4.08) |
| 8 | 3049 (4.82) | 4733 (3.16) | ‒ | ‒ |
| Structure_threader | ||||
| 1 | 14688 | 14980 | 8193 | 5328 |
| 2 | 7762 (1.89) | 7808 (1.92) | 4145 (1.98) | 2811 (1.90) |
| 4 | 4040 (3.64) | 5255 (2.85) | ‒ | 1551 (3.44) |
| 6 | 3597 (4.08) | 5099 (2.94) | ‒ | 1282 (4.16) |
| 8 | 2999 (4.90) | 4708 (3.18) | ‒ | ‒ |
| EasyParallel (NEWHYBRIDS) | ||||
| 1 | 1574 | 1594 | 793 | 683 |
| 2 | 810 (1.94) | 820 (1.94) | 418 (1.90) | 375 (1.82) |
| 4 | 489 (3.22) | 606 (2.63) | ‒ | 206 (3.32) |
| 6 | 480 (3.28) | 551 (2.89) | ‒ | 205 (3.33) |
| 8 | 330 (4.77) | 407 (3.92) | ‒ | ‒ |
| Parallelnewhybrid | ||||
| 1 | 1500 | 1617 | 828 | 710 |
| 2 | 800 (1.87) | 837 (1.91) | 445 (1.86) | 377 (1.88) |
| 4 | 477 (3.08) | 562 (2.81) | ‒ | 208 (3.39) |
| 6 | 478 (3.19) | 553 (2.86) | ‒ | 206 (3.36) |
| 8 | 323 (4.69) | 403 (3.92) | ‒ | ‒ |