| Literature DB >> 29732110 |
Louis Lagardère1,2,3, Luc-Henri Jolly2, Filippo Lipparini4, Félix Aviat3, Benjamin Stamm5, Zhifeng F Jing6, Matthew Harger6, Hedieh Torabifard7, G Andrés Cisneros8, Michael J Schnieders9, Nohad Gresh3, Yvon Maday10,11,12, Pengyu Y Ren6, Jay W Ponder13, Jean-Philip Piquemal3,6,11.
Abstract
We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed.Entities:
Year: 2017 PMID: 29732110 PMCID: PMC5909332 DOI: 10.1039/c7sc04531j
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.825
Fig. 1Example of 3D spatial decomposition.
Fig. 2Illustration of the midpoint rule in 2D: the square of edge d represents a subdomain assigned to a process and the blue line delimits the area that has to be imported by the latter.
Fig. 3Schematic representation of a velocity Verlet step.
Fig. 4Schematic representation of an iteration of a polarization solver.
Fig. 5Schematic representation of the computation of the reciprocal part of the electrostatic energy and forces with SPME.
Best production time (ns per day) for the different test systems (AMOEBA force field) using various methods. Number of atoms and optimal number of cores are given for each systems. All timings are given for Intel Haswell processors. Reference canonical Tinker CPU times are given for Open-MP computations using 8 cores. All computations were performed using a RESPA (2 fs) integrator if not specified otherwise. ASPC = Always Stable Predictor Corrector.53 N.A. = Non Applicable due to memory limitations. GPU production times were obtained using the Tinker-OpenMM software34 (CUDA 7.5), the JI/DIIS solver and a GTX 1080 NVIDIA card
| Systems | Ubiquitin | DHFR | COX-2 | STMV | Ribosome |
| Number of atoms | 9737 | 23 558 | 174 219 | 1 066 628 | 3 484 755 |
| Tinker-HP number of CPU cores | 480 | 680(960) | 2400 | 10 800 | 10 800 |
| PCG (10–5 D, ASPC) | 8.4 | 6.3(7.2) | 1.6 | 0.45 | 0.18 |
| TPCG2 | 10.42 | 7.81(8.93) | 1.98 | 0.56 | 0.22 |
| TPCG2/RESPA (3 fs) | 15.62 | 11.71(13.39) | 2.98 | 0.84 | 0.34 |
| CPU OPEN-MP | 0.43 | 0.21 | 0.024 | 0.0007 | N.A. |
| GPU (GTX 1080) | 10.97 | 7.85 | 1.15 | N.A. | N.A. |
Water boxes used for benchmark purposes
| System | Puddle | Pond | Lake | Sea | Ocean |
| Number of atoms | 96 000 | 288 000 | 8 640 000 | 7 776 000 | 23 328 000 |
| Size (of an edge) in Angstroms | 98.5 | 145 | 205.19 | 426.82 | 615.57 |
| Size (of an edge) of the PME grid | 120 | 144 | 250 | 432 | 648 |
Fig. 6Performance gain for the [dmim+][cl–] ionic liquid system (A) and the Puddle (B), Pond (C) and Lake (D) water boxes.
Fig. 7Performance gain for the ubiquitin protein in water (A), the dihydrofolate reductase protein (dhfr) in water (B), the COX-2 system in water (C), the satellite tobacco mosaic virus in water (D) and the ribosome in water (E).
Biosystems used for benchmark purposes
| Systems | Ubiquitin | Dhfr | COX-2 | STMV | Ribosome |
| Number of atoms | 9732 | 23 558 | 174 219 | 1 066 228 | 3 484 755 |
| Size (of an edge) in Angstroms | 54.99 × 41.91 × 41.91 | 62.23 | 120 | 223 | 327.1 |
| Size (of an edge) of the PME grid | 72 × 54 × 54 | 64 | 128 | 270 | 360 |