SCCS Colloquium - Sep 16, 2019
|Date:||September 16, 2019|
|Time:||15:00 - 16:00|
Julian Spahl: Extending AutoPas to GPUs
The AutoPas particle simulation library uses automatic tuning to optimize node level performance. This work implements short range particle interactions on GPUs using nvidia's CUDA. A software architecture and build systems supporting GPUs while also keeping flexibility is introduced. Furthermore performance metrics will be presented.
Keywords: AutoPas, molecular dynamics, GPU, CUDA
Maximilian Geitner: Parallelizing Particle Simulations with Kokkos
This is a Bachelor's thesis submission talk, in German. Maximilian is advised by Fabio Gratl.
The naive approach and the linked cell method are two approaches for computing short-range interactions such as the Lennard-Jones potential in the domain of MD simulations. In order to compute the interaction between particle pairs, it is necessary to parallelize the computations. Therefore, the usage of tools such as OpenMP or CUDA are quite useful for parallel executions of applications. However, each toolkit provides its own directives and requires the re-implementation of algorithms for each target platform.
This thesis describes the usage of the library Kokkos in the AutoPas project (https://github.com/AutoPas/AutoPas) from the chair of scientific computing at the Techncal University of Munich and compares it to already existing implementations in OpenMP. Kokkos is a programing model in C++ developed by the Sandia National Laboratory and focuses on the deployment of portable performance applications to all major HPC platforms. Kokkos provides its own data management and directives for parallelized execution which are modified during compile time and optimized for the specific target platform. Kokkos currently offers implementations for OpenMP, Pthreads and CUDA. AutoPas is an application which consists of many different configurations of traversal methods and other options, the goal is to select the most efficient configuration to calculate a given simulation consisting of particles.
Two different implementation strategies of Kokkos are tested with the execution platform OpenMP in AutoPas with several different configurations on the CoolMUC2, a linux cluster based on the Haswell architecture at the LRZ, and compared to an already existing application using OpenMP.
In the detailed analysis of the results, it is shown which weaknesses each Kokkos implementation has and how good the parallelization of the N-body algorithms work in practice. The performance is highly dependent on the chosen configuration, but there are cases in which Kokkos can compete with the native OpenMP implementation.
Keywords: AutoPas, Molecular Dynamics, OpenMP, Linked-Cell Algorithm