SC²S Colloquium - July 23, 2013
Date: | July 23, 2013 |
Room: | 02.07.023 |
Time: | 2:30 pm, s.t. |
Roman Karlstetter: Parameter-Optimization and Parallelization of a Regressor based on spatially-adaptive Sparse Grids for huge Datasets
Gaining knowledge out of vast datasets is a main challenge in data-driven applications nowadays. Sparse grids provide a numerical method for both classification and regression in data mining which scales only linearly in the number of data points and is thus well-suited for huge amounts of data. Based on top of an existing parallelization for multi-core CPUs with vector units, GPUs and hybrid systems, a MPI-Version for compute-clusters was developed inside SGpp. Several different communication strategies were developed and evaluated with strong-scaling benchmarks on various different clusters. Furthermore, a performance model for two of the strategies was developed, using microbenchmarks for the expected communication times. In addition to this, the existing codebase was refactored to be more modular and therefore making it easy to port existing vecorizations to the cluster version.