SC²S Colloquium - May 4, 2016

From Sccswiki
Jump to navigation Jump to search
Date: May 4, 2016
Room: 02.07.023
Time: 3:00 pm, s.t.

André Malcher : Performance Optimization of Riemann Solvers for the Shallow Water Equations

This thesis deals with the single-core performance optimization of Riemann solvers for the two-dimensional Shallow Water Equations. Specifically, we will analyze and improve solvers provided by GeoClaw, a software for geophysical flows. Therefore, an performance assessment of the existing solvers will be conducted to identify bottlenecks and define possible optimization approaches. The so-called Roofline model as an easy-to-grasp, but powerful theoretical tool will sustain this task. As for the optimization the main focus will be on the auto-vectorization of the Riemann solvers. The chosen approach will be shown and speedup results will be presented.

Matthias Fischer: A Recommender System using Clustering with Sparse Grid Density Estimation

This thesis is part of a project with the Q&A website gutefrage.net, two main goals of this project that will be addressed here are an online clustering of the questions and the development of a new recommender system. Clustering as well as the implementation of recommender systems are subject of active research, especially in the context of "Big Data". Currently, there are over 16 million questions growing by an amount of about eight thousand each day. Thus, it is not feasible to reprocess the whole data set all the time, processing the new questions has to build up on the already processed questions. An algorithm to do so has been developed before, where the questions are processed batch-wise and transformed into a global feature space. The actual clustering is based on density estimation, identifying clusters by regions of high density in the feature space. Sparse grids are used to estimate the density of a batch and to update the global density estimation accordingly. While this approach is efficient, the analysis will show that the quality of the clustering is not quite acceptable yet, with the main problem being that there is actually no relation between questions in different batches. We present a solution to this issue that is able to somehow relate the batches using the clustering of the previously processed batches to add virtual questions with fixed positions. We provide also an improved dissimilarity measure for the questions, especially for the question titles, which is needed for the feature space transformation. Furthermore, we extend the flat clustering to an hierarchical clustering algorithm that is able to detect more clusters and subdivide larger clusters at deeper levels. Opposed to the flat clustering algorithm, it is also independent from the choice of a high density threshold. The clustering is then used to implement the recommender system, where we already get acceptable recommendations for moderately large data sets.