SCCS Colloquium - Nov 21, 2019

From Sccswiki
Jump to navigation Jump to search
Date: November 21, 2019
Room: 00.08.053
Time: 15:00 - 16:00

Vincent Bennet Bautista Anguiano: Integration and Visualization of Sparse-Grid based Clustering Methods in the SG++ DataMining Pipeline

Master's thesis introduction talk. Vincent is advised by Paul Sarbu and Kilian Röhner.

The SG++ Datamining Pipeline is a component of the SG++ Toolbox whose main purpose is to provide an interface to generate machine learning models based on spare grid methods. So far, the pipeline provides support to generate Density Estimation, Classification and Regression Models. Sparse Grids methods offer the advantage of being able to significantly mitigate the curse of dimensionality while processing a large amount of data, making them an attractive option to generate machine learning models.

Clustering is also a common machine learning task which is not yet supported by the pipeline. It is then that the objective of this Master Thesis is to implement a density based clustering model taking as a base the algorithm developed by Peherstorfen [Model Order Reduction of Parametrized Systems with Sparse Grids Techniques, TUM, (2013)], which make use of sparse grids numerical methods.

Additionally, different dimensionality reduction algorithms, whose purpose is to visualize high dimensional clustering models, will be explored and compared to the ones already implemented within the pipeline.

Keywords: Clustering, Machine Learning, Sparse Grids, Density-based Clustering