SC²S Colloquium - Apr 28, 2018
|Date:||Apr 28, 2018|
|Time:||15:00 - 15:30|
Eric Koepke: Optimizing Hyperparameters in the SG++ Datamining Pipeline
In Machine Learning there are often parameters of the model or the training algorithm that have to be known before the actual learning begins. These hyperparameters can make a big difference to the success of a machine learning model, especially since these models grow more complex as research on them advances. Advanced automatic hyperparameter optimization algorithms were developed to find optimal hyperparameters as fast as possible. I implement and compare two different approaches in the context of SG++, a toolbox that uses Sparse Grids to perform different classical machine learning tasks. Harmonica successively reduces the optimization search space while Bayesian Optimization tries the most promising hyperparameter setting based on previous results. I test them on regression and density estimation tasks and discuss the strengths and weaknesses of both to show different use cases. Harmonica requires more resources while being trivial to parallelize and more thorough in its search. Bayesian Optimization converges faster and finds the optimal solution as long as certain conditions are met.