SCCS Colloquium - Jul 22, 2020
|Date:||July 22, 2020|
|Room:||Online (password: SCCS)|
|Time:||15:00 - 16:00|
Mihai Zorca: Training Deep Convolutional Neural Networks on the GPU Using a Second-Order Optimizer
Bachelor's thesis submission talk. Mihai is advised by Severin Reiz.
Deep Convolutional Neural Networks (CNNs) are a prominent class of powerful and flexible machine learning models. Training such networks requires vast compute resources. To speed up learning, many specialized algorithms have been developed. First-order methods (using just the gradient) are the most popular, but second-order algorithms (using Hessian information) are gaining importance. In this thesis we give an overview over the most common first-order optimizers and how they are used to train networks. Then we build upon a sample second-order algorithm which we call EHNewton. By integrating it into the Tensorflow platform, the new method can act as a drop-in replacement to standard algorithms. We make use of this, by training one CNN model each out of the Inception, ResNet and MobileNet architectures. Due to technical limitations we only train the last layers. But EHNewton compares favorably to the first-order algorithms for training all three CNNs on ImageNet.
Keywords: Convolutional Neural Networks, Second-Order Optimization, Machine Learning, Tensorflow