Improving the adaptive moment estimation optimization methods for modern machine learning
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Optimization Algorithms for Neural Networks (NNs) have become a crucial and fundamental key in Artificial Intelligence. Inside Neural Networks architectures, adapting learning rates via an optimizer is a fundamental step in training neural networks. Correspondingly, choosing any optimizer can differentiate the NNs implementations performance. The Adaptive Moment Estimation method (ADAM) is one of the most common and powerful used algorithms in training Neural Networks. However, the ADAM method has shown substandard convergence in certain cases due to fluctuating and unstable learning rates. In this thesis, we integrate two state-of-the-art algorithms: the Adaptive Moment Estimation method and a normalized Momentum. We apply the normalized Momentum of the previous gradients into the current ADAM method update to preserve and stabilize the direction of the learning rate. In this work, we test our proposed method, ADAM Plus, on the MNIST digit recognition database, CIFAR10 database, Bayesian Neural Network and U-net models. The empirical results show our proposed method has better convergence and accuracy performance. ADAM Plus demonstrates improved performance in multiple convolution neural networks applications compared to various adaptive optimization algorithms.
Description
Keywords
Machine learning, Optimization Algorithms, Neural Networks