Paper
21 December 2021 A comparative study of recently deep learning optimizers
Yan Liu, Maojun Zhang, Zhiwei Zhong, Xiangrong Zeng, Xin Long
Author Affiliations +
Proceedings Volume 12156, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021); 121560F (2021) https://doi.org/10.1117/12.2626430
Event: International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021), 2021, Sanya, China
Abstract
Deep learning has achieved great success in computer vision, natural language processing, recommendation systems and other fields. However, the models of deep neural network (DNN) are very complex, which often contain millions of parameters and tens or even hundreds of layers. Optimizing weights of DNNs is easy to fall into local optima, and hard to achieve better performance. Thus, how to choose an effective optimizer which is able to obtain network with higher precision and stronger generalization ability is of great significance. In this article, we make a review of some popular historical and state-of-the-art optimizers, and conclude them into three main streams: first order optimizers that accelerate convergence speed of stochastic gradient descent or/and adaptively adjust learning rates; second order optimizers that can make use of second-order information of loss landscape which helps escape from local optima; proxy optimizers that are able to deal with non-differentiable loss functions through combining with the proxy algorithm. We also summarize the first and second order moment used in different optimizers. Moreover, we provide an insightful comparison on some optimizers through image classification. The results show that first order optimizers like AdaMod and Ranger not only have low computational cost, but also show great convergence speed. Meanwhile, the optimizers that can introduce curvature information such as Adabelief and Apollo, have a better generalization especially when optimizing complex network.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yan Liu, Maojun Zhang, Zhiwei Zhong, Xiangrong Zeng, and Xin Long "A comparative study of recently deep learning optimizers", Proc. SPIE 12156, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021), 121560F (21 December 2021); https://doi.org/10.1117/12.2626430
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Stochastic processes

Lithium

Image classification

Neural networks

Optimization (mathematics)

Network architectures

Quantization

Back to Top