SAdam: A Variant of Adam for Strongly Convex Functions

Guanghui Wang; Shiyin Lu; Quan Cheng; Wei-wei Tu; Lijun Zhang

SAdam: A Variant of Adam for Strongly Convex Functions

Guanghui Wang, Shiyin Lu, Quan Cheng, Wei-wei Tu, Lijun Zhang

Published: 20 Dec 2019, Last Modified: 22 Jun 2025ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Online convex optimization, Adaptive online learning, Adam

TL;DR: A variant of Adam for strongly convex functions

Abstract: The Adam algorithm has become extremely popular for large-scale machine learning. Under convexity condition, it has been proved to enjoy a data-dependent $O(\sqrt{T})$ regret bound where $T$ is the time horizon. However, whether strong convexity can be utilized to further improve the performance remains an open problem. In this paper, we give an affirmative answer by developing a variant of Adam (referred to as SAdam) which achieves a data-dependent $O(\log T)$ regret bound for strongly convex functions. The essential idea is to maintain a faster decaying yet under controlled step size for exploiting strong convexity. In addition, under a special configuration of hyperparameters, our SAdam reduces to SC-RMSprop, a recently proposed variant of RMSprop for strongly convex functions, for which we provide the first data-dependent logarithmic regret bound. Empirical results on optimizing strongly convex functions and training deep networks demonstrate the effectiveness of our method.

Code: https://github.com/SAdam-ICLR2020/codes

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/sadam-a-variant-of-adam-for-strongly-convex/code)

Original Pdf: pdf

12 Replies

Loading