This website requires JavaScript.

The Anytime Convergence of Stochastic Gradient Descent with Momentum: From a Continuous-Time Perspective

Yasong FengYifan JiangTianyu WangZhiliang Ying
Feb 2024
0被引用
0笔记
摘要原文
In this paper, we study the stochastic optimization problem from a continuous-time perspective. We propose a stochastic first-order algorithm, called Stochastic Gradient Descent with Momentum (SGDM), and show that the trajectory of SGDM, despite its stochastic nature, converges to a deterministic second-order Ordinary Differential Equation (ODE) in $L_2$-norm, as the stepsize goes to zero. The connection between the ODE and the algorithm results in delightful patterns in the discrete-time convergence analysis. More specifically, we develop convergence results for the ODE through a Lyapunov function, and translate the whole argument to the discrete-time case. This approach yields a novel anytime convergence guarantee for stochastic gradient methods. Precisely, we prove that, for any $\beta$, there exists $k_0$ such that the sequence $\{ x_k \}$ governed by running SGDM on a smooth convex function $f$ satisfies $\mathbb{P}\left(\mathop{\bigcap}\limits_{k=k_0}^\infty\bigg\{f (x_k) - f^* \leq \frac{C\log k\log(2/\beta)}{\sqrt{k}}\bigg\}\right)\geq 1-\beta$, where $f^*=\min_{x\in\mathbb{R}^n} f(x)$. Our contribution is significant in that it better captures the convergence behavior across the entire trajectory of the algorithm, rather than at a single iterate.
展开全部
机器翻译
AI理解论文&经典十问
图表提取
参考文献
发布时间 · 被引用数 · 默认排序
被引用
发布时间 · 被引用数 · 默认排序
社区问答