This website requires JavaScript.

An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context

Xiaoyu ChenXiangming ZhuYufeng Zheng ...+7 Tie-Yan Liu
Dec 2022
摘要
One of the key challenges in deploying RL to real-world applications is toadapt to variations of unknown environment contexts, such as changing terrainsin robotic tasks and fluctuated bandwidth in congestion control. Existing workson adaptation to unknown environment contexts either assume the contexts arethe same for the whole episode or assume the context variables are Markovian.However, in many real-world applications, the environment context usually staysstable for a stochastic period and then changes in an abrupt and unpredictablemanner within an episode, resulting in a segment structure, which existingworks fail to address. To leverage the segment structure of piecewise stablecontext in real-world applications, in this paper, we propose a\textit{\textbf{Se}gmented \textbf{C}ontext \textbf{B}elief \textbf{A}ugmented\textbf{D}eep~(SeCBAD)} RL method. Our method can jointly infer the beliefdistribution over latent context with the posterior over segment length andperform more accurate belief context inference with observed data within thecurrent context segment. The inferred belief context can be leveraged toaugment the state, leading to a policy that can adapt to abrupt variations incontext. We demonstrate empirically that SeCBAD can infer context segmentlength accurately and outperform existing methods on a toy grid worldenvironment and Mujuco tasks with piecewise-stable context.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答