This website requires JavaScript.

Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching

Shangzhe LiXinhua Zhang
Feb 2024
0被引用
1笔记
摘要原文
Deep generative models have recently emerged as an effective approach to offline reinforcement learning. However, their large model size poses challenges in computation. We address this issue by proposing a knowledge distillation method based on data augmentation. In particular, high-return trajectories are generated from a conditional diffusion model, and they are blended with the original trajectories through a novel stitching algorithm that leverages a new reward generator. Applying the resulting dataset to behavioral cloning, the learned shallow policy whose size is much smaller outperforms or nearly matches deep generative planners on several D4RL benchmarks.
展开全部
机器翻译
AI理解论文&经典十问
图表提取
参考文献
发布时间 · 被引用数 · 默认排序
被引用
发布时间 · 被引用数 · 默认排序
社区问答