This website requires JavaScript.

Toward Self-learning End-to-End Task-Oriented Dialog Systems

Xiaoying ZhangBaolin PengJianfeng GaoHelen Meng
Jan 2022
摘要
End-to-end task bots are typically learned over a static and usuallylimited-size corpus. However, when deployed in dynamic, changing, and openenvironments to interact with users, task bots tend to fail when confrontedwith data that deviate from the training corpus, i.e., out-of-distributionsamples. In this paper, we study the problem of automatically adapting taskbots to changing environments by learning from human-bot interactions withminimum or zero human annotations. We propose SL-AGENT, a novel self-learningframework for building end-to-end task bots. SL-AGENT consists of a dialogmodel and a pre-trained reward model to predict the quality of an agentresponse. It enables task bots to automatically adapt to changing environmentsby learning from the unlabeled human-bot dialog logs accumulated afterdeployment via reinforcement learning with the incorporated reward model.Experimental results on four well-studied dialog tasks show the effectivenessof SL-AGENT to automatically adapt to changing environments, using bothautomatic and human evaluations. We will release code and data for furtherresearch.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答