Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy

Zhao, Yangyang; Wang, Zhenyu; Zhu, Changxi; Wang, Shihan

doi:https://doi.org/10.18653/v1/2021.emnlp-main.354

Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy

DSpace/Manakin Repository

Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy

Zhao, Yangyang; Wang, Zhenyu; Zhu, Changxi; Wang, Shihan

(2021) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 4311 - 4323

(Part of book)

Abstract

Deep reinforcement learning has shown great potential in training dialogue policies. However, its favorable performance comes at the cost of many rounds of interaction. Most of the existing dialogue policy methods rely on a single learning system, while the human brain has two specialized learning and memory systems, supporting to ... read more

Download/Full Text

Open Access version via Utrecht University Repository

Publisher version

DOI: https://doi.org/10.18653/v1/2021.emnlp-main.354

Publisher: Association for Computational Linguistics (ACL)

(Peer reviewed)

See more statistics about this item