Abstract
Reinforcement learning (RL) has emerged as a key technique for designing dialogue policies. However, action space inflation in dialogue tasks has led to a heavy decision burden and incoherence problems for dialogue policies. In this paper, we propose a novel decomposed deep Q-network (D2Q) that exploits the natural structure of
... read more