Abstract
Model-Based Reinforcement Learning (MBRL) can
greatly profit from using world models for estimating the consequences of selecting particular actions: an animat can construct such a model from its experiences and use it for computing rewarding behavior. We study the problem of collecting useful experiences through exploration in stochastic environments. Towards this
... read more