Reinforcement Learning (RL) is a promising approach for solving various control, optimization, and sequential decision making tasks . But designing reward functions for complex tasks (e.g., with multiple objectives and safetyconstraints) can be challenging for most users . In this paper we propose aspecification language (Inkling Goal Specification) for complex control andoptimization tasks, which is very close to natural language and allows apractitioner to focus on problem specification instead of reward function hacking . We include a set of experiments showing that the proposed method provides great ease of use to specify a wide range of realworld tasks; and that the reward generated is able to drive the policy training to achieve the specified goal . The proposal also includes a novel automaton-guided dense reward generation that can be used to drive

Author(s) : Xuan Zhao, Marcos Campos

Links : PDF - Abstract

Code :
Coursera

Keywords : tasks - reward - training - specification - complex -

Leave a Reply

Your email address will not be published. Required fields are marked *