Behavior-Guided Actor-Critic (BAC) is an off-policyactor-critic deep RL algorithm . BAC mathematically formulates the behavior of the policy through autoencoders . The agent is encouraged to change its behavior consistently towards less-visited state-action pairs while attaining goodperformance by maximizing the expected discounted sum of rewards . Results show considerably better performance of BAC when compared to several cutting-edge learning algorithms . One prominent aspect of our approach is that it is applicableto both stochastic and deterministic actors in contrast to maximum entropy deepreinforcement learning algorithms.

Author(s) : Ammar Fayad, Majd Ibrahim

Links : PDF - Abstract

Code :

Keywords : behavior - learning - critic - bac - guided -

