Transient Non stationarity and Generalisation in Deep Reinforcement Learning

Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments . ITER augments standard RL training by repeated knowledge transfer of the current policy into a freshly initialised network . Experimentally, we show that ITER improves performance on the challenging generalisation benchmarks ProcGen and Multiroom . We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents, we say it will improve performance of the challenging benchmarks Proc Gen and multiroom . It is hoped ITER will be used to improve the generalisation performance of deep-learning agents in deep-language algorithms . It will also improve performance on

Links: PDF - Abstract

Code :

None

Keywords : performance - generalisation - iter - improve - deep -

Leave a Reply

Your email address will not be published. Required fields are marked *