-
Notifications
You must be signed in to change notification settings - Fork 10
from Demonstrations (fD)
Seungjae Ryan Lee edited this page Mar 30, 2019
·
1 revision
DDPGfD (Vecerik et al., 2017) is an imitation learning algorithm that infuses demonstration data into experience replay. DDPGfD also improved DDPG by (1) using prioritized experience replay (Schaul et al., 2015), (2) adding n-step returns, (3) learning multiple times per environment step, and (4) adding L2 regularizers to actor and critic losses. We incorporated these improvements to TD3 and SAC and found that it dramatically improves their performance.