from Demonstrations (fD)

DDPGfD (Vecerik et al., 2017) is an imitation learning algorithm that infuses demonstration data into experience replay. DDPGfD also improved DDPG by (1) using prioritized experience replay (Schaul et al., 2015), (2) adding n-step returns, (3) learning multiple times per environment step, and (4) adding L2 regularizers to actor and critic losses. We incorporated these improvements to TD3 and SAC and found that it dramatically improves their performance.

Welcome to KAIR wiki! You are always welcome to improve the wiki.

Here are some suggestions:

Populate stub pages
Write tutorials & guides
Specify versions of each software module & specs of each hardware component
Add links to external resources
Fix typos

KAIR

RL Algorithms

SAC
TD3
fD
HER
PER

Simulator

OpenManipulator

Setup
Default Controller
Demo Controller

Sim2Real

Domain Randomization

Misc

Docker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

from Demonstrations (fD)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RL Algorithms

Simulator

OpenManipulator

Sim2Real

Misc

Clone this wiki locally