This repository contains codes that I have reproduced (while learning RL) for various reinforcement learning algorithms. The codes were tested on Colab.
If Github is not loading the Jupyter notebooks, a known Github issue, click here to view the notebooks on Jupyter's nbviewer.
Algorithms | Discrete | Continuous | Multithreaded | Multiprocessing | Tested on |
---|---|---|---|---|---|
DQN | ✔️ | CartPole-v0 | |||
Double DQN (DDQN) | ✔️ | CartPole-v0 | |||
Dueling DDQN | ✔️ | CartPole-v0 | |||
Dueling DDQN + PER | ✔️ | CartPole-v0 | |||
A3C (1) | ✔️ | ✔️ | ✔️ | ✔️(3) | CartPole-v0, Pendulum-v0 |
DPPO (2) | ✔️ | ✔️(3) | Pendulum-v0 | ||
RND + PPO | ✔️ | MountainCarContinuous-v0 (4), Pendulum-v0 (5) |
(1): N-step returns used for critic's target.
(2): GAE used for computation of TD lambda return (for critic's target) & policy's advantage.
(3): Distributed Tensorflow & Python's multiprocessing package used.
(4): State featurization (approximates feature map of an RBF kernel) is used.
(5): Fast-slow LSTM with an overly simplified VAE like "variational unit" (VU) is used.
The misc folder contains related example codes that I have put together while learning RL. See the README.md in the misc folder for more details.
Check out my blog for more information on my repositories.