Continuous Control With Deep Reinforcement Learning - DDPG with Prioritized Experience Replay

Jonathan Pearce, McGill University

Comp 767 Reinforcement Learning, Final Project (Winter 2020)

Deep Deterministic Policy Gradient (DDPG) is currently one of the most popular deep reinforcement learning algorithms for continuous control. Inspired by the Deep Q-network algorithm (DQN) that works with discrete action spaces, DDPG uses a replay buffer to stabilize Q-learning. It has been demonstrated that prioritized experience replay (PER) can improve the performance of DQN. We investigate whether prioritized experience replay can have a similar effect with a continuous control algorithm such as DDPG. In this project we have reproduced the DDPG algorithm, integrated prioritized experience replay with DDPG and evaluated both algorithms on two popular benchmarking tasks for continuous control methods. Our experiments show that prioritized experience replay can improve the performance of the DDPG algorithm.

For full project report please click here or go to report.pdf above

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
DDPG.py		DDPG.py
PER_buffer.py		PER_buffer.py
README.md		README.md
main.py		main.py
original_buffer.py		original_buffer.py
plots.py		plots.py
report.pdf		report.pdf
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDPG.py

DDPG.py

PER_buffer.py

PER_buffer.py

README.md

README.md

main.py

main.py

original_buffer.py

original_buffer.py

plots.py

plots.py

report.pdf

report.pdf

utils.py

utils.py

Repository files navigation

Continuous Control With Deep Reinforcement Learning - DDPG with Prioritized Experience Replay

Jonathan Pearce, McGill University

Comp 767 Reinforcement Learning, Final Project (Winter 2020)

About

Releases

Packages

Languages

Jonathan-Pearce/DDPG_PER

Folders and files

Latest commit

History

Repository files navigation

Continuous Control With Deep Reinforcement Learning - DDPG with Prioritized Experience Replay

Jonathan Pearce, McGill University

Comp 767 Reinforcement Learning, Final Project (Winter 2020)

About

Topics

Resources

Stars

Watchers

Forks

Languages