Skip to content

Implementation of REINFORCE with Baseline and Monte-Carlo Tree Search algorithms along with Multi-Armed Bandits.

Notifications You must be signed in to change notification settings

razor08/RL-Project

Repository files navigation

We present implementations of REINFORCE with Baseline and Monte-Carlo Tree Search algorithms on three MDPs: Cartpole, CS687-Gridworld and Mountain Car. For extra-credits, we have implemented a yet unexplored MDP: Mountain Car and we present different algorithms: Epsilon Greedy, Epsilon Decreasing Greedy, Upper Confidence Bound (UCB) and Thompson sampling performance analysis on multi-armed bandits.

About

Implementation of REINFORCE with Baseline and Monte-Carlo Tree Search algorithms along with Multi-Armed Bandits.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages