ReinforcementLearning

This project implemented value iteration and q-learning to solve Markov Decision Processes.

The generic agents created here implement these algorithms to maximize their long-term reward in three settings: a simple gridworld (Sutton 1998), a simulated robot controller (Crawler), and Pac-Man.