The main goal of this project is provide an examples of Q-Value and Q-learning, which are Reinforcement Learning algorithms.
The first is done through a path planning experiment, where the robot have to reach the ball. As it is a Model-Based process, it calculates all its moves before act:
And the Q-learning algorithm learns a policy of a basic crawler, created with Pybox2D: