Code solutions to the Practical Reinforcement Learning course by National Research Higher School of Economics (HSE).
- Week1: using the gym interface to interact with environments; crossentropy method; deep crossentropy method.
- Week2 value based methods: the action value function.
- Week3 model free methods: q-learning; SARSA; expected-value SARSA; experience replay.
- Week4 approximating q values: deep q network implementations.
- Week5 policy-based methods: REINFORCE & advantage actor-critic implementation.
- Week6 uncertainty-based exploration: multi-armed bandits; monte carlo tree search (MCTS); seq2seq with RL.