This repository contains an unofficial implementation of the Adaptive Approximate Policy Iteration and its application to the DeepSea environment as in :
- Paper : Adaptive Approximate Policy Iteration
- Authors : B. Hao, N. Lazic, Y. Abbasi-Yadkori, P. Joulani, C. Szepesvari
- Date : 2021
- Environment : DeepSea environment (Paper, Page 7) using
bsuite
- Features : One-hot encoding (Paper, Page 7)
- Evaluation method : least-squares Monte Carlo (Paper, Page 7) using
JAX
- Agent : AAPI (Paper, Algorithm 1) using
JAX
To run the experiments :
- Option 1 : Open in Colab.
- Option 2 : Run on local machine. First, you need to clone this repository and execute the following commands to install the required packages :
$ cd adaptive-policy-iteration
$ pip install -r requirements.txt
You can run an experiment using the following command :
$ cd src
$ python deepsea.py