Skip to content
/ ESRL Public

Code for Expert Supervised Reinforcement Learning

Notifications You must be signed in to change notification settings

asonabend/ESRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Expert-Supervised Reinforcement Learning (ESRL)

Code for Expert-Supervised Reinforcement Learning (ESRL). If you use our code please cite our Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation paper.

Repo is set up for Riverswim environment and will work for any episodic, discrete state and action space environment.

Overview

Running main.py will

  1. Train an expert behavior policy function with PSRL if it's not already present
  2. Generate a training dataset using epsilon-greedy behavior policy
  3. Train an ESRL policy
  4. Evaluate the policy online
  5. Use offline policy evaluation using step-importance sampling (IS), step-weighted importance sampling (WIS) and model-based ESRL to obtain reward estimates

Results are saved in a dictionary or added into the existing results dictionary.

Function:

To begin the process with defaults run:

python main.py

The following argument options are available:

python main.py --seed 0 --episodes 300 --risk_aversion .1 --epsilon .1 --MDP_samples_train 250 --MDP_samples_eval 500

see the ESRL paper or our ESRL video for the details on the arguments and method.

Bibtex

@inproceedings{ASW2020expertsupervised,
 author = {Sonabend, Aaron and Lu, Junwei and Celi, Leo Anthony and Cai, Tianxi and Szolovits, Peter},
 booktitle = {Advances in Neural Information Processing Systems},
 pages = {18967--18977},
 title = {Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation},
 url = {https://proceedings.neurips.cc/paper/2020/file/daf642455364613e2120c636b5a1f9c7-Paper.pdf},
 volume = {33},
 year = {2020}
}

About

Code for Expert Supervised Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages