[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

Official codebase for paper Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework. This codebase is based on the open-source stable-baseline3 framework and please refer to that repo for more documentation.

Overview

TLDR: Our contribution is a dedicated initiative advisor-in-the-loop actor-critic framework for interactive reinforcement learning, which enables a two-way message passing and seeks advisor assistance only on demand. The proposed Ask-AC substantially lessens the advisor participation effort and is readily applicable to various discrete actor-critic architectures.

Abstract: Despite the promising results achieved, state-of-the-art interactive reinforcement learning schemes rely on passively receiving supervision signals from advisor experts, in the form of either continuous monitoring or pre-defined rules, which inevitably result in a cumbersome and expensive learning process. In this paper, we introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC, that replaces the unilateral advisor-guidance mechanism with a bidirectional learner-initiative one, and thereby enables a customized and efficacious message exchange between learner and advisor. At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector, that can be readily incorporated into various discrete actor-critic architectures. The former component allows the agent to initiatively seek advisor intervention in the presence of uncertain states, while the latter identifies the unstable states potentially missed by the former especially when environment changes, and then learns to promote the ask action on such states. Experimental results on both stationary and non-stationary environments and across different actor-critic backbones demonstrate that the proposed framework significantly improves the learning efficiency of the agent, and achieves the performances on par with those obtained by continuous advisor monitoring.

Prerequisites

Install dependencies

See requirments.txt file for more information about how to install the dependencies.

Usage

Please follow the instructions below to replicate the results in the paper.

# test the game-playing performance of the human advisor
python test_human.py

# train the agent with the human advisor
python train_human.py --exp run

Citation

If you find this work useful for your research, please cite our paper:

@article{liu2023AskAC,
  title={Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework},
  author={Liu, Shunyu and Wang, Xinchao and Yu, Na and Song, Jie and Chen, Kaixuan and Feng, Zunlei and Song, Mingli},
  journal={IEEE Transactions on Systems, Man, and Cybernetics: Systems},
  year={2023},
  volume={53},
  number={12},
  pages={7403--7414},
  doi={10.1109/TSMC.2023.3296773}
}

Contact

Please feel free to contact me via email ([email protected]) if you are interested in my research :)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
stable_baselines3		stable_baselines3
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
framework.png		framework.png
introduction.png		introduction.png
requirements.txt		requirements.txt
test-human.png		test-human.png
test_human.py		test_human.py
train-human.png		train-human.png
train_human.py		train_human.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

Overview

Prerequisites

Install dependencies

Usage

Citation

Contact

About

Releases

Packages

Languages

License

liushunyu/Ask-AC

Folders and files

Latest commit

History

Repository files navigation

[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

Overview

Prerequisites

Install dependencies

Usage

Citation

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages