Skip to content

liushunyu/Ask-AC

Repository files navigation

[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

License: Apache arXiv

Official codebase for paper Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework. This codebase is based on the open-source stable-baseline3 framework and please refer to that repo for more documentation.

Overview

TLDR: Our contribution is a dedicated initiative advisor-in-the-loop actor-critic framework for interactive reinforcement learning, which enables a two-way message passing and seeks advisor assistance only on demand. The proposed Ask-AC substantially lessens the advisor participation effort and is readily applicable to various discrete actor-critic architectures.

Abstract: Despite the promising results achieved, state-of-the-art interactive reinforcement learning schemes rely on passively receiving supervision signals from advisor experts, in the form of either continuous monitoring or pre-defined rules, which inevitably result in a cumbersome and expensive learning process. In this paper, we introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC, that replaces the unilateral advisor-guidance mechanism with a bidirectional learner-initiative one, and thereby enables a customized and efficacious message exchange between learner and advisor. At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector, that can be readily incorporated into various discrete actor-critic architectures. The former component allows the agent to initiatively seek advisor intervention in the presence of uncertain states, while the latter identifies the unstable states potentially missed by the former especially when environment changes, and then learns to promote the ask action on such states. Experimental results on both stationary and non-stationary environments and across different actor-critic backbones demonstrate that the proposed framework significantly improves the learning efficiency of the agent, and achieves the performances on par with those obtained by continuous advisor monitoring.

image

Prerequisites

Install dependencies

See requirments.txt file for more information about how to install the dependencies.

Usage

Please follow the instructions below to replicate the results in the paper.

# test the game-playing performance of the human advisor
python test_human.py

image

# train the agent with the human advisor
python train_human.py --exp run

image

Citation

If you find this work useful for your research, please cite our paper:

@article{liu2023AskAC,
  title={Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework},
  author={Liu, Shunyu and Wang, Xinchao and Yu, Na and Song, Jie and Chen, Kaixuan and Feng, Zunlei and Song, Mingli},
  journal={IEEE Transactions on Systems, Man, and Cybernetics: Systems},
  year={2023},
  volume={53},
  number={12},
  pages={7403--7414},
  doi={10.1109/TSMC.2023.3296773}
}

Contact

Please feel free to contact me via email ([email protected]) if you are interested in my research :)