Skip to content

franck-ledoux/SPAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPAM

SPAM yields for Single PlAyer Monte carlo tree search. It is an implementation of Monte Carlo Tree Search that focuses on single-player games.

MCTS Implementation

We provide a single-player MCTS implementation in the directory mcts. Running this MCTS on your problem requires to implement three interfaces, which are:

  • IState gathers the requirement your state must fulfill;
  • IAcation indicates how to compare actions;
  • IRewardFunction contains the reward evaluation of a given state.

The memory management of the MCTS is based on smart (std::shared_ptr) and raw pointers. Memory check is regularly performed and there is no memory leak in the current implementation. In order to avoid memory issues as much as possible, previous interfaces only use smart pointers.

An example of problem solving using this MCTS implementation is given in takuzu. The takuzu is a Sudoku-like problem that is played by a single player.

Without getting in the details, the two main classes that are implemented the MCTS algorithm are MCTSAgent en MCTSTree. In practice, when using SPAM, you do not need to look at the MCSTree class. On its side, the MCTSAgent class provides the run method that implement the iterative procedure depicted below.

MCTS_pipeline

When instantiating an MCTSAgent agent, we must give it the reward function that will be used during the process, but also the selection function (here UCB is provided), the maximum number of iterations we accept, and the maximun number of seconds.

The other important class is MCTSLoop. Its role is to integrate the MCTS pipeline execution inside a loopStarting from an initial state s0 and a given agent a, we will iteratively ask a to run an MCTS pipeline to generate a new state. More specifically, the input of an iteration is a state si, we run a with si as the root of the MCTS tree. When it is done, we get the best child of this tree root, or the best node in the tree as an output. It will become the input of the next iteration.

MCTS Installation

We use CMake to build our project, which depends on the nlohmann-json library for handling json files. When building the dev environnement using cmake, you will have to previously install nlohmann-json, and to add its location into your CMAKE_PREFIX_PATH.

Using spack to install dependencies, you can do the following commands

spack install nlohmann-json

And add such a cmake option:

-DCMAKE_PREFIX_PATH=/home/.../spack/opt/spack/linux-ubuntu22.04-icelake/gcc-11.4.0/nlohmann-json-3.11.2-xpjtkhg6i3k5mbh2x4yrkqsh4dxs3rl6

MCTS explained

There exist many ways to implement a MCTS. The version proposed here is mainly inspired by https://gibberblot.github.io/rl-notes/single-agent/mcts.html.

Json format for exporting trees from SPAM

The MCTSAgent class allows to automatically export the current state of the MCTS tree in a json file. In the following pieces of codes, we activate the debug mode of the agent a before running the MCTS algorithm. The option OUT_END_ONLY indicates that we will only export the final tree. It will be written in the file "takuzu_0.json".

MCTSAgent a(...)
a.activate_debug_mode("takuzu", MCTSAgent::OUT_END_ONLY);
a.run(s);

With the following options, one file will be written every two cycles

a.activate_debug_mode("takuzu", MCTSAgent::OUT_ITERATION,2);

The generated json files will look like this.

{
  "links": [
    {"child": 2, "parent": 1}, {"child": 3, "parent": 1},
    {"child": 4, "parent": 1}, {"child": 5, "parent": 1},
    ...
  ],
  "nodes": [
    {"id": 1, "reward": -11.0, "visits": 11},
    {"id": 2, "reward": -1.0, "visits": 1},
    {"id": 3, "reward": -1.0, "visits": 1
    },
    ...
  ]
}

Further readings

About

Single player MCTS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published