- This is a Gomoku AI based on curriculum learning and AlphaGo methods.
AI adopts deterministic policy with 400 simulations per move.
Tencent Gomoku AI plays black stone. AlphaGomoku adopts deterministic policy with 400 simulations per move.
- 3.6
pip install -r requirements.txt
- tensorflow
- keras
- pygame
- numpy
- matplotlib
- easygui (optional)
- Execute run.py.
- Select mode 2 (AI vs Human).
- You can also compete with different versions of AlphaGomoku by switching the network.
- Execute run.py.
- Select mode 13.
All important parameters are in AlphaGomoku/config.py. Some of them are listed as follows,
- simulation_times: the number of 'exploration' of game tree for each move.
- c_puct: in general, when c_puct gets larger, the policy decision will rely more on prior probability.
- initial_tau: temperature coefficient. When it gets smaller, policy will tend to be more deterministic.
- Zheng Xie
- XingYu Fu
- JinYuan Yu
- Likelihood Lab
- Vthree.AI
- Sun Yat-sen University
We would like to say thanks to Andrew Chen from Vthree.AI and MingWen Liu from ShiningMidas Investment for their generous help throughout the research. We are also grateful to ZhiPeng Liang and Hao Chen from Sun Yat-sen University for their supports of the training process of our Gomoku AI. Without their supports, it's hard for us to finish such a complicated task.