Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theoretically Optimal Play #5

Open
Casper-Guo opened this issue Jul 13, 2024 · 4 comments
Open

Theoretically Optimal Play #5

Casper-Guo opened this issue Jul 13, 2024 · 4 comments

Comments

@Casper-Guo
Copy link

Casper-Guo commented Jul 13, 2024

I am making an effort to use policy iteration for achieving optimal play under the Finkel ruleset. I have ran the search algorithm and there is ~137M non-end board states. With some computing power it should be feasible to run many iterations of the algorithm and achieve convergence to the optimal strategy.

I am currently bottlenecked by the RAM required to produce the full game graph necessary for running the algorithm but I expect to have access to some high-performance computing capacity in the near future. You can find my plans in planning.md in my repo.

I am wondering if anyone else have attempted a similar approach and if there is any pitfall I should watch out for. I suppose you can say this is just expectimax on steroids given the unlimited backup depth. Any critique is appreciated!

@Sothatsit
Copy link
Collaborator

Hi @Casper-Guo, I think we actually have done something similar to what you suggest! We used value iteration to solve the game for the Finkel, Blitz, Masters and Aseb rulesets.

I have been working on a blog post that details how this works, and another member of our Discord server, Jeroen, is planning to write a paper on it!

It's not very well documented at the moment, but the Lut class in RoyalUr-Java can read a table of all game states and associated chances of light winning (https://github.com/RoyalUr/RoyalUr-Java/blob/main/src/main/java/net/royalur/lut/Lut.java). That can be used to play optimally, as you can just pick the move that leads to the state with the highest chance of winning. We have also uploaded the model files for each of these on HuggingFace: https://huggingface.co/sothatsit/RoyalUrModels/tree/main. Their implementation for solving the game using Julia is also open-source: https://github.com/JeroenOlieslagers/game_of_ur

I'd say the main pitfalls in writing an implementation for solving the game is managing the win percentages. In RoyalUr-Java, we just calculate all win percentages relative to the light player (i.e., the chance that light wins). This makes it easier, as swapping the chance back and forth between light/dark can be an easy place for subtle bugs to creep in. This is pretty important for the greedy selection.

Good luck with your implementation! If you're interested, we have discussed a lot of things related to this on our Discord server: https://discord.gg/FbCp76esgb. It would be great to say hi if you're interested!

@Casper-Guo
Copy link
Author

Hi, thanks for the pointer! I will direct some more detailed questions to the dedicated repo. I have taken a look and the code organization and some decisions we made are eerily similar haha.

In RoyalUr-Java, we just calculate all win percentages relative to the light player (i.e., the chance that light wins). This makes it easier, as swapping the chance back and forth between light/dark can be an easy place for subtle bugs to creep in.

I am taking a similar approach. The game is seen purely from the white perspective and so are the win rates. It is trivially easy to use this model to play as black anyways.

Can you please resend the discord invite? It doesn't seem to be working

@Sothatsit
Copy link
Collaborator

Can you please resend the discord invite? It doesn't seem to be working

Here's a new invite link: https://discord.gg/wYcR9MrQs2

What happened with the old one? If it's broken I should probably update the link in a few places! Thanks :D

@Casper-Guo
Copy link
Author

This new one doesn't work either so it's probably something on my end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants