-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Theoretically Optimal Play #5
Comments
Hi @Casper-Guo, I think we actually have done something similar to what you suggest! We used value iteration to solve the game for the Finkel, Blitz, Masters and Aseb rulesets. I have been working on a blog post that details how this works, and another member of our Discord server, Jeroen, is planning to write a paper on it! It's not very well documented at the moment, but the I'd say the main pitfalls in writing an implementation for solving the game is managing the win percentages. In RoyalUr-Java, we just calculate all win percentages relative to the light player (i.e., the chance that light wins). This makes it easier, as swapping the chance back and forth between light/dark can be an easy place for subtle bugs to creep in. This is pretty important for the greedy selection. Good luck with your implementation! If you're interested, we have discussed a lot of things related to this on our Discord server: https://discord.gg/FbCp76esgb. It would be great to say hi if you're interested! |
Hi, thanks for the pointer! I will direct some more detailed questions to the dedicated repo. I have taken a look and the code organization and some decisions we made are eerily similar haha.
I am taking a similar approach. The game is seen purely from the white perspective and so are the win rates. It is trivially easy to use this model to play as black anyways. Can you please resend the discord invite? It doesn't seem to be working |
Here's a new invite link: https://discord.gg/wYcR9MrQs2 What happened with the old one? If it's broken I should probably update the link in a few places! Thanks :D |
This new one doesn't work either so it's probably something on my end |
I am making an effort to use policy iteration for achieving optimal play under the Finkel ruleset. I have ran the search algorithm and there is ~137M non-end board states. With some computing power it should be feasible to run many iterations of the algorithm and achieve convergence to the optimal strategy.
I am currently bottlenecked by the RAM required to produce the full game graph necessary for running the algorithm but I expect to have access to some high-performance computing capacity in the near future. You can find my plans in
planning.md
in my repo.I am wondering if anyone else have attempted a similar approach and if there is any pitfall I should watch out for. I suppose you can say this is just expectimax on steroids given the unlimited backup depth. Any critique is appreciated!
The text was updated successfully, but these errors were encountered: