原文
原始链接: https://news.ycombinator.com/item?id=40851919
该用户分享了他们将蒙特卡洛搜索方法应用于战略纸牌游戏 Magic: The Gathering 的经验,特别是在程序 Llama.cpp 和 Belcher 的背景下。 他们讨论了将他们的技术与作者的技术进行比较,特别是关于树和有向无环图(DAG)结构之间的选择。 最初,作者遇到了许多重复的游戏位置,并选择了 DAG 而不是树,这导致了严重的复杂性,但节省了大量的内存。 用户质疑这种转变的复杂性,并描述了通过更简单的方法无需进行重大重构即可获得可比较的结果。 此外,他们还强调了通过定制哈希函数和基于游戏文本摘要或二进制表示的编码密钥所实现的改进,从而带来了相当大的加速。 最终,他们对开发人员的成就表示钦佩,并计划将这些见解应用到他们的项目中。 [ [链接 1]:GitHub – HanClinto/mtg_belcher_montecarlo\ [链接2]:GitHub –ggerganov/llama.cpp/pull/6616]
I've been doing a fair bit with Monte Carlo search lately to explore different spaces, but mine has been in the context of Llama.cpp and Magic: The Gathering. [1] I'm going to compare and contrast my approaches with the approach used by the authors.
First, I want to ask a question (either to the authors, or to the general audience), about trees vs. DAGs. The author wrote:
> It was only after the initial tree implementation that we considered that there would be a lot of repeated wells: after all, in this game you can reach the same position in a number of different ways. A lot of deliberation and a re-factored codebase later, we opted for a directed acyclic graph (DAG) instead, taking identical positions and merging them into the same entry in a graph, rather than making them distinct entries on a tree. This complicated things significantly, but reduced our memory needs by an order of magnitude, at least. Refactoring from tree searches to DAG searches was more work than we’d expected to put in to the project, but was yielding promising results.
I'm curious about why this was such a large refactor. I've made this change in two different projects now, and in both cases, I was able to change from an exhaustive tree structure to what is essentially a DAG by simply searching for duplicate states and culling any branches that are equivalent. For instance, I implemented this change in Llama.cpp and got a >400% speedup -- and it was essentially a two-line change -- no drastic refactoring needed. [2]
Am I missing something here? How is a culled tree structure fundamentally different from a "true" DAG -- and more importantly -- would I stand to gain even more performance by switching to something else? I don't see what's so complicated about this, but I feel like I must be missing something.
> we discovered that the majority of our time was now spent accessing the hash of positions, since whenever a new positions was explored, we had to check if it already existed. A bit of a further dig and we discovered that most of that time was spent running a hashing algorithm on the positions data for comparison. Which again, made sense…but we knew all our positions were unique among each other. We didn’t need to do any hashing, we could just use the position’s representation as a key directly, if we could find a reasonable way to encode it. So we replaced the hashing algorithm with our own custom version that encoded the position as a binary representation of position, rotation and piece type. This was much faster than having to hash a position, and we knew it guaranteed uniqueness. This doubled the speed of our move finder.
In the case of my Magic: The Gathering searcher, I am using a text summary of the game state (essentially `gameState.toString()`) -- what's on the battlefield, what's in the hand, what's in the graveyard, how much mana is available, etc -- as the key. I sort the cards in each location alphabetically, so that it doesn't matter which order they were put there -- only what's actually there. This GREATLY increased the speed of execution, but again -- it was just a simple change.
That said, the fire graph showing how much time is spent in string processing is eye-opening, and it makes me think that I should probably profile my Python code with a similar tool, because a binary representation might be worth it.
It's also fascinating to me that the author attempted to train an ML model to play the game. My initial thought was that an exhaustive search was the only way to approach this (because it's perhaps the only way to arrive at a max score that is provably correct), but I think I am not truly appreciating the vastness of the search space at play here, and exhaustively searching all possible inputs might just be too unimaginably large.
All that said, this read has kept me on the edge of my seat. Kudos to the developers and authors for writing this up for us -- and CONGRATULATIONS ON CAPTURING THE RECORD!!! Even though it was later surpassed, that should in no way diminish the satisfaction of the accomplishments here.
It's an absolute blast reading this. I've enjoyed it, I've learned a ton, and I'm going to try implementing some of their suggestions in my own projects!! Thank you!! It's articles like this that keep me coming back to Hacker News day after day. :)
* [1] - https://github.com/HanClinto/mtg_belcher_montecarlo
* [2] - https://github.com/ggerganov/llama.cpp/pull/6616