Posts

Sorted by New

Wiki Contributions

Comments

LGS110

It does do (a variant of) MCTS. Check it out for yourself. The paper is here:

https://arxiv.org/pdf/1911.08265.pdf

Appendix B, page 12:

"We now describe the search algorithm used by MuZero. Our approach is based upon Monte-Carlo tree search with upper confidence bounds, an approach to planning that converges asymptotically to the optimal policy in single agent domains and to the minimax value function in zero sum games [22]."

LGS20

You are aware that MuZero has tree search hardcoded into it, yes? How does that contradict claim 1?