AI finally masters Stratego—on a budget
AI finally masters Stratego—on a budget
Stratego has long been a worst-case test for AI: you must plan deeply while most pieces are hidden. Big-budget attempts still fell short of top humans.
This paper reports a step change: an AI that reaches vastly superhuman Stratego performance—trained for only a few thousand dollars.
- Self-play reinforcement learning: the system learns by playing itself, discovering tactics and long-term plans without human data.
- Test-time search with hidden info: a lookahead that reasons about uncertainty, not just perfect knowledge.
- General methods: designed for imperfect-information problems beyond games.
Why it matters: strong decision-making under uncertainty, at low cost, opens doors for research and applications in security, logistics, and robotics.
Authors: Samuel Sokota, Eugene Vinitsky, Hengyuan Hu, J. Zico Kolter, Gabriele Farina. Paper: http://arxiv.org/abs/2511.07312v1
Paper: http://arxiv.org/abs/2511.07312v1
Register: https://www.AiFeta.com
AI ReinforcementLearning Stratego GameAI MachineLearning ImperfectInformation Research SelfPlay Arxiv