AI finally masters Stratego—on a budget
Big win for AI in hidden-information games. Stratego has long resisted top-tier AI—even with training runs costing millions. This new study cracks it.
Researchers achieved vastly superhuman Stratego play using self-play reinforcement learning plus a smart test-time search designed for imperfect information. The surprise: it worked with only a few thousand dollars of compute, not an industrial budget.
- Why it matters: Stratego is a tough proxy for real-world decision-making under uncertainty.
- How they did it: agents learn by playing themselves, then plan at move time while accounting for what’s hidden.
- So what: cheaper, stronger methods could spread to many uncertain, adversarial settings.
Read the paper: http://arxiv.org/abs/2511.07312v1
Authors: Samuel Sokota, Eugene Vinitsky, Hengyuan Hu, J. Zico Kolter, Gabriele Farina
Paper: http://arxiv.org/abs/2511.07312v1
Register: https://www.AiFeta.com
AI ReinforcementLearning Stratego Games MachineLearning ImperfectInformation Research