Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning
CEL learns rules and strategies from scratch by reflecting on its own trajectories.
Instead of memorizing behaviors from vast experience, CEL (Cogito, ergo ludo) learns to play by reasoning and planning in language. Starting tabula rasa (knowing only the action set), the agent interacts with an environment, then reflects on the full episode to update two explicit, human-readable artifacts: (1) a rule model of environment dynamics and (2) a strategic playbook distilled from experience.
This interaction–reflection cycle powers two concurrent processes:
- Rule Induction: Discover and refine the environment’s mechanics from observed trajectories.
- Strategy and Playbook Summarization: Extract reusable, actionable tactics for future episodes.
Evaluated on grid-world tasks such as Minesweeper, Frozen Lake, and Sokoban, CEL autonomously uncovers rules and learns effective policies from sparse rewards—no external annotations or prior task knowledge required. Ablations confirm the iterative reflection is essential for sustained improvement, suggesting a promising route to interpretable, general agents that explain not just what they do, but why.
Why it matters: By encoding knowledge in language, CEL provides transparency, editability, and transfer. Designers can inspect the learned rules and strategies, adapt them to new settings, or deliberately scaffold learning—bridging the gap between black-box performance and human-understandable intelligence.
Paper: arXiv: Cogito, Ergo Ludo
Register: https://www.AiFeta.com
#Agents #Planning #Reasoning #GameAI #Interpretability #LLM #SelfReflection #Explainability