TowerMind: A lightweight tower-defense testbed for AI agents

TowerMind: A lightweight tower-defense testbed for AI agents

TowerMind is a new, lightweight tower-defense game environment for testing AI agents, especially large language models (LLMs), on planning and decision-making.

  • Low compute cost and easy to run
  • Multimodal observations: pixels, text, and structured game state
  • Customizable levels and rules
  • Built-in tests for model hallucination

The authors create five benchmark levels and evaluate popular LLMs under different input settings, plus classic RL baselines (Ape-X DQN and PPO). Results show a clear gap to human experts. Common failure modes include weak plan validation, single-track choices (lack of multifinality), and inefficient action use.

Why it matters: RTS-style games demand both long-term strategy and quick tactical moves—perfect for probing agent skills. TowerMind offers a practical, open benchmark that complements heavier RTS testbeds.

Read more: https://arxiv.org/abs/2601.05899 Code: https://github.com/tb6147877/TowerMind

Paper: https://arxiv.org/abs/2601.05899v1

Register: https://www.AiFeta.com

AI LLM ReinforcementLearning GameAI RTS Benchmark OpenSource TowerDefense

Read more