ToolOrchestra: Small AI that smartly manages tools
Large models are powerful, but solving deep, multi-step problems is pricey. ToolOrchestra trains a small “orchestrator” that decides which tools to use, when, and how—like a smart project manager for AI.
It uses reinforcement learning with rewards for outcomes, efficiency, and user preferences. The result: an 8B Orchestrator that beats bigger tool-use agents on tough benchmarks.
- On Humanity’s Last Exam (HLE), it scores 37.1%, edging out GPT-5 at 35.1%, while ~2.5x more efficient.
- On tau2-Bench and FRAMES, it outperforms GPT-5 at ~30% of the cost.
- It generalizes to new tools and respects user tool choices.
Bottom line: coordinating many tools with a lightweight model can be smarter and cheaper than relying on one giant model.
Paper by Hongjin Su et al. Details: https://arxiv.org/abs/2511.21689
Paper: https://arxiv.org/abs/2511.21689v1
Register: https://www.AiFeta.com
AI MachineLearning LLM Tools ReinforcementLearning Efficiency Research NLP