PerfDojo: An AI coach for faster ML on any chip
In a nutshell
Making ML run fast on every chip is hard. Different CPUs, GPUs, and accelerators need different tricks, especially with features like sparsity and quantization. Manual tuning is slow; many automatic tools are opaque.
What’s new
- PerfLLM + PerfDojo turn optimization into a reinforcement learning game guided by large language models.
- They use a human-readable, math-inspired code form; transformations guarantee semantic validity while exploring faster variants.
- Works without prior hardware-specific knowledge, enabling both human insight and effective agent training.
Why it matters
- Portable speed-ups: the paper reports gains across CPUs (x86, Arm, RISC-V) and GPUs.
- Fewer black-box heuristics; more interpretable optimization.
Automated performance without arcane hardware hacks.
By Andrei Ivanov, Siyuan Shen, Gioele Gottardo, Marcin Chrapek, Afif Boudaoud, Timo Schneider, Luca Benini, Torsten Hoefler. Read more: http://arxiv.org/abs/2511.03586v1
Paper: http://arxiv.org/abs/2511.03586v1
Register: https://www.AiFeta.com
#MachineLearning #AI #Performance #LLM #ReinforcementLearning #CPUs #GPUs #RISCV #Arm #Systems #HPC #Research