PerfDojo: An AI coach for faster ML on any chip

PerfDojo: An AI coach for faster ML on any chip

In a nutshell

Making ML run fast on every chip is hard. Different CPUs, GPUs, and accelerators need different tricks, especially with features like sparsity and quantization. Manual tuning is slow; many automatic tools are opaque.

What’s new

  • PerfLLM + PerfDojo turn optimization into a reinforcement learning game guided by large language models.
  • They use a human-readable, math-inspired code form; transformations guarantee semantic validity while exploring faster variants.
  • Works without prior hardware-specific knowledge, enabling both human insight and effective agent training.

Why it matters

  • Portable speed-ups: the paper reports gains across CPUs (x86, Arm, RISC-V) and GPUs.
  • Fewer black-box heuristics; more interpretable optimization.
Automated performance without arcane hardware hacks.

By Andrei Ivanov, Siyuan Shen, Gioele Gottardo, Marcin Chrapek, Afif Boudaoud, Timo Schneider, Luca Benini, Torsten Hoefler. Read more: http://arxiv.org/abs/2511.03586v1

Paper: http://arxiv.org/abs/2511.03586v1

Register: https://www.AiFeta.com

#MachineLearning #AI #Performance #LLM #ReinforcementLearning #CPUs #GPUs #RISCV #Arm #Systems #HPC #Research

Read more