AI that writes the rules: LEARN-Opt designs rewards for robots—no code needed

Kari Jaaskelainen

25 Nov 2025 — 1 min read

TL;DR

Teaching robots with reinforcement learning hinges on the reward function—the 'scorecard' that tells them what to do. Designing good rewards is hard and time-consuming.

LEARN-Opt is a new, fully autonomous approach that lets large language models write, run, and evaluate reward functions from plain-English task descriptions—no environment code or prebuilt metrics needed.

Autonomously derives performance metrics from the task goal.
Matches or outperforms state-of-the-art methods (like EUREKA) with less prior knowledge.
Finds strong solutions even with low-cost LLMs.
Reveals reward design is high-variance—multiple runs help surface the best candidates.

Bottom line: fewer engineering bottlenecks, more generalizable RL control. Paper: https://arxiv.org/abs/2511.19355v1

Paper: https://arxiv.org/abs/2511.19355v1

Register: https://www.AiFeta.com

#AI #ReinforcementLearning #LLM #Robotics #MachineLearning #Automation #Research

AI that writes the rules: LEARN-Opt designs rewards for robots—no code needed

Kari Jaaskelainen

TL;DR

Read more

Kun päällekkäisyys ei olekaan virhe: hermoverkot voivat hyödyntää “hälyä”

Synteettinen data parantaa satelliittikuvien tekoälyä, kun sisältö tarkistetaan sanoin ja kuvin

Huipputulokset koetesteissä eivät vielä todista aidosta päättelystä

Tekoäly oppii katsomaan kuvia myös vietnamiksi, kun data ja mittarit päivitetään