Teaching AI to Finish Long, Complex Tasks with Timely Human Help

AI agents are getting good at coding and research, but they still struggle with long, domain-specific projects. Training them is hard: full-time human annotation is too costly, while pure trial-and-error rarely finds successful paths.

Apollo is a new sampling framework that blends lightweight, asynchronous human guidance with strict action-level filtering.

Humans jump in only when the agent drifts, offering tips or domain knowledge - making 30+ hour sessions feasible.
Filtering removes weak steps to stop error snowballs and keep trajectories useful.

On InnovatorBench, training GLM-4.5 with Apollo delivered 50%+ gains over the untrained baseline and 28% over a no-human variant.

Bottom line: smarter human-in-the-loop interaction can reliably train agents for long-horizon tasks like coding, deep research, and UI automation. More: http://arxiv.org/abs/2510.27630v2

Paper: http://arxiv.org/abs/2510.27630v2

Register: https://www.AiFeta.com

AI LLM HumanInTheLoop MachineLearning AIResearch AutonomousAgents

Teaching AI to Finish Long, Complex Tasks with Timely Human Help

Read more

Tekoälyapuria ei kannata valita pelkän esittelytekstin perusteella

Hakutulosten kannattaa olla hyödyllisiä, ei vain samankaltaisia

Yksi malli voi pian puhua, soittaa ja kolista – pelkillä tekstiohjeilla

Tekoälyn kanssa pärjäämme paremmin sopimalla kuin komentamalla