Robots That Learn from Imagined Worlds

Robots That Learn from Imagined Worlds

What if robots could learn new tasks just by watching AI-generated videos - and still obey physics?

PhysWorld is a new framework that turns text-and-image prompts into physically executable robot skills.

  • It first generates a task-conditioned video from a single scene image and a command.
  • Then it reconstructs the scene's physical world - objects and dynamics - from those frames.
  • Finally, it grounds the video motions into real actions with object-centric residual reinforcement learning, so the robot follows physics, not just pixels.

Why it matters: this synergy cuts out real robot data collection, enables zero-shot manipulation across diverse tasks, and substantially boosts accuracy versus prior "retarget video to robot" baselines.

Paper: http://arxiv.org/abs/2511.07416v1 • Project: https://pointscoder.github.io/PhysWorld_Web/

Paper: http://arxiv.org/abs/2511.07416v1

Register: https://www.AiFeta.com

Robotics AI ComputerVision ReinforcementLearning GenerativeAI VideoGeneration EmbodiedAI

Read more