X-Diffusion: Robots Learn from Human Videos—Safely and Effectively

X-Diffusion: Robots Learn from Human Videos—Safely and Effectively

Robots learning from human videos—without the bad habits

Human demos are cheap and plentiful, but humans and robots have different bodies—so copying motions often fails. X-Diffusion turns this into a strength.

The trick: add noise to actions. As noise increases, tiny execution details vanish while the high-level intent remains. A classifier learns to tell whether a noisy action looks human or robot. During training, human actions are only used after enough noise is added that the classifier can’t tell the difference. Robot actions supervise precise steps at low noise; human actions give broad guidance at higher noise.

  • Safer learning: avoids physically impossible robot motions.
  • Better performance: +16% average success across five manipulation tasks vs. the best baseline.

In short, X-Diffusion maximizes what humans do best—show intent—while letting robots handle execution.

Project: https://portal-cornell.github.io/X-Diffusion/ Paper: http://arxiv.org/abs/2511.04671v1

Paper: http://arxiv.org/abs/2511.04671v1

Register: https://www.AiFeta.com

#Robotics #AI #DiffusionModels #RobotLearning #EmbodiedAI #MachineLearning #ImitationLearning #Research

Read more