Driving on Registers: DrivoR for lean, adaptive end-to-end driving

Driving on Registers: DrivoR for lean, adaptive end-to-end driving

Meet DrivoR, a simple, efficient transformer for end-to-end autonomous driving.

It builds on pretrained Vision Transformers and adds camera-aware "register tokens" that compress multi-camera views into a compact scene summary, cutting downstream compute without sacrificing accuracy.

DrivoR then runs two lightweight decoders: one to propose driving paths, and one to score them by mimicking an oracle, with interpretable sub-scores you can tune at inference:

  • Safety
  • Comfort
  • Efficiency

Despite its minimal design, DrivoR matches or beats strong baselines on NAVSIM-v1, NAVSIM-v2, and the photorealistic, closed-loop HUGSIM benchmark.

Takeaway: a pure-transformer stack plus targeted token compression can deliver accurate, efficient, and behavior-adaptive end-to-end driving. Paper: https://arxiv.org/abs/2601.05083v1. Code and checkpoints will be released on the project page.

Paper: https://arxiv.org/abs/2601.05083v1

Register: https://www.AiFeta.com

#AutonomousDriving #AI #ComputerVision #Transformers #Robotics #Safety #Efficiency

Read more