Teaching robots to ask for clarification in 3D

Kari Jaaskelainen

12 Jan 2026 — 1 min read

When robots should ask: Which one?

In safety-critical places like operating rooms, a vague command like "Pass me the vial" can be dangerous. This paper introduces a simple idea with big impact: teach AI to detect when an instruction is ambiguous in a 3D scene and pause to ask for clarification.

New task: Open-Vocabulary 3D Instruction Ambiguity Detection — decide if a command has exactly one clear target in a scene.
New dataset: Ambi3D with 700+ diverse scenes and ~22k instructions to stress-test models.
Key finding: Today’s leading 3D LLMs often miss ambiguity.
New method: AmbiVer, a two-stage system that gathers visual evidence from multiple views and uses it to judge clarity more reliably.

Why it matters: More cautious, trustworthy assistants — from hospitals and labs to warehouses and homes.

Read more: https://arxiv.org/abs/2601.05991 and project/code: https://jiayuding031020.github.io/ambi3d/

Paper: https://arxiv.org/abs/2601.05991v1

Register: https://www.AiFeta.com

#AI #Robotics #Safety #ComputerVision #3D #LLM #VLM #HRI #EmbodiedAI

Teaching robots to ask for clarification in 3D

Kari Jaaskelainen

When robots should ask: Which one?

Read more

Tekoäly myötäilee toteamuksia enemmän kuin kysymyksiä

Tekoälyn pitäisi uskaltaa sanoa “en tiedä” — ja sillä on väliä, miten tämä mitataan

Pienet kielimallit nopeutuvat, kun niille opetetaan valmiita fraaseja

Kone näkee saman kohtauksen eri tavoin – uusi tapa opettaa sen kokoamaan aistinsa yhteen