AI
In-Video Instructions: Visual Signals as Generative Control
What if you could direct a video generator by doodling right on the frames? The paper introduces In-Video Instruction: instead of long, vague text prompts, you add visual cues—overlaid words, arrows, or motion paths—inside the image. Each cue acts as a concrete instruction tied to a specific object.