Medusa: Tiny image tweaks can mislead medical AI

Medusa: Tiny image tweaks can mislead medical AI

New study warns: the multimodal medical AIs behind report writing and diagnosis can be steered off-course by barely visible tweaks to images.

Researchers introduce Medusa, a black-box attack that nudges an input scan so the system retrieves the wrong supporting evidence and then generates misleading text. Medusa learns to make tampered images look, to the AI, like medically plausible but incorrect text cues, and it transfers across different models by training on an ensemble with a dual-loop strategy.

  • Targets retrieval-augmented vision-language systems used in radiology and diagnosis.
  • Works across models without internal access (black box).
  • Achieved over 90% attack success in two real-world tasks, and bypassed four common defenses.

Why this matters: in safety-critical care, a subtle image perturbation could silently sway evidence retrieval and the final report.

Suggested actions: establish robustness benchmarks, audit retrieval steps, add cross-modal consistency checks, and stress-test models with transferable attacks before deployment.

Paper: https://arxiv.org/abs/2511.19257v1 • Code: https://anonymous.4open.science/r/MMed-RAG-Attack-F05A

Paper: https://arxiv.org/abs/2511.19257v1

Register: https://www.AiFeta.com

#AI #Healthcare #MedicalAI #CyberSecurity #AdversarialAI #RAG #Radiology #PatientSafety #MLSafety

Read more