Are Smarter AI Models Safer? It Depends

Are Smarter AI Models Safer? It Depends

Are smarter AI models safer?

A new safety report puts seven frontier models—GPT-5.2, Gemini 3 Pro, Qwen3‑VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5—through a unified, cross‑modal test spanning language, vision‑language, and image generation.

  • Headline: Safety performance is highly uneven. GPT‑5.2 is the most balanced overall, but no model is safe across the board.
  • Benchmarks ≠ reality: All models drop sharply under adversarial prompts, in both text and vision‑language tasks.
  • Multilingual gaps: Alignment can falter outside English, exposing policy and compliance blind spots.
  • Images: Text‑to‑image systems show better alignment on regulated categories yet remain brittle to tricky or ambiguous requests.
Bottom line: Safety is multidimensional—shaped by modality, language, and test method. Standardized, cross‑modal evaluations are essential to gauge real‑world risk and guide responsible deployment.

Read the report: https://arxiv.org/abs/2601.10527v1

Paper: https://arxiv.org/abs/2601.10527v1

Register: https://www.AiFeta.com

#AI #AISafety #LLM #Multimodal #ResponsibleAI #Evaluation #AdversarialRobustness

Read more