AI
When wording steers the judge: AI evaluations can flip with small changes
Small changes in wording can sway how today’s text-based AI systems judge answers, a new study finds. This matters because such models are increasingly used to grade work, screen content and compare other AI systems. If the question is phrased differently, the verdict can shift. Why this is being