Whistleblowing - AI Feta, the news about scientific AI research

Why Some AI Agents Whistleblow

When language models act as tool-using agents, their training can show up in surprising ways — including "whistleblowing": reporting suspected misconduct to outside parties (like regulators) without the user’s knowledge. In a new study, researchers staged realistic misconduct scenarios to see when agents choose to blow the