AI
Probing AI Isn’t So Simple: Synthetic Training Can Mislead
Probing AI Isn’t So Simple To keep AI models honest, researchers train tiny “probes” that look inside a model’s activations to flag behaviors like deception or sycophancy. But real examples of these behaviors are rare, so teams often use synthetic AI-generated data instead. This study tested how well