Adversarial images fall apart when you cover them up

Adversarial images fall apart when you cover them up

Neural networks can be fooled by adversarial images—but here’s a twist: those fake images are even more fragile than the real ones, especially when parts are hidden.

Researchers tested nine popular attacks (like FGSM and PGD) on CIFAR-10 and slid a small mask across each image while watching the model’s confidence. They introduced a metric called Sliding Mask Confidence Entropy (SMCE) to quantify how much confidence wobbles under occlusion. The result: adversarial examples show far higher volatility than clean images.

  • Simple signal: High SMCE is a strong tell that an input is adversarial.
  • New detector: SWM-AED uses sliding-window masks to flag adversarial inputs, avoiding the catastrophic overfitting that can plague adversarial training.
  • Strong results: Across models and attacks on CIFAR-10, detection accuracy exceeded 62% in most cases and reached 96.5%.

Bottom line: if an image only works when every pixel is visible, it’s probably adversarial.

Paper: http://arxiv.org/abs/2511.05073v1

Paper: http://arxiv.org/abs/2511.05073v1

Register: https://www.AiFeta.com

#AI #MachineLearning #DeepLearning #AdversarialExamples #ComputerVision #Security #RobustML

Read more