Bias scores shift with context: Contextual StereoSet

Kari Jaaskelainen

16 Jan 2026 — 1 min read

Models that seem fair in the lab can slip in the wild. Contextual StereoSet shows how measured bias swings when you change the framing—no adversarial prompting required.

Same stereotypes, new frames: Hold content constant, vary time, place, or audience.
Striking shifts: Anchoring to 1990 (vs. 2030) raised stereotype choices in all tested models (p<0.05). Gossip framing raised them in 5/6 models. Out-group observer framing shifted rates by up to 13 percentage points.
Across domains: Effects replicate in hiring, lending, and help-seeking vignettes.
Quick or deep: A 360-context diagnostic grid for deep dives, and a budgeted protocol covering 4,229 items for production screening.
CSF profiles: Context Sensitivity Fingerprints summarize how a model’s bias score disperses across contexts, with bootstrap CIs and FDR-corrected contrasts.

The takeaway: stop asking Is this model biased? Start asking Under what conditions does bias appear? It’s a robustness stress test, not a claim about ground-truth bias rates. Code, benchmark, and results: https://arxiv.org/abs/2601.10460v1

Paper: https://arxiv.org/abs/2601.10460v1

Register: https://www.AiFeta.com

#AI #MachineLearning #LLM #NLP #AIEthics #ResponsibleAI #Bias #Evaluation

Bias scores shift with context: Contextual StereoSet

Kari Jaaskelainen

Read more

Tekoäly myötäilee toteamuksia enemmän kuin kysymyksiä

Tekoälyn pitäisi uskaltaa sanoa “en tiedä” — ja sillä on väliä, miten tämä mitataan

Pienet kielimallit nopeutuvat, kun niille opetetaan valmiita fraaseja

Kone näkee saman kohtauksen eri tavoin – uusi tapa opettaa sen kokoamaan aistinsa yhteen