DIF-V: A Diverse, Inclusive Synthetic Face Dataset for Fairer Verification

Kari Jaaskelainen

24 Nov 2025 — 1 min read

Face checks power everything from phone unlocks to banking logins — but training data often skews toward certain races and genders. This paper presents a method to generate high-quality, ID‑photo‑style synthetic faces that better reflect real‑world diversity, and introduces a new benchmark: Diverse and Inclusive Faces for Verification (DIF‑V).

DIF‑V includes 27,780 images across 926 identities for fairer evaluation.
Popular verification models still favor some genders and races.
Applying identity style changes reduces verification accuracy.
Generation rules enforce ID‑photo standards (lighting, pose, background).

The aim: more inclusive, transparent, and reliable face verification — with fewer privacy risks than using real faces. Learn more: https://arxiv.org/abs/2511.17393v1

Paper: https://arxiv.org/abs/2511.17393v1

Register: https://www.AiFeta.com

AI FaceVerification Biometrics FairML Bias SyntheticData ComputerVision Dataset AIethics ResponsibleAI

DIF-V: A Diverse, Inclusive Synthetic Face Dataset for Fairer Verification

Kari Jaaskelainen

Read more

Tekoäly myötäilee toteamuksia enemmän kuin kysymyksiä

Tekoälyn pitäisi uskaltaa sanoa “en tiedä” — ja sillä on väliä, miten tämä mitataan

Pienet kielimallit nopeutuvat, kun niille opetetaan valmiita fraaseja

Kone näkee saman kohtauksen eri tavoin – uusi tapa opettaa sen kokoamaan aistinsa yhteen