FITRep: Transparent AI to De-duplicate Lookalike Items

Kari Jaaskelainen

27 Nov 2025 — 1 min read

Duplicate and lookalike items clutter online feeds and ads, hurting user experience.

FITRep is an attention-guided, white-box way to represent items for fine-grained deduplication. Inspired by Feature Integration Theory, it teaches Multimodal LLMs to separate what’s primary from what’s auxiliary—so structures don’t collapse into one vague embedding.

CHIE: extracts a hierarchy of semantic concepts from text + visuals.
SPDR: an adaptive, UMAP-based reduction that preserves structure while compressing.
FBC: FAISS-powered clustering that assigns each item a stable, unique cluster ID.

Deployed in Meituan’s ad system, FITRep lifted CTR by +3.60% and CPM by +4.25% in online A/B tests—cleaner feeds, better matching, real revenue impact.

Bottom line: fewer near-duplicates, clearer item meaning, and faster, scalable clustering—without the black-box guesswork.

Paper: https://arxiv.org/abs/2511.21389v1

Register: https://www.AiFeta.com

#MLLM #RecommenderSystems #InformationRetrieval #Deduplication #AdTech #UMAP #FAISS #CTR #CPM #AI #Meituan

FITRep: Transparent AI to De-duplicate Lookalike Items

Kari Jaaskelainen

Read more

Tekoäly myötäilee toteamuksia enemmän kuin kysymyksiä

Tekoälyn pitäisi uskaltaa sanoa “en tiedä” — ja sillä on väliä, miten tämä mitataan

Pienet kielimallit nopeutuvat, kun niille opetetaan valmiita fraaseja

Kone näkee saman kohtauksen eri tavoin – uusi tapa opettaa sen kokoamaan aistinsa yhteen