FITRep: Transparent AI to De-duplicate Lookalike Items

FITRep: Transparent AI to De-duplicate Lookalike Items

Duplicate and lookalike items clutter online feeds and ads, hurting user experience.

FITRep is an attention-guided, white-box way to represent items for fine-grained deduplication. Inspired by Feature Integration Theory, it teaches Multimodal LLMs to separate what’s primary from what’s auxiliary—so structures don’t collapse into one vague embedding.

  • CHIE: extracts a hierarchy of semantic concepts from text + visuals.
  • SPDR: an adaptive, UMAP-based reduction that preserves structure while compressing.
  • FBC: FAISS-powered clustering that assigns each item a stable, unique cluster ID.

Deployed in Meituan’s ad system, FITRep lifted CTR by +3.60% and CPM by +4.25% in online A/B tests—cleaner feeds, better matching, real revenue impact.

Bottom line: fewer near-duplicates, clearer item meaning, and faster, scalable clustering—without the black-box guesswork.

Paper: https://arxiv.org/abs/2511.21389v1

Register: https://www.AiFeta.com

#MLLM #RecommenderSystems #InformationRetrieval #Deduplication #AdTech #UMAP #FAISS #CTR #CPM #AI #Meituan

Read more