FITRep: Transparent AI to De-duplicate Lookalike Items
Duplicate and lookalike items clutter online feeds and ads, hurting user experience.
FITRep is an attention-guided, white-box way to represent items for fine-grained deduplication. Inspired by Feature Integration Theory, it teaches Multimodal LLMs to separate what’s primary from what’s auxiliary—so structures don’t collapse into one vague embedding.
- CHIE: extracts a hierarchy of semantic concepts from text + visuals.
- SPDR: an adaptive, UMAP-based reduction that preserves structure while compressing.
- FBC: FAISS-powered clustering that assigns each item a stable, unique cluster ID.
Deployed in Meituan’s ad system, FITRep lifted CTR by +3.60% and CPM by +4.25% in online A/B tests—cleaner feeds, better matching, real revenue impact.
Bottom line: fewer near-duplicates, clearer item meaning, and faster, scalable clustering—without the black-box guesswork.
Paper: https://arxiv.org/abs/2511.21389v1
Register: https://www.AiFeta.com
#MLLM #RecommenderSystems #InformationRetrieval #Deduplication #AdTech #UMAP #FAISS #CTR #CPM #AI #Meituan