Meet FITRep: Attention-Guided, Transparent Item Deduplication
Ever scrolled past pages of near-identical listings? Those duplicates hurt user experience and waste ad spend.
FITRep is an attention-guided, white-box way to represent items across image + text, so platforms can tell what’s truly important (primary) versus extra (auxiliary) and cluster near-duplicates with confidence.
- CHIE: uses Multimodal LLMs to extract hierarchical concepts from each item.
- SPDR: an adaptive, structure-preserving UMAP-based compression that keeps key relationships.
- FBC: FAISS-powered clustering that assigns every item a stable cluster ID.
In Meituan’s ad system, FITRep increased CTR by +3.60% and CPM by +4.25% in online A/B tests.
Result: cleaner catalogs, fewer duplicates, more relevant recommendations—and measurable revenue lift.
Paper: https://arxiv.org/abs/2511.21389v1
Authors: Guoxiao Zhang, Ao Li, Tan Qu, Qianlong Xie, Xingxing Wang
Paper: https://arxiv.org/abs/2511.21389v1
Register: https://www.AiFeta.com
MLLM RecommenderSystems InformationRetrieval AdsTech ComputerVision NLP UMAP FAISS Deduplication Ecommerce