Table of Contents
- Problem Definition
- Previous Methods & new Methods & mainly contribution
- Application in RecSys
- Ref
- TODO & Questions & Further Reading
DSSM was initial invented for NLP word-matching task, recent years, it is widely used in candidate generation stage of recommendation system.
Problem Definition
- Main objective:
- web document ranking task
- Main metrics (only offline evaluation available)
- NDCG@1, 3, 5
Previous Methods & new Methods & mainly contribution
- Main contributions
- learning from clickthrough data
- DNN supervised learning
- word hashing: n-gram based word hashing
Application in RecSys
- Pros:
- fast: item-embedding calculated offline; user-embedding calculated in real-time
- Cons:
- no interaction between item and user features
Ref
- 为什么现在推荐系统喜欢用双塔模型?双塔模型相较于单塔有什么优缺点? - 数据拾光者的回答 - 知乎
- 推荐系统(十八) 大厂实践经验学习:双塔模型 - 缄默笔记的文章 - 知乎
- 读透Learning Deep Structured Semantic Models for Web Search using Click through Data
TODO & Questions & Further Reading
- Latent Semantic Model: intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails
- get cosine similarity, then apply softmax -> yes
- word hashing: letter n-gram
- to read
- Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations: from google 2019, tried to removed the influence of popular item to negative sampling
- in-batch softmax ??就是利用batch内样本互相做彼此的负样本,来构建softmax损失
- Embedding-based Retrieval in Facebook Search: from facebook 2022
- deepmatch DSSM example
- Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations: from google 2019, tried to removed the influence of popular item to negative sampling