Table of Contents
Problem Definition
- key metrics
- Offline: AUC
Previous Methods & New Methods
- 8 ensemble methods are evaluted
- Boosting neural networks with gradient boosting decision trees turns out to be the best.
- how does it work??
Impacts
- With larger training data, it shows near 0.9% AUC improvement in offline testing and significant click yield gains in online traffic.
TODO & Questions
- How to solve position bias
- Logistic regression model
- Inversed position bias has better performance than normal one.
- Note that we use the inverse position bias, i.e., given pb = σ(x),
- We need the inversed form because the final sigmoid will convert it back to position CTR which is empirical CTR. we use x instead of pb.???
- COEC -> Click Over Expected Clicks -> check ref paper
- 特征处理
- 所有的统计特征包括position特征都需要做归一化处理【目前常规的做法是统计值分桶,然后学习不同桶的低纬嵌入,落地方面,效果稳定,还能建模包括统计特征在内的特征交叉】
- Position features
- position bias whose value is roughly the expected CTR of all samples collected from an bucket traffic with randomized ad order
Ref
- Model Ensemble for Click Prediction in Bing Search Ads 论文讲解
- Related papers:
- the concept of COEC: Personalized click prediction in sponsored search, 2010, Yahoo