2017-Model Ensemble for Click Prediction in Bing Search Ads

Posted by Xiaoye's Blog on March 9, 2023

Table of Contents

  1. Problem Definition
  2. Previous Methods & New Methods
  3. Impacts
  4. TODO & Questions
  5. Ref

Problem Definition

  1. key metrics
    1. Offline: AUC

Previous Methods & New Methods

  1. 8 ensemble methods are evaluted
  2. Boosting neural networks with gradient boosting decision trees turns out to be the best.
    1. how does it work??

Impacts

  1. With larger training data, it shows near 0.9% AUC improvement in offline testing and significant click yield gains in online traffic.

TODO & Questions

  1. How to solve position bias
    1. Logistic regression model
  2. Inversed position bias has better performance than normal one.
    1. Note that we use the inverse position bias, i.e., given pb = σ(x),
    2. We need the inversed form because the final sigmoid will convert it back to position CTR which is empirical CTR. we use x instead of pb.???
  3. COEC -> Click Over Expected Clicks -> check ref paper
  4. 特征处理
    1. 所有的统计特征包括position特征都需要做归一化处理【目前常规的做法是统计值分桶,然后学习不同桶的低纬嵌入,落地方面,效果稳定,还能建模包括统计特征在内的特征交叉】
  5. Position features
    1. position bias whose value is roughly the expected CTR of all samples collected from an bucket traffic with randomized ad order

Ref

  1. Model Ensemble for Click Prediction in Bing Search Ads 论文讲解
  2. Related papers:
    1. the concept of COEC: Personalized click prediction in sponsored search, 2010, Yahoo