TOC
Potential Applications
Recently, I’ve done a project to predict the customer life-time value of hundred millions of users in a e-commence company. This is a summary for that project and also a small review for CLTV prediction.
In e-commence companies, customer Life-time value is an important property. Based on the life-time value, companies can allocate resources to high-value customer in order to get a high Return of Investment (ROI). In addition, based on the predicted customer life-time value, companies can launch market compaign for different user segmentation. Overal, the customer life-time value can be benifit to the following services:
- User acquisition
- User service
- In-app purchase
- Pricing and promotion
Challenge & Potential Solution
- Problem definition
- revenue related definition
- profit: influenced by margins
- gross profit
- frequency of order
- engagement related definition
- frequency of login
- community contribution (comments contribution …)
- revenue related definition
- LTV values vary across different users
- Plenty of 0
- Two stage model
- classification: whether will LTV be zero
- regression: the specific number of LTV
- Clustering -> develop models for every cluster
- For different group, measuring LTV in different dimension might also be possible. e.g zero-order users can be measured with engagement-related LTV
- Two stage model
- extreme values exist
- Predict percentiles instead of real value -> map percentiles to real value later
- Transformation (log)
- Remove the value
- Plenty of 0
- Trigger events influence user’s behaviors
- seasonality
- coupon to attract new users
- bad experience purchase
- customer service interaction such as call, refund
- Scaling problem because of the magnitude of the user base.
- update partially
- Simple model for majority, complex model for those who go through trigger events
Features
Feature Dimension
- User demographic features: gender, age, occupation..
- User behavior feature
- review, purchase, order, products in the carts
- interaction with the seller (in the chanel of chat, video streaming…)
- Platform intervention
- coupons intervention
- average margins of our product
- Features related to previous CLTV
- LTV in previous week, month, half-year, year
- standard deviation of monthly LTV in previous year…
Potential Important Features
- Purchase value in the past quarter
- Purchase value in the past year
- Lifetime purchase value
- Number of days since most recent purchase
Paper Reading
An Engagement-Based Customer Lifetime Value System for E-commerce
背景和假设
发表时间:2016年
Groupon 是一个与美团类似的团购网站,目前其系统每天预测用户价值。预测尺度为
- short - 一个季度
- medium – 三个季度
- long – 一年
本文假设,对于不同生命周期的用户,其价值的影响因素不同,最后在分群特征重要度处有体现。预测值为 purchase value:
We use a proxy measure for customer value that we will call “purchase value” that is based on purchasing behavior, focusing on this behavior instead of profifit because the latter is subject to margins that can influctuate and that are not related to user intent.
商品实际的profit和当时的商品利润相关,跟用户的意图关系不大。GTV才跟用户意图最相关?
特征
一共采用40多个特征,主要有以下方面
- engagement: 与邮件的交互,和在平台上的搜索
- user experience:
- 用户居住地附近的商品数等
- 用户服务: 用户给的差评,用户退货次数,到货时间等
- user behavior: 历史价值,订单,优惠券等
- Other, such as demographiic:年龄,性别,居住城市的特征,注册时特征等。
方法
- 用户分群 根据用户购买历史,分为5个用户群 (对于不同生命周期的用户,其价值的影响因素不同的假设)
- Unactivated
- One-time buyer
- Sporadic buyer
- Power users
- 分群LTV预测(分类+回归问题)(对于不同分群,进行LTV预测,所用模型为Random Forest,分为两个阶段)
- 预测用户是否会下单
- 在1的基础上,预测会下单用户的LTV值
- Seasonality & decay model这两个模型基于实际的业务和工程考虑,起到补丁的作用:
- 模型是基以季度为周期进行训练的,同时不同季度间用户购买行为有变化,所有根据历史数据,增加季度间的修正参数
- 由于用户数的庞大,每天更新全量用户成本较高。同时观察到用户在”triggr event”之后的LTV变化有一定规律性,于是采用以下策略:
- 通过模型更新有“trigger event”的用户
- 对于没有“trigger event” 的用户,直接使用原有值乘以经验得到的衰减参数。
效果
整体效果评估参数:
- Pearson and Spearman correlation coefficient
- RMSE, bias in the averages,
- a comparison of actual versus predicted distributions.
- top 10% revenue VS bottom 90%
- Predicted inclinded revenue VS real case
整体效果如下:
几个结论:
- 时间长度越长,预测越准
Customer Lifetime Value Prediction Using Embeddings
背景和假设
ASOS-总部位于英国,全球性时尚服饰和美妆产品线上零售商。
发表时间:2017年
本文中介绍了两部分:
- churn model. 采用两级预测,先用RF预测概率,然后再用LR修正概率
- CLTV model. 主要采用Random Forest模型。
流失用户和LTV预测的范围如下:
We define a customer as churned if they have not placed an order in the past year. We define CLTV as the sales, net of returns, of a customer over a one year period.
两个模型中都强调了 model calibration 的作用
同时文章提出生成user embedding, churn 模型效果有提升,CLTV模型上还没尝试,然而资源成本太高,未上线。
特征
主要从以下四个维度抽取132个特征
- customer demographics,
- purchase history
- returns history
- web and app session logs. (最多的特征)
最后得出的特征重要度如下:
web/app session log 的重要度仅次于purchases history
方法
预测步骤:
由于LTV数量级分布差异比较大, 故做了两级预测:
- RF模型预测用户的LTV大小在总体分数的分位(范围0-1)
- RF模型将预测的分位映射到LTV的具体数值上
效果
CLTV model: Spearman rank-order correlation coefficient
- 0.56 (for all customers)
- 0.46 (excluding customers with a CLTV of 0)
Customer Lifetime Value Prediction in Non-Contractual Freemium Settings
背景和假设
本文主要关注于游戏应用的LTV预测,主要方法为:
用MLP, decision tree, LR 预测,MLP效果较好
用SMOTE 对高价值用户(比较少)进行过采样, 效果较好
特征
Ref
A DEEP PROBABILISTIC MODEL FOR CUSTOMER LIFETIME VALUE PREDICTION
背景 & 贡献 场景:新用户的LTV预测
主要贡献
- 摒弃两级模型的预测结构,用一个模型解决问题,
- 用新的loss function解决以下问题
- 0的数值较多
- 数据分布的倾斜 (20/80 rule)
- 整体模型值得借鉴
- model calibration 值得借鉴
Ref
- Jamal, Zainab, and Alex Zhang. “2008 DMEF customer lifetime value modeling (Task 2).” Journal of Interactive Marketing 23.3 (2009): 279-283.
- Vanderveld, Ali, et al. “An engagement-based customer lifetime value system for e-commerce.” Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016.
- Learning, Deep, and Big Data. “Customer Lifetime Value Prediction Using Embeddings.”
- Kooti, Farshad, et al. “iPhone’s Digital Marketplace: Characterizing the Big Spenders.” Proceedings of the tenth ACM international conference on web search and data mining. 2017.
- Sifa, Rafet, et al. “Customer lifetime value prediction in non-contractual freemium settings: Chasing high-value users using deep neural networks and SMOTE.” Proceedings of the 51st Hawaii International Conference on System Sciences. 2018.
- Wang, Xiaojing, Tianqi Liu, and Jingang Miao. “A deep probabilistic model for customer lifetime value prediction.” arXiv preprint arXiv:1912.07753 (2019).
- Jasek, Pavel, et al. “Predictive performance of customer lifetime value models in e-commerce and the use of non-financial data.” Prague Economic Papers 28.6 (2019): 648-669.
- Malthouse, Edward C. “The results from the lifetime value and customer equity modeling competition.” (2009): 272-275.