Predict ‘propensity to buy’ using big data analytics
The Big Picture
A leading FMCG company wanted to develop purchase propensity models for its customers—purchasing products both online and offline—to better plan for its manufacturing and inventory processes. It wanted to improve accuracy and recommend improvements in the current approach and develop a scalable solution to churn terabytes of data (100 million data points).
To get there, the company needed to:
- Churn large data sets using a big data stack and perform EDA
- Perform feature engineering on large data sets
- Build an automated and scalable machine-learning pipeline
The solution leveraged structured data from customer transactions, and the data preparation included monthly roll-up of customer ID’s, an EDA to choose the prediction window of the target variable, and feature engineering on the observation window. The solution also leveraged semi-structured behavioral data on customers, which was prepared by removing redundancy at the segment level.
Classification models were developed and an ensemble of models was implemented, including Lasso, Random Forest, and GBM. These were compared to identify the best model. The model resulted in identifying the right cut-off for predicting that a customer will likely make a purchase. The solution was scaled on a big data platform. The solution provided an automated script that provides purchase propensity scores for each customer.
As a result of the engagement, the client:
- Identified the average turn-around-time for a customer to buy a product in a particular segment.
- Extracted insights from TBs of data with minimal execution time of two hours.
- Experimented with Logistic and Random Forest algorithms to arrive at higher accuracy.
- Enabled an automated scalable solution to procure products for a particular segment.