Customer Segmentation
- Problem: we don't know if we have different types of customers and how to approach them
- Goals:
- We want to understand better our customers
- We want to have clear criteria to segment our customers
- Why? To perform specific actions to improve the customer experience
Technique to solve the business problem
We need a formal definition
Customer segmentation is the practice of dividing a customer base into groups of individuals that are similar in specific ways relevant to marketing, such as age, gender, interests and spending habits.
The most common forms of customer segmentation are:
- Geographic segmentation: considered as the first step to international marketing, followed by demographic and psychographic segmentation.
- Demographic segmentation:based on variables such as age, sex, generation, religion, occupation and education level.
- Firmographic: based on features such as company size (either in terms of revenue or number of employees), industry sector or location (country and/or region).
- Behavioral segmentation: based on knowledge of, attitude towards, usage rate, response, loyalty status, and readiness stage to a product.
- Psychographic segmentation: based on the study of activities, interests, and opinions (AIOs) of customers.
- Occasional segmentation: based on the analysis of occasions (such as bieng thirsty).
- Segmentation by benefits: based on RFM, CLV, etc.
- Cultural segmentation: based on cultural origin.
- Multi-variable segmentation: based on the combination of several techniques.
Main Concepts
Customer Segmentation Techniques
- Single discrete variable (CLV, RFM, CHURN)
- Clustering: K-means, Hierarchical
- Latent Class Analysis (LCA)
- Finite mixture modelling (ex. Gaussian Mixture Modelling)
- Self Organizing maps
- Topological Data Analysis
- PCA
- Spectral Embedding
- Locally-linear embedding (LLE)
- Hessian LLE
- Local Tangent Space Alignment (LTSA)
- Random forests, Decision Trees
Implementation Process
- [BU] Determine business needs
- [DU] Sourcing, Cleaning & Exploration
- [DP] Feature Creation (Extract additional information to enrich the set)
- [DP] Feature Selection (Reduce to a smaller dataset to speed up computation)
- [M] Select Customer Segmentation Technique (test and compare some of them)
- [M] Applied Selected Customer Segmentation Technique
- [E] Analyze results and adjust parameters
- [D] Present and explain the results
Benefits
This technique provides the following benefits:
- Customer profiling
- Targeted marketing actions
- Targeted operations
Use cases
This technique is used in different use cases:
- Reporting
- Commercial actions: Retention offers, Product promotions, Loyalty rewards
- Operations: Optimise stock levels, store layout
- Pricing: price elasticity
- Strategy: M&A, new products,...
How to implement this algorithm using R
K-means
Given a set of observations (x1, x2, …, xn), where each observation is a d-dimensional real vector, k-means clustering aims to partition the n observations into k (≤ n) sets S = {S1, S2, …, Sk} so as to minimize the within-cluster sum of squares (WCSS) (sum of distance functions of each point in the cluster to the K center). In other words, its objective is to find:
$$ \underset{\mathbf{S}} {\operatorname{arg\,min}} \sum{i=1}^{k} \sum{\mathbf x \in S_i} \left| \mathbf x - \boldsymbol\mu_i \right|^2
$$
where $$μ_i$$ is the mean of points in $$S_i$$.
Case
We consider the dataset: Wholesale customers Data Set. Abreu, N. (2011). Analise do perfil do cliente Recheio e desenvolvimento de um sistema promocional. Mestrado em Marketing, ISCTE-IUL, Lisbon
This dataset has the following attributes:
- FRESH: annual spending (m.u.) on fresh products (Continuous);
- MILK: annual spending (m.u.) on milk products (Continuous);
- GROCERY: annual spending (m.u.) on grocery products (Continuous);
- FROZEN: annual spending (m.u.) on frozen products (Continuous)
- DETERGENTS_PAPER: annual spending (m.u.) on detergents and paper products (Continuous)
- DELICATESSEN: annual spending (m.u.) on and delicatessen products (Continuous);
- CHANNEL: customers Channel - Horeca (Hotel/Restaurant/Café) or Retail channel (Nominal)
- REGION: customers Region of Lisbon, Oporto or Other (Nominal)
# Install packages
install.packages("NbClust")
# Load packages
library(NbClust)
# Load data
data <- read.csv('data/chapter7.csv', header = T,sep=',')
# Review data structure
str(data)
# Review data
summary(data)
# Scale data
testdata <- data
testdata <- scale(testdata)
# Determine number of clusters. Option 1: visual rule
wss <- (nrow(testdata)-1)*sum(apply(testdata,2,var))
for (i in 2:15) wss[i] <- sum(kmeans(testdata,
centers=i)$withinss)
plot(1:15, wss, type="b", xlab="Number of Clusters",
ylab="Within groups sum of squares")
# Determine number of clusters. Option 2: more frequent optimal number
res <- NbClust(data, diss=NULL, distance = "euclidean", min.nc=2, max.nc=12,
method = "kmeans", index = "all")
# More information
res$All.index
res$Best.nc
res$All.CriticalValues
res$Best.partition
# K-Means Cluster Analysis (based on the proposed number by NbCluster)
fit <- kmeans(testdata, 3)
# Calculate average for each cluster
aggregate(data,by=list(fit$cluster),FUN=mean)
# Add segmentation to dataset
data <- data.frame(data, fit$cluster)
References
- Hwang, H., Jung, T. and Suh, E., 2004. An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry. Expert systems with applications, 26(2), pp.181-188.
- Kim, S.Y., Jung, T.S., Suh, E.H. and Hwang, H.S., 2006. Customer segmentation and strategy development based on customer lifetime value: A case study. Expert systems with applications, 31(1), pp.101-107. -Marcus, C., 1998. A practical yet meaningful approach to customer segmentation. Journal of consumer marketing, 15(5), pp.494-504.
- Chan, C.C.H., 2008. Intelligent value-based customer segmentation method for campaign management: A case study of automobile retailer. Expert systems with applications, 34(4), pp.2754-2762.
- Teichert, T., Shehu, E. and von Wartburg, I., 2008. Customer segmentation revisited: The case of the airline industry. Transportation Research Part A: Policy and Practice, 42(1), pp.227-242.
- Espinoza, M., Joye, C., Belmans, R. and Moor, B.D., 2005. Short-term load forecasting, profile identification, and customer segmentation: a methodology based on periodic time series. Power Systems, IEEE Transactions on, 20(3), pp.1622-1630.
- Wu, J. and Lin, Z., 2005, August. Research on customer segmentation model by clustering. In Proceedings of the 7th international conference on Electronic commerce (pp. 316-318). ACM.
- Machauer, A. and Morgner, S., 2001. Segmentation of bank customers by expected benefits and attitudes. International Journal of Bank Marketing, 19(1), pp.6-18.