Use K-means and let AI advise you how many segments there (really) are.
Market and customer segmentation are some of the most important tasks in any company. The segmentation done will influence marketing and sales decisions, and potentially the survival of a company.
Surprisingly, despite the advance in machine learning, few marketers are using such technology to augment their all-important market and customer segmentation efforts.
In this article, I will show you how to augment your segmentation analysis with a simple, yet powerful machine learning technique called K-means. Learning this will give you an edge over your competitors (and colleagues).
So what’s K-means?
K-means is a popular clustering algorithm for unsupervised machine learning. It groups similar data point into a predefined number of groups.
Let me explain each term for you:
- Clustering: a machine learning technique for identifying and grouping similar data points (e.g. customers) together.
- Unsupervised machine learning: you don’t need to provide labeled data to the algorithm on how to group the customers. It will scan through all information associated with each customer and learn the best way to group them together.
- A predefined number of groups: you need to tell K-means how many groups to form. This is the only input needed from you.
Here is an analogy to the above concepts: Imagine you have some toys and without providing further instruction, you ask your kid to separate the toys into three groups. Your kid will play around and eventually find his own best way to form three groups of similar toys.
OK…so how does K-means work?
Let’s assume you think there are 3 potential segments of customers.
K-means will randomly initiate 3 points (i.e. centroids) at random locations and slowly fit each data point to the nearest centroid. Each data point represents one customer, and the customer closest to the same centroid will be in the same group.
The centroids’ locations are adjusted automatically based on the last nearest customer allocated to them. Doing so, it will learn on its own to find other customers with similar characteristics.
What? That looks simple. I could do the grouping visually myself!
The 2-dimensional representation of customers above is a simplified form of visualizing the data.
Each information associated with a customer represents one dimension of data. For instance, if you are just plotting the items and quantity purchased, then that’s 2-dimension. Once you consider additional information for each customer, such as country of residence and total spending, the complexity jumps to 4-dimension!
It is hard for us to imagine grouping items together beyond 3-dimensional space, but not so for machine learning. This makes machine learning much more powerful than traditional methods in finding meaningful segments.
Machine learning can make sense of multiple dimensions beyond our imagination, find similar characteristics of customers based on their information, and group similar customers together.
That’s the beauty of it!