Why is this important?
Customer Segmentation (CS), i.e. the subdivision of the entire customer base into discrete groups that share similar characteristics, is a fundamental marketing practice that dates back many decades. It is a fundamental element of the Customer Relationship Management (CRM) process, which aids the development of a business strategy in order to build long term, profitable relationships with the customers. The benefits of CS include, among others, the identification of profitable (or not) customers, customer retention and satisfaction, product improvement to meet customer needs, avoidance of non-profitable markets, outperforming competition, and quick adaptation to social or demographical changes.
Customer segmentation in the past
In the past, customer segments were primarily extracted based on intuition, practical experience or market wisdom. However, as the amount of digital data started to increase, the use of computational algorithms to perform CS became more and more imperative; information on many thousands of customers was impossible to be properly analyzed by the human brain. At that stage, clustering analysis, i.e. the computational method for identifying homogenous groups of objects in the data (called clusters), started to gain attraction. Indeed, it was successfully applied to a variety of CS-cases that usually involved a rather small number of a-priori defined market and/or customer characteristics. Clustering analysis allowed customer segments to be automatically formed more based on data and less on subjectivity. However, these initial CS solutions were considering only up to a moderate amount of customer information, and didn’t tackle challenges similar to the ones we face today.
What are the challenges today?
Today, CS is a vital requirement for every company that wants to cope with the continuous market changes and the even stronger competition. In today’s modern world, population characteristics change rapidly, especially in metropolitan cities. Customer preferences are easily shifted by marketing campaigns and media. As a result, companies need to quickly adapt to the fast clanging size and type of customer segments. Furthermore, customer retention requires continuous monitoring of the customers’ needs, whereas upselling and cross-selling opportunities can be easily lost, if a company does not keep track of the often sudden lifestyle changes of its customers.
The data is out there!
So, how can we adapt in the continuously changing customer landscape? On the bright side, the information is all there! For example, publically-available demographic data capture annual changes in the population characteristics (even within small geographical blocks). Information on sales is recorded in great detail and stored in databases. Personal preferences can be assessed via customer cards or credit cards. Social media data reflect general trends and product perception. Data from websites (e.g., click-stream data) carry a wealth of information for current products, while personal web-pages can provide information for a customer’s lifestyle changes.
Is it difficult to analyze all this information?
Yes it is. All the aforementioned data sources give to a customer many ‘dimensions’. And this is the greatest challenge for cluster analysis: to deal with the high dimensionality in terms of customers’ features. Although traditional cluster analysis algorithms can easily deal with information on a large number of customers, they can do so only under the premise that the number of customers’ features is moderate. When the dimension of the so-called ‘feature-space’ is high, they are easily subjected to bias. For the very same dataset, the clustering result can be substantially different depending on the choices of (i) the clustering method, (ii) the predefined number of clusters, and (iii) the employed similarity measure. Undoubtedly, a biased customer segmentation analysis may have negative financial impact for the company, as it can lead to a wrong marketing strategy and therefore, loss of customers.
Let your data speak!
Removing subjectivity during the data analysis process is of paramount importance. State-of-the-art clustering strategies give us the possibility to perform robust clustering and uncover reliable customer segments. For example, recent methodologies that employ information theory can optimally select the cluster model and the number of clusters for our data, so that our customer segments are small enough to be informative and large enough to be stable under fluctuations (a.k.a noise) in our dataset. Moreover, modern feature selection techniques can help us to identify correlations between the customers’ features, and uncover the effective dimensionality of the multi-source customers’ dataset. In this way, we can perform high quality customer segmentation that fully accounts for today’s wealth of information, and hence, we can reliably aid a company to increase in competitiveness, succeed in market expansion and increase its profitability.