The reason for delay in posting this has been because I've been toying with whether I should write on segmentation or not. So much has been written on this subject that it makes me a little hesitant about revisiting this space.
What got me to finally pen this was the title of a paper at an upcoming conference that said 'How statistics get in the way of actionable segmentation'. I don't know what the presenters have to say (must source a copy after the conference) but the title made me laugh. The two words that stuck in my head were 'statistics' and 'actionable segmentation' and whether the twain will ever meet.
I have undertaken enumerable segmentation projects in my career, some simple, others complex, yet others that go nowhere. All of them in the end have the same things in common:
- Too many bases variables
- Over reliance on cluster analysis as the primary tool for segmentation
- Use of subjective judgement to evaluate results of the cluster solution
- Lack of reliability and validity tests on the solution
- Recreation of the scientific solution into a more 'creative and arty' one
What the above means for managers who implement segmentation solutions is that they could be formulating strategy and targeting segments that are unstable and unreal. There exists a body of research that calls for a deeper look at the statistics and data that go into cluster analysis and segmentation(I will be happy to provide the references).
The real issue continues to be an inability of both analysts and practitioners to put together a common road map for segmentation that takes into account statistical robustness of the technique along with creation of actionable segments that can be targeted through focused marketing programs. In my experience, the science of segmentation gets lost in the art.
Here are the five key things that analysts and practitioners must do to create better, robust and more scientific segments-
- Choose bases variables for segmentation that tie in with the end goal of segmentation and keep their number not more than 8-10. Build a set of good profiling variables that tie into the bases variables(there is no restriction in number here).
- Explore other tools for segmentation(sometimes simple business rules work just as well). Latent class analysis offers excellent alternatives for both survey and crm data and is still a highly underused technique. Try two techniques, if possible and compare and contrast results.
- Use a variety of statistical parameters to evaluate a solution instead of relying on one or two or on subjectivity. Decide which metrics you want to look at before the study. For example-dendograms, change rate, psuedo Rsq, hotelling's Tsq can be some metrics for evaluating no. of clusters in a cluster solution. The BIC, p-value, parsimony(no. of parameters) and the bootstrap p-value can be the parameters to nail number of segments in a latent class segmentation. Reliance on 'many' statistics vs. 'few' should be the mantra.
- Test reliability through hold out samples and validity through looking at profiling variables and how they differentiate the solution. The holdout sample results must match those of the developmental sample in terms of the number of segments and profiles. Most of the picked profiling variables must adequately differentiate the final segment solution. If there is an issue with reliability and validity-the solution may have a problem. Going back and reworking the same is the best way out.
- Don't use the art of segmentation to sidestep the science for a solution, use it if you will to add to the same.