In 1998, my interest in Kohonen’s Self-Organizing Maps (SOM) stemmed from my discomfort with the so-called cross charts (four-quadrant charts) I had encountered in my work at a consulting firm. While several similar charts are frequently used in strategy consulting, to me they seemed like nothing more than “overly simplified abstract theories.” Although they are easy to understand, my conscience was troubled by the way they were treated as if they were some special technique, with an excessive emphasis on authority.
Strategy consulting is mostly comprised of qualitative analysis. If relevant quantitative data can be obtained, a map can be created using principal component analysis. In that case, the strategic domains that consultants advocate based on data can be discovered. However, the reality is that many of those who loudly proclaim “strategy, strategy” have not reached that level of analysis.
Statistical marketing, and sensory evaluation analysis of food and beverages involve quantitative analysis. But if you ask yourself, “Is that really scientific?”, your confidence in that suddenly diminishes. For example, if you perform a correspondence analysis on the results of a survey, is that scientific? In reality, not necessarily. This is because the results of a survey can be completely different depending on the type of questionnaire used. If you only ask questions that lack a crucial perspective, you will never arrive at an answer. In the field of investigation, there is always a frustrating feeling that this problem is constantly being left unresolved as things move on.
Statistical methods are famously known as the “grammar of science,” as declared by Karl Pearson, but in reality, in many fields, they have degenerated into “tools intended to appear scientific.” For example, principal component analysis itself is a very powerful technique when used in a series of analyses to remove noise from data or to eliminate correlations between attributes. However, in many fields, maps created using the coordinate axes of the first and second principal components are frequently used. There is no objection to using them to imagine the overall shape of the data space, but there is a big pitfall if you consider that to be the true picture. That map is merely a projection, and it only shows what it looks like from a particular angle. In other words, there is a great deal of information loss.
When it comes to sensory evaluation analysis, I feel a sense of despair. First of all, the analysis cannot begin without quantifying subjective experiences such as taste and smell. There are methods to adjust the scales, but the fundamental problem in this field is that this process of objectifying subjectivity remains an “impossible challenge.” And since the final result is still a principal component analysis map, the question arises: where is the truth in this series of analyses? I feel that it is not an exaggeration to say that statistics are being used as a tool to disguise science, and that there is a dark side to it. In particular, as I pointed out in the past, the preference mapping method has almost no validity from a statistical standpoint.
SOM offers a ray of hope to these problems because it can summarize multidimensional spaces not through mere projection, but through topological ordering. The preference mapping problem can be solved by integrating the matrix of expert evaluations of product characteristics with the matrix of consumer preferences for products through matrix calculations, and as a result, the individual consumer response to product characteristics can be analyzed using SOM.
SOM can be interpreted as a practical method for nonlinear multivariate analysis. It’s almost pitiful to think that the vast majority of consultants in the world don’t understand this and are still using cross charts. It reminds me of when I take my dog into a park and it tries to go around a pole, but its leash gets caught and it can’t get through. There’s no other way to describe it than that there’s a crucial cognitive gap.
SOM (specifically, statistically compatible batch SOM) works very well for analyzing quantitative data up to about 20 dimensions. Today, advances in Large-Scale Language Models (LLMs) have made it possible to vectorize qualitative information (unstructured text). This results in extremely high-dimensional data, such as 1536 dimensions. While SOM can model this, a slight weakness of SOM is that its topology is fixed from the beginning to the end of training. Therefore, it is recommended to train on extremely high-dimensional data using a technique called Growing Neural Gas (GNG), an advanced version of SOM. This alone is sufficient, but it is perfect if you then use a Minimum Spanning Tree (MST) to identify the main topology.
The emergence of LLM, which allows for the creation of maps from unstructured text information, is a revolutionary development. The conceptual structure model using GNG+MST has the potential to replace the old cross charts as a strategic information analysis tool. We highly recommend that conscientious consultants and researchers try it out.