# Does it make sense to cluster SOM neurons instead of the original data?

by Tendero   Last Updated August 30, 2018 01:19 AM

Suppose we have a given dataset of $M$ points and in an $N$-dimensional space. We train a 2D self-organising map (SOM) with this dataset. Then, if the SOM has dimensions $d_1\times d_2$, then there will be $P=d_1d_2$ points of dimension $N$ that represent the original dataset. In general, $P<<M$.

Would it make sense to cluster the $P$ neurons of the SOM instead of the $M$ original datapoints? The procedure I imagine would be as follows:

1. Perform some clustering on the $P$ neuron weights of the SOM. Suppose that we are left with $k$ clusters.
2. Find the centroids of each cluster, finding the average of all the neurons that correspond to each of the $k$ clusters.
3. Now we have $k$ centroids. Assign each original datapoint to the cluster whose centroid is the nearest.

Does this approach make sense? In which cases would it be sensible to follow this procedure? Is this done in real life?

Tags :