Text clustering using NN method called Kohonen networks(SOM)

by new_comer   Last Updated August 09, 2017 14:19 PM

This is my first question and I do not know whether this is a duplicate question or not. I find So many question regarding clustering but everyone was giving example of numerical clustering even on google i can only found clustering using numerical(continuous) variable. What I want is how to do text clustering using SOM Neural networks method, which is better than k-means clustering(what i think). If any one can suggest where to start. A small example of text will be helpful.

Example:- I have a csv file which is having both text and continuous columns. it looks like this:

Id Company Name

2845571 JIO CORPORATION

9257073 DEUTSCHE TELEKOM AG

1269276 JIO CORPORATION INC

1025492 ibm bank

2845571 ibm hospitals

41552367 AT & T

9657073 DEUTSCHE TELEKOM

1269286 DEUTSCHE BANK

Here in this example I have ibm hospitals and ibm bank similar like deutsche bank and deutsche telekom they should be in different cluster. But jio corporation and jio corporation inc should be in single cluster.

People I am just asking suggestions not asking for solution, please give your suggestion, that will be appreciated. I tried in python using nltk and scikit-learn library. I got 60% result. I heard somewhere that neural networks method for clustering is more efficient.

Thanks in advance.



Related Questions



Kohonen network with more winners than one

Updated March 08, 2018 11:19 AM


Self-organizing maps: fuzzy input?

Updated December 15, 2017 21:19 PM

Interpretation of Quantinization Error in SOM

Updated November 18, 2017 23:19 PM