by BS BS
Last Updated January 11, 2018 13:19 PM

If I understood the algorithm properly it should go like this:

- Randomly initialize SOM vectors $\xi_1, ... , \xi_n$ in feat space.
- Randomly pick feature vector from training set on step k v(k)
- Find BMU (Best matching unit, closest SOM vector to feature vector) b(k)
- Update
*all*$\xi_m$ according to the following rule:

$$ \xi_m(k + 1) = \xi_m(k) + \alpha(k)h(k, dist(V_m, b(k))[v(k) - \xi_m(k)] $$

Where $\alpha(k)$ is the learning rate and $h(k, dist(V_m, b(k)) = exp(-\frac{dist(V_m, b(k))}{\sigma^2(k)}) $

- Update $\alpha(k+1)= \alpha(0) exp(- k/ \lambda)$ and $\sigma(k+1) = \sigma(0) exp(-k/\lambda)$
- Repeat from step 2 until desired number of iterations is reached.

So from my understanding execution time should be mainly dependant on number of iterations and dimensions.

Problem is that when playing around with (mvpa2.mappers.som.SimpleSOMMapper). I get that the execution time incresases with the size of my dataset.

```
data = np.random.randint(low=0, high= 255, size = (100,3))
data =data/255
%%time
som = SimpleSOMMapper((10,10), 1000, learning_rate= 0.01)
som.train(data)
```

Outputs wall time of 5.8 s. While,

```
data = np.random.randint(low=0, high= 255, size = (1000,3))
data =data/255
%%time
som = SimpleSOMMapper((10,10), 1000, learning_rate= 0.01)
som.train(data)
```

Outputs wall time of 58 s. What am I missing? And, most importantly how can I apply this algorithm to large data sets ? Are there better python libraries for this?

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger