by marcv81
Last Updated March 06, 2018 07:19 AM

When training a neural network, do I get the same results in both the following cases?

- Training on half the training data for 1 epoch, then on the other half for another epoch.
- Training on the entire training data for 1 epoch.

I am training a model on variable-length sequential data. The model is already well-tuned. I would like to divide the data in subsets of different sizes to optimize the training speed (i.e.: spend less time training on padding data). If the 2 cases above are equivalent the solution is straightforward.

I can share information about the particular model, but think it should be possible to give a theoretical answer which generalizes to any model.

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger