What are best practices for storing ensemble run outputs?

by Doug   Last Updated August 13, 2019 17:05 PM

I'm writing a Python program that will take in multiple inputs and run a model over all combinations of those inputs. Initially there will be just under 200 runs. I'm looking to structure the intermediate and output files such that I can refer back to which inputs generated which outputs. I'd like to do this without creating a file structure that becomes overly bloated and without creating extremely long filenames.

Are there designs that exist in practice for handling this and I imagine monte carlo like runs? Would it make sense to create a folder for each output run, saving a configuration file of the inputs used for that output?

The model generates outputs in a binary format representing large compressed multidimensional numpy arrays.



Related Questions



best way to install local package into docker image

Updated February 08, 2018 07:05 AM

Data processing pipeline design for processing data

Updated November 28, 2018 22:05 PM

What type of application do I need to write?

Updated March 15, 2019 21:05 PM