Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Way to restart experiments #45

Open
matthiasreisser opened this issue Apr 26, 2017 · 1 comment
Open

Way to restart experiments #45

matthiasreisser opened this issue Apr 26, 2017 · 1 comment

Comments

@matthiasreisser
Copy link
Contributor

Use-case: Experiment terminates, now I want to resume training from a saved checkpoint. The Experiment Directory should now be the same as the initial run since ideally model checkpoints, (and for example tensorflow summaries) are within the previous experiment's folder

Terminal Print out either attatched to previous output.txt or an output2.txt created, etc...

@petered
Copy link
Contributor

petered commented Jul 26, 2017

Thought about this.....

One way would be to make extend a PersistentExperiment base class with an __init__ and an abstract "run_step" method. It's "run" method would repeatedly call run_step while either periodically saving checkpoints or catching keyboard interrupts and saving from there.

A small obstacle is that everything in your experiment needs to be picklable. (so no lambda functions, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants