-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support computing custom scores and terminating/saving based on them in BaseTrainer #1202
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @anyongjin, it's a good contribution!
Overall, the trainer has to become more flexible, but it would be too much to ask for right now. I think we can merge this after some slight changes and then soon refactor the trainer, taking in consideration the support for custom scoring and custom conditions on terminating the training
In essence, Update:
|
evaluate_test_fn
to BaseTrainer
(Calculate the test batch performance score to determine whether it is the best model)
This PR introduces a new concept into tianshou training: a
best_score
. It is computed from the appropriateStats
instance and always added toInfoStats
.Breaking Changes:
InfoStats
has a new non-optional fieldbest_score
Background
Currently, tianshou uses the maximum average return to find the best model. But sometimes it may not meet user needs, for example, the average return only drops by 5%, but the standard deviation drops by 50%. The latter is generally considered to be more stable and better than the former.