Support computing custom scores and terminating/saving based on them in BaseTrainer #1202

anyongjin · 2024-08-13T02:54:17Z

This PR introduces a new concept into tianshou training: a best_score. It is computed from the appropriate Stats instance and always added to InfoStats.

Breaking Changes:

InfoStats has a new non-optional field best_score

Background

Currently, tianshou uses the maximum average return to find the best model. But sometimes it may not meet user needs, for example, the average return only drops by 5%, but the standard deviation drops by 50%. The latter is generally considered to be more stable and better than the former.

tianshou/trainer/base.py

tianshou/data/stats.py

MischaPanch

Thanks for the PR @anyongjin, it's a good contribution!

Overall, the trainer has to become more flexible, but it would be too much to ask for right now. I think we can merge this after some slight changes and then soon refactor the trainer, taking in consideration the support for custom scoring and custom conditions on terminating the training

tianshou/trainer/base.py

anyongjin · 2024-08-14T03:44:43Z

In essence, Average Reward and Test Score are two different things. The former represents a fixed test result indicator. The latter is a score for the test result. The scoring logic for different tasks and users may be different. For example, some consider the standard deviation and some do not.
Currently, tianshou uses best_reawrd for both average reward and test score. It is difficult for users to implement custom scoring logic. So I suggest that best_reward be used only for average reward, and best_score be added for test score. In this way, best_reward and best_score are two different things. If it is called best_custom_score, people will think that there is a system default score field, so I think it is better not to add 'custom'.

Update:

Added explanation for InfoStats.best_score.
Use lambda function when compute_score_fn is None to avoid multiple if-else

tianshou/trainer/base.py

add evaluate_test_fn to BaseTrainer

47b966a

MischaPanch reviewed Aug 13, 2024

View reviewed changes

tianshou/trainer/base.py Outdated Show resolved Hide resolved

MischaPanch reviewed Aug 13, 2024

View reviewed changes

tianshou/data/stats.py Show resolved Hide resolved

MischaPanch requested changes Aug 13, 2024

View reviewed changes

tianshou/trainer/base.py Outdated Show resolved Hide resolved

tianshou/trainer/base.py Outdated Show resolved Hide resolved

tianshou/trainer/base.py Outdated Show resolved Hide resolved

tianshou/trainer/base.py Outdated Show resolved Hide resolved

rename to compute_score_fn

0dd252a

anyongjin added 2 commits August 14, 2024 12:07

fix mypy error

9f98f57

fix black format error

68aadc9

MischaPanch reviewed Aug 14, 2024

View reviewed changes

tianshou/trainer/base.py Show resolved Hide resolved

MischaPanch approved these changes Aug 14, 2024

View reviewed changes

MischaPanch changed the title ~~add evaluate_test_fn to BaseTrainer (Calculate the test batch performance score to determine whether it is the best model)~~ Support computing custom scores and terminating/saving based on them in BaseTrainer Aug 14, 2024

MischaPanch merged commit a38e586 into thu-ml:master Aug 14, 2024
4 checks passed

This was referenced Aug 14, 2024

Refactoring: make Trainer more flexible #1204

Open

Support custom scores for Trainer in high-level interfaces #1205

Open

anyongjin mentioned this pull request Aug 14, 2024

optimize logging log_msg #1206

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support computing custom scores and terminating/saving based on them in BaseTrainer #1202

Support computing custom scores and terminating/saving based on them in BaseTrainer #1202

anyongjin commented Aug 13, 2024 •

edited by MischaPanch

Loading

MischaPanch left a comment •

edited

Loading

anyongjin commented Aug 14, 2024

Support computing custom scores and terminating/saving based on them in BaseTrainer #1202

Support computing custom scores and terminating/saving based on them in BaseTrainer #1202

Conversation

anyongjin commented Aug 13, 2024 • edited by MischaPanch Loading

Breaking Changes:

Background

MischaPanch left a comment • edited Loading

Choose a reason for hiding this comment

anyongjin commented Aug 14, 2024

anyongjin commented Aug 13, 2024 •

edited by MischaPanch

Loading

MischaPanch left a comment •

edited

Loading