Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feat/refactor collector (thu-ml#1063)
Closes: thu-ml#1058 ### Api Extensions - Batch received two new methods: `to_dict` and `to_list_of_dicts`. thu-ml#1063 - `Collector`s can now be closed, and their reset is more granular. thu-ml#1063 - Trainers can control whether collectors should be reset prior to training. thu-ml#1063 - Convenience constructor for `CollectStats` called `with_autogenerated_stats`. thu-ml#1063 ### Internal Improvements - `Collector`s rely less on state, the few stateful things are stored explicitly instead of through a `.data` attribute. thu-ml#1063 - Introduced a first iteration of a naming convention for vars in `Collector`s. thu-ml#1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). thu-ml#1063 - Improved typing for `exploration_noise` and within Collector. thu-ml#1063 ### Breaking Changes - Removed `.data` attribute from `Collector` and its child classes. thu-ml#1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call `reset` expicitly or pass `reset_before_collect=True` . thu-ml#1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. thu-ml#1063 - Fixed `iter(Batch(...)` which now behaves the same way as `Batch(...).__iter__()`. Can be considered a bugfix. thu-ml#1063 --------- Co-authored-by: Michael Panchenko <[email protected]>
- Loading branch information