-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Actor and distribution function should not be specified independently #1194
Labels
enhancement
Feature that is not a new algorithm or an algorithm enhancement
refactoring
No change to functionality
Comments
opcode81
added
enhancement
Feature that is not a new algorithm or an algorithm enhancement
refactoring
No change to functionality
labels
Aug 7, 2024
opcode81
changed the title
Actor outputs and distribution functions should be specified independently
Actor outputs and distribution functions should *not* be specified independently
Aug 7, 2024
opcode81
changed the title
Actor outputs and distribution functions should *not* be specified independently
Actor and distribution function should not be specified independently
Aug 7, 2024
opcode81
added a commit
that referenced
this issue
Aug 7, 2024
distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194
Merged
8 tasks
opcode81
added a commit
that referenced
this issue
Aug 7, 2024
distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194
opcode81
added a commit
that referenced
this issue
Aug 7, 2024
distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194
opcode81
added a commit
that referenced
this issue
Aug 7, 2024
distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194
opcode81
added a commit
that referenced
this issue
Aug 7, 2024
distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194
MischaPanch
added a commit
that referenced
this issue
Aug 8, 2024
…tribution function (#1195) Establish a strong link between the actor and the distribution function (dist_fn) used in policies by creating the distribution function in the actor factory which knows which function is appropriate. Consequently, remove the policy parameter 'dist_fn' from the high-level API because it is determined automatically, eliminating the possibility of misspecification by the user. [breaking change: code must not specify the 'dist_fn' parameter, but persisted objects continue to work as expected] Implements #1194 - [X] I have added the correct label(s) to this Pull Request or linked the relevant issue(s) - [X] I have provided a description of the changes in this Pull Request - [X] I have added documentation for my changes and have listed relevant changes in CHANGELOG.md - [X] If applicable, I have added tests to cover my changes. - [X] I have reformatted the code using `poe format` - [X] I have checked style and types with `poe lint` and `poe type-check` - [ ] (Optional) I ran tests locally with `poe test` (or a subset of them with `poe test-reduced`) ,and they pass - [ ] (Optional) I have tested that documentation builds correctly with `poe doc-build`
Closed by #1195 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Feature that is not a new algorithm or an algorithm enhancement
refactoring
No change to functionality
Currently, the distribution function (e.g. for PPO) and the actor are specified independently, yet there is obviously a very strong connection: The nature of the actor's output determines the distribution function to apply.
This can cause errors that are hard to debug, especially when the input to the distribution function can be interpreted but its semantics are skewed.
Therefore, I propose to create a stronger link, which we can already implement at least in the high-level API:
The ActorFactory, which knows the nature of the actor's output, shall also create the matching distribution function, eliminating the issue. In turn, the
dist_fn
parameter can be entirely removed from the high-level API (breaking change).The text was updated successfully, but these errors were encountered: