Skip to content
This repository has been archived by the owner on Oct 26, 2023. It is now read-only.

Nonstationary bandit #6

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

MouseAndKeyboard
Copy link

Added a new bandit with moving q*(a) values.
The values change after each step.
Mentioned in Reinforcement Learning: An Introduction (Sutton, Barto)
Section: 2.5 Tracking a Nonstationary Problem

Potential future additions/new bandit:
Non-stationary bandit where the shift in q*(a) values is determined by a normal distribution rather than shifting by a constant amount each step.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant