Nonstationary bandit #6

MouseAndKeyboard · 2020-01-15T03:55:22Z

Added a new bandit with moving q*(a) values.
The values change after each step.
Mentioned in Reinforcement Learning: An Introduction (Sutton, Barto)
Section: 2.5 Tracking a Nonstationary Problem

Potential future additions/new bandit:
Non-stationary bandit where the shift in q*(a) values is determined by a normal distribution rather than shifting by a constant amount each step.

MouseAndKeyboard added 6 commits January 15, 2020 10:22

added constant-valued non-stationary bandit to gym register

afbac87

added non-stationary bandit implementation

518adc2

register nonstationary bandit

8cc8ea1

added import for non-stationary bandit

13efda1

updated wrapper on nonstationary bandit to work correct

39b12e5

updated readme to include the non-stationary bandit

3daa11e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nonstationary bandit #6

Nonstationary bandit #6

MouseAndKeyboard commented Jan 15, 2020

Nonstationary bandit #6

Are you sure you want to change the base?

Nonstationary bandit #6

Conversation

MouseAndKeyboard commented Jan 15, 2020