About the average reward #3

Tendimension · 2016-12-02T02:47:33Z

I have run 20 million frames(time-steps) in the Breakout environment, but the average reward has not changed. After about 17 million steps, the average reward has changed in Asynchronous Methods for Deep Reinforcement Learning. I do not know where the problem is?

kkjh0723 · 2016-12-03T03:44:30Z

@Tendimension Do you find any reason? I have the same problem. The avg. reward is 2.0 and std. is 0.0 until 20 million frames. Is the reward going up after some period?

Tendimension · 2016-12-03T04:10:38Z

@kkjh0723 I do not know what the reason is.

yao62995 · 2016-12-07T02:56:17Z

@Tendimension @kkjh0723 I also found this bug. I will check it soon.

Tendimension · 2016-12-07T08:16:28Z

@yao62995 Thanks a million!

kkjh0723 · 2017-01-15T14:06:57Z

@yao62995 Do you have any updates on this problem?

andyxzq · 2018-02-12T10:07:43Z

I find the same issue. The average reward is still 0.0 after 1 million steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the average reward #3

About the average reward #3

Tendimension commented Dec 2, 2016

kkjh0723 commented Dec 3, 2016 •

edited

Loading

Tendimension commented Dec 3, 2016

yao62995 commented Dec 7, 2016

Tendimension commented Dec 7, 2016

kkjh0723 commented Jan 15, 2017

andyxzq commented Feb 12, 2018

About the average reward #3

About the average reward #3

Comments

Tendimension commented Dec 2, 2016

kkjh0723 commented Dec 3, 2016 • edited Loading

Tendimension commented Dec 3, 2016

yao62995 commented Dec 7, 2016

Tendimension commented Dec 7, 2016

kkjh0723 commented Jan 15, 2017

andyxzq commented Feb 12, 2018

kkjh0723 commented Dec 3, 2016 •

edited

Loading