Assignment 1, vi_and_pi.py, line 106 #2

zhenshan-bing · 2019-12-05T21:21:18Z

Are you missing the reward, which should be "new_value_function[state] += (reward + gamma * probability * value_function[nextstate])" instead of "new_value_function[state] += (gamma * probability * value_function[nextstate])"?

SonOfAntonv0 · 2020-04-14T06:13:13Z

I think that needs to be addressed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assignment 1, vi_and_pi.py, line 106 #2

Assignment 1, vi_and_pi.py, line 106 #2

zhenshan-bing commented Dec 5, 2019

SonOfAntonv0 commented Apr 14, 2020

Assignment 1, vi_and_pi.py, line 106 #2

Assignment 1, vi_and_pi.py, line 106 #2

Comments

zhenshan-bing commented Dec 5, 2019

SonOfAntonv0 commented Apr 14, 2020