performance is not improved during training #1

KerryWu16 · 2019-05-06T07:48:40Z

Thank you very much for sharing the code.
I have run both the ddpg_stochastic and ddpg files. And my problem is that during the trainings, the performance was not getting improved and always ended up with reward approximately -3 or -4.
I read through your paper and checked the code. It looks correct and I did not find any problem.
So I guess it may cause by the environment. Therefore, I print out the sensor output, the local target and the states, all seem to be fine.
Then I rewrite the ddpg part and adopts the random switch method. However, things are still not getting better. And I really do not know what is wrong.

May I know if this version of code works fine on your side? If yes, what is the learning curve looks like (such as after how many iterations, it gets -2, -1, 0, 1, 2, ......). And do you have any idea on what mistake I may have made? Thank you very much for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance is not improved during training #1

performance is not improved during training #1

KerryWu16 commented May 6, 2019

performance is not improved during training #1

performance is not improved during training #1

Comments

KerryWu16 commented May 6, 2019