Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance is not improved during training #1

Open
KerryWu16 opened this issue May 6, 2019 · 0 comments
Open

performance is not improved during training #1

KerryWu16 opened this issue May 6, 2019 · 0 comments

Comments

@KerryWu16
Copy link

Thank you very much for sharing the code.
I have run both the ddpg_stochastic and ddpg files. And my problem is that during the trainings, the performance was not getting improved and always ended up with reward approximately -3 or -4.
I read through your paper and checked the code. It looks correct and I did not find any problem.
So I guess it may cause by the environment. Therefore, I print out the sensor output, the local target and the states, all seem to be fine.
Then I rewrite the ddpg part and adopts the random switch method. However, things are still not getting better. And I really do not know what is wrong.

May I know if this version of code works fine on your side? If yes, what is the learning curve looks like (such as after how many iterations, it gets -2, -1, 0, 1, 2, ......). And do you have any idea on what mistake I may have made? Thank you very much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant