Adding a sample_action method for ActorCritic #4

lemikhovalex · 2021-02-14T08:02:48Z

Hello! I've been learning how to code RL form your repo. I've replace duplicating code lines from
def train
def update_policy

to agent's method self.sample_action(). And it seems that agent now solves Cart-Pole problem x2 slower(num of episodes). And it happes everytime. I have no idea what happens with torch and havn't found anything on Internet.
Can you pls help me?

https://github.com/lemikhovalex/pytorch-rl
5_tr - Proximal Policy Optimization (PPO) [CartPole]-Copy1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a sample_action method for ActorCritic #4

Adding a sample_action method for ActorCritic #4

lemikhovalex commented Feb 14, 2021 •

edited

Loading

Adding a sample_action method for ActorCritic #4

Adding a sample_action method for ActorCritic #4

Comments

lemikhovalex commented Feb 14, 2021 • edited Loading

lemikhovalex commented Feb 14, 2021 •

edited

Loading