-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training losses of urbandriver is easy to diverge. is there any training tricks to stabilizing the training process? #383
Comments
Hi @shubaozhang, none of the authors of this work are working at Level-5 anymore, you can try reaching directly to them. With that said, my personal take on it is that it doesn't seem to be an offline RL method for many reasons (i.e. you are not optimizing for expected reward but for an imitation loss, you still have a differentiable loss minimization and simulation, there is no exploration, etc) so it is quite different than what we have in a RL setting or the setting where policy gradient theorem is derived from. |
Thanks for your reply |
Hi @shubaozhang, were you ever able to figure out the issues with training? I'm also facing difficulties in getting my trained urbandriver model (specifically the open loop with history) to match the performance of the pretrained models provided. |
The parameters: history_num_frames_ego, future_num_frames, etc., affects a lot. I use the following paramets. And the training loss converges. |
Hi @shubaozhang, thanks for these configurations! Have you tried evaluating your trained model in closed_loop_test.ipynb and visualizing the scenes? My trained Urban Driver model (with the configs above as well as the default configs) with 150k iterations on train_full.zarr still seem to converge a degenerate solution (such as just driving straight ahead regardless of the map) whereas the pre-trained BPTT.pt provided does not have this issue. Were there any additional changes you had to make (such as to closed_loop_model.py or open_loop_model.py)? Also, by chance, have you tried loading in the state dict of the pretrained BPTT.pt model (as opposed as directly using the Jit model)? It seems the provided configs do not work when trying to load the state dict. I had to change the d_local from 256 to 128 in open_loop_model.py to get the pretrained state dict to load in to my model and there seems to be other mismatches (the shape of result['positions'] is different between the BPTT.pt Jit model and my model where I load in the state dict of BPTT.pt). |
Any training tricks to stabilizing the training process of UrbanDriver?
The text was updated successfully, but these errors were encountered: