-
Notifications
You must be signed in to change notification settings - Fork 121
Go2 support #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Go2 support #110
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
So i've worked on the full collision mesh and examples, i have trained successfully Joystick, Handstand, Footstand and Getup. The policies need some rewards tuning but training works. Let me know if I need to do anything else. |
note: the actuator order in the mjx for Go2 does not follow the Unitree ordering for legs which is FR/FL/RR/RL, in the mjx it's FL/FR/RL/RR. just a note as simply forwarding the actions to the default order in LowCmd leads to mixing up the joints. Should this be fixed in the MJX? or it's an implementation detail left to the driver? |
Hello! Have you successfully trained Go2Getup and sim2realized it to a real robot? I found that a single training does not work like this: python train_jax_ppo.py --env_name=Go2Getup |
Yes I trained the Joystick policy and transfered it on a real Go2 successfully, the Getup and Handstand are straight copied from Go1, but from quick tests they did result in successful policies in sim, I didnt transfer these on the real Go2. |
for Getup there's an issue about it for Go1 so might have a look #65 |
That's the key point! It's mentioned in #65 that 50M timesteps is not enough for training Go1Getup. But should I train 750M timesteps at once, or train 50M timesteps each time and repeat loading checkpoints? More importantly, the paper mentions:
How should these two tricks be added to the training process? |
that's a bit off-topic regarding this PR, I'd suggest you ask directly in the issue itself since it's pretty much the same problem, the Go1/Go2 architectures are very similar. |
Hi @aatb-ch thanks for the PR! I'll try to get to this after the CoRL supplemental deadline (probs end of week). |
Thank you! I have reproduced it after 750M timesteps training. |
@kevinzakka super, yeah no stress just let me know once you got time if I need to change anything. |
This PR adds Unitree Go2 support, based off existing Go1 support. Used the Menagerie Go2 MJX model and adjusted accordingly to add correct sensors, collisions etc.
TODO: adjust full collision mjx, not 100% sure, seems some things are missing, have to go through the mesh of Go1 and compare, then test getup/handstand before adding these tasks.