-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor success rate in complex scenarios #545
Comments
Have you tried picking only one piece out of 6 pieces for one episode? |
@astroyat My original thought is that reseting to the start position would be time consuming, so I haven't tried this method, do you have experience about this, in your experiment did 1 piece 1 episode help improve success rate? I think the essential problem is whether this algorithm can learn planning in a long-time picking scenario |
I found the success rate is better when the joints graph is consistent for all the episodes even for random location. |
I see, thank you so much for the reply, I tried to record picking the six legos 1 piece per episode, the training loss converged faster and loss was lower than long-time task. But disappointingly, the behavior is similar to the previous one, when the lego grasping order is learned in dataset it performs good, but when the state is hard to define order, ACT starts to collapse. let me give an example, I was recording data in this order, left-right and up-down, and followed a dummy block rule (subjectively determined) indicated by the black dotted line, the number is the order I grasp legos. The upper block prioritized over below ones, and within block, left pieces prioritized over right ones. When the lego piece is at the junction of the blocks, robot arm will stuck somewhere in the middle (e.g. in the middle of the blue and the green like below). if not intervening by relocating the state to somewhere model has seen, it will never get autocorrect. I think one reason is even humans cannot maintain consistent and coherent actions in those situations. Does this mean the sequence of unordered grasping is very difficult to learn in the current algorithm framework? Any idea or comment? |
Hi I used Moss robot to play with and train ACT policy, when it comes to one lego piece, it can finish grabbing task at high success rate after recording 50+ episodes with different pose & location variants, but generalization on multi-piece random location is not promising.
When I started to add complexity (for example 6 pieces with different colors like the picture below), and place the lego pieces a little bit randomly, record one episode continuously until all the pieces are grabbed (other than 1 piece 1 episode). furthermore, were recorded with order
Here is what I found:
The trained policy can not work if the gripping sequence is randomized, in other words it has to keep a fixed spacial order e.g. from upper left to down right.
The trained policy can not work if the [location, color, pose] combination was not seen in training dataset, especially location combos
At first I suspected only iPhone and Mac fixed cameras can not give enough depth perception, so I bought a wide-angle USB camera mounted it on the gripper, as a result success rate didn't get higher.
Enlarging dataset size to 120+ episodes didn't give obvious change.
I was wondering how to improve this task, is the method I used to record data wrong or due to the generalization of ACT is limited?
Looking forward to hearing answers or experience
The text was updated successfully, but these errors were encountered: