You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As described above, when I first got a tensordict by specifying the index (try 1 & 2), I failed to replace the part of "update".
However, when I first got a tensor by specifying the key "update" (try 3), I succeeded in replacing the part of "update".
(By the way, I also noticed that a substitution of tensordict worked as expected as below)
# try 4
batch.set_("update", new_update)
buffer[index] = batch # This replaces like above!
Although currently there are some workarounds (try-3/4), this behavior is confusing and some users unconsciously may try 1&2, like me.
A possible fix is to modify try-1/2 ways so that both can also update the data.
Hey!
This is unfortunately kind of expected. It's a pytorch thing, not really an RL / tensordict artifact. Think of tensordict as a regular dict:
when you do td[index] and index is a tensor, we return a new tensor with a copy of the data (ie the data does not share the memory with the original one). If the index is a slice or an int, the second example will update (try with index = slice(len(index[0]))). So as soon as you do buffer[index] we're screwed.
There is however a set_at_ method in tensordict to handle this:
buffer[:].set_at_("update", ~new_update, index)
that will do the trick.
that being said, I do agree it's confusing and we should preferably get ways to prevent the confusion. I'm open to suggestions.
What would be the most helpful? Better doc? Find a way to raise an error/warning if people do (1) or (2) (not sure it's achievable)?
Thank you for your kind explanation! I could know the pytorch's indexing rule.
I tested int/slice for try-1/2 and confirmed that only try-2 worked, and then I wondered why int/slice for try-1 did not work. If int/slice indexing gives a reference to the part, try-1 also seems to work...?
Ah yeah, an update method would indeed be a killer feature!
Didn't really think about that but it makes total sense.
Perhaps set_at_, set_, update_ to keep the tensordict nomenclature?
I tried to replace/update some parts of the data stored in ReplayBuffer with new tensors.
Although the functionality was add (#2209), I found an unexpected behavior as follows:
This prints like:
As described above, when I first got a tensordict by specifying the index (try 1 & 2), I failed to replace the part of "update".
However, when I first got a tensor by specifying the key "update" (try 3), I succeeded in replacing the part of "update".
(By the way, I also noticed that a substitution of tensordict worked as expected as below)
Although currently there are some workarounds (try-3/4), this behavior is confusing and some users unconsciously may try 1&2, like me.
A possible fix is to modify try-1/2 ways so that both can also update the data.
(Finally, my system info is like below)
The text was updated successfully, but these errors were encountered: