You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this for new documentation, or an update to existing docs?
New
Describe the incorrect/future/missing documentation
The high-level APIs for decoupled look-back don't have any alignment requirements AFAIK but this seems not the case if you use cub::ScanTileState directly. I had to use compute-sanitizer to figure out why my code crashed and it seems that d_temp_storage needs to be 8-byte aligned in this case.
I am making this a documentation request, not a bug request, as this behavior seems correct to me (if I'm working at this low level then I don't want cub padding or aligning pointers behind my back). I just suggest that this be documented.
If this is a correction, please provide a link to the incorrect documentation. If this is a new documentation request, please link to where you have looked.
Is this a duplicate?
Is this for new documentation, or an update to existing docs?
New
Describe the incorrect/future/missing documentation
The high-level APIs for decoupled look-back don't have any alignment requirements AFAIK but this seems not the case if you use cub::ScanTileState directly. I had to use compute-sanitizer to figure out why my code crashed and it seems that d_temp_storage needs to be 8-byte aligned in this case.
I am making this a documentation request, not a bug request, as this behavior seems correct to me (if I'm working at this low level then I don't want cub padding or aligning pointers behind my back). I just suggest that this be documented.
If this is a correction, please provide a link to the incorrect documentation. If this is a new documentation request, please link to where you have looked.
I was just copying the example code here: https://github.com/NVIDIA/cccl/blob/main/cub/examples/device/example_device_decoupled_look_back.cu
I actually didn't see any documentation for ScanTileState in the HTML docs, I just looked around in the header files and didn't find anything about the alignment issue. In fact I was not really sure this is part of the public API except that Jake Hemstad told me it was (NVIDIA-internal link): https://nvidia.slack.com/archives/C07DK328L/p1697046387099159?thread_ts=1696996019.051079&cid=C07DK328L
This seems to be distinct from this issue: #910 (I'm specifically not expecting cub to align pointers for me)
The text was updated successfully, but these errors were encountered: