Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"How to resolve CUDA error: device-side assert triggered" #24

Open
shuaigeGoku opened this issue Sep 4, 2023 · 1 comment
Open

"How to resolve CUDA error: device-side assert triggered" #24

shuaigeGoku opened this issue Sep 4, 2023 · 1 comment

Comments

@shuaigeGoku
Copy link

Issue Description:
I encountered the following error while attempting GAE training, and I'm unsure how to resolve it. I've tried multiple approaches, but none have been successful. Please help me find a solution.
Error Message:# Copy and paste the complete error message here
C:/w/1/s/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: block: [10,0,0], thread: [32,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "train.py", line 181, in
batch_pc, compute_graph=True)
File "C:\Users\YourUsername\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "D:\Point2Skeleton\Point2Skeleton\src\SkelPointNet.py", line 238, in forward
A, valid_Mask, known_Mask = self.init_graph(input_pc[..., 0:3], skel_xyz)
File "D:\Point2Skeleton\Point2Skeleton\src\SkelPointNet.py", line 188, in init_graph
A[torch.arange(bn)[:, None], knn_sp2sk[:, :, 1], knn_sp2sk[:, :, 0]] = 1
RuntimeError: CUDA error: device-side assert triggered

Environment Information:
Operating System: Windows 11
Python Version: Python 3.7.8
PyTorch Version: (1.1.0)
CUDA Version: (10.0.13)
Issue Background:
I'm attempting to perform GAE training (please provide the relevant project or library name, if applicable). My objective is (briefly describe your goal or task).

@clinplayer
Copy link
Owner

Hey, I can't reimplement this issue. I think this error is caused by using an index that is outside the range of dimensions of a CUDA tensor in CUDA tensor indexing. Specifically, this error message is a device-side assertion in the CUDA runtime library, used to check if the index used in tensor indexing is within the range of tensor size.

Could you please check if you amended the code so the indices are out of bounds of the shape of the adjacency matrix A? I would recommend printing the shapes of the indices of each dimension to find the cause of error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants