Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor #84

Open
michelle-chou25 opened this issue Oct 20, 2022 · 15 comments
Labels
bug Something isn't working

Comments

@michelle-chou25
Copy link

1666232927523

Dear Yuan,

I met this issue when running the demo.py, it occurred in line 29, ast_models.py,
self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=patch_size)
with error msg as followed:
Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor.
Would you like to have a look at it?
I use 👍
timm=0.4.5
torch = 1.10.1+cu102
torchaudio = 0.10.1+cu102
torchvision = 0.11.2+cu102

Thank you
Best Regards,
Nanjun

@YuanGongND
Copy link
Owner

YuanGongND commented Oct 20, 2022

Hi Nanjun,

This typically means your input and model are not on the same device (i.e., one on CPU, another on GPU), which can be solved by

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
input = model.to(device)

May I ask which demo script you are running? We have a colab demo at https://colab.research.google.com/github/YuanGongND/ast/blob/master/colab/AST_Inference_Demo.ipynb, which should be bug-free.

-Yuan

@YuanGongND YuanGongND added the bug Something isn't working label Oct 20, 2022
@michelle-chou25
Copy link
Author

Dear Yuan,

The file I run was src/demo.py, I also run the jupyter notbook demo and didn't have this issue.
I debug the code, in self.proj(x), x.mlp_head.weight is in cuda, but when self.proj(x) is executed thiserror occurrs.

Best Regards,
Nanjun

@michelle-chou25
Copy link
Author

Dear Yuan,

The file I run was src/demo.py, I also run the jupyter notbook demo and didn't have this issue. I debug the code, in self.proj(x), x.mlp_head.weight is in cuda, but when self.proj(x) is executed thiserror occurrs.

Best Regards, Nanjun

And this error is still there after I set both the model and input to cuda.
I'll check it again by change cudatoolkit to another version.

@YuanGongND
Copy link
Owner

What if you run the jupyter script with your environment instead of the Google Colab one? If no error, then it's not your environment's problem.

@YuanGongND
Copy link
Owner

I also think setting the pretrain flag could also help:

ast_mdl = ASTModel(label_dim=label_dim, input_tdim=input_tdim, imagenet_pretrain=False, audioset_pretrain=False)

@michelle-chou25
Copy link
Author

I failed to run the Jupiter script on my local machine, it said it can't find the path '/content/ast/', seems my IDE failed to connect to colab.

@YuanGongND
Copy link
Owner

Yes, you need to change the filepath and maybe something else to run on local machine.

@michelle-chou25
Copy link
Author

Thank you it solves the issue and may I know why?
I also tried changing x to x.half(), a different error msg as followed occurred:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper__thnn_conv2d_forward)

@YuanGongND
Copy link
Owner

I think this again means your input and model are not in the same device. Which specific method solved your issue?

@michelle-chou25
Copy link
Author

I think this again means your input and model are not in the same device. Which specific method solved your issue?
Disable both imigenet_pretrained and audioset_pretrained

@YuanGongND
Copy link
Owner

The reason is it avoids the pretrained weights being load to cpu. No one reported this issue before about the input/model device, maybe not many people actually ran this demo. But since you have GPU, you could try run ESC-50 recipe and see if the error still there. I don't think cuda/torch version is the problem.

@michelle-chou25
Copy link
Author

I tried it on another machine. it was not reproduced.

@michelle-chou25
Copy link
Author

But changed line 132 in ast_models.py to
self.mlp_head = nn.Sequential(nn.LayerNorm(self.original_embedding_dim), nn.Linear(self.original_embedding_dim, label_dim)).to("cuda")
and line 18 in demo.py to
test_input = torch.rand([10, input_tdim, 128]).to("cuda").half()

@michelle-chou25
Copy link
Author

But changed line 132 in ast_models.py to self.mlp_head = nn.Sequential(nn.LayerNorm(self.original_embedding_dim), nn.Linear(self.original_embedding_dim, label_dim)).to("cuda") and line 18 in demo.py to test_input = torch.rand([10, input_tdim, 128]).to("cuda").half()

In the previous machine, the error can still be reproduced by applying the workaround.

@YuanGongND
Copy link
Owner

I see, it is a bit weird to me. Thanks for reporting this.

I actually don't think .half() is needed though the model is trained with half-precision - it should work for all float tensor input. You can do a quick test in the Google Colab environment to see if it is true.

Let's see if anyone else has the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants