Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load weight error #16

Open
zwhus opened this issue Aug 22, 2023 · 6 comments
Open

Load weight error #16

zwhus opened this issue Aug 22, 2023 · 6 comments

Comments

@zwhus
Copy link

zwhus commented Aug 22, 2023

Hi, Thanks for your excellent work.
Now I ran into an issue when I tried to load GPT4ROI weights to perform stage2 training and there was an error
”Error(s) in loading state_dict for SPILlavaMPTForCausalLM:
size mismatch for lm_head.weight: copying a param with shape torch.Size([32006, 4096]) from checkpoint, the shape in current model is torch.Size([32005, 4096]).“
How to solve this problem?
Looking forward to your reply!

@jshilong
Copy link
Owner

Hi, to better understand your situation, I need more information about how you're loading the model, the script you're using, and whether you're loading the weights from the first stage or the final weight we providing.

@zwhus
Copy link
Author

zwhus commented Aug 29, 2023

Thank you for your reply!
First, I followed the download weights tutorial to get weight GPT4RoI-7B
Next, I want to continue training GPT4RoI with this weight, so I reference stage2 and use the command:

bash train_stage2.sh exp/stage2 GPT4ROi-7B

Here, GPT4RoI-7B is the final weight, and the stage2 file is unchanged
Finally: there was an error
”Error(s) in loading state_dict for SPILlavaMPTForCausalLM:
size mismatch for lm_head.weight: copying a param with shape torch.Size([32006, 4096]) from checkpoint, the shape in current model is torch.Size([32005, 4096]).“
How to solve this problem?

@jshilong
Copy link
Owner

please change
pip install tokenizers==0.13.3
and
pip install transformers@git+https://github.com/huggingface/transformers.git@cae78c46.

I tried the same operation and there is not error.

@jshilong
Copy link
Owner

Please change these two package versions
pip install tokenizers==0.13.3
and
pip install transformers@git+https://github.com/huggingface/transformers.git@cae78c46.

I tried the same operation and there is no error.

@jshilong
Copy link
Owner

Perhaps you could furnish me with the comprehensive error message. I'm interested in determining whether this error transpires during the initialization of the model or while trying to resume it from GPT4ROi-7B

@jshilong
Copy link
Owner

This may be an issue due to improper weight merging. For troubleshooting, you can try resuming from https://huggingface.co/shilongz/debug to make sure your weight is no problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants