mismatch shape issues #489

imrankh46 · 2025-01-18T05:53:34Z

hello, i am trying to merge two model with same architecture, so the merging is working but facing issues when loading the model

RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
	size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([96640, 4096]) from checkpoint, the shape in current model is torch.Size([128264, 4096]).
	size mismatch for lm_head.weight: copying a param with shape torch.Size([96640, 4096]) from checkpoint, the shape in current model is torch.Size([128264, 4096]).
	You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

after adding the following
ignore_mismatched_sizes=True
so the modl output is garbage text.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mismatch shape issues #489

mismatch shape issues #489

imrankh46 commented Jan 18, 2025

mismatch shape issues #489

mismatch shape issues #489

Comments

imrankh46 commented Jan 18, 2025