Skip to content

Fixed the model conversion bug caused by minicpm's GQA structure。After testing minicpm's GQA, the converted model generates all <h>. This is because the number of k and v matrices of Gqa should be the same as kv_head, not the same as head/kv_head. #8249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

LDLINGLINGLING
Copy link

@github-actions github-actions bot added the python python script changes label Jul 2, 2024
@LDLINGLINGLING LDLINGLINGLING changed the title 修改了由于minicpm的GQA结构带来的模型转换bug Fixed the model conversion bug caused by minicpm's GQA structure Jul 2, 2024
@LDLINGLINGLING LDLINGLINGLING changed the title Fixed the model conversion bug caused by minicpm's GQA structure Fixed the model conversion bug caused by minicpm's GQA structure。After testing minicpm's GQA, the converted model generates all <h>. This is because the number of k and v matrices of Gqa should be the same as kv_head, not the same as head/kv_head. Jul 2, 2024
@ngxson
Copy link
Collaborator

ngxson commented Jul 2, 2024

Does this only affect minicpm 2, or both version 1 and 2?

@LDLINGLINGLING
Copy link
Author

only minicpm 2,Because only minicpm2 started using gqa

@ngxson
Copy link
Collaborator

ngxson commented Jul 2, 2024

Changing this will break minicpm 1, we should add a check somewhere to detect if the input model is minicpm 1 or 2

@LDLINGLINGLING
Copy link
Author

Is the 1 you mentioned cpm-bee? This model has never been supported. Which model did you crash? Can I check it out?

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jul 3, 2024
@compilade
Copy link
Collaborator

Seems to be the same as #7967, therefore my comment from there applies here too.

This makes MiniCPMModel._reverse_hf_permute exactly equivalent to LlamaModel.permute. Should LlamaModel.permute (which is also a static method) be used instead in MiniCPMModel?

@ngxson
Copy link
Collaborator

ngxson commented Jul 4, 2024

@LDLINGLINGLING I saw that support for minicpm is added a long time ago, so I supposed it was version 1.

But to be honest, there have been quite a lot of versions of minicpm so I'm not even sure what we're talking about. Could you specify which version you've tested?

image

A perplexity test would be appreciated.

@LDLINGLINGLING
Copy link
Author

Hello, I just found that the code I uploaded before is wrong. The new commit has tested all minicpm, and the results are normal. The code in the original llama.cpp cannot effectively convert the minicpm of the 1b parameter. , it is worth noting that in your figure, the model with the letter v is multi-modal, which is not supported by llama.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants