Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持纯cpu推理吗 #90

Open
sgwzy22 opened this issue Nov 13, 2024 · 2 comments
Open

支持纯cpu推理吗 #90

sgwzy22 opened this issue Nov 13, 2024 · 2 comments

Comments

@sgwzy22
Copy link

sgwzy22 commented Nov 13, 2024

No description provided.

@sixsixcoder
Copy link

sixsixcoder commented Nov 14, 2024

启动模型服务时,将device设置成cpu即可

python model_server.py --host localhost --model-path THUDM/glm-4-voice-9b --port 10000 --dtype bfloat16 --device cpu

但是Speech tokenizer目前不支持cpu

@sgwzy22
Copy link
Author

sgwzy22 commented Nov 18, 2024

启动模型服务时,将device设置成cpu即可

python model_server.py --host localhost --model-path THUDM/glm-4-voice-9b --port 10000 --dtype bfloat16 --device cpu

但是Speech tokenizer目前不支持cpu

cpu可以用bf16吗,是不是还需要改--dtype float32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants