Multiple safetensor files, am I missing something? #175
Unanswered
mousepixels
asked this question in
Q&A
Replies: 2 comments 2 replies
-
ExLlama expects a single .safetensors file and doesn't currently support sharding. I'm not aware of anyone releasing sharded GPTQ models, but if you have a link to where you found those files I could probably take a look. |
Beta Was this translation helpful? Give feedback.
2 replies
-
I'm not aware of any particular guides, but generally you want to look for GPTQ conversions. ExLlama only works on quantized weights, and those links seem to be to the original FP16 weights. Here is a Llama2-7b conversion, and there are many other GPTQ models on HF that should work fine. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The new llama-2 models have multiple safetensor files.
I get this error when running anything
!! Multiple files matching ../llama/transformed_70b/*.safetensors
I am assuming I am missing something, so i didn't raise this as an issue.
Thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions