-
While developing an application that uses llama.cpp, I ran into the issue of having to test model loading. Naturally, this requires an actual model to load, and for the time being I'm using TheBlokes TinyLlama Q2 GGUF model. That model was the smallest I could find, at around 482MB. I seem to remember seeing a minimal GGUF model used during the testing of llama.cpp, but I can't for the life of me figure out if I'm just imagining it. Does anyone know of barebones GGUF model that will read as a proper GGUF, but be small enough to use during testing? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
You can generate small GGUF models like this: make -j gguf
# write
./gguf dummy.gguf w
# read
./gguf dummy.gguf r The code is in the |
Beta Was this translation helpful? Give feedback.
-
That was quick, thank you! |
Beta Was this translation helpful? Give feedback.
-
I use this one:
it's < 100M. llama-run can dowload it from ollama repository. |
Beta Was this translation helpful? Give feedback.
You can generate small GGUF models like this:
The code is in the
examples/gguf
folder