Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to latest llama.cpp #1706

Merged
merged 3 commits into from
Dec 1, 2023
Merged

Update to latest llama.cpp #1706

merged 3 commits into from
Dec 1, 2023

Conversation

cebtenzzre
Copy link
Member

NOTE: Requires this llama.cpp branch, which is where most of the changes are: https://github.com/nomic-ai/llama.cpp/tree/update-llamacpp-base
We will have to reset our llama.cpp fork's master branch to that commit before we can properly merge this.

What I did:

  • Split the Vulkan changes into their own branch: https://github.com/nomic-ai/llama.cpp/tree/vulkan
  • Incrementally merged the latest llama.cpp into them (up to ggerganov/llama.cpp@6b0a7420), making commits as needed to maintain parity between Metal and Vulkan. This is the current state of the vulkan branch.
  • Merged the vulkan branch into the GPT4All-specific branch, and made several changes on top of that. This is update-llamacpp-base.
  • Updated the llama.cpp dependency in GPT4All and made the necessary changes for it to be used. That is this PR.

It seems to run fine with app.py on my RX 7800 XT. 0cc4m tested a slightly older version of the changes to llama.cpp and also had success (after some initial failures that we couldn't reproduce).

We weren't setting n_threads_batch, and setThreadCount was a no-op,
because we're using llama_decode which doesn't take an n_threads
argument.
I filed an upstream PR to discuss this:
ggerganov/llama.cpp#4274

Also, make sure to free the batch when we're done with it.
@cebtenzzre cebtenzzre added the backend gpt4all-backend issues label Dec 1, 2023
@cebtenzzre cebtenzzre requested a review from manyoso December 1, 2023 18:51
@cebtenzzre cebtenzzre merged commit 9e28dfa into main Dec 1, 2023
4 of 5 checks passed
@cebtenzzre cebtenzzre deleted the newllama branch December 1, 2023 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend gpt4all-backend issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants