Update to latest llama.cpp #1706

cebtenzzre · 2023-12-01T18:51:17Z

NOTE: Requires this llama.cpp branch, which is where most of the changes are: https://github.com/nomic-ai/llama.cpp/tree/update-llamacpp-base
We will have to reset our llama.cpp fork's master branch to that commit before we can properly merge this.

What I did:

Split the Vulkan changes into their own branch: https://github.com/nomic-ai/llama.cpp/tree/vulkan
Incrementally merged the latest llama.cpp into them (up to ggerganov/llama.cpp@6b0a7420), making commits as needed to maintain parity between Metal and Vulkan. This is the current state of the vulkan branch.
Merged the vulkan branch into the GPT4All-specific branch, and made several changes on top of that. This is update-llamacpp-base.
Updated the llama.cpp dependency in GPT4All and made the necessary changes for it to be used. That is this PR.

It seems to run fine with app.py on my RX 7800 XT. 0cc4m tested a slightly older version of the changes to llama.cpp and also had success (after some initial failures that we couldn't reproduce).

We weren't setting n_threads_batch, and setThreadCount was a no-op, because we're using llama_decode which doesn't take an n_threads argument.

I filed an upstream PR to discuss this: ggerganov/llama.cpp#4274 Also, make sure to free the batch when we're done with it.

cebtenzzre added 3 commits December 1, 2023 13:40

update llama.cpp-mainline

27a4737

llamamodel: fix setting of n_threads

caba345

We weren't setting n_threads_batch, and setThreadCount was a no-op, because we're using llama_decode which doesn't take an n_threads argument.

llamamodel: fix incorrect use of batch API

c68e881

I filed an upstream PR to discuss this: ggerganov/llama.cpp#4274 Also, make sure to free the batch when we're done with it.

cebtenzzre added the backend gpt4all-backend issues label Dec 1, 2023

cebtenzzre requested a review from manyoso December 1, 2023 18:51

manyoso approved these changes Dec 1, 2023

View reviewed changes

cebtenzzre merged commit 9e28dfa into main Dec 1, 2023
4 of 5 checks passed

cebtenzzre deleted the newllama branch December 1, 2023 21:51

This was referenced Dec 2, 2023

deepseek-coder-6.7B-instruct-GGUF crashes the app when selected #1617

Closed

backend: use ggml_new_graph for GGML backend v2 #1719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to latest llama.cpp #1706

Update to latest llama.cpp #1706

cebtenzzre commented Dec 1, 2023

Update to latest llama.cpp #1706

Update to latest llama.cpp #1706

Conversation

cebtenzzre commented Dec 1, 2023