Ollama interfaces for Neovim: get up and running with large language models locally in Neovim.
nvim-llama.mp4
🏗️ 👷 Warning! Under active development!! 👷 🚧
Docker is required to use nvim-llama
.
And that's it! All models and clients run from within Docker to provide chat interfaces and functionality. This is an agnostic approach that works for MacOS, Linux, and Windows.
Use your favorite package manager to install the plugin:
use 'jpmcb/nvim-llama'
{
'jpmcb/nvim-llama'
}
Plug 'jpmcb/nvim-llama'
In your init.vim
, setup the plugin:
require('nvim-llama').setup {}
You can provide the following optional configuration table to the setup
function:
local defaults = {
-- See plugin debugging logs
debug = false,
-- The model for ollama to use. This model will be automatically downloaded.
model = llama2,
}
Ollama supports an incredible number of open-source models available on ollama.ai/library
Check out their docs to learn more: https://github.com/jmorganca/ollama
When setting the model
setting, the specified model will be automatically downloaded:
Model | Parameters | Size | Model setting |
---|---|---|---|
Neural Chat | 7B | 4.1GB | model = neural-chat |
Starling | 7B | 4.1GB | model = starling-lm |
Mistral | 7B | 4.1GB | model = mistral |
Llama 2 | 7B | 3.8GB | model = llama2 |
Code Llama | 7B | 3.8GB | model = codellama |
Llama 2 Uncensored | 7B | 3.8GB | model = llama2-uncensored |
Llama 2 13B | 13B | 7.3GB | model = llama2:13b |
Llama 2 70B | 70B | 39GB | model = llama2:70b |
Orca Mini | 3B | 1.9GB | model = orca-mini |
Vicuna | 7B | 3.8GB | model = vicuna |
Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models. 70B parameter models require upwards of 64 GB of ram (if not more).
The :Llama
autocommand opens a Terminal
window where you can start chatting with your LLM.
To exit Terminal
mode, which by default locks the focus to the terminal buffer, use the bindings Ctrl-\ Ctrl-n