-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathvimqq.txt
333 lines (248 loc) · 12.9 KB
/
vimqq.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
*vimqq.txt* For Vim 8.0+, Neovim ?? Last change: 2024-09-16
VIMQQ ~
AI plugin for Vim/NeoVim with focus on local evaluation, flexible context
and aggressive cache warmup to hide latency.
Version: 0.0.7
Author: Oleksandr Kuvshynov
1. Introduction ........................................ |vimqq-intro|
2. Installation ........................................ |vimqq-install|
2.1. Groq ............................................ |vimqq-groq|
2.2. Local ........................................... |vimqq-local|
2.3. Claude .......................................... |vimqq-claude|
2.4. Mistral ......................................... |vimqq-mistral|
3. Usage ............................................... |vimqq-usage|
4. Commands ............................................ |vimqq-commands|
4.1. Message sending ................................. |vimqq-commands-msg|
4.2. Forking ......................................... |vimqq-commands-fork|
4.3. UI Controls ..................................... |vimqq-commands-ui|
5. Mappings ............................................ |vimqq-mappings|
6. Configuration ....................................... |vimqq-config|
7. Changelog ........................................... |vimqq-changelog|
==============================================================================
1. Introduction *vimqq-intro*
Features:
- Support for both remote models through paid APIs (Groq, Claude)
and local models via llama.cpp server;
- automated KV cache warmup for local model evaluation;
- dynamic warmup on typing - in case of long questions, it is a good idea
to prefill cache for the question itself.
- mixing different models in the same chat sessions. It is possible to send
original message to one model and use a different model for the follow-up
questions.
- ability to fork the existing chat, start a new branch of the chat and
reuse server-side cache for the shared context.
- Support both Vim and NeoVim;
What vimqq is not doing:
- providing any form of autocomplete.
- generating code in place, typing it in editor directly, all communication
is done in the chat buffer.
==============================================================================
2. Installation *vimqq-install*
Using |packages|:
Copy over the plugin itself:
>
git clone https://github.com/okuvshynov/vimqq.git ~/.vim/pack/plugins/start/vimqq
The command above makes vimqq automatically loaded at vim start
Update helptags in vim:
>
:helptags ~/.vim/pack/plugins/start/vimqq/doc
Using vim-plug:
>
Plug 'okuvshynov/vimqq'
vimqq will not work in 'compatible' mode. 'nocompatible' needs to be set.
==============================================================================
2.1 Installation: Groq *vimqq-groq*
This might be the easiest option to try it out at the moment for free.
- Register and get API key at https://console.groq.com/keys
- Save that key to `$GROQ_API_KEY` environment variable.
- Add the configuration to your vimrc:
>
let g:vqq_groq_models = [
\ {'bot_name': 'groq', 'model': 'llama-3.1-70b-versatile'}
\]
==============================================================================
2.2 Installation: Local *vimqq-local*
To use local models, get and build llama.cpp server
>
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make -j 16
Download/prepare the models and start llama server:
>
./llama.cpp/llama-server
--model path/to/model.gguf
--chat-template llama3
--host 0.0.0.0
--port 8080
Add a bot endpoint configuration to vimrc file, for example
>
let g:vqq_llama_servers = [
\ {'bot_name': 'llama', 'addr': 'http://localhost:8080'},
\]
It is possible to have multiple bots with different names.
==============================================================================
2.3 Installation: Claude *vimqq-claude*
To use claude models, register and get API key. By default vimqq will look for
API key in environment variable `$ANTHROPIC_API_KEY`. It is possible to
override the default with
>
let g:vqq_claude_api_key = ...
Bot definition looks similar:
>
let g:vqq_claude_models = [
\ {'bot_name': 'sonnet', 'model': 'claude-3-5-sonnet-20240620'}
\]
==============================================================================
2.4 Installation: Mistral *vimqq-mistral*
To use mistral models using the API, register and get API key. By default
vimqq will look for API key in environment variable `$MISTRAL_API_KEY`.
It is possible to override the default with
>
let g:vqq_mistral_api_key = ...
Bot definition looks similar:
>
let g:vqq_mistral_models = [
\ {'bot_name': 'mistral', 'model': 'mistral-large-latest'}
\]
==============================================================================
3. Usage *vimqq-usage*
Assuming we have a single bot named "groq", we can start by asking a question:
>
:Q @groq What are basics of unix philosophy?
We can omit the tag as well, and the first configured bot will be used:
>
:Q What are basics of unix philosophy?
After running this command new window should open in vertical split showing
the message sent and response. If it is local llama bot, we should see
the reply being appended token by token.
Pressing "q" in the window would change the window view to the list of
past messages. Up/Down or j/k can be used to navigate the list and <cr>
can be used to select individual chat.
Chat title is generated by the same model after the first response.
==============================================================================
4. Commands *vimqq-commands*
------------------------------------------------------------------------------
4.1. Message sending *vimqq-commands-msg*
First group of commands are used to send messages.
- `:QQ` sends/warms up a message with range provided
- `:Q` sends/warms up a message with no range
- `:QQN` sends/warms up a message with range provided in new chat
- `:QN` sends/warms up a message with no range in new chat
The format for calling both commands is similar, except for the range part
>
:Q [@bot_name] message
------------------------------------------------------------------------------
4.2. Forking *vimqq-commands-fork*
`QF` command forks the current chat reusing the context from the first message.
It is useful in cases of long context, but when you want to start a new
discussion thread. For example,
>
:QF Suggest a simple task for junior engineer working on the project
It will:
- take current chat's first message, keep the context and bot
- modify the question with 'Suggest a ...'
- create new chat
- append amended message to new chat
- send new chat to the original bot
This way we can reuse the context, which might be long (entire file/project)
Note: this command might get merged to the Q/QQ commands. Similar to `n`
meaning 'ask in new chat' we might add an option to 'ask in fork'.
------------------------------------------------------------------------------
4.3. UI Control *vimqq-commands-ui*
- `:QQList` will show the list of past chat threads. User can navigate the
list and select chat session by pressing <CR>. This action will make chat
session current.
- `:QQOpenChat chat_id` opens chat with id=`chat_id`. This action will make
chat session current.
- `:QQToggle` shows/hides vimqq window.
==============================================================================
5. Mappings *vimqq-mappings*
In chat list view:
- 'd' will show a dialog confirmation and delete chat if user answers Yes.
- <cr> will select chat session under cursor, open it and make current.
==============================================================================
6. Configuration *vimqq-config*
Configuration is done using global variables with prefix `g:vqq`.
- `g:vqq_llama_servers` - list of llama.cpp bots. Default is empty.
Each bot configuration is a dictionary with the following attributes:
- `bot_name`. string identification for this bot.
Must be unique and consist of [a-zA-Z0-9_] symbols. Required
- `addr`. Path to endpoint, in the http://host:port format. Required.
- `healthcheck_ms`. How often to do healthcheck query.
Optional, default is 10000ms (=10s)
- `title_tokens`. When generating title for chat, up to this many
tokens can be produced. Optional, default is 16.
- `max_tokens`. When producing response, up to this many tokens can
be produced. Optional, default is 1024.
- `system_prompt`. System prompt to include in the query and steer
bot behavior. Optional, default is "You are a helpful assistant".
- `g:vqq_claude_models` - list of claude bots. Default is empty.
Each bot configuration is a dictionary with the following attributes:
- `model`. Model to use, must be one the models supported by Claude.
Check https://docs.anthropic.com/en/docs/about-claude/models for
the model list. Required.
- `bot_name`. string identification for this bot.
Must be unique and consist of [a-zA-Z0-9_] symbols. Required
- `title_tokens`. When generating title for chat, up to this many
tokens can be produced. Optional, default is 16.
- `max_tokens`. When producing response, up to this many tokens can
be produced. Optional, default is 1024.
- `g:vqq_default_bot`. bot_name of the bot used by default, if user
omits the tag. Optional, default is first bot.
- `g:vqq_warmup_on_chat_open`. List of bot names for which to issue
a warmup query automatically, every time we opened chat history
for some chat session. Useful for resuming old sessions. Default
is empty list.
- `g:vqq_context_template`. Template to use to construct the message
if some code needs to be included as a context. Replaces placeholders
{vvq_ctx} and {vvq_msg} with context (selected code) and message.
Optional, default is "Here's a code snippet: \n\n{vqq_ctx}\n\n{vqq_msg}"
- `g:vqq_width`. Default chat window width in characters.
Optional, defutlt is 80.
- `g:vqq_time_format`. Time format to use in chat sessions list.
Optional, default is "%b %d %H:%M ", for example "July 23, 18:05"
- `g:vqq_chats_file`. Path to json file to store message history.
Optional, default is expand('~/.vim/vqq_chats.json')
- `g:vqq_context_filetypes`. When extracring project-level context,
this list of patterns will be used to include files.
Default: "*.vim,*.txt,*.md,*.cpp,*.c,*.h,*.hpp,*.py,*.rs"
- `g:vqq_autowarm_cmd_ms`. When doing warmup queries vimqq can keep
sending warmup queries as user types the message. It is useful in
the cases of longer messages with detailed instructions. This logic
is based on two events: previous warmup finished and message in the
command-line has changed. If both of them are true, new warmup will
be sent. To the best of my knowlegde, vim doesn't provide a way to
listen to the cmdline update events, but we can check its value
periodically. This parameter defines this period. Default - 1000ms.
- `g:vqq_log_file`. File location where to append logs.
Default is `~/.vim/vimqq.log`
- `g:vqq_log_level`. Controls how much to log. Can be one of
'DEBUG', 'INFO', 'ERROR', 'NONE'. All messages logged with
configured level or higher will be appended to the file.
==============================================================================
6. Example basic config *vimqq-config-example*
>
let g:vqq_llama_servers = [
\{
\ 'bot_name': 'local',
\ 'addr': 'http://localhost:8080',
\ 'do_autowarm': v:true
\}
\]
" chat [l]ist
nnoremap <leader>ll :<C-u>QQList<cr>
Autowarm will allow plugin to start sending warmup queries to local server
as user types, hiding the latency.
==============================================================================
7. Changelog *vimqq-changelog*
* version 0.0.6 (2024-09-16)
- Support for version control context based on git blame
- Bug fixes
* version 0.0.7 (2024-09-24)
- Support for Groq API
- Automated tests with mock llama server
- Bug fixes
* version 0.0.8 (2024-12-19)
- Autowarmup improvements
- Bug fixes
vim:tw=78:ts=8:ft=help:noet:nospell