Skip to content

Commit a0c0fb6

Browse files
committed
kv-cache : simplify the non-FA branch
ggml-ci
1 parent 0197ac9 commit a0c0fb6

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

src/llama-kv-cache-unified.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -792,8 +792,7 @@ ggml_tensor * llama_kv_cache_unified::cpy_v(ggml_context * ctx, ggml_tensor * v_
792792
// TODO: this seems not very optimal - can we do something better?
793793
v_view = ggml_reshape_3d(ctx, v, 1, v->ne[1], v->ne[0]);
794794

795-
v_cur = ggml_cont(ctx, v_cur);
796-
v_cur = ggml_reshape_3d(ctx, v_cur, 1, n_tokens, hparams.n_embd_v_gqa(il));
795+
v_cur = ggml_cont_3d(ctx, v_cur, 1, v_cur->ne[0], v_cur->ne[1]);
797796

798797
kv_idxs = ggml_repeat_4d(ctx, kv_idxs, v_cur->ne[1], v_cur->ne[2], 1, 1);
799798

0 commit comments

Comments
 (0)