llama_decode + llama_get_logits_ith() JNI android integration , GGML_ABORT("fatal error"); #14245
-
My question is : is their a function to populating the output_ids array based on ubatch.logits. ? I'm working with the latest llama.cpp and using the llama_decode + llama_get_logits_ith() workflow in a JNI integration. I’m getting a fatal error from this function: float * llama_context::get_logits_ith(int32_t i) { I call llama_decode() with a batch containing 1 token, at position n_cur, with logits[0] = true. Then I call llama_get_logits_ith(n_cur - 1). The first decode works (prompt prefill), and the first generated token works too. On the second iteration, calling llama_get_logits_ith(n_cur - 1) throws an error because output_ids[n_cur - 1] == -1, which means batch.logits wasn't true during decode. However, I did set logits[0] = true in every decode batch. Why is this still failing? This is the function : ` extern "C"
} ` gemini told me this: Can you help me reason through what causes output_ids[i] = -1 even when logits=true? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Try to replace this line: float* logits = llama_get_logits_ith(wrapper->ctx, n_cur - 1); with float* logits = llama_get_logits_ith(wrapper->ctx, -1); |
Beta Was this translation helpful? Give feedback.
Try to replace this line:
with