You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
causes a compilation failure as the operator * cannot be resolved when the types are __half and float. Also, the assignment will also likely break because of the implicit float to __half conversion.
The LogitsTransform template is invoked inside the prefill kernel on line
I discovered this issue when working on fixing #806 and compiling kernels that were generated with the use_fp16_qk_reductions=true flag passed to aot_build_utils.generate.
I can apply a fix using a constexpr cast from fp16 to fp32 and vice-versa either at call-site or inside LogitsTransform . But, before I do any mechanical changes wanted to clarify the intent of the implementation.
The text was updated successfully, but these errors were encountered:
I want to clarify the semantics of the
LogitsTransform
function declared on :flashinfer/include/flashinfer/attention/variants.cuh
Line 69 in 738460f
The
logits
parameter is templated and presumably can support__half
. However, the computation of the value oflogits
onflashinfer/include/flashinfer/attention/variants.cuh
Line 75 in 738460f
operator *
cannot be resolved when the types are__half
andfloat
. Also, the assignment will also likely break because of the implicitfloat
to__half
conversion.The
LogitsTransform
template is invoked inside theprefill
kernel on lineflashinfer/include/flashinfer/attention/prefill.cuh
Line 690 in 738460f
I discovered this issue when working on fixing #806 and compiling kernels that were generated with the
use_fp16_qk_reductions=true
flag passed toaot_build_utils.generate
.I can apply a fix using a
constexpr
cast from fp16 to fp32 and vice-versa either at call-site or insideLogitsTransform
. But, before I do any mechanical changes wanted to clarify the intent of the implementation.The text was updated successfully, but these errors were encountered: