Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compilation with FP16_QK_REDUCTION enabled. #962

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

diptorupd
Copy link
Contributor

As described in #806 and #936, setting the cmake build flag FLASHINFER_GEN_USE_FP16_QK_REDUCTIONS to "true" causes a build failure due to cuda_fp16.h not supporting constexpr cast from __half type to float. Note that the issue is not just a CMake/C++ configuration issue the issue will be triggered even in the flashinfer JIT code compilation path as reported in #915.

The PR fixes #806 and #936 by adding a modified version of the FP16 header from the FP16 library that supports constexpr versions of the conversion functions. To make the conversion functions constexpr, I am using std::bit_cast that is the reason for bumping the required standard to 20.

With these changes I am able to build the C++ API with both FLASHINFER_GEN_USE_FP16_QK_REDUCTIONS ON and OFF.

Fixes #936
Fixes #806

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant