-
-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REQUEST] Support for RTX 5090 #285
Comments
TabbyAPI can technically support any torch version. However, each subsequent wheel-based dependency needs to be built around that version. In addition, CUDA 12.8 + Windows is not supported for pytorch and the issues to support them are still open. Wheel dependencies: Since torch 2.7 is currently nightly, Tabby won't support it out of the box due to the unstable nature of nightly builds. However, you can build wheels for the above programs using your configuration and install them in a new venv. To skip wheel installs in a venv:
Once Torch 2.7 is stable, the wheels can be updated in Tabby's pyproject and the CUDA version can be bumped to 12.8. I'll keep this issue open for tracking. |
Thanks, I'm happy to try out and report back. |
Assuming you have Blackwell-compatible torch already installed in your tabby venv: For exllamav2, you should be able to For flash-attn, I'd say activate your tabby venv and then follow the instructions to install from source here: https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features. This package is pretty heavy to compile from source though, will take a while to run. |
Thank you. Server is running, didnt try yet to do anything, but here are a few commands I used to get exllavav2 to work with tabby:
I had to do the And flash attention took like 2-3 hours. It heated the room pretty well. I'll update the ticket with the next findings. |
I got Just to double check I compiled and installed flash attention 2 correctly: is there any way to test if tabby is using it correctly for a model? |
Check in Tabby's logs to see if you're falling back to "compatibility mode". If those messages don't show up, you're using FA2 |
Problem
Only cuda 12.8 supports RTX 5090.
When trying a vanilla tabby setup with cuda 12.x, these blockers pop up:
I tried reinstalling pytorch after all the setup:
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
But then tabby is pretty upset about it:
Solution
Support rtx 5090 so that we can use it with tabby
Alternatives
No response
Explanation
we cannot use tabby with rtx 5090 otherwise
Examples
No response
Additional context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: