Progress on the rewrite for older cards (Like the P40) #279

TimyIsCool · 2023-09-08T16:32:23Z

Was wondering what the current progress was on the rewrite and if this could be turned into some sort of tracker for it? optimizations for the P40 seems to be something many would like

Ph0rk0z · 2023-09-10T11:14:58Z

I think V2 is in the works. Not sure if it will have support for P40 but then again, you have llama.cpp that is all FP32 and I can run Q5KM and Q6 quants on it. If you apply the peer access patch it even does direct transfers on linux. For nvlink it's faster than exllama. Some downsides in how it processes prompts and mem efficiency but other than that, you can use it today.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress on the rewrite for older cards (Like the P40) #279

Progress on the rewrite for older cards (Like the P40) #279

TimyIsCool commented Sep 8, 2023

Ph0rk0z commented Sep 10, 2023 •

edited

Loading

Progress on the rewrite for older cards (Like the P40) #279

Progress on the rewrite for older cards (Like the P40) #279

Comments

TimyIsCool commented Sep 8, 2023

Ph0rk0z commented Sep 10, 2023 • edited Loading

Ph0rk0z commented Sep 10, 2023 •

edited

Loading