Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11 #4

Open
selfAndrewKB opened this issue Feb 24, 2024 · 7 comments

Comments

@selfAndrewKB
Copy link

selfAndrewKB commented Feb 24, 2024

A warning upon first running the whisper model clued me in to it not using hardware acceleration:

UserWarning: FP16 is not supported on CPU; using FP32 instead

All I had to do in order to enable CUDA support was first uninstall Torch:
python -m pip3 uninstall torch

And reinstall with this command:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Confirm that CUDA is available in Python by running:
import torch
torch.cuda.is_available()

monkeyplug/whisper should now correctly use your GPU to significantly speed up operations. A youtube video with a runtime of 10:42 took 13 minutes and 42 seconds to process on my CPU with the medium.en model. After successfully enabling CUDA support, that same video took 3 minutes and 13 seconds to process on an RTX 3070. With noticeable accuracy over the default base.en.

I caught several warning messages that were raised during a job (might be related to generating timestamps?), but they don't seem to affect the operation at all:

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation...
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...`
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation...
warnings.warn(

Noticed that #3 might be in the works, which might help, but I thought it could be wise/helpful to share my findings regardless in the meantime.

PS: Whisper really is another tier of accuracy and is much appreciated.

@mmguero
Copy link
Owner

mmguero commented Feb 25, 2024

Interesting, on my Linux machine it was using the GPU right out of the gate just with pip install openai-whisper without any other steps on my end (double-checked with nvidia-smi during processing).

@selfAndrewKB
Copy link
Author

Oh and If it helps, this is a fresh install of Windows 11 and I actually used that very same command to install whisper following Python 3.12. Strange indeed.

@selfAndrewKB selfAndrewKB changed the title Whisper defaults to CPU instead of utilizing Nvidia GPU Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11 Feb 25, 2024
@bradyj04
Copy link

Are you still having this issue any, I tried your steps and mine persisted.

@mmguero
Copy link
Owner

mmguero commented Apr 25, 2024

Right now I don't have access to a Windows machine with a GPU, so I don't have any way to confirm or look into this.

@selfAndrewKB
Copy link
Author

Are you still having this issue any, I tried your steps and mine persisted.

Sorry to hear. It's been working just fine ever since. Could you provide more info about your setup? Operating system, whether you tried torch.cuda.is_available(), what it returns, any error messages you might've seen, etc.

@bradyj04
Copy link

Windows 11, getting the exact same error messages as you get in your original one. I'm currently just using a separate whisper program instead so no big deal, and yes torch returns true.

@therealmichaelberna
Copy link

therealmichaelberna commented Nov 15, 2024

@selfAndrewKB @bradyj04

For windows, I had to install an Nvidia triton windows compiler build from here : https://huggingface.co/madbuda/triton-windows-builds
Command:
pip install https://huggingface.co/madbuda/triton-windows-builds/resolve/main/triton-3.0.0-cp312-cp312-win_amd64.whl

If you have CUDA 12.6 or higher, this bugfix needs to be applied also.

https://github.com/triton-lang/triton/pull/4588/files (see the changed files tabs and note the added and removed lines)

For me, the file I had to edit was located in
C:\Users\User.conda\envs\monkeyplug_312\Lib\site-packages\triton\backends\nvidia\compiler.py

After this and doing the pytorch CUDA re-install, it worked for Windows.

Thanks for creating this and I hope this info can help someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants