-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel crash with Tesla P40 GPU on CUDA 12.1, but works fine on Google Colab with CUDA 12.0 #378
Comments
faster-whisper requires CUDA 11. We don't expect it to work with CUDA 12. Since you have some CUDA 11 packages installed with pip, you could make use of them using this technique: #153 (comment) |
Thanks for the comment. From the issue you sent, when I perform Moreover, I am still a bit confused why the package does work on a Google Colab with CUDA 12.0 and same driver? The only thing I can pinpoint for now is the cuBLAS issue, as it is probably present on the Colab machine. Doing
on the Colab returns output, while the I know this question kinda derailed from the |
Google Colab uses CUDA 11.8. The "CUDA Version: 12.0" that you see in nvidia-smi corresponds to the CUDA version associated with the GPU driver, but it does not mean that this CUDA version is installed on the system.
The CUDA libraries installed with pip and the CUDA libraries installed on the system are 2 different things. If your current Python environment contains the CUDA libraries as listed above then the technique shown in #153 (comment) should work. If for some reasons you can't use these libraries installed with pip, then you should install CUDA 11 and cuDNN for CUDA 11 on the system, either following the installation instructions from NVIDIA or using a Docker image. |
File "D:\Python310\lib\site-packages\faster_whisper\transcribe.py", line 573, in encode |
Looks it also not support well in windows cuda12 |
Can it be update to be compatible with cuda 12? |
Updating to CUDA 12 is not planned in the very short term. See an explanation here: #47 (comment). |
I'm experiencing a kernel crash when running the
faster-whisper
model on a Tesla P40 GPU in my offline environment, while the same package/model works perfectly fine on Google Colab equipped with a Tesla T4 GPU.Environment Details:
Offline Environment:
GPU: Tesla P40
NVIDIA-SMI: 525.105.17
Driver Version: 525.105.17
CUDA Version: 12.1
Google Colab:
GPU: Tesla T4
NVIDIA-SMI: 525.105.17
Driver Version: 525.105.17
CUDA Version: 12.0
Observations:
Multiple GitHub issues here suggested the package is optimized for CUDA 11 and not CUDA 12. However, since it works in Google Colab with CUDA 12.0, I'm curious why my offline setup with CUDA 12.1 crashes. I also don't really see anything in the logs of my Jupyter process, the kernel just dies, and seeing as I am trying to transcribe a small audio file, and looking at the output of
nvidia-smi
, I can't imagine this is an OOM error.The model works perfectly on my offline machine when running on a CPU.
Standard
whisper
models and other HuggingFace models operate smoothly on my GPU.I receive the following warning when executing the model:
When I explicitly specify float16, the model crashes citing the aforementioned reason.
Dependencies
I see some interesting things in my
pip freeze
within my env:These packages seem to refer to CUDA 11 stuff... I don't know if that could be an issue.
I'd appreciate insights into why the kernel crashes in my offline setup, even though other environment (Google Colab) with also CUDA 12 don't experience this issue. It seems like there might be nuances with CUDA 12.1 or maybe some environment configuration of my machine.
Thanks in advance!
The text was updated successfully, but these errors were encountered: