-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add initial GPU support #4
base: master
Are you sure you want to change the base?
Conversation
Closes #3 |
Just wanted to leave my 2cents here: (Did not try Piper) |
Piper does not work because of this: rhasspy/rhasspy3#49 |
Whisper is still targeting 20.04 is there a reason for that? |
This may need to be its own image since the majority of users would not want the cuda version |
could this be split into 2 tickets one for whisper and one for piper. The whisper portion is in reality the more useful of the two and benefits more from this feature. If piper is experiencing issues. |
@wdunn001 From the documentation https://github.com/guillaumekln/faster-whisper/ it says it requires cuDNN 8 for CUDA 11, and for those versions of CUDA and cuDNN the highest version of ubuntu available is 20.04, and I had to look for it because it was not working with the image I set for the other containers sadly. |
Sorry, editing because I missunderstood your comment. But I guess for better maintainability the solution we add for one should be the same as for the others, for that is I think is better to have the conversation in a single issue and PR. |
And I'll try to add porcupine1 too |
Awesome! I am happy to help if you need anything. Would we want to add the docker arguments for the CUDA image to the documentation here? |
I added the changes. And yes, ofc we should document this, also I was thinking should we add a docker-compose.yml file? |
But in the README.md file right now there is just the documentation for using it pulling the images, not building them, so that will depend on the tags the maintainer might wanna use. Should we add building instructions to the README.md file? |
I think so for sure we can create a contributors section. I'll work on it I will be building it for the first time this weekend so I'll try and document the process. |
I will give you the docker-compose files and a starting point. |
I just added it, tell me how it works for you, you can create your own docker-compose.x.yml file for your use case. I have not added porcupine1 to the docker compose because it uses the same port as openwakeword, so for that particular case it could be added in the custom extend file. |
ok so I am getting an error deploying this via compose or run usage: main.py [-h] --model {tiny,tiny-int8,base,base-int8,small,small-int8,medium,medium-int8} --uri URI --data-dir DATA_DIR [--download-dir DOWNLOAD_DIR] [--device DEVICE] [--language LANGUAGE] [--compute-type COMPUTE_TYPE] [--beam-size BEAM_SIZE] [--debug] It needs additional params in contrast with the other build. These appear to be supplied by the run.sh file and I see its called in the Dockerfile. I added commands to the GPU compose file identical to those in the NOGPU version and they work fine and made a pr. Its only the ones in the run.sh that seem to not work. I am on Ubuntu 22.04 with latest docker is that matters. |
This is weird, according to the documentation, the only thinks not extended should be |
I needed to add
New to contributing, happy to hear thoughts. |
I rebased with the last chnages from master and the typos in the readme file. I don´t think we need to create another branch for the meanwhile you can just have an extend file where you use GPU options for whisper and openwakeword and nongpu for piper. And regarding /var/data, I am generally against storing user data in a system folder. And passing all the folder to the docker container might load a lot of data that is not needed from other applications. |
@edurenye agreed using cpu for piper seems to be more than sufficient. I am still experiencing issues with openwakeword but it may just be my environment. I'll pull down the changes here and try again. I'll push any fixes I find to the PR on your branch. |
piper/GPU.Dockerfile
Outdated
@@ -0,0 +1,35 @@ | |||
FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we remove this file in the interim to get rid of dead code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see it as dead code, when this issue gets fixed it should just work right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok sounds good
porcupine1/GPU.Dockerfile
Outdated
@@ -0,0 +1,32 @@ | |||
FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove to get rid of deadcode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see it as dead code either, the people that wants to use it can just use it extending the docker compose or use it directly with docker run
as documented here: https://github.com/rhasspy/wyoming-porcupine1/blob/master/README.md but adding the cuda stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
.gitignore
Outdated
@@ -0,0 +1,12 @@ | |||
# OpenWakeWord |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we reference managed volumes instead to prevent this?
i.e.
volumes:
openwakeword-data:
whisper-data:
piper-data:
this is what I did in my version.
we could also add -gpu for volumes connected to gpu enabled instances in the GPU compose file so that we can keep data seperate between instance types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean non binded mounts? But then adding custom models (thinking mainly about OpenWakeWord here) is hard, with binded mounts you can just move the model to that directory. Also I don't think there will be a case where you want to move from GPU to NONGPU changing models, but probably I am wrong there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with you here, probably the best way is to not bind them by default and then you can bind them extending the docker compose and point wherever you have the custom model.
Or maybe we could look at passing it as a parameter, haven't looked into it, I'm still fighting to generate the custom model actually.
docker compose down | ||
``` | ||
|
||
### Run with GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we reference documentation on how to setup docker for gpu? (I can of course add it in a seperate pr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea!
Good finiding! Was not documented, but that parameter exists in https://github.com/rhasspy/wyoming-faster-whisper/blob/master/wyoming_faster_whisper/__main__.py |
Can u resolve the conflicts? I would love to see the improvements from using the GPU directly :) |
Doesn't work with piper since wyoming-piper doesn't declare the |
@edurenye when I try to do
for me. When I comment out
where did you add that parameter?
did you add it in your new commits? I can't see it.
|
@spitfire There are problems with the piper that I'm trying to sort out right now, but |
a quick |
Whisper fails for me with the image you've specified:
it works with:
|
With the updates I made, Whisper should work now, but piper is still getting this error: rhasspy/wyoming#9 |
At least piper is really trying to use CUDA now |
Also I tested OpenWakeWord and I'm getting the following error:
Seems like https://github.com/rhasspy/wyoming-openwakeword is quite broken from the amount of issues I see in the repository. So I just go back to use |
Same here. Didn't mention it since it was quite obviously broken and I use wake word locally on my satellites
I don't even know what half of these tools do TBH |
Can you guys try out https://github.com/AnkushMalaker/wyoming-openwakeword/tree/master? Its PR #39 on https://github.com/rhasspy/wyoming-openwakeword |
Can this be merged and a container release published? I want to use my GPU with the Whisper container because it's otherwise unbearably slow. |
You can already run it from the repo https://github.com/edurenye/wyoming-addons-gpu/tree/gpu using docker-compose just like I am doing it. |
Thanks for the prompt response. I'm not using Docker. I'm using Podman. How do I run this directly as a |
you can run it using |
Correct, I can do that. That is exactly what I don't want to do — clone some repo in my production computer and run a I was hoping there would be a container image I could test — but, if there isn't, then I guess I'll have to wait until all of this is merged to |
It's exactly as safe as running a pre-built container. You can review anything before you run it and nothing is installed on the host system. Just like with pre-built images docker is downloading a base image, and overlaying or putting stuff inside it. |
I found a way to run the build process on my build server using Podman. Thanks for your kind help. |
Didn't work.
I'm using the latest NVIDIA driver DKMS on Fedora. Not sure what could be going on here.
In container:
|
|
Did you install docker nvidia container toolkit? |
Nope, and now I have installed it, and now it works. Thanks for the tip. |
For the curious, how I solved my problem:
Then, to run the container:
|
WOW. The difference is overwhelming and totally worth doing. From 7 seconds for a sentence with 15 words, to absolutely no time at all. It makes the Voice Assistant Preview Edition faster than Google Home or Amazon Alexa. My self-built Whisper container is using the 24.04 Ubuntu version. I recommend that the container be based on that version instead of the ancient 22.04 version it currently is. |
This is a work in progress.
I think for whisper it is working, but I'm not sure how to check it.
And for piper it is giving me an error
unrecognized arguments: --cuda
, but I got the instructions from here: https://github.com/rhasspy/piper At the end it says that it should work just installingonnxruntime-gpu
and running piper with the--cuda
argument.What am I missing?
I guess this will conflict with those that just want to use the CPU, how can we handle that? Making different images?
Ex: piper and piper-gpu