-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue building DockerFile #315
Comments
@hobodrifterdavid Thanks for bringing up the issue. The documentation is a bit outdated. Can you please try the latest main branch and this Docker command instead: docker run --name wordcab-transcribe --gpus all --shm-size 1g --restart unless-stopped -p 5001:5001 -e WORDCAB_TRANSCRIBE_API_KEY="x" -e WHISPER_MODEL="medium" -e WHISPER_ENGINE="faster-whisper-batch" -e ALIGN_MODEL="tiny" -e DIARIZATION_BACKED="longform-diarizer" -e COMPUTE_TYPE="float16" -e DEBUG="True" -e USERNAME="admin" -e PASSWORD="password" -e OPENSSL_KEY="0123456789abcdefghijklmnopqrstuvwyz" -e WINDOW_LENGTHS="2.0,1.5,1.0,0.75,0.5" -e SHIFT_LENGTHS="1.0,0.75,0.625,0.5,0.25" -e TENSORRT_LLM_VERSION="0.9.0.dev2024032600" wordcab-transcribe The environment variables are from the .env file, feel free to customize. |
On the second machine, I'm able to build if I add ipython to requirements.txt. The 'docker run' command in the readme does start the container sucessfully, and I'm able to process a request, but it errors out if I try to use the VAD. It seems okay with the updated command you sent. On the first machine, still illegal memory access, but I will wipe the machine and try again. I got a few questions. :) Is there a preferred backend for processing a long file over multiple GPUs? In your docs, TensorRT-LLM doesn't allow passing a prompt. The prompt is useful for nudging the model towards outputing zh-CN or zh-TW, as there is only a single supported Chinese language code for whisper. Although, I guess machine translation as a post-processing step might be reasonable way to handle this. Faster-Whisper has a length_penalty parameter that I understand increases the probabilty of the 'end of segment' token, the longer the segment gets. I think it's useful for pushing the output towards making shorter segments/subs. Could it be exposed in the API? The current output often gives segments that are too long to show as subtitles. btw, I noticed today that stable-ts has a set of functions for splitting and merging subs, although a proper sentence segmenter would additionally be helpful. |
@hobodrifterdavid I noticed the missing IPython as well, check out the latest main branch that I just pushed, should resolve a few issues. I kind of now prefer the Whisper engine I just added, Use the edited FastAPI docs are a bit weird for list input, so if you want to add vocab you'll need to use |
I wiped the first machine, it runs fine now. I didn't see the length_penalty param in the docs yet. The Silero VAD is used? Do you know how it compares to other VADs (nemo etc.), in different languages? |
I think you might have not pushed the length_penalty. 👀🙂 |
Hello. This project looks very interesting. I hit some issues building the Dockerfile as described in the readme:
On the first machine (Ubuntu Server 22 LTS, 4x 3090), the build process completed, but I got an 'illegal memory access' error, I think from a CUDA library, when starting up. This machine previously had a modified nvidia driver for P2P access, so it's possible it's not your issue. (tinygrad/open-gpu-kernel-modules#4)
On the second machine (Ubuntu Server 22 LTS, 1x 3090), initially I had an error about the specific version of openssl not being available or compatible, I removed the version number specified in the Dockerfile, and the build continued. But the latest error is "ModuleNotFoundError: No module named 'IPython'"
Just a heads up, ideally I'd be able to help you debug.
The text was updated successfully, but these errors were encountered: