-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greater error when converted via ctranslate2 #1168
Comments
I add some sample transcriptions. My fine-tuned model on
|
In CT2 conversion, remove the quantization and try again |
I have same issue. Remained the same after removing the quantization. |
After comparing the encoder_output, I found that there is a lot difference between whisper and faster-whisper inference, totally different I mean. I don't konw why. Cound you help with this? @MahmoudAshraf97 @hforghani |
I have removed the quantization but still too much hallucination.
|
Compare to Openai-whisper |
The third row of the table in this comment is related to |
In that table openai-whisper is similar to faster-whisper. |
All SpeechBrain settings:
My inference using res = asr_model.transcribe_file(voice_file, task="transcribe", use_torchaudio_streaming=True) Code of the method All faster-whisper settings:
Update: |
Better ask Speechbrain, they will know more about it. Here the aim is more or less to replicate the openAI not some Speechbrain's function. |
@nullscc from faster_whisper import WhisperModel, BatchedInferencePipeline
model = WhisperModel("tmp_whisper_ft_ctranslate2", device='cuda')
batched_model = BatchedInferencePipeline(model=model)
segments, info = batched_model.transcribe(voice_file, language="your_lang") Some config may differ in |
@hforghani No, I tried both, WhisperModel and BatchedInferencePipeline, still get worse result. @MahmoudAshraf97 In my situation, I finetuned the model, and always decode using openai-whisper. Before finetuned, I got nearly same result when decoding using pretrained openai-whisper. But after finetuned, I got worse result when decoding using CTranslate2 converted model compared to inference using openai-whisper code. Have no idea till now. |
@nullscc Did you find the problem? I am experiencing the exact same issue when i compare performance of a finetuned whisper model between faster whisper and the original whisper. Looking at the logits they are different between the implementation. I think this is some numerical issue, will investigate further and open an issue in the coming days, just wondering if you found a solution since it sounds like a similar problem? |
I fine-tuned a Whisper large-v3 model via speechbrain framework. I want to convert it to
faster-whisper
model and run inference on it viafaster-whisper==1.0.3
. For this sake I first saved the model and weights:Then I converted the model via
ctranslate2==4.5.0
tofaster-whisper
format following this instruction infp16
quantization:After that I ran inference on it:
I ran this inference on a dataset containing 400 samples and averaged WER and CER. But I received greater errors than speechbrain:
Why the converted model in
faster-whisper
format obtains far greater error rates thanspeechbrain
? You may think it is due to quantizationfp16
but the base modelWhisper-large-v3
with the same quantization onfaster-whisper
gains almost equal error rates in comparison withopenai-whisper
.The text was updated successfully, but these errors were encountered: