-
This problem is mainly for transcribing non-English audio. I have tested Japanese audio, Spanish audio, and Korean audio. If the language specified when generating subtitles is the same as the language of the transcribed audio (such as Japanese audio, set --language Japanese), the subtitle results will be There will be some selection errors for polysyllabic words. However, if you force the generated subtitles to be in English (such as Japanese audio, set --language English), you will get more accurate subtitle results than the original language of the audio. And the results obtained by using --language English to force the transcription of non-English audio are very different from --task translate, and are more accurate than --task translate. However, the timestamp may have incorrect positions or be of too short duration, and there may also be repeated hallucinations. It would be even better if the timestamps could be normal when using --language English to force transcribe non-English audio. Otherwise, you can only manually correct the timestamp of each line of subtitles. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Not so accurate timestamps on the translations is known thing. It think I even added a warning about it. I've no idea about your observations with |
Beta Was this translation helpful? Give feedback.
Not so accurate timestamps on the translations is known thing. It think I even added a warning about it.
I've no idea about your observations with
--language English
vs proper language and--task=translate
.Better ask there -> https://github.com/openai/whisper/discussions