-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcriptions have no spaces - wav2vec2-xls-r-1b-spanish #51
Comments
This problem seems to be fixed by using the automatic-speech-recognition pipeline. With and without chunking. Not really sure what is happening. code: transcriptions = pipe(str(correct_path)[1:]) Additionally I tested chunking in the pipeline. My first thought was that there was a problem with the length of the audios, but after testing different chunking parameters and then without chunking, it worked perfectly. The only thing I would note is that chunking significantly increases the time of processing the audio. I saw processing times of twice as long and up to seven times more. In terms of accurately transcribing the audios the longest to compute (of 10s chunks) seemed to work the best, but it is not worth the computation time, since 30s chunks which only doubled the processing time was almost as good. |
Same issue using the jonatasgrosman/wav2vec2-large-xlsr-53-german model |
You should try to add a language model. See here: |
@santideleon Hi and I have this same issue using wbbbbb/wav2vec2-large-chinese-zh-cn model. Have you solved this problem? |
I am working on Speech to Text for ~135 (or less) second audios of spanish recorded by lapel microphons or VR goggles. I am using wav2vec2-xls-r-1b-spanish and the language model lm.binary and unigrams.txt provided. They are the ones downloaded from jonatasgrosman/wav2vec2-large-xlsr-53-spanish, but based on the size they seems to be the exact same for 1b. I originally started with large version, but I opted for 1b for better performance.
My plan is to work on the text with the pysentimiento pre-trained spanish sentiment and emotion analyzer. The problem I have is that the text does not have spaces separating the words.
Is there a quick fix for this or any suggestions?
Example:
alesundíamanormalparamímelevantosobrelasochodelamañana desayunasepredesayunoalomismodeayunosquirconceriales yfrutameduchomeevistoacosasenchilavoycaminandosube lacuestahastaelaparadadelautobustyietesperoquevenga autobusesestallevaalaparadadesanlorenzocojoelmetro
code:
The text was updated successfully, but these errors were encountered: