How about an EVNT chunk in the wave files? #1473
FrontierDK
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all :)
When using the good ol' SAPI interface, events could be put into the .WAV files, allowing for accurate timing when highlighting words on a web site etc. This happened using the EVNT chunk in .WAV files. How about adding it to the TTS part? Some also call these SSML tags.
Here is a demo .WAV file which contaings events.
Events.zip
Besides text high-lighting, it can also be used for other functions, such as lip syncing:
https://www.youtube.com/watch?v=ui9XT47uwxs
More info
https://documentation.help/SAPI-5/WP_SimpleTTS.htm
https://groups.google.com/g/microsoft.public.speech_tech.sdk/c/VfotWbZ7oDQ?pli=1
https://groups.google.com/g/microsoft.public.speech_tech.sdk/c/R6vbasYoHNQ/m/1EpOaKUslloJ
https://github.com/JakobOvrum/speech4d/blob/master/source/speech/windows/sapi.d
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-speech-synthesis-viseme?pivots=programming-language-csharp
https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html
Here are the SAPI events for the attached file (in seconds)
Beta Was this translation helpful? Give feedback.
All reactions