Replies: 4 comments 2 replies
-
I couldn't come to conclusion about 356 PR effects, on every run output is different in those problematic audio areas. |
Beta Was this translation helpful? Give feedback.
-
https://github.com/archive-r/faster-whisper/tree/compare-results This is a branch that adds fallback result logging to
@Purfview, would you mind testing this when you have a chance? Edit: I made some modifications to the proposed logic. |
Beta Was this translation helpful? Give feedback.
-
Hi @Purfview, thanks for sharing your test results. |
Beta Was this translation helpful? Give feedback.
-
After a week of testing, I saw slight improvements in English, Korean, and Japanese. I haven't tested it on other languages. |
Beta Was this translation helpful? Give feedback.
-
e786e26
In the above commit, we changed the logic for selecting decoding results to the following
find the result that satisfies
compression_ratio_threshold
with the highestavg_logprob
.if no such result exists, find the result that satisfies the highest
avg_logprob
among "all" results.After applying this, there was a noticeable improvement.
So we can assume that @guillaumekln's idea that when choosing a
decode_result
, it is better to consider bothavg_logprob
andcompression_ratio
, rather than just choosing the result with the highestavg_logprob
value, is valid.If so, instead of jumping directly from step 1 above to step 2, we could also consider the following modified logic.
find the
decode_result
that satisfiescompression_ratio_threshold
with the highestavg_logprob
.if no such result exists:
Instead of finding the highest
avg_logprob
value from all results,Find the value of highest
avg_logprob
while gradually increasingcompression_ratio_threshold
.A simple example code is shown below.
In my tests on a variety of sources, the higher the
compression_ratio
was above the default threshold of 2.4, the more likely it was to be a badly transcribed sentence. When thecompression_ratio
was in double digits, it was very unlikely to be a correctly transcribed sentence.By applying this alternative, we can adaptively choose the final result by increasing the bounds of the compression ratio threshold.
What do you think @guillaumekln, if you think this is helpful, I'll submit a PR.
Beta Was this translation helpful? Give feedback.
All reactions