You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed when testing the tesseract engine directly on the original image without image pre-processing that I got often better results than when using the pre-processed image in SubtitleEdit.
So I would appreciate it, if there could be a checkbox in the image pre-processing tab to just use the original image for OCR without any image pre-processing applied.
The text was updated successfully, but these errors were encountered:
Im using Tesseract 5.5 currently but I noticed the same using version 5.3.
Here is a zip-file with an example: OCR test.zip
In there are two pictures and three screenshots with the results, one time using Tesseract from the command line and the other two using tesseract through Subtitle Edit. On the full picture tesseract using the preprocessed image falls completely apart whereas tesseract OCRs the orignal image from the command line pretty well.
I am aware that I maybe could get a better result with a different binary image threshold or inverting the colors, but on these two pictures I can not find values that get even remotely close to the accuracy from just using the original image. And you dont even need to fine tune the binary image threshold.
Therefore I would really appreciate this added functionality.
I noticed when testing the tesseract engine directly on the original image without image pre-processing that I got often better results than when using the pre-processed image in SubtitleEdit.
So I would appreciate it, if there could be a checkbox in the image pre-processing tab to just use the original image for OCR without any image pre-processing applied.
The text was updated successfully, but these errors were encountered: