Project Pipeline

Pattern Recognition project - Cairo Uni

Project Pipeline

Take input image along with its text file to train the model
Segment image to lines
Segment lines to words
Segment words to characters
Compare segmented characters with its text files
Take the valid data as training dataset
Feature extraction of the training dataset
Building the neural network model
Classification of the dataset through the model
Cross validation of the data (80% training – 20% validation)
Input any test image to be segmented and write its characters into text file

Pre-processing module

Input image is converted to a binary image then dilated to get lines and words per image. It was also found that this corrects the skew of the input images. Then by now, we have words to be segmented to characters and continue the flow of the pipeline.

Character Segmentation module

For every word image, a baseline index and maximum transition index are calculated. Baseline is the line with the most pixel density along with the whole line where most letters are on a horizontal segment with a constant width. Maximum transition is the horizontal line with maximum number of vertical transitions above the baseline. This is done by searching for the horizontal line with the maximum number of pixel value change (transition from 0 to 1 or from 1 to 0).
All available cut points are then determined but there may exist invalid ones among them.
Valid cut points are determined to be able to segment the word to characters.
Papers used: An efficient, font independent word and character segmentation algorithm for printed Arabic text (KING SAUDI UNIVERSITY)

Feature extraction/ selection module

Unsuccessful trials:

Image flattening
Vertical and horizontal projection of histogram
SIFT Descriptors

Successful (used):

Hue moments
Ratio between white and black pixels in the 4 quarters of the letter image
Center of mass
Connected components Papers used:
Using SIFT Descriptors for OCR of Printed Arabic (TEL AVIV UNIVERSITY)
Printed Arabic Optical Character Recognition Using Support Vector Machine (International conference on Mathematics and Information Technology, Adrar, Algeria)

Model selection and training modules

A fully connected neural network model is used with two hidden layers. The input layer receives feature vector of size 20. First hidden layer is 40 neurons with “Relu” activation function and the second one consists of 50 neurons along with “Relu” function also. The output layer has 29 neurons for the 29 class then determine the letter’s class using “Soft max” activation function. And for the training modules, SGD (stochastical gradient descent) is used as a training optimizer. Keras is used along with tensor flow as a backend.

Enhancements and future work (methods planned to try)

Improve character segmentation accuracy
Add more features as hole detection and determining location of dots in letters
Tune the hyper parameters of the neural network as number of hidden layers and number of neurons in each layer
Classify on handwritten Arabic OCR

Accuracy and Runtime

Training accuracy: 96%
Evaluation: 95.5%
Avgerage runtime: 5 seconds/image

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
KerasNN.py		KerasNN.py
LICENSE		LICENSE
README.md		README.md
Segmentation.py		Segmentation.py
TrainedModel.h5		TrainedModel.h5
commonfunctions.py		commonfunctions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Pipeline

Pre-processing module

Character Segmentation module

Feature extraction/ selection module

Model selection and training modules

Enhancements and future work (methods planned to try)

Accuracy and Runtime

About

Releases

Packages

Languages

License

ahmedosama9777/OCR-for-Arabic-Scripts

Folders and files

Latest commit

History

Repository files navigation

Project Pipeline

Pre-processing module

Character Segmentation module

Feature extraction/ selection module

Model selection and training modules

Enhancements and future work (methods planned to try)

Accuracy and Runtime

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages