The goal of this project is to classify images of cats and dogs using a Random Forest Classifier from the Scikit-learn library. We aim to explore how machine learning models can be applied to image classification tasks.
This dataset consists of images categorized into two classes: Cats and Dogs. It is suitable for binary classification in computer vision tasks.
The dataset is sourced from Kaggle:
To run this project, install the required dependencies using the following command:
pip install numpy==1.22.3 pandas matplotlib scikit-image scikit-learn joblib opencv-python imageio[pyav]
- Import Necessary Libraries: Import all required libraries and suppress warnings.
- Load, Resize & Store Images: Resize images from the 'PetImages' folder, categorizing them into 'Dog' and 'Cat', and store them in a serialized format (.pkl).
- Read and Preprocess Data: Load the resized images, preprocess them (normalize), and split them into training and testing sets.
- Feature Extraction using HOG: Extract features using Histogram of Oriented Gradients (HOG) from the images.
- Hyperparameter Tuning: Tune the Random Forest Classifier's hyperparameters using RandomizedSearchCV to optimize model performance.
- Fit and Predict: Train the final model with the optimized parameters and predict on the test set.
- Custom Prediction Function: Define a function to predict labels for custom images using the trained model.
README.md
: This file, providing an overview of the project.main.ipynb
: Jupyter notebook containing the entire code with explanations and steps.PetImages/
: Folder containing original images of cats and dogs.