The aim was to design, implement and evaluate the optimal model to predict texture and colour of objects in images based on visual attributes. Training and test set were provided by Kasim Terzic and Lei Fang.
The dataset was based on a subset of the GQA dataset for learning attributes and relations. The GQA dataset consists of images where objects are annotated in terms of bounding boxes and relevant attributes and relations. Attributes extracted from bounding boxes included colour histograms (CIELAB), histograms of oriented gradients (HOG) as well as a set of complex cell responses based on oriented Gabor filters generated by the BIMP model.
The decision was made not to train a model for multi-output classification but to aim for two multiclass classifiers, each optimised for the classification of one respective target, color or texture. An experimental approach was chosen that revolved around the use of a partly customised pipeline to which parameters were passed to be included in extensive grid searches per classifier and target.
Self-contained scripts were defined per classifier, baseline_lr.py, dt.py, etc. Within these scripts hyper-parameter tuning was performed using grid search to find the specific classifier's optimal parameters for each target. The parameters can be found in multi-line comments at the end of the self-contained scripts.
The optimal parameters for the combination of classifier and target, a dictionary, were saved to disk using pickle and be found in the optimal-params
directory. After the optimal parameters had been found, all classifier-target combinations were evaluated using cross validation on the training set, performed in eval_train_col.py and eval_train_tex.py.
Finally, classifier-target combinations were evaluated on the validation set, performed in eval_valid_col.py and eval_valid_tex.py.
The following classification algorithms were implemented using sklearn:
- Logistic regression (baseline)
- Decision tree
- K-nearest neighbours
- Multi-layer perceptron
- Random forest
- Stochastic gradient descent
- Support vector
Based on the validation set evaluation, the following models and parameters were selected:
Model selected for color:
- Random forest classifier
- Validation set balanced accuracy: 0.183
Model selected for texture:
- Support vector classifier
- Validation set balanced accuracy: 0.254
The predicted output files colour_test.csv and texture_test.csv for the two classification tasks can be found in the output
directory.
Navigate into the visual
directory:
cd visual
Set up a virtual environment within the directory:
python3 -m venv my_env
Activate the virtual environment:
source my_env/bin/activate
Install the requirements to your virtual environment via pip:
pip install -r requirements.txt
To run a script, your command should take the following form:
python3 <file_name>