Learning Visual Attributes

The aim was to design, implement and evaluate the optimal model to predict texture and colour of objects in images based on visual attributes. Training and test set were provided by Kasim Terzic and Lei Fang.

The dataset was based on a subset of the GQA dataset for learning attributes and relations. The GQA dataset consists of images where objects are annotated in terms of bounding boxes and relevant attributes and relations. Attributes extracted from bounding boxes included colour histograms (CIELAB), histograms of oriented gradients (HOG) as well as a set of complex cell responses based on oriented Gabor filters generated by the BIMP model.

The decision was made not to train a model for multi-output classification but to aim for two multiclass classifiers, each optimised for the classification of one respective target, color or texture. An experimental approach was chosen that revolved around the use of a partly customised pipeline to which parameters were passed to be included in extensive grid searches per classifier and target.

Self-contained scripts were defined per classifier, baseline_lr.py, dt.py, etc. Within these scripts hyper-parameter tuning was performed using grid search to find the specific classifier's optimal parameters for each target. The parameters can be found in multi-line comments at the end of the self-contained scripts.

The optimal parameters for the combination of classifier and target, a dictionary, were saved to disk using pickle and be found in the optimal-params directory. After the optimal parameters had been found, all classifier-target combinations were evaluated using cross validation on the training set, performed in eval_train_col.py and eval_train_tex.py.

Finally, classifier-target combinations were evaluated on the validation set, performed in eval_valid_col.py and eval_valid_tex.py.

The following classification algorithms were implemented using sklearn:

Logistic regression (baseline)
Decision tree
K-nearest neighbours
Multi-layer perceptron
Random forest
Stochastic gradient descent
Support vector

Based on the validation set evaluation, the following models and parameters were selected:

Model selected for color:

Random forest classifier
Validation set balanced accuracy: 0.183

Model selected for texture:

Support vector classifier
Validation set balanced accuracy: 0.254

The predicted output files colour_test.csv and texture_test.csv for the two classification tasks can be found in the output directory.

Compiling and Running Instructions

Navigate into the visual directory:

cd visual

Set up a virtual environment within the directory:

python3 -m venv my_env

Activate the virtual environment:

source my_env/bin/activate

Install the requirements to your virtual environment via pip:

pip install -r requirements.txt

To run a script, your command should take the following form:

python3 <file_name>

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
optimal-params		optimal-params
output		output
res		res
.gitignore		.gitignore
README.md		README.md
baseline_lr.py		baseline_lr.py
data_prep.py		data_prep.py
dt.py		dt.py
eval_train_col.py		eval_train_col.py
eval_train_tex.py		eval_train_tex.py
eval_valid_col.py		eval_valid_col.py
eval_valid_tex.py		eval_valid_tex.py
explore.py		explore.py
knn.py		knn.py
mlp.py		mlp.py
requirements.txt		requirements.txt
rf.py		rf.py
sgd.py		sgd.py
svc.py		svc.py
test_col.py		test_col.py
test_tex.py		test_tex.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Visual Attributes

Compiling and Running Instructions

About

Releases

Packages

Languages

buchacher/visual

Folders and files

Latest commit

History

Repository files navigation

Learning Visual Attributes

Compiling and Running Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages