Skip to content

Latest commit

 

History

History
37 lines (21 loc) · 1.57 KB

data_sheet.md

File metadata and controls

37 lines (21 loc) · 1.57 KB

Datasheet

Motivation

This dataset seems to have been created as a learning tool. Though no clear indication as to the authors original intent.

Composition

  • Each instances represents a labelled image of a monkey of one of ten species.
  • There are approximately 1400 images in the dataset (Details of the breakdown are available in the monkey_labels.txt file included with the dataset)
  • There doesn't appear to be any missing data, and all of the classes have roughly the same quantity of images
  • There is no data included that could be considered confidential

Collection process

  • The data is scrapped from a google image search for each of the listed species of monkey
  • As this is a list of google results the dataset could technically be extended, but this dataset has not been sampled in any random way from a larger set.
  • This data was gathered in a single instance in 2017, and was updated once in 2018.

Preprocessing/cleaning/labelling

  • No pre-processing or additional labelling was carried out on this dataset

Uses

  • This dataset is only useful in image classification tasks, it may useful for both broad and fine-grained classification tasks. This will depend on the granularity of classification required.
  • There is no sensitive data associated with the dataset, so there are no immediately obvious reasons to restrict its use.

Distribution

  • This dataset was obtained from Kaggle, and is published under the Public Domain.

Maintenance

  • There is no regular maintenance of this dataset