This repository features a Google Colab notebook where I explore multi-class video classification using the UCF11 video dataset. The dataset is available here on Kaggle.
In this project, I built a mini 3D Convolutional Neural Network inspired by Pytorch's 18-layer 3D Resnet Model and trained it using two different methods:
- Custom Dimensions Transformation: All videos were resized to 100x100 pixels. This approach achieved a 32% test accuracy over 10 epochs, taking approximately 1 hour.
- Inward Cropping Transformation: Videos were cropped inward by 10% of their original length to focus on the most crucial parts. This method resulted in a 54% test accuracy over 10 epochs, taking about 1.4 hours.
Future steps include experimenting with other transformations, tuning hyperparameters, and training over more epochs.