This cutting edge deep learning class has the following features:
- Teaches approaches to solve unsolved problems
- Operates on the cutting edge of research.
- Some techniques are extremely new.
The lecture tackles single object detection in the image first.
This process is divided into a few steps
Download Pascal 2007 dataset from the website. Then to load the dataset we exeute the follwoing code
PATH = PATH('<path-to-pascal-data>') # PATH is a handy library to confiure paths in filesystem
list(PATH.iterdir(()) # gets all filenames in the directory
#open a file
trn_j = json.load(PATH/'one.json').open())
trn_j.keys()
Hint: You can name some data types as CONSTANTS for auto completion as given in the below example
FILE_NAME, IMG_ID, CAT_ID, BBOX = 'file_name','id','image_id','category_id','bbox'
Use defaultdict in python if you want to create a default dictionary for new keys.
trn_anno = collections.defaultdict(lambda:[])
# load dict with key being image_id and value being bounding box and category id of the object.
# fast ai is going to refer rows by columns always.
# bbox coordinates consist of left top x,y and bottom right x,y
Hint: Create a function to transalte bbox coordinates back into x,y,height and width format
Fast.ai uses opencv because of
- OpenCV is 5-10x faster than Pillow, Scikit-Image
- Pillow is not very thread safe
- Because of GIL (Global Interpreter Lock), python imposes a big lock which prevents two threads from doing pythonic things. This is a disadvantage of python.
- Open CV, doesn't use Python to launch threads. Open CV runs thread in C. Hence very fast.
Disadvantages
- OpenCV has a very shitty API.
So use Open CV , as it fast.
Matplotlib was very shitty before, the folks there decided to maintain a new very useful object orientated wrapper. No examples online to understand the much better API.
- plt.subplots
fig,ax = plt.subplots(figsize=figsize)
ax.imshow(im) #displays image
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
It returns figure and axis.The use axis to set various thing on the plot
We can draw outlines on boxes via matplotlib.
Load dataset. That means load images and its corresponding bounding box. First we look at only generating the bounding box for thr largest object in an image.
We filter the annotations so that we only take the largest bounding box for the image.
Now, that we have our dataset. We load the dataset into fast.ai by saving our data to a csv and then using the inbuilt fastai dataloader to process our data.
The easiest way to save to the csv is pandas DataFrame.
Then we train a simple resnet classifier to predict the class of largest object.
--- pdb.set_trace()
--- %debug
pops open the debugger at the last exception
To predict bounding box around the object, we construct a regresiion model to predict the 4 coordinates. The MSE loss is good for regression, so we'll use that. We'll use a variation of MSE called L1 loss which is much better for unbounded inputs. MSE penalises too much, hence L1 is better.
We lr_find to find the learning rate.
So in short, we create a single object detector classifier. Trained with the latest techbiques using a simpel classification and regression head Both are trained seperately.