Dogs vs Cats Image Classification with Keras

jupyter_notebook data_science neural_networks tensorflow keras classification featured

Differentiating Dog vs Cats Using Keras

Image Classification of dogs and cats using Convolutional Neural Networks¶

In this post, I'll be showing how to train a convolutional neural network (CNN) to differentiate between dog and cat pictures. The data was downloaded from the 'Dogs vs Cats' dataset found in Kaggle. You can see it here: https://www.kaggle.com/c/dogs-vs-cats/data?select=train.zip. I'll go through the process of downloading the data, organizing it, preprocessing it, building a CNN model, assessing its performance and optimizing it. Finally, I'll use transfer learning with pretrained models to compare the performance of the built model. The models were built using Keras.

Modules used in this project¶

The modules below were used in this project:

In [6]:

# General utilities
import matplotlib.pyplot as plt
import random
import os, os.path , sys
from os import makedirs, listdir
from PIL import Image
from shutil import copyfile
import numpy as np
import pandas as pd

#Use tensorflow with GPU
import tensorflow as tf

#Keras modules
from keras.utils import to_categorical
from keras.models import Sequential, Model #Build a sequential/pretrained model
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout #Layers that make up models
from keras.optimizers import SGD #Optimizer
from keras.preprocessing.image import ImageDataGenerator #For preprocessing images
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.applications.resnet import ResNet50
from keras.applications.resnet import preprocess_input

Step 1 - Exploring and Organizing the data¶

I always like to get an initial idea of the data I'm working with before anything else. In this case, I also get to see pictures of cute cats and dogs so ...

Let's start by exploring the data a bit. I'll figure out how many pictures I have at my disposal in the test and training sets.

In [3]:

train_folder = 'train/'
test_folder = 'test/'
print('Number of pictures in train folder is:')
print(len([f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))]))
n_train = len([f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))])

print('Number of pictures in test folder is:')
print(len([f for f in os.listdir(test_folder) if os.path.isfile(os.path.join(test_folder, f))]))
n_test = len([f for f in os.listdir(test_folder) if os.path.isfile(os.path.join(test_folder, f))])

Number of pictures in train folder is:
25000
Number of pictures in test folder is:
12500

Now I'm going to plot some cat and dog images from the training set and some random images from the test set below.

In [24]:

# plot 6 random dog images from train set
fig, axs = plt.subplots(1,6,figsize =(30,30)) 
for i in range(6): #Let me plot 10 random images from the dataset
    img_sel = random.randint(0, (n_train/2)-1)
    img_name = train_folder + 'dog.' + str(img_sel) + '.jpg'
    img = Image.open(img_name)
    axs[i].imshow(img)
    axs[i].set_title(img_name, fontsize = 20)
plt.show()

# plot 6 random cat images from train set
fig, axs = plt.subplots(1,6,figsize =(30,30)) 
for i in range(6): 
    img_sel = random.randint(0, (n_train/2)-1)
    img_name = train_folder + 'cat.' + str(img_sel) + '.jpg'
    img = Image.open(img_name)
    axs[i].imshow(img)
    axs[i].set_title(img_name, fontsize = 20)
plt.show()

# plot 6 random images from test set
fig, axs = plt.subplots(1,6,figsize =(30,30)) 
for i in range(6): 
    img_sel = random.randint(1, n_test)
    img_name = test_folder + str(img_sel) + '.jpg'
    img = Image.open(img_name)
    axs[i].imshow(img)
    axs[i].set_title(img_name, fontsize = 20)
plt.show()

LOOK AT THOSE CUTIES!!! Now I'll create subdirectories for cats and dogs for both train and test directories

In [39]:

#create directories
subdirs = [train_folder, test_folder]
for subdir in subdirs:
    # create label subdirectories
    labeldirs = ['dogs/', 'cats/']
    for labldir in labeldirs:
        newdir = subdir + labldir
        os.makedirs(newdir, exist_ok=True)

Splitting the data into testing and training sets¶

Now that I have my directories set up, I'm ready to start splitting my data. I'll use an 80:20 split to start things off and see how well that works.

In [50]:

# seed random number generator for reproducibility
random.seed(1)

# Ratio of pictures to use for validation 80:20
val_ratio = 0.2

# Find files in train folder w.r.t. cwd
files = [f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))]

# copy training dataset images into subdirectories
for file in files:
    #print(file)
    src = train_folder + file
    dst_dir = 'train/'
    
    if random.random() < val_ratio:
        dst_dir = 'test/'
        
    if file.startswith('cat'):
        dst = dst_dir + 'cats/'  + file
        copyfile(src, dst)
    elif file.startswith('dog'):
        dst = dst_dir + 'dogs/'  + file
        copyfile(src, dst)

Let's see how many cats and dogs we have in each of our training sets

In [52]:

print('This is the number of cat pics in the train folder')
print(len([f for f in os.listdir(train_folder + 'cats/') 
           if os.path.isfile(os.path.join(train_folder+ 'cats/', f))]))
print('\n')

print('This is the number of dog pics in the train folder')
print(len([f for f in os.listdir(train_folder + 'dogs/') 
           if os.path.isfile(os.path.join(train_folder+ 'dogs/', f))]))
print('\n')

print('This is the number of cat pics in the test folder')
print(len([f for f in os.listdir(test_folder + 'cats/') 
           if os.path.isfile(os.path.join(test_folder+ 'cats/', f))]))
print('\n')

print('This is the number of cat pics in the test folder')
print(len([f for f in os.listdir(test_folder + 'dogs/') 
           if os.path.isfile(os.path.join(test_folder+ 'dogs/', f))]))

This is the number of cat pics in the train folder
9945


This is the number of dog pics in the train folder
9965


This is the number of cat pics in the test folder
2555


This is the number of cat pics in the test folder
2535

So we have a total of 12,500 pictures for cats and dogs each in the test and training folders. The ratio of test cats to total cats is 2,555/12,500 = 0.2 so we have an 80:20 split. The situation is similar for dogs. Let's continue preprocessing our data.

Scaling, reshaping and augmenting the data¶

The images that we have in our dataset have different sizes and so we'll have to ensure that they are all the same size prior to modeling. We'll also have to normalize the pixels. Lastly, we'll also do data augmentation to expand the size of the training dataset.

In the code below, I'm normalizing the pixel intensities to between 0-1, ensuring that there are at most two classes being detected (cats vs dogs), and forcing each image to be 200 x 200 pixels.

In [58]:

# create data generators
datagen = ImageDataGenerator(rescale=1.0/255.0)

# prepare iterators
train_it = datagen.flow_from_directory(train_folder,
                                       class_mode='binary',
                                       batch_size=64,
                                       target_size=(200, 200))

test_it = datagen.flow_from_directory(test_folder,
                                      class_mode='binary',
                                      batch_size=64,
                                      target_size=(200, 200))

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.

Single layer model¶

I'll start with a model composed of a single convolution layer prior to flattening as described below.

Building the Model¶

Now that I have the data normalized and reshaped, I'll start building the model. The first model I'll build will be structured as follows:

1x Convolution + MaxPooling Layer

1x FC Layer

1x Densse Layer

1 output layer with sigmoid activation

In [69]:

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))

Now I'll set the optimizer to 'SGD' and compile the model

In [70]:

# compile model
opt = SGD(lr = 0.001, momentum = 0.9)
model.compile(optimizer = opt, loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, 200, 200, 32)      896       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 100, 100, 32)      0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 320000)            0         
_________________________________________________________________
dense_5 (Dense)              (None, 128)               40960128  
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 129       
=================================================================
Total params: 40,961,153
Trainable params: 40,961,153
Non-trainable params: 0
_________________________________________________________________

Fitting the Model¶

And now I'll fit and assess the model. I'll train it for 20 epochs to start off.

In [71]:

# fit model
history = model.fit_generator(train_it, 
                              steps_per_epoch = len(train_it),
                              validation_data = test_it, 
                              validation_steps = len(test_it), 
                              epochs = 20, 
                              verbose = 1)
# evaluate model
_, acc = model.evaluate_generator(test_it, steps = len(test_it), verbose = 1)
print('> %.3f' % (acc * 100.0))

Epoch 1/20
 - 160s - loss: 0.6853 - accuracy: 0.5608 - val_loss: 0.7295 - val_accuracy: 0.5976
Epoch 2/20
 - 159s - loss: 0.6463 - accuracy: 0.6150 - val_loss: 0.7255 - val_accuracy: 0.6043
Epoch 3/20
 - 159s - loss: 0.6207 - accuracy: 0.6448 - val_loss: 0.6261 - val_accuracy: 0.6326
Epoch 4/20
 - 158s - loss: 0.6062 - accuracy: 0.6589 - val_loss: 0.6031 - val_accuracy: 0.6503
Epoch 5/20
 - 159s - loss: 0.5985 - accuracy: 0.6684 - val_loss: 0.4948 - val_accuracy: 0.6607
Epoch 6/20
 - 159s - loss: 0.5770 - accuracy: 0.6902 - val_loss: 0.7151 - val_accuracy: 0.6662
Epoch 7/20
 - 160s - loss: 0.5607 - accuracy: 0.7076 - val_loss: 0.5129 - val_accuracy: 0.6912
Epoch 8/20
 - 162s - loss: 0.5360 - accuracy: 0.7304 - val_loss: 0.5882 - val_accuracy: 0.7145
Epoch 9/20
 - 159s - loss: 0.5077 - accuracy: 0.7521 - val_loss: 0.5406 - val_accuracy: 0.7159
Epoch 10/20
 - 159s - loss: 0.4810 - accuracy: 0.7686 - val_loss: 0.6013 - val_accuracy: 0.7163
Epoch 11/20
 - 158s - loss: 0.4562 - accuracy: 0.7868 - val_loss: 0.4718 - val_accuracy: 0.7334
Epoch 12/20
 - 157s - loss: 0.4233 - accuracy: 0.8094 - val_loss: 0.3947 - val_accuracy: 0.7334
Epoch 13/20
 - 158s - loss: 0.3818 - accuracy: 0.8315 - val_loss: 0.6220 - val_accuracy: 0.7344
Epoch 14/20
 - 158s - loss: 0.3471 - accuracy: 0.8546 - val_loss: 0.5257 - val_accuracy: 0.7354
Epoch 15/20
 - 157s - loss: 0.3066 - accuracy: 0.8753 - val_loss: 0.4527 - val_accuracy: 0.7424
Epoch 16/20
 - 157s - loss: 0.2859 - accuracy: 0.8857 - val_loss: 0.4680 - val_accuracy: 0.7305
Epoch 17/20
 - 157s - loss: 0.2452 - accuracy: 0.9093 - val_loss: 0.5377 - val_accuracy: 0.7491
Epoch 18/20
 - 158s - loss: 0.2079 - accuracy: 0.9276 - val_loss: 0.6251 - val_accuracy: 0.7395
Epoch 19/20
 - 158s - loss: 0.1792 - accuracy: 0.9415 - val_loss: 0.5093 - val_accuracy: 0.7460
Epoch 20/20
 - 157s - loss: 0.1587 - accuracy: 0.9506 - val_loss: 0.8236 - val_accuracy: 0.7462
> 74.617

Hmmm, this initial model has an accuracy of 74.6%. Not great but not absolutely terrible. This process took about 53 minutes to complete.

Visualize learning curve¶

A learning curve is a plot of model learning performance over time. A learning curve allows us to determine things like whether the model is underfitting/overfitting/goodfit and whether the test/training sets are good representations of one another.

The training curve tells us how well the model is learning.

The testing curve tells us how well the model is generalizing.

For each of these datasets we'll generate learning curves for optimization and performance.

The optimization learning curve is based on the metric that is being used to optimize the model parameters (i.e., the loss function).

The performance learning curve is based on the model evaluation metrics (i.e., accuracy)

Underfitting happens when a model cannot learn from a training set. This can usually be identified by plotting the loss function over the course of the learning process. If the training loss is flat or continues to decrease by the end of the training process then underfitting is likely happening.

Overfitting happens when a model learned too much from the training set. This is a problem because it makes it limits the generality of the model and thus makes it too specialized to be useful in scenarios other than the one it is working on. In other words an overfitted model only works for a specific dataset and is generally incapable of having the same kind of success with new datasets. Over fitting generally happens when the model is trained for too long or if the model has a capacity that is too large for the dataset that it is fitting (i.e., too many parameters). Overfitting can be identified from the learning curves. Signs of overfitting are a continued loss over time (i.e., no plateau) and decrease followed by increase in the loss of the test set.

Good fits occur when the loss between the test and training sets are comparable with one another. They can be identified by a decreasing loss that eventually plateaus and by a small gap between the loss of the test and training sets.

Let's look at the entropy loss and classification accuracy of our model to get some learning diagnostics

In [107]:

fig, axs = plt.subplots(1,2,figsize =(30,10)) 

# plot loss
axs[0].set_title('Cross Entropy Loss', fontsize = 40)
axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
axs[0].set_xlabel('Epoch', fontsize = 30) # X label
axs[0].tick_params(axis='x', labelsize=30)
axs[0].tick_params(axis='y', labelsize=30)
axs[0].legend(fontsize = 40)

# plot accuracy
axs[1].set_title('Classification Accuracy', fontsize = 40)
axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
axs[1].set_xlabel('Epoch', fontsize = 30) # X label
axs[1].tick_params(axis='x', labelsize=30)
axs[1].tick_params(axis='y', labelsize=30)
axs[1].legend(fontsize = 40)
plt.show()

From these curves we can see that our model (test set) is overfitting the training set after approximately 12 epochs as evidenced by the divergence between the two curves at that stage. This is fine however since I started with a simple, single convolution/pooling layer. I'll start with a more complex build/architecture next. Overfitting can be minimized by addition of dropout layers and data augmentation.

Consolidating Routines¶

Now that I have a full model, I'll start gathering all the routines I have from before and place it into functions that will allow me to easily reference later if need be.

In [10]:

#Instantiate DataGenerator and prepare iterators
def iter_generator():
    # create data generators
    datagen = ImageDataGenerator(rescale=1.0/255.0)

    # prepare iterators
    train_it = datagen.flow_from_directory(train_folder,
                                           class_mode='binary',
                                           batch_size=64,
                                           target_size=(200, 200))

    test_it = datagen.flow_from_directory(test_folder,
                                          class_mode='binary',
                                          batch_size=64,
                                          target_size=(200, 200))
    
    return train_it, test_it
    
# Define Model
def define_SL_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model    

#Model fitting
def fit_cnn_model(model,train_it,test_it,epochs = 20):
    # fit model
    history = model.fit(train_it, 
                              steps_per_epoch = len(train_it),
                              validation_data = test_it, 
                              validation_steps = len(test_it), 
                              epochs = epochs, 
                              verbose = 1)
    # evaluate model
    _, acc = model.evaluate_generator(test_it, steps = len(test_it), verbose = 1)
    print('> %.3f' % (acc * 100.0))
    return history

#Visualize learning results
def cnn_diagnostics(history):
    fig, axs = plt.subplots(1,2,figsize =(30,10)) 

    # plot loss
    axs[0].set_title('Cross Entropy Loss', fontsize = 40)
    axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
    axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
    axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
    axs[0].set_xlabel('Epoch', fontsize = 30) # X label
    axs[0].tick_params(axis='x', labelsize=30)
    axs[0].tick_params(axis='y', labelsize=30)
    axs[0].legend(fontsize = 40)

    # plot accuracy
    axs[1].set_title('Classification Accuracy', fontsize = 40)
    axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
    axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
    axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
    axs[1].set_xlabel('Epoch', fontsize = 30) # X label
    axs[1].tick_params(axis='x', labelsize=30)
    axs[1].tick_params(axis='y', labelsize=30)
    axs[1].legend(fontsize = 40)
    plt.show()

Double layer model¶

Let me add another convolution/pooling layer to the model and see how that performs. Since I already have generator,model making, fitting, and plotting routines made I can simply call them. I'll modify the model routine from earlier to include the double layer model

In [110]:

# Define Model
def define_DL_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    opt = SGD(lr = 0.001, momentum = 0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model    

In [116]:

dl_model = define_DL_model()
train_it, test_it = iter_generator()
dl_history = fit_cnn_model(dl_model,train_it,test_it)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/20
312/312 [==============================] - 274s 879ms/step - loss: 0.6773 - accuracy: 0.5717 - val_loss: 0.7064 - val_accuracy: 0.5560
Epoch 2/20
312/312 [==============================] - 270s 864ms/step - loss: 0.6371 - accuracy: 0.6278 - val_loss: 0.6867 - val_accuracy: 0.6473
Epoch 3/20
312/312 [==============================] - 269s 863ms/step - loss: 0.6162 - accuracy: 0.6523 - val_loss: 0.5552 - val_accuracy: 0.6603
Epoch 4/20
312/312 [==============================] - 271s 869ms/step - loss: 0.5815 - accuracy: 0.6913 - val_loss: 0.6806 - val_accuracy: 0.6884
Epoch 5/20
312/312 [==============================] - 270s 866ms/step - loss: 0.5503 - accuracy: 0.7146 - val_loss: 0.4781 - val_accuracy: 0.7189
Epoch 6/20
312/312 [==============================] - 270s 865ms/step - loss: 0.5223 - accuracy: 0.7399 - val_loss: 0.5847 - val_accuracy: 0.7385
Epoch 7/20
312/312 [==============================] - 268s 859ms/step - loss: 0.4860 - accuracy: 0.7697 - val_loss: 0.5146 - val_accuracy: 0.7495
Epoch 8/20
312/312 [==============================] - 269s 861ms/step - loss: 0.4616 - accuracy: 0.7845 - val_loss: 0.5140 - val_accuracy: 0.7291
Epoch 9/20
312/312 [==============================] - 267s 857ms/step - loss: 0.4288 - accuracy: 0.8018 - val_loss: 0.3683 - val_accuracy: 0.7413
Epoch 10/20
312/312 [==============================] - 269s 861ms/step - loss: 0.3985 - accuracy: 0.8235 - val_loss: 0.4651 - val_accuracy: 0.7658
Epoch 11/20
312/312 [==============================] - 267s 856ms/step - loss: 0.3664 - accuracy: 0.8401 - val_loss: 0.5742 - val_accuracy: 0.7745
Epoch 12/20
312/312 [==============================] - 267s 856ms/step - loss: 0.3329 - accuracy: 0.8608 - val_loss: 0.5326 - val_accuracy: 0.7809
Epoch 13/20
312/312 [==============================] - 270s 865ms/step - loss: 0.3041 - accuracy: 0.8720 - val_loss: 0.6844 - val_accuracy: 0.7750
Epoch 14/20
312/312 [==============================] - 272s 873ms/step - loss: 0.2646 - accuracy: 0.8917 - val_loss: 0.4253 - val_accuracy: 0.7829
Epoch 15/20
312/312 [==============================] - 268s 860ms/step - loss: 0.2315 - accuracy: 0.9120 - val_loss: 0.2716 - val_accuracy: 0.7676
Epoch 16/20
312/312 [==============================] - 268s 857ms/step - loss: 0.2134 - accuracy: 0.9178 - val_loss: 0.2441 - val_accuracy: 0.7756
Epoch 17/20
312/312 [==============================] - 268s 859ms/step - loss: 0.1821 - accuracy: 0.9339 - val_loss: 0.4488 - val_accuracy: 0.7806
Epoch 18/20
312/312 [==============================] - 270s 865ms/step - loss: 0.1554 - accuracy: 0.9470 - val_loss: 0.7510 - val_accuracy: 0.7648
Epoch 19/20
312/312 [==============================] - 268s 859ms/step - loss: 0.1377 - accuracy: 0.9537 - val_loss: 0.6019 - val_accuracy: 0.7613
Epoch 20/20
312/312 [==============================] - 269s 863ms/step - loss: 0.1016 - accuracy: 0.9729 - val_loss: 0.1961 - val_accuracy: 0.7694
80/80 [==============================] - 16s 201ms/step
> 76.935

This model is marginally better with 76.9% accuracy, but it is an improvement nonetheless. We can also see that the model is overfitting based on the learning curves below, and this is happening at an earlier epoch (around 8) than before. This process took about 90 minutes to complete.

In [117]:

cnn_diagnostics(dl_history)

Triple Layer Model¶

I'll try adding one more layer just to confirm that this trend continues before trying to improve the model further in a different way.

In [120]:

# Define Model
def define_TL_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    opt = SGD(lr = 0.001, momentum = 0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model    

In [121]:

tl_model = define_TL_model()
train_it, test_it = iter_generator()
tl_history = fit_cnn_model(tl_model,train_it,test_it)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/20
312/312 [==============================] - 346s 1s/step - loss: 0.6556 - accuracy: 0.6092 - val_loss: 0.6442 - val_accuracy: 0.6611
Epoch 2/20
312/312 [==============================] - 341s 1s/step - loss: 0.5933 - accuracy: 0.6772 - val_loss: 0.5903 - val_accuracy: 0.7063
Epoch 3/20
312/312 [==============================] - 337s 1s/step - loss: 0.5576 - accuracy: 0.7103 - val_loss: 0.5116 - val_accuracy: 0.6809
Epoch 4/20
312/312 [==============================] - 340s 1s/step - loss: 0.5315 - accuracy: 0.7327 - val_loss: 0.7766 - val_accuracy: 0.7358
Epoch 5/20
312/312 [==============================] - 337s 1s/step - loss: 0.5007 - accuracy: 0.7581 - val_loss: 0.5519 - val_accuracy: 0.7240
Epoch 6/20
312/312 [==============================] - 337s 1s/step - loss: 0.4749 - accuracy: 0.7725 - val_loss: 0.5498 - val_accuracy: 0.7487
Epoch 7/20
312/312 [==============================] - 338s 1s/step - loss: 0.4463 - accuracy: 0.7944 - val_loss: 0.3613 - val_accuracy: 0.7817
Epoch 8/20
312/312 [==============================] - 338s 1s/step - loss: 0.4224 - accuracy: 0.8075 - val_loss: 0.7579 - val_accuracy: 0.7760
Epoch 9/20
312/312 [==============================] - 338s 1s/step - loss: 0.3949 - accuracy: 0.8230 - val_loss: 0.5986 - val_accuracy: 0.7782
Epoch 10/20
312/312 [==============================] - 337s 1s/step - loss: 0.3713 - accuracy: 0.8345 - val_loss: 0.3801 - val_accuracy: 0.7925
Epoch 11/20
312/312 [==============================] - 336s 1s/step - loss: 0.3420 - accuracy: 0.8516 - val_loss: 0.3372 - val_accuracy: 0.7963
Epoch 12/20
312/312 [==============================] - 334s 1s/step - loss: 0.3129 - accuracy: 0.8656 - val_loss: 0.3451 - val_accuracy: 0.7770
Epoch 13/20
312/312 [==============================] - 334s 1s/step - loss: 0.2982 - accuracy: 0.8731 - val_loss: 0.3409 - val_accuracy: 0.8020
Epoch 14/20
312/312 [==============================] - 336s 1s/step - loss: 0.2571 - accuracy: 0.8962 - val_loss: 0.4073 - val_accuracy: 0.8071
Epoch 15/20
312/312 [==============================] - 337s 1s/step - loss: 0.2307 - accuracy: 0.9104 - val_loss: 0.4753 - val_accuracy: 0.7984
Epoch 16/20
312/312 [==============================] - 335s 1s/step - loss: 0.2002 - accuracy: 0.9251 - val_loss: 0.4671 - val_accuracy: 0.8033
Epoch 17/20
312/312 [==============================] - 332s 1s/step - loss: 0.1827 - accuracy: 0.9308 - val_loss: 0.6437 - val_accuracy: 0.8124
Epoch 18/20
312/312 [==============================] - 334s 1s/step - loss: 0.1403 - accuracy: 0.9543 - val_loss: 0.4069 - val_accuracy: 0.8059
Epoch 19/20
312/312 [==============================] - 336s 1s/step - loss: 0.1137 - accuracy: 0.9655 - val_loss: 0.2892 - val_accuracy: 0.7980
Epoch 20/20
312/312 [==============================] - 338s 1s/step - loss: 0.0998 - accuracy: 0.9705 - val_loss: 0.6823 - val_accuracy: 0.7582
80/80 [==============================] - 19s 239ms/step
> 75.815

The final accuracy of this model is 75.8% which is slightly worse than the two layer model. However, the accuracy did reach 80% in certain epochs. We can also see that the model is overfitting based on the learning curves below, and this is happening at an earlier epoch (around 6) than before. At this point, it is clear that adding more convolution/pooling layers is unlikely to improve the model further so I'll start including other strategies like dropout layers and implementing data augmentation strategies. This process took almost two hours to complete.

In [123]:

cnn_diagnostics(tl_history)

Adding dropout layers¶

I'll try addressing the overfitting issue by incorporating a dropout layer. This is known as dropout regularization. I'll start by incorporating a dropout of 20% after each of the initial 3 layers and a subsequent dropout of 50% after the FC layer. This may not necessary improve the accuracy of the model but it should help with overfitting which will be valuable in the long run.

In [9]:

# Define Model
def define_TLD_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Dropout(0.2))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Dropout(0.2))
    model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    opt = SGD(learning_rate = 0.001, momentum = 0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model    

In [131]:

tld_model = define_TLD_model()
train_it, test_it = iter_generator()
tld_history = fit_cnn_model(tld_model,train_it,test_it, epochs = 50)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/50
312/312 [==============================] - 393s 1s/step - loss: 0.6976 - accuracy: 0.5245 - val_loss: 0.6855 - val_accuracy: 0.5717
Epoch 2/50
312/312 [==============================] - 394s 1s/step - loss: 0.6824 - accuracy: 0.5531 - val_loss: 0.6780 - val_accuracy: 0.6189
Epoch 3/50
312/312 [==============================] - 394s 1s/step - loss: 0.6762 - accuracy: 0.5735 - val_loss: 0.6870 - val_accuracy: 0.5980
Epoch 4/50
312/312 [==============================] - 393s 1s/step - loss: 0.6668 - accuracy: 0.5900 - val_loss: 0.6787 - val_accuracy: 0.6265
Epoch 5/50
312/312 [==============================] - 395s 1s/step - loss: 0.6547 - accuracy: 0.6077 - val_loss: 0.6317 - val_accuracy: 0.6373
Epoch 6/50
312/312 [==============================] - 395s 1s/step - loss: 0.6484 - accuracy: 0.6114 - val_loss: 0.6318 - val_accuracy: 0.6375
Epoch 7/50
312/312 [==============================] - 394s 1s/step - loss: 0.6381 - accuracy: 0.6259 - val_loss: 0.7235 - val_accuracy: 0.6481
Epoch 8/50
312/312 [==============================] - 394s 1s/step - loss: 0.6304 - accuracy: 0.6356 - val_loss: 0.6590 - val_accuracy: 0.6548
Epoch 9/50
312/312 [==============================] - 394s 1s/step - loss: 0.6195 - accuracy: 0.6508 - val_loss: 0.5910 - val_accuracy: 0.6646
Epoch 10/50
312/312 [==============================] - 393s 1s/step - loss: 0.6047 - accuracy: 0.6649 - val_loss: 0.6096 - val_accuracy: 0.6572
Epoch 11/50
312/312 [==============================] - 394s 1s/step - loss: 0.5948 - accuracy: 0.6769 - val_loss: 0.5737 - val_accuracy: 0.6817
Epoch 12/50
312/312 [==============================] - 402s 1s/step - loss: 0.5867 - accuracy: 0.6826 - val_loss: 0.5909 - val_accuracy: 0.6994
Epoch 13/50
312/312 [==============================] - 399s 1s/step - loss: 0.5791 - accuracy: 0.6926 - val_loss: 0.5816 - val_accuracy: 0.7167
Epoch 14/50
312/312 [==============================] - 397s 1s/step - loss: 0.5625 - accuracy: 0.7061 - val_loss: 0.5693 - val_accuracy: 0.7220
Epoch 15/50
312/312 [==============================] - 399s 1s/step - loss: 0.5535 - accuracy: 0.7095 - val_loss: 0.5309 - val_accuracy: 0.7244
Epoch 16/50
312/312 [==============================] - 398s 1s/step - loss: 0.5437 - accuracy: 0.7226 - val_loss: 0.4805 - val_accuracy: 0.7346
Epoch 17/50
312/312 [==============================] - 397s 1s/step - loss: 0.5272 - accuracy: 0.7356 - val_loss: 0.6038 - val_accuracy: 0.7385
Epoch 18/50
312/312 [==============================] - 399s 1s/step - loss: 0.5194 - accuracy: 0.7418 - val_loss: 0.5695 - val_accuracy: 0.7454
Epoch 19/50
312/312 [==============================] - 406s 1s/step - loss: 0.5124 - accuracy: 0.7476 - val_loss: 0.5086 - val_accuracy: 0.7379
Epoch 20/50
312/312 [==============================] - 398s 1s/step - loss: 0.5034 - accuracy: 0.7489 - val_loss: 0.4349 - val_accuracy: 0.7587
Epoch 21/50
312/312 [==============================] - 397s 1s/step - loss: 0.4914 - accuracy: 0.7637 - val_loss: 0.4599 - val_accuracy: 0.7648
Epoch 22/50
312/312 [==============================] - 395s 1s/step - loss: 0.4841 - accuracy: 0.7655 - val_loss: 0.5894 - val_accuracy: 0.7737
Epoch 23/50
312/312 [==============================] - 395s 1s/step - loss: 0.4708 - accuracy: 0.7771 - val_loss: 0.5886 - val_accuracy: 0.7619
Epoch 24/50
312/312 [==============================] - 408s 1s/step - loss: 0.4632 - accuracy: 0.7772 - val_loss: 0.4034 - val_accuracy: 0.7806
Epoch 25/50
312/312 [==============================] - 434s 1s/step - loss: 0.4551 - accuracy: 0.7857 - val_loss: 0.4416 - val_accuracy: 0.7770
Epoch 26/50
312/312 [==============================] - 449s 1s/step - loss: 0.4435 - accuracy: 0.7925 - val_loss: 0.4382 - val_accuracy: 0.7847
Epoch 27/50
312/312 [==============================] - 434s 1s/step - loss: 0.4345 - accuracy: 0.7979 - val_loss: 0.5633 - val_accuracy: 0.7855
Epoch 28/50
312/312 [==============================] - 407s 1s/step - loss: 0.4256 - accuracy: 0.8031 - val_loss: 0.4061 - val_accuracy: 0.7752
Epoch 29/50
312/312 [==============================] - 407s 1s/step - loss: 0.4206 - accuracy: 0.8076 - val_loss: 0.4251 - val_accuracy: 0.7772
Epoch 30/50
312/312 [==============================] - 401s 1s/step - loss: 0.4060 - accuracy: 0.8174 - val_loss: 0.4502 - val_accuracy: 0.7861
Epoch 31/50
312/312 [==============================] - 409s 1s/step - loss: 0.3949 - accuracy: 0.8212 - val_loss: 0.4621 - val_accuracy: 0.7874
Epoch 32/50
312/312 [==============================] - 417s 1s/step - loss: 0.3929 - accuracy: 0.8204 - val_loss: 0.5444 - val_accuracy: 0.7949
Epoch 33/50
312/312 [==============================] - 420s 1s/step - loss: 0.3759 - accuracy: 0.8298 - val_loss: 0.4273 - val_accuracy: 0.7937
Epoch 34/50
312/312 [==============================] - 419s 1s/step - loss: 0.3717 - accuracy: 0.8341 - val_loss: 0.4550 - val_accuracy: 0.7992
Epoch 35/50
312/312 [==============================] - 400s 1s/step - loss: 0.3595 - accuracy: 0.8401 - val_loss: 0.4704 - val_accuracy: 0.8014
Epoch 36/50
312/312 [==============================] - 398s 1s/step - loss: 0.3499 - accuracy: 0.8459 - val_loss: 0.4227 - val_accuracy: 0.7965
Epoch 37/50
312/312 [==============================] - 397s 1s/step - loss: 0.3500 - accuracy: 0.8480 - val_loss: 0.4315 - val_accuracy: 0.7931
Epoch 38/50
312/312 [==============================] - 398s 1s/step - loss: 0.3346 - accuracy: 0.8512 - val_loss: 0.4268 - val_accuracy: 0.8031
Epoch 39/50
312/312 [==============================] - 399s 1s/step - loss: 0.3292 - accuracy: 0.8570 - val_loss: 0.4325 - val_accuracy: 0.8075
Epoch 40/50
312/312 [==============================] - 400s 1s/step - loss: 0.3167 - accuracy: 0.8624 - val_loss: 0.3254 - val_accuracy: 0.8157
Epoch 41/50
312/312 [==============================] - 399s 1s/step - loss: 0.3012 - accuracy: 0.8721 - val_loss: 0.3822 - val_accuracy: 0.8143
Epoch 42/50
312/312 [==============================] - 412s 1s/step - loss: 0.2936 - accuracy: 0.8755 - val_loss: 0.3717 - val_accuracy: 0.8183
Epoch 43/50
312/312 [==============================] - 401s 1s/step - loss: 0.2872 - accuracy: 0.8755 - val_loss: 0.4908 - val_accuracy: 0.8165
Epoch 44/50
312/312 [==============================] - 399s 1s/step - loss: 0.2780 - accuracy: 0.8815 - val_loss: 0.4762 - val_accuracy: 0.8104
Epoch 45/50
312/312 [==============================] - 397s 1s/step - loss: 0.2659 - accuracy: 0.8866 - val_loss: 0.4051 - val_accuracy: 0.8143
Epoch 46/50
312/312 [==============================] - 393s 1s/step - loss: 0.2696 - accuracy: 0.8856 - val_loss: 0.4317 - val_accuracy: 0.8169
Epoch 47/50
312/312 [==============================] - 390s 1s/step - loss: 0.2565 - accuracy: 0.8931 - val_loss: 0.4638 - val_accuracy: 0.8147
Epoch 48/50
312/312 [==============================] - 391s 1s/step - loss: 0.2440 - accuracy: 0.9004 - val_loss: 0.7397 - val_accuracy: 0.8116
Epoch 49/50
312/312 [==============================] - 390s 1s/step - loss: 0.2323 - accuracy: 0.9037 - val_loss: 0.2923 - val_accuracy: 0.8204
Epoch 50/50
312/312 [==============================] - 390s 1s/step - loss: 0.2291 - accuracy: 0.9049 - val_loss: 0.5022 - val_accuracy: 0.8083
80/80 [==============================] - 18s 229ms/step
> 80.825

Nice! The accuracy of the model is 80.8% and we have minimized overfitting quite a bit as evidenced by the smaller gap in the classification accuracy between the training and test sets.

In [132]:

cnn_diagnostics(tld_history)

Small Detour for GPU setup¶

The calculations and the epochs required for training are starting to get a bit too long (the last model took almost 6 hours). I've been running tensorflow using the CPU version so far. However, massive gains in computational speed can be obtained by setting up keras to work with a GPU so I'll switch my tensorflow environment to use my GPU instead.

My home setup has a AMD Ryzen 7 3700X 8-core CPU and an NVIDIA GeForce RTX 2060 GPU. I've noticed in my two most recent models that the CPU utilization was consistently 100% whch is slowing all my other tasks. Hence, the need to switch to GPU.

The instructions on setting up tensorflow with GPU can be found here: https://www.tensorflow.org/install/pip#linux

In [2]:

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Num GPUs Available:  1

As you can see below tensorflow is recognizing my GPU. I'll run a dummy model using the dropout model I used before really quick to ensure that the GPU is being used now.

In [11]:

test_model = define_TLD_model()
train_it, test_it = iter_generator()
test_history = fit_cnn_model(test_model,train_it,test_it, epochs = 1)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
312/312 [==============================] - 49s 155ms/step - loss: 0.7191 - accuracy: 0.5220 - val_loss: 0.6888 - val_accuracy: 0.5316

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 10s 122ms/step - loss: 0.6888 - accuracy: 0.5316
> 53.163

Perfect! That's a massive reduction in computational time. An epoch with the CPU was taking approximately 400s to complete. With the GPU its taking under 60s. That's over 5 times as fast! Cool, now I can start to feasibly explore more complex models within a reasonable time frame :]

Data Augmentation¶

Through data augmentation I'll be applying various modifications to the training/validation set that will not only make it larger but should also result in a more general and flexible model.

Some of the modifications we can appply to the images in our dataset are rotations, translations, zooms, mirror flips and noise.

This can all be readily done through the ImageDataGenerator as shown below where I'll modify one of my base functions iter_generator to incorporate these new parameters:

In [73]:

#Instantiate DataGenerator and prepare iterators
def iter_generator(wsr = 0,hsr = 0, hf = False, pxx = 200, pxy = 200, pretrain = None):
    # create data generators
    if pretrain == 'VGG16':
        datagen = ImageDataGenerator(featurewise_center=True)
        datagen.mean = [123.68, 116.779, 103.939]
    elif pretrain == 'ResNet50':
        datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
    else:
        datagen = ImageDataGenerator(rescale=1.0/255.0,
                                     width_shift_range = wsr,
                                     height_shift_range = hsr,
                                     horizontal_flip = hf
                                    )

    # prepare iterators
    train_it = datagen.flow_from_directory(train_folder,
                                           class_mode='binary',
                                           batch_size=64,
                                           target_size=(pxx, pxy))

    test_it = datagen.flow_from_directory(test_folder,
                                          class_mode='binary',
                                          batch_size=64,
                                          target_size=(pxx, pxy))
    
    return train_it, test_it

I'll call this model TLDA since it uses the same architecture as the TLD but the training set has now been augmented.

In [51]:

tlda_model = define_TLD_model()
train_it, test_it = iter_generator(wsr = 0.1, hsr = 0.1, hf = True)
tlda_history = fit_cnn_model(tlda_model,train_it,test_it, epochs = 50)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/50
312/312 [==============================] - 186s 594ms/step - loss: 0.7318 - accuracy: 0.5142 - val_loss: 0.6913 - val_accuracy: 0.5589
Epoch 2/50
312/312 [==============================] - 186s 595ms/step - loss: 0.6865 - accuracy: 0.5485 - val_loss: 0.6891 - val_accuracy: 0.5473
Epoch 3/50
312/312 [==============================] - 185s 593ms/step - loss: 0.6839 - accuracy: 0.5531 - val_loss: 0.6899 - val_accuracy: 0.5456
Epoch 4/50
312/312 [==============================] - 185s 594ms/step - loss: 0.6866 - accuracy: 0.5470 - val_loss: 0.6860 - val_accuracy: 0.5780
Epoch 5/50
312/312 [==============================] - 184s 590ms/step - loss: 0.6793 - accuracy: 0.5713 - val_loss: 0.6808 - val_accuracy: 0.6104
Epoch 6/50
312/312 [==============================] - 185s 591ms/step - loss: 0.6765 - accuracy: 0.5777 - val_loss: 0.6778 - val_accuracy: 0.6088
Epoch 7/50
312/312 [==============================] - 184s 590ms/step - loss: 0.6732 - accuracy: 0.5850 - val_loss: 0.6754 - val_accuracy: 0.5692
Epoch 8/50
312/312 [==============================] - 185s 592ms/step - loss: 0.6688 - accuracy: 0.5979 - val_loss: 0.6805 - val_accuracy: 0.5570
Epoch 9/50
312/312 [==============================] - 184s 590ms/step - loss: 0.6652 - accuracy: 0.6007 - val_loss: 0.6800 - val_accuracy: 0.5680
Epoch 10/50
312/312 [==============================] - 184s 588ms/step - loss: 0.6628 - accuracy: 0.6039 - val_loss: 0.6967 - val_accuracy: 0.5448
Epoch 11/50
312/312 [==============================] - 185s 593ms/step - loss: 0.6610 - accuracy: 0.6043 - val_loss: 0.6645 - val_accuracy: 0.6138
Epoch 12/50
312/312 [==============================] - 185s 592ms/step - loss: 0.6533 - accuracy: 0.6133 - val_loss: 0.6729 - val_accuracy: 0.5790
Epoch 13/50
312/312 [==============================] - 185s 592ms/step - loss: 0.6513 - accuracy: 0.6158 - val_loss: 0.6547 - val_accuracy: 0.6218
Epoch 14/50
312/312 [==============================] - 185s 591ms/step - loss: 0.6429 - accuracy: 0.6282 - val_loss: 0.6479 - val_accuracy: 0.6238
Epoch 15/50
312/312 [==============================] - 187s 598ms/step - loss: 0.6367 - accuracy: 0.6389 - val_loss: 0.6533 - val_accuracy: 0.6167
Epoch 16/50
312/312 [==============================] - 185s 594ms/step - loss: 0.6260 - accuracy: 0.6494 - val_loss: 0.6395 - val_accuracy: 0.6342
Epoch 17/50
312/312 [==============================] - 186s 597ms/step - loss: 0.6173 - accuracy: 0.6558 - val_loss: 0.6407 - val_accuracy: 0.6369
Epoch 18/50
312/312 [==============================] - 186s 596ms/step - loss: 0.6116 - accuracy: 0.6619 - val_loss: 0.6185 - val_accuracy: 0.6723
Epoch 19/50
312/312 [==============================] - 186s 594ms/step - loss: 0.6037 - accuracy: 0.6735 - val_loss: 0.6064 - val_accuracy: 0.6839
Epoch 20/50
312/312 [==============================] - 186s 595ms/step - loss: 0.5995 - accuracy: 0.6786 - val_loss: 0.6014 - val_accuracy: 0.6697
Epoch 21/50
312/312 [==============================] - 187s 600ms/step - loss: 0.5912 - accuracy: 0.6791 - val_loss: 0.5997 - val_accuracy: 0.6780
Epoch 22/50
312/312 [==============================] - 186s 597ms/step - loss: 0.5858 - accuracy: 0.6917 - val_loss: 0.5890 - val_accuracy: 0.6866
Epoch 23/50
312/312 [==============================] - 184s 590ms/step - loss: 0.5763 - accuracy: 0.6970 - val_loss: 0.5840 - val_accuracy: 0.6813
Epoch 24/50
312/312 [==============================] - 184s 590ms/step - loss: 0.5751 - accuracy: 0.6968 - val_loss: 0.5693 - val_accuracy: 0.7055
Epoch 25/50
312/312 [==============================] - 184s 591ms/step - loss: 0.5680 - accuracy: 0.7010 - val_loss: 0.5525 - val_accuracy: 0.7267
Epoch 26/50
312/312 [==============================] - 184s 591ms/step - loss: 0.5615 - accuracy: 0.7106 - val_loss: 0.5496 - val_accuracy: 0.7165
Epoch 27/50
312/312 [==============================] - 185s 592ms/step - loss: 0.5568 - accuracy: 0.7144 - val_loss: 0.5560 - val_accuracy: 0.7104
Epoch 28/50
312/312 [==============================] - 184s 590ms/step - loss: 0.5456 - accuracy: 0.7215 - val_loss: 0.5270 - val_accuracy: 0.7371
Epoch 29/50
312/312 [==============================] - 185s 594ms/step - loss: 0.5453 - accuracy: 0.7165 - val_loss: 0.5348 - val_accuracy: 0.7261
Epoch 30/50
312/312 [==============================] - 185s 593ms/step - loss: 0.5345 - accuracy: 0.7286 - val_loss: 0.5170 - val_accuracy: 0.7434
Epoch 31/50
312/312 [==============================] - 184s 590ms/step - loss: 0.5294 - accuracy: 0.7342 - val_loss: 0.5079 - val_accuracy: 0.7491
Epoch 32/50
312/312 [==============================] - 186s 596ms/step - loss: 0.5225 - accuracy: 0.7386 - val_loss: 0.5091 - val_accuracy: 0.7477
Epoch 33/50
312/312 [==============================] - 187s 601ms/step - loss: 0.5159 - accuracy: 0.7446 - val_loss: 0.5050 - val_accuracy: 0.7583
Epoch 34/50
312/312 [==============================] - 191s 611ms/step - loss: 0.5109 - accuracy: 0.7471 - val_loss: 0.4988 - val_accuracy: 0.7664
Epoch 35/50
312/312 [==============================] - 187s 600ms/step - loss: 0.5038 - accuracy: 0.7552 - val_loss: 0.5040 - val_accuracy: 0.7430
Epoch 36/50
312/312 [==============================] - 186s 595ms/step - loss: 0.5003 - accuracy: 0.7584 - val_loss: 0.4913 - val_accuracy: 0.7723
Epoch 37/50
312/312 [==============================] - 185s 595ms/step - loss: 0.4954 - accuracy: 0.7616 - val_loss: 0.4845 - val_accuracy: 0.7699
Epoch 38/50
312/312 [==============================] - 187s 598ms/step - loss: 0.4889 - accuracy: 0.7661 - val_loss: 0.4761 - val_accuracy: 0.7807
Epoch 39/50
312/312 [==============================] - 184s 588ms/step - loss: 0.4849 - accuracy: 0.7711 - val_loss: 0.4795 - val_accuracy: 0.7709
Epoch 40/50
312/312 [==============================] - 185s 593ms/step - loss: 0.4815 - accuracy: 0.7723 - val_loss: 0.4653 - val_accuracy: 0.7819
Epoch 41/50
312/312 [==============================] - 184s 590ms/step - loss: 0.4737 - accuracy: 0.7777 - val_loss: 0.4607 - val_accuracy: 0.7796
Epoch 42/50
312/312 [==============================] - 185s 592ms/step - loss: 0.4707 - accuracy: 0.7770 - val_loss: 0.4639 - val_accuracy: 0.7825
Epoch 43/50
312/312 [==============================] - 186s 595ms/step - loss: 0.4657 - accuracy: 0.7806 - val_loss: 0.4573 - val_accuracy: 0.7910
Epoch 44/50
312/312 [==============================] - 197s 633ms/step - loss: 0.4665 - accuracy: 0.7785 - val_loss: 0.4548 - val_accuracy: 0.7857
Epoch 45/50
312/312 [==============================] - 186s 598ms/step - loss: 0.4590 - accuracy: 0.7869 - val_loss: 0.4464 - val_accuracy: 0.7914
Epoch 46/50
312/312 [==============================] - 184s 591ms/step - loss: 0.4562 - accuracy: 0.7912 - val_loss: 0.4473 - val_accuracy: 0.7957
Epoch 47/50
312/312 [==============================] - 185s 591ms/step - loss: 0.4504 - accuracy: 0.7887 - val_loss: 0.4352 - val_accuracy: 0.7949
Epoch 48/50
312/312 [==============================] - 186s 596ms/step - loss: 0.4459 - accuracy: 0.7934 - val_loss: 0.4394 - val_accuracy: 0.7963
Epoch 49/50
312/312 [==============================] - 188s 603ms/step - loss: 0.4447 - accuracy: 0.7949 - val_loss: 0.4270 - val_accuracy: 0.8033
Epoch 50/50
312/312 [==============================] - 184s 588ms/step - loss: 0.4357 - accuracy: 0.7983 - val_loss: 0.4320 - val_accuracy: 0.7972

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 38s 467ms/step - loss: 0.4359 - accuracy: 0.7943
> 79.430

In [ ]:

In [17]:

cnn_diagnostics(tlda_history)

Now we are getting somewhere! We have dealt with overfitting quite well and the accuracy of our model has improved by about 10% from our initial model.

At this point I can try using a variety of different model architectures, data augmentations, and learning conditions. This is something that I may explore but I think my time would be better spent using a pretrained model for the purposes of transfer learning.

VGG16 model¶

I built the VGG16 model from scratch in a previous post just for funsies. However, Keras already has this model (and serveral others) pretrained and ready to use into this and any other application. You can find these pretrained models here: https://keras.io/api/applications/

There are a few things that need to be adjusted from our previous approach to have the data fed in a format suitable for VGG16. You can find the details for VGG16 here: https://keras.io/api/applications/vgg/#vgg16-function

Here are the parameters that the pretrained VGG16 model takes in:

include_top
weights
input_tensor
input_shape
pooling
classes
classifier_activation

I'll be adjusting the following:

1) Set include_top = False since I need to tailor (i.e., fine tune) the FC layers to be more specific to the classification problem at hand.

2) Change the input_shape from the default (3,224,224) to (224,224,3) since I won't be using the default 3 FC layers of the original VGG16.

3) Change the shape of our images to have sizes of 224 x 224 since the images that this model was trained on had those dimensions.

4) Since VGG16 is a pretrained model, it is unlikely that a large number of epochs will be required for training so I'll change that to 10

5) I'll make it so that the FC layers are not trainable by 'freezing' them since I'm going to be fine tuning model parameters

6) The images need to be centered since that's how the VGG16 model prepared their dataset. This will be done by setting the mean to [123.68, 116.779,103.939]

In [59]:

# define cnn model
def define_VGG16_model():
    # load model
    model = VGG16(include_top=False, input_shape=(224, 224, 3))
    # mark loaded layers as not trainable
    for layer in model.layers:
        layer.trainable = False
    # add new classifier layers
    flat1  = Flatten()(model.layers[-1].output)
    class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
    output = Dense(1, activation='sigmoid')(class1)
    # define new model
    model = Model(inputs=model.inputs, outputs=output)
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

Let's run it now!

In [41]:

vgg16_model = define_VGG16_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16_history = fit_cnn_model(vgg16_model,train_it,test_it, epochs = 10)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/10
312/312 [==============================] - 98s 299ms/step - loss: 0.3993 - accuracy: 0.9606 - val_loss: 0.0637 - val_accuracy: 0.9758
Epoch 2/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0311 - accuracy: 0.9891 - val_loss: 0.0651 - val_accuracy: 0.9764
Epoch 3/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0115 - accuracy: 0.9965 - val_loss: 0.0727 - val_accuracy: 0.9790
Epoch 4/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0032 - accuracy: 0.9993 - val_loss: 0.0907 - val_accuracy: 0.9786
Epoch 5/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0018 - accuracy: 0.9995 - val_loss: 0.0933 - val_accuracy: 0.9790
Epoch 6/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0011 - accuracy: 0.9997 - val_loss: 0.1048 - val_accuracy: 0.9796
Epoch 7/10
312/312 [==============================] - 90s 288ms/step - loss: 8.1669e-04 - accuracy: 0.9997 - val_loss: 0.1044 - val_accuracy: 0.9788
Epoch 8/10
312/312 [==============================] - 88s 282ms/step - loss: 8.4617e-04 - accuracy: 0.9998 - val_loss: 0.1079 - val_accuracy: 0.9804
Epoch 9/10
312/312 [==============================] - 89s 285ms/step - loss: 4.9252e-04 - accuracy: 0.9999 - val_loss: 0.1100 - val_accuracy: 0.9800
Epoch 10/10
312/312 [==============================] - 90s 288ms/step - loss: 4.3476e-04 - accuracy: 0.9999 - val_loss: 0.1128 - val_accuracy: 0.9802

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 18s 228ms/step - loss: 0.1128 - accuracy: 0.9802
> 98.016

The VGG16 resultsed in a 98.02% classification accuracy. Much much better than all the other models from before. However there's significant overfitting taking place. I'll try to incorporate a dropout layer to see if that helps.

In [57]:

cnn_diagnostics(vgg16_history)

In [68]:

# define cnn model
def define_VGG16D_model():
    # load model
    model = VGG16(include_top=False, input_shape=(224, 224, 3))
    # mark loaded layers as not trainable
    for layer in model.layers:
        layer.trainable = False
    # add new classifier layers
    drop0  = Dropout(0.5)(model.layers[-1].output)
    flat1  = Flatten()(drop0)
    drop1  = Dropout(0.2)(flat1)
    class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
    drop2  = Dropout(0.5)(class1)
    output = Dense(1, activation='sigmoid')(drop2)
    # define new model
    model = Model(inputs=model.inputs, outputs=output)
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [65]:

vgg16d_model = define_VGG16D_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16_history = fit_cnn_model(vgg16d_model,train_it,test_it, epochs = 10)

C:\Users\vmurc\anaconda3\lib\site-packages\keras\optimizers\optimizer_v2\gradient_descent.py:108: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
  super(SGD, self).__init__(name, **kwargs)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/10
312/312 [==============================] - 90s 286ms/step - loss: 0.2224 - accuracy: 0.9630 - val_loss: 0.0662 - val_accuracy: 0.9772
Epoch 2/10
312/312 [==============================] - 90s 288ms/step - loss: 0.0390 - accuracy: 0.9846 - val_loss: 0.0649 - val_accuracy: 0.9768
Epoch 3/10
312/312 [==============================] - 91s 292ms/step - loss: 0.0190 - accuracy: 0.9932 - val_loss: 0.0728 - val_accuracy: 0.9794
Epoch 4/10
312/312 [==============================] - 90s 289ms/step - loss: 0.0102 - accuracy: 0.9966 - val_loss: 0.0835 - val_accuracy: 0.9790
Epoch 5/10
312/312 [==============================] - 91s 290ms/step - loss: 0.0069 - accuracy: 0.9977 - val_loss: 0.0904 - val_accuracy: 0.9800
Epoch 6/10
312/312 [==============================] - 92s 294ms/step - loss: 0.0050 - accuracy: 0.9982 - val_loss: 0.0987 - val_accuracy: 0.9804
Epoch 7/10
312/312 [==============================] - 91s 291ms/step - loss: 0.0035 - accuracy: 0.9986 - val_loss: 0.1077 - val_accuracy: 0.9794
Epoch 8/10
312/312 [==============================] - 90s 287ms/step - loss: 0.0047 - accuracy: 0.9987 - val_loss: 0.1001 - val_accuracy: 0.9794
Epoch 9/10
312/312 [==============================] - 88s 282ms/step - loss: 0.0039 - accuracy: 0.9989 - val_loss: 0.0998 - val_accuracy: 0.9804
Epoch 10/10
312/312 [==============================] - 89s 284ms/step - loss: 0.0029 - accuracy: 0.9994 - val_loss: 0.1075 - val_accuracy: 0.9806

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 18s 228ms/step - loss: 0.1075 - accuracy: 0.9806
> 98.055

In [66]:

cnn_diagnostics(vgg16_history)

In [69]:

vgg16d3_model = define_VGG16D_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16d3_history = fit_cnn_model(vgg16d3_model,train_it,test_it, epochs = 10)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/10
312/312 [==============================] - 92s 292ms/step - loss: 0.4423 - accuracy: 0.9364 - val_loss: 0.0807 - val_accuracy: 0.9701
Epoch 2/10
312/312 [==============================] - 89s 283ms/step - loss: 0.1194 - accuracy: 0.9556 - val_loss: 0.0727 - val_accuracy: 0.9756
Epoch 3/10
312/312 [==============================] - 89s 286ms/step - loss: 0.1000 - accuracy: 0.9646 - val_loss: 0.0617 - val_accuracy: 0.9754
Epoch 4/10
312/312 [==============================] - 89s 286ms/step - loss: 0.0949 - accuracy: 0.9670 - val_loss: 0.0626 - val_accuracy: 0.9770
Epoch 5/10
312/312 [==============================] - 88s 283ms/step - loss: 0.0874 - accuracy: 0.9672 - val_loss: 0.0624 - val_accuracy: 0.9790
Epoch 6/10
312/312 [==============================] - 87s 279ms/step - loss: 0.0782 - accuracy: 0.9718 - val_loss: 0.0584 - val_accuracy: 0.9798
Epoch 7/10
312/312 [==============================] - 87s 280ms/step - loss: 0.0733 - accuracy: 0.9733 - val_loss: 0.0554 - val_accuracy: 0.9802
Epoch 8/10
312/312 [==============================] - 88s 283ms/step - loss: 0.0638 - accuracy: 0.9754 - val_loss: 0.0566 - val_accuracy: 0.9798
Epoch 9/10
312/312 [==============================] - 87s 279ms/step - loss: 0.0624 - accuracy: 0.9762 - val_loss: 0.0607 - val_accuracy: 0.9788
Epoch 10/10
312/312 [==============================] - 88s 282ms/step - loss: 0.0595 - accuracy: 0.9768 - val_loss: 0.0559 - val_accuracy: 0.9798

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 18s 222ms/step - loss: 0.0559 - accuracy: 0.9798
> 97.976

In [70]:

cnn_diagnostics(vgg16d3_history)

Nice! The addition of the dropout layers is helping with the overfitting issue while preserving a pretty high classification accuracy of 97.98%.

ResNet50 Model¶

In [72]:

# define cnn model
def define_ResNet_model():
    # load model
    model = ResNet50(include_top=False, input_shape=(224, 224, 3))
    # mark loaded layers as not trainable
    for layer in model.layers:
        layer.trainable = False
    # add new classifier layers
    flat1  = Flatten()(model.layers[-1].output)
    class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
    output = Dense(1, activation='sigmoid')(class1)
    # define new model
    model = Model(inputs=model.inputs, outputs=output)
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [74]:

resnet_model = define_ResNet_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'ResNet50')
resnet_history = fit_cnn_model(resnet_model, train_it,test_it, epochs = 10)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94765736/94765736 [==============================] - 2s 0us/step
Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/10
312/312 [==============================] - 69s 212ms/step - loss: 0.0589 - accuracy: 0.9786 - val_loss: 0.0328 - val_accuracy: 0.9890
Epoch 2/10
312/312 [==============================] - 64s 205ms/step - loss: 0.0087 - accuracy: 0.9975 - val_loss: 0.0318 - val_accuracy: 0.9888
Epoch 3/10
312/312 [==============================] - 64s 205ms/step - loss: 0.0019 - accuracy: 0.9997 - val_loss: 0.0378 - val_accuracy: 0.9884
Epoch 4/10
312/312 [==============================] - 64s 206ms/step - loss: 8.0528e-04 - accuracy: 0.9998 - val_loss: 0.0389 - val_accuracy: 0.9892
Epoch 5/10
312/312 [==============================] - 64s 204ms/step - loss: 6.1654e-04 - accuracy: 0.9999 - val_loss: 0.0402 - val_accuracy: 0.9890
Epoch 6/10
312/312 [==============================] - 64s 204ms/step - loss: 4.4182e-04 - accuracy: 0.9999 - val_loss: 0.0414 - val_accuracy: 0.9892
Epoch 7/10
312/312 [==============================] - 64s 205ms/step - loss: 2.7643e-04 - accuracy: 1.0000 - val_loss: 0.0414 - val_accuracy: 0.9896
Epoch 8/10
312/312 [==============================] - 64s 205ms/step - loss: 2.2176e-04 - accuracy: 1.0000 - val_loss: 0.0423 - val_accuracy: 0.9896
Epoch 9/10
312/312 [==============================] - 65s 208ms/step - loss: 1.9270e-04 - accuracy: 1.0000 - val_loss: 0.0427 - val_accuracy: 0.9900
Epoch 10/10
312/312 [==============================] - 65s 209ms/step - loss: 1.7137e-04 - accuracy: 1.0000 - val_loss: 0.0431 - val_accuracy: 0.9900

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 13s 166ms/step - loss: 0.0431 - accuracy: 0.9900
> 98.998

In [75]:

cnn_diagnostics(resnet_history)

In [78]:

# define cnn model
def define_ResNetD_model():
    # load model
    model = ResNet50(include_top=False, input_shape=(224, 224, 3))
    # mark loaded layers as not trainable
    for layer in model.layers:
        layer.trainable = False
    # add new classifier layers
    drop0  = Dropout(0.5)(model.layers[-1].output)
    flat1  = Flatten()(drop0)
    drop1  = Dropout(0.2)(flat1)
    class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
    drop2  = Dropout(0.5)(class1)
    output = Dense(1, activation='sigmoid')(drop2)
    # define new model
    model = Model(inputs=model.inputs, outputs=output)
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [79]:

resnetd_model = define_ResNetD_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'ResNet50')
resnetd_history = fit_cnn_model(resnetd_model, train_it,test_it, epochs = 10)

Found 19910 images belonging to 2 classes.
Found 5090 images belonging to 2 classes.
Epoch 1/10
312/312 [==============================] - 69s 214ms/step - loss: 0.0969 - accuracy: 0.9704 - val_loss: 0.0376 - val_accuracy: 0.9857
Epoch 2/10
312/312 [==============================] - 66s 211ms/step - loss: 0.0364 - accuracy: 0.9858 - val_loss: 0.0322 - val_accuracy: 0.9886
Epoch 3/10
312/312 [==============================] - 66s 210ms/step - loss: 0.0290 - accuracy: 0.9899 - val_loss: 0.0339 - val_accuracy: 0.9882
Epoch 4/10
312/312 [==============================] - 66s 211ms/step - loss: 0.0232 - accuracy: 0.9914 - val_loss: 0.0354 - val_accuracy: 0.9876
Epoch 5/10
312/312 [==============================] - 66s 211ms/step - loss: 0.0154 - accuracy: 0.9943 - val_loss: 0.0372 - val_accuracy: 0.9892
Epoch 6/10
312/312 [==============================] - 66s 210ms/step - loss: 0.0148 - accuracy: 0.9947 - val_loss: 0.0386 - val_accuracy: 0.9870
Epoch 7/10
312/312 [==============================] - 65s 209ms/step - loss: 0.0132 - accuracy: 0.9951 - val_loss: 0.0391 - val_accuracy: 0.9874
Epoch 8/10
312/312 [==============================] - 64s 205ms/step - loss: 0.0109 - accuracy: 0.9958 - val_loss: 0.0358 - val_accuracy: 0.9896
Epoch 9/10
312/312 [==============================] - 64s 205ms/step - loss: 0.0080 - accuracy: 0.9970 - val_loss: 0.0363 - val_accuracy: 0.9896
Epoch 10/10
312/312 [==============================] - 64s 206ms/step - loss: 0.0082 - accuracy: 0.9970 - val_loss: 0.0392 - val_accuracy: 0.9886

C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.

80/80 [==============================] - 13s 163ms/step - loss: 0.0392 - accuracy: 0.9886
> 98.861

In [80]:

cnn_diagnostics(resnetd_history)

Model Performance Comparison¶

The 9 models tried so far are shown below. Transfer learning through VGG16 resulted in the highest classification accuracy. The incremental layer models

In [81]:

acc_scores = [74.617 , 76.935, 75.815 ,79.430, 81.218, 98.016, 97.976 , 98.998 ,98.861 ]
model_names = ['Single Layer','Double Layer','Triple Layer',
               'Triple Layer + Drop',
               'Triple Layer + Drop + Augment',
               'VGG16',
               'VGG16 + Drop',
               'ResNet50',
               'ResNet50 + Drop']
dft = pd.DataFrame(acc_scores,index = model_names,columns =['Accuracy Scores'])
dft

Out[81]:

	Accuracy Scores
Single Layer	74.617
Double Layer	76.935
Triple Layer	75.815
Triple Layer + Drop	79.430
Triple Layer + Drop + Augment	81.218
VGG16	98.016
VGG16 + Drop	97.976
ResNet50	98.998
ResNet50 + Drop	98.861

Based on the accuracy and the extent of overfitting, I'll use the "ResNet50 + Drop" model to train on the full data set and make predictions.

Final Model Preparations¶

I'll start testing the model now and see how well it performs using all the images in the training set. This will then allow me to make predictions on new images that were not originally included in the dataset. I'll start by organizing the images into the final dataset. No validation data required in this step.

In [26]:

#create directories
dataset_home = 'final_dogs_vs_cats/'
# create label subdirectories
labeldirs = ['dogs/', 'cats/']
for labldir in labeldirs:
    newdir = dataset_home + labldir
    makedirs(newdir, exist_ok=True)
    print(newdir)
# copy training dataset images into subdirectories
flist = [name for name in os.listdir(".") if os.path.isdir(name)]
# copy training dataset images into subdirectories
src_directory = os.getcwd() + '\\' + flist[-1]
for file in listdir(src_directory):
    src = src_directory + '\\' + file
    if file.startswith('cat'):
        dst = dataset_home + 'cats\\'  + file
        copyfile(src, dst)
    elif file.startswith('dog'):
        dst = dataset_home + 'dogs\\'  + file
        copyfile(src, dst)

final_dogs_vs_cats/dogs/
final_dogs_vs_cats/cats/

Now I'll gather my previous functions here one last time and update them to represent the final needs of the model evaluation. I'll also save my model into an h5 file for future use.

In [34]:

#Instantiate DataGenerator and prepare iterators
def iter_generator(wd , wsr = 0, hsr = 0, hf = False, pxx = 200, pxy = 200, model = None):
    # create data generators
    if model == 'VGG16':
        datagen = ImageDataGenerator(featurewise_center=True)
        datagen.mean = [123.68, 116.779, 103.939]
    elif model == 'ResNet50':
        datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
    else:
        datagen = ImageDataGenerator(rescale=1.0/255.0,
                                     width_shift_range = wsr,
                                     height_shift_range = hsr,
                                     horizontal_flip = hf
                                    )

    # prepare iterators
    train_it = datagen.flow_from_directory(wd,
                                           class_mode='binary',
                                           batch_size=64,
                                           target_size=(pxx, pxy))
    
    return train_it

# define cnn model
def define_ResNetD_model():
    # load model
    model = ResNet50(include_top=False, input_shape=(224, 224, 3))
    # mark loaded layers as not trainable
    for layer in model.layers:
        layer.trainable = False
    # add new classifier layers
    drop0  = Dropout(0.5)(model.layers[-1].output)
    flat1  = Flatten()(drop0)
    drop1  = Dropout(0.2)(flat1)
    class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
    drop2  = Dropout(0.5)(class1)
    output = Dense(1, activation='sigmoid')(drop2)
    # define new model
    model = Model(inputs=model.inputs, outputs=output)
    # compile model
    opt = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

#Model fitting
def fit_final_model(model,train_it,epochs = 20):
    # fit model
    history = model.fit(train_it, 
                        steps_per_epoch = len(train_it),
                        epochs = epochs, 
                        verbose = 1)
    
    model.save('final_model.h5')
    return history

#Visualize learning results
def cnn_diagnostics(history):
    fig, axs = plt.subplots(1,2,figsize =(30,10)) 

    # plot loss
    axs[0].set_title('Cross Entropy Loss', fontsize = 40)
    axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
    axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
    axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
    axs[0].set_xlabel('Epoch', fontsize = 30) # X label
    axs[0].tick_params(axis='x', labelsize=30)
    axs[0].tick_params(axis='y', labelsize=30)
    axs[0].legend(fontsize = 40)

    # plot accuracy
    axs[1].set_title('Classification Accuracy', fontsize = 40)
    axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
    axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
    axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
    axs[1].set_xlabel('Epoch', fontsize = 30) # X label
    axs[1].tick_params(axis='x', labelsize=30)
    axs[1].tick_params(axis='y', labelsize=30)
    axs[1].legend(fontsize = 40)
    plt.show()

Okay, let's run it now!

In [35]:

dataset_home = 'final_dogs_vs_cats/'
resnetd_model = define_ResNetD_model()
train_it = iter_generator(dataset_home, pxx = 224, pxy = 224, model = 'ResNet50')
resnetd_history = fit_final_model(resnetd_model, train_it, epochs = 20)

Found 25000 images belonging to 2 classes.
Epoch 1/20
391/391 [==============================] - 135s 328ms/step - loss: 0.0866 - accuracy: 0.9735
Epoch 2/20
391/391 [==============================] - 67s 170ms/step - loss: 0.0380 - accuracy: 0.9865
Epoch 3/20
391/391 [==============================] - 67s 170ms/step - loss: 0.0268 - accuracy: 0.9902
Epoch 4/20
391/391 [==============================] - 67s 170ms/step - loss: 0.0226 - accuracy: 0.9916
Epoch 5/20
391/391 [==============================] - 66s 168ms/step - loss: 0.0163 - accuracy: 0.9938
Epoch 6/20
391/391 [==============================] - 65s 166ms/step - loss: 0.0138 - accuracy: 0.9947
Epoch 7/20
391/391 [==============================] - 67s 170ms/step - loss: 0.0116 - accuracy: 0.9955
Epoch 8/20
391/391 [==============================] - 66s 169ms/step - loss: 0.0111 - accuracy: 0.9955
Epoch 9/20
391/391 [==============================] - 65s 166ms/step - loss: 0.0087 - accuracy: 0.9970
Epoch 10/20
391/391 [==============================] - 66s 168ms/step - loss: 0.0078 - accuracy: 0.9971
Epoch 11/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0073 - accuracy: 0.9972
Epoch 12/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0065 - accuracy: 0.9978
Epoch 13/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0049 - accuracy: 0.9982
Epoch 14/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0054 - accuracy: 0.9979
Epoch 15/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0062 - accuracy: 0.9978
Epoch 16/20
391/391 [==============================] - 64s 164ms/step - loss: 0.0045 - accuracy: 0.9984
Epoch 17/20
391/391 [==============================] - 65s 167ms/step - loss: 0.0057 - accuracy: 0.9979
Epoch 18/20
391/391 [==============================] - 67s 170ms/step - loss: 0.0041 - accuracy: 0.9987
Epoch 19/20
391/391 [==============================] - 66s 170ms/step - loss: 0.0050 - accuracy: 0.9984
Epoch 20/20
391/391 [==============================] - 66s 170ms/step - loss: 0.0042 - accuracy: 0.9985

Use Model To Predict Whether Image is of dog or cat¶

Let's see how well the model responds to images outside of our dataset! I'll use pictures of my cat Catterina and my dog Maia :)

First, I'll make a function to prepare images for loading into the model (i.e., make sure they have the right dimensions, size, and centering).

In [55]:

from tensorflow.keras.utils import load_img
from tensorflow.keras.utils import img_to_array
from keras.models import load_model

def load_image_for_prediction(filename):
    # load the image
    img = load_img(filename, target_size=(224, 224))
    # convert to array
    img = img_to_array(img)
    # reshape into a single sample with 3 channels
    img = img.reshape(1, 224, 224, 3)
    # center pixel data
    img = img.astype('float32')
    img = img - [123.68, 116.779, 103.939]
    return img

Then, I can feed this image to the model and have it predict whether the loaded image is of a cat or a dog. The model will return 0 if the image is of a cat or 1 if its a dog.

In [66]:

# load an image and predict the class
def dog_cat_predict(img_file, model_file):
    # load the image
    img = load_image_for_prediction(img_file)
    # load model
    model = load_model(model_file)
    # predict the class
    result = model.predict(img)
    rounded = [np.round(x) for x in result]
    if int(rounded[0]) == 0:
        print('This is a cat!')
    elif int(rounded[0]) == 1:
        print('This is a dog!')
    else:
        print('This is neither a cat or a dog?')

Let's give it a go! Here's the first image I'll be giving the model.

In [67]:

dog_cat_predict('baby1.jpg', 'final_model.h5')

1/1 [==============================] - 1s 616ms/step
This is a cat!

Super cool! It recognized Catterina as a cat correctly! Now let's try my dog Maia. Here's the picture I'll be giving the model

In [68]:

dog_cat_predict('maia1.jpg', 'final_model.h5')

1/1 [==============================] - 1s 635ms/step
This is a dog!

Sweet! It recognized Maia as a dog correctly as well!!

Conclusions¶

9 different convolutional neural networks were explored in order to differentiate between images of cats and dogs. A model was trained,incrementally built, and optimized to reduce overfitting until an 81% classification accuracy was obtained. Transfer learning was also used via modifications to the VGG16 and ResNet50 models. The utilized model had a classification accuracy of 97.98% while minimizing overfitting through the incorporation of dropout layers. The model is capable of correctly classifying images of dogs and cats that weren't in the original dataset, and is thus flexible in learning and capable of processing new information correctly.