Dogs vs Cats Image Classification with Keras
jupyter_notebook
data_science
neural_networks
tensorflow
keras
classification
featured
Image Classification of dogs and cats using Convolutional Neural Networks¶
In this post, I'll be showing how to train a convolutional neural network (CNN) to differentiate between dog and cat pictures. The data was downloaded from the 'Dogs vs Cats' dataset found in Kaggle. You can see it here: https://www.kaggle.com/c/dogs-vs-cats/data?select=train.zip. I'll go through the process of downloading the data, organizing it, preprocessing it, building a CNN model, assessing its performance and optimizing it. Finally, I'll use transfer learning with pretrained models to compare the performance of the built model. The models were built using Keras.
Modules used in this project¶
The modules below were used in this project:
# General utilities
import matplotlib.pyplot as plt
import random
import os, os.path , sys
from os import makedirs, listdir
from PIL import Image
from shutil import copyfile
import numpy as np
import pandas as pd
#Use tensorflow with GPU
import tensorflow as tf
#Keras modules
from keras.utils import to_categorical
from keras.models import Sequential, Model #Build a sequential/pretrained model
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout #Layers that make up models
from keras.optimizers import SGD #Optimizer
from keras.preprocessing.image import ImageDataGenerator #For preprocessing images
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.applications.resnet import ResNet50
from keras.applications.resnet import preprocess_input
Step 1 - Exploring and Organizing the data¶
I always like to get an initial idea of the data I'm working with before anything else. In this case, I also get to see pictures of cute cats and dogs so ...
Let's start by exploring the data a bit. I'll figure out how many pictures I have at my disposal in the test and training sets.
train_folder = 'train/'
test_folder = 'test/'
print('Number of pictures in train folder is:')
print(len([f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))]))
n_train = len([f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))])
print('Number of pictures in test folder is:')
print(len([f for f in os.listdir(test_folder) if os.path.isfile(os.path.join(test_folder, f))]))
n_test = len([f for f in os.listdir(test_folder) if os.path.isfile(os.path.join(test_folder, f))])
Number of pictures in train folder is: 25000 Number of pictures in test folder is: 12500
Now I'm going to plot some cat and dog images from the training set and some random images from the test set below.
# plot 6 random dog images from train set
fig, axs = plt.subplots(1,6,figsize =(30,30))
for i in range(6): #Let me plot 10 random images from the dataset
img_sel = random.randint(0, (n_train/2)-1)
img_name = train_folder + 'dog.' + str(img_sel) + '.jpg'
img = Image.open(img_name)
axs[i].imshow(img)
axs[i].set_title(img_name, fontsize = 20)
plt.show()
# plot 6 random cat images from train set
fig, axs = plt.subplots(1,6,figsize =(30,30))
for i in range(6):
img_sel = random.randint(0, (n_train/2)-1)
img_name = train_folder + 'cat.' + str(img_sel) + '.jpg'
img = Image.open(img_name)
axs[i].imshow(img)
axs[i].set_title(img_name, fontsize = 20)
plt.show()
# plot 6 random images from test set
fig, axs = plt.subplots(1,6,figsize =(30,30))
for i in range(6):
img_sel = random.randint(1, n_test)
img_name = test_folder + str(img_sel) + '.jpg'
img = Image.open(img_name)
axs[i].imshow(img)
axs[i].set_title(img_name, fontsize = 20)
plt.show()
LOOK AT THOSE CUTIES!!! Now I'll create subdirectories for cats and dogs for both train and test directories
#create directories
subdirs = [train_folder, test_folder]
for subdir in subdirs:
# create label subdirectories
labeldirs = ['dogs/', 'cats/']
for labldir in labeldirs:
newdir = subdir + labldir
os.makedirs(newdir, exist_ok=True)
Splitting the data into testing and training sets¶
Now that I have my directories set up, I'm ready to start splitting my data. I'll use an 80:20 split to start things off and see how well that works.
# seed random number generator for reproducibility
random.seed(1)
# Ratio of pictures to use for validation 80:20
val_ratio = 0.2
# Find files in train folder w.r.t. cwd
files = [f for f in os.listdir(train_folder) if os.path.isfile(os.path.join(train_folder, f))]
# copy training dataset images into subdirectories
for file in files:
#print(file)
src = train_folder + file
dst_dir = 'train/'
if random.random() < val_ratio:
dst_dir = 'test/'
if file.startswith('cat'):
dst = dst_dir + 'cats/' + file
copyfile(src, dst)
elif file.startswith('dog'):
dst = dst_dir + 'dogs/' + file
copyfile(src, dst)
Let's see how many cats and dogs we have in each of our training sets
print('This is the number of cat pics in the train folder')
print(len([f for f in os.listdir(train_folder + 'cats/')
if os.path.isfile(os.path.join(train_folder+ 'cats/', f))]))
print('\n')
print('This is the number of dog pics in the train folder')
print(len([f for f in os.listdir(train_folder + 'dogs/')
if os.path.isfile(os.path.join(train_folder+ 'dogs/', f))]))
print('\n')
print('This is the number of cat pics in the test folder')
print(len([f for f in os.listdir(test_folder + 'cats/')
if os.path.isfile(os.path.join(test_folder+ 'cats/', f))]))
print('\n')
print('This is the number of cat pics in the test folder')
print(len([f for f in os.listdir(test_folder + 'dogs/')
if os.path.isfile(os.path.join(test_folder+ 'dogs/', f))]))
This is the number of cat pics in the train folder 9945 This is the number of dog pics in the train folder 9965 This is the number of cat pics in the test folder 2555 This is the number of cat pics in the test folder 2535
So we have a total of 12,500 pictures for cats and dogs each in the test and training folders. The ratio of test cats to total cats is 2,555/12,500 = 0.2 so we have an 80:20 split. The situation is similar for dogs. Let's continue preprocessing our data.
Scaling, reshaping and augmenting the data¶
The images that we have in our dataset have different sizes and so we'll have to ensure that they are all the same size prior to modeling. We'll also have to normalize the pixels. Lastly, we'll also do data augmentation to expand the size of the training dataset.
In the code below, I'm normalizing the pixel intensities to between 0-1, ensuring that there are at most two classes being detected (cats vs dogs), and forcing each image to be 200 x 200 pixels.
# create data generators
datagen = ImageDataGenerator(rescale=1.0/255.0)
# prepare iterators
train_it = datagen.flow_from_directory(train_folder,
class_mode='binary',
batch_size=64,
target_size=(200, 200))
test_it = datagen.flow_from_directory(test_folder,
class_mode='binary',
batch_size=64,
target_size=(200, 200))
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes.
Single layer model¶
I'll start with a model composed of a single convolution layer prior to flattening as described below.
Building the Model¶
Now that I have the data normalized and reshaped, I'll start building the model. The first model I'll build will be structured as follows:
1x Convolution + MaxPooling Layer
1x FC Layer
1x Densse Layer
1 output layer with sigmoid activation
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
Now I'll set the optimizer to 'SGD' and compile the model
# compile model
opt = SGD(lr = 0.001, momentum = 0.9)
model.compile(optimizer = opt, loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
Model: "sequential_3" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_3 (Conv2D) (None, 200, 200, 32) 896 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 100, 100, 32) 0 _________________________________________________________________ flatten_3 (Flatten) (None, 320000) 0 _________________________________________________________________ dense_5 (Dense) (None, 128) 40960128 _________________________________________________________________ dense_6 (Dense) (None, 1) 129 ================================================================= Total params: 40,961,153 Trainable params: 40,961,153 Non-trainable params: 0 _________________________________________________________________
Fitting the Model¶
And now I'll fit and assess the model. I'll train it for 20 epochs to start off.
# fit model
history = model.fit_generator(train_it,
steps_per_epoch = len(train_it),
validation_data = test_it,
validation_steps = len(test_it),
epochs = 20,
verbose = 1)
# evaluate model
_, acc = model.evaluate_generator(test_it, steps = len(test_it), verbose = 1)
print('> %.3f' % (acc * 100.0))
Epoch 1/20 - 160s - loss: 0.6853 - accuracy: 0.5608 - val_loss: 0.7295 - val_accuracy: 0.5976 Epoch 2/20 - 159s - loss: 0.6463 - accuracy: 0.6150 - val_loss: 0.7255 - val_accuracy: 0.6043 Epoch 3/20 - 159s - loss: 0.6207 - accuracy: 0.6448 - val_loss: 0.6261 - val_accuracy: 0.6326 Epoch 4/20 - 158s - loss: 0.6062 - accuracy: 0.6589 - val_loss: 0.6031 - val_accuracy: 0.6503 Epoch 5/20 - 159s - loss: 0.5985 - accuracy: 0.6684 - val_loss: 0.4948 - val_accuracy: 0.6607 Epoch 6/20 - 159s - loss: 0.5770 - accuracy: 0.6902 - val_loss: 0.7151 - val_accuracy: 0.6662 Epoch 7/20 - 160s - loss: 0.5607 - accuracy: 0.7076 - val_loss: 0.5129 - val_accuracy: 0.6912 Epoch 8/20 - 162s - loss: 0.5360 - accuracy: 0.7304 - val_loss: 0.5882 - val_accuracy: 0.7145 Epoch 9/20 - 159s - loss: 0.5077 - accuracy: 0.7521 - val_loss: 0.5406 - val_accuracy: 0.7159 Epoch 10/20 - 159s - loss: 0.4810 - accuracy: 0.7686 - val_loss: 0.6013 - val_accuracy: 0.7163 Epoch 11/20 - 158s - loss: 0.4562 - accuracy: 0.7868 - val_loss: 0.4718 - val_accuracy: 0.7334 Epoch 12/20 - 157s - loss: 0.4233 - accuracy: 0.8094 - val_loss: 0.3947 - val_accuracy: 0.7334 Epoch 13/20 - 158s - loss: 0.3818 - accuracy: 0.8315 - val_loss: 0.6220 - val_accuracy: 0.7344 Epoch 14/20 - 158s - loss: 0.3471 - accuracy: 0.8546 - val_loss: 0.5257 - val_accuracy: 0.7354 Epoch 15/20 - 157s - loss: 0.3066 - accuracy: 0.8753 - val_loss: 0.4527 - val_accuracy: 0.7424 Epoch 16/20 - 157s - loss: 0.2859 - accuracy: 0.8857 - val_loss: 0.4680 - val_accuracy: 0.7305 Epoch 17/20 - 157s - loss: 0.2452 - accuracy: 0.9093 - val_loss: 0.5377 - val_accuracy: 0.7491 Epoch 18/20 - 158s - loss: 0.2079 - accuracy: 0.9276 - val_loss: 0.6251 - val_accuracy: 0.7395 Epoch 19/20 - 158s - loss: 0.1792 - accuracy: 0.9415 - val_loss: 0.5093 - val_accuracy: 0.7460 Epoch 20/20 - 157s - loss: 0.1587 - accuracy: 0.9506 - val_loss: 0.8236 - val_accuracy: 0.7462 > 74.617
Hmmm, this initial model has an accuracy of 74.6%. Not great but not absolutely terrible. This process took about 53 minutes to complete.
Visualize learning curve¶
A learning curve
is a plot of model learning performance over time. A learning curve allows us to determine things like whether the model is underfitting/overfitting/goodfit and whether the test/training sets are good representations of one another.
The training curve
tells us how well the model is learning.
The testing curve
tells us how well the model is generalizing.
For each of these datasets we'll generate learning curves for optimization and performance.
The optimization learning curve
is based on the metric that is being used to optimize the model parameters (i.e., the loss function).
The performance learning curve
is based on the model evaluation metrics (i.e., accuracy)
Underfitting
happens when a model cannot learn from a training set. This can usually be identified by plotting the loss function over the course of the learning process. If the training loss is flat or continues to decrease by the end of the training process then underfitting is likely happening.
Overfitting
happens when a model learned too much from the training set. This is a problem because it makes it limits the generality of the model and thus makes it too specialized to be useful in scenarios other than the one it is working on. In other words an overfitted model only works for a specific dataset and is generally incapable of having the same kind of success with new datasets. Over fitting generally happens when the model is trained for too long or if the model has a capacity that is too large for the dataset that it is fitting (i.e., too many parameters). Overfitting can be identified from the learning curves. Signs of overfitting are a continued loss over time (i.e., no plateau) and decrease followed by increase in the loss of the test set.
Good fits
occur when the loss between the test and training sets are comparable with one another. They can be identified by a decreasing loss that eventually plateaus and by a small gap between the loss of the test and training sets.
Let's look at the entropy loss and classification accuracy of our model to get some learning diagnostics
fig, axs = plt.subplots(1,2,figsize =(30,10))
# plot loss
axs[0].set_title('Cross Entropy Loss', fontsize = 40)
axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
axs[0].set_xlabel('Epoch', fontsize = 30) # X label
axs[0].tick_params(axis='x', labelsize=30)
axs[0].tick_params(axis='y', labelsize=30)
axs[0].legend(fontsize = 40)
# plot accuracy
axs[1].set_title('Classification Accuracy', fontsize = 40)
axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
axs[1].set_xlabel('Epoch', fontsize = 30) # X label
axs[1].tick_params(axis='x', labelsize=30)
axs[1].tick_params(axis='y', labelsize=30)
axs[1].legend(fontsize = 40)
plt.show()
From these curves we can see that our model (test set) is overfitting the training set after approximately 12 epochs as evidenced by the divergence between the two curves at that stage. This is fine however since I started with a simple, single convolution/pooling layer. I'll start with a more complex build/architecture next. Overfitting can be minimized by addition of dropout layers and data augmentation.
Consolidating Routines¶
Now that I have a full model, I'll start gathering all the routines I have from before and place it into functions that will allow me to easily reference later if need be.
#Instantiate DataGenerator and prepare iterators
def iter_generator():
# create data generators
datagen = ImageDataGenerator(rescale=1.0/255.0)
# prepare iterators
train_it = datagen.flow_from_directory(train_folder,
class_mode='binary',
batch_size=64,
target_size=(200, 200))
test_it = datagen.flow_from_directory(test_folder,
class_mode='binary',
batch_size=64,
target_size=(200, 200))
return train_it, test_it
# Define Model
def define_SL_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
#Model fitting
def fit_cnn_model(model,train_it,test_it,epochs = 20):
# fit model
history = model.fit(train_it,
steps_per_epoch = len(train_it),
validation_data = test_it,
validation_steps = len(test_it),
epochs = epochs,
verbose = 1)
# evaluate model
_, acc = model.evaluate_generator(test_it, steps = len(test_it), verbose = 1)
print('> %.3f' % (acc * 100.0))
return history
#Visualize learning results
def cnn_diagnostics(history):
fig, axs = plt.subplots(1,2,figsize =(30,10))
# plot loss
axs[0].set_title('Cross Entropy Loss', fontsize = 40)
axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
axs[0].set_xlabel('Epoch', fontsize = 30) # X label
axs[0].tick_params(axis='x', labelsize=30)
axs[0].tick_params(axis='y', labelsize=30)
axs[0].legend(fontsize = 40)
# plot accuracy
axs[1].set_title('Classification Accuracy', fontsize = 40)
axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
axs[1].set_xlabel('Epoch', fontsize = 30) # X label
axs[1].tick_params(axis='x', labelsize=30)
axs[1].tick_params(axis='y', labelsize=30)
axs[1].legend(fontsize = 40)
plt.show()
Double layer model¶
Let me add another convolution/pooling layer to the model and see how that performs. Since I already have generator,model making, fitting, and plotting routines made I can simply call them. I'll modify the model routine from earlier to include the double layer model
# Define Model
def define_DL_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
# compile model
opt = SGD(lr = 0.001, momentum = 0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
dl_model = define_DL_model()
train_it, test_it = iter_generator()
dl_history = fit_cnn_model(dl_model,train_it,test_it)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/20 312/312 [==============================] - 274s 879ms/step - loss: 0.6773 - accuracy: 0.5717 - val_loss: 0.7064 - val_accuracy: 0.5560 Epoch 2/20 312/312 [==============================] - 270s 864ms/step - loss: 0.6371 - accuracy: 0.6278 - val_loss: 0.6867 - val_accuracy: 0.6473 Epoch 3/20 312/312 [==============================] - 269s 863ms/step - loss: 0.6162 - accuracy: 0.6523 - val_loss: 0.5552 - val_accuracy: 0.6603 Epoch 4/20 312/312 [==============================] - 271s 869ms/step - loss: 0.5815 - accuracy: 0.6913 - val_loss: 0.6806 - val_accuracy: 0.6884 Epoch 5/20 312/312 [==============================] - 270s 866ms/step - loss: 0.5503 - accuracy: 0.7146 - val_loss: 0.4781 - val_accuracy: 0.7189 Epoch 6/20 312/312 [==============================] - 270s 865ms/step - loss: 0.5223 - accuracy: 0.7399 - val_loss: 0.5847 - val_accuracy: 0.7385 Epoch 7/20 312/312 [==============================] - 268s 859ms/step - loss: 0.4860 - accuracy: 0.7697 - val_loss: 0.5146 - val_accuracy: 0.7495 Epoch 8/20 312/312 [==============================] - 269s 861ms/step - loss: 0.4616 - accuracy: 0.7845 - val_loss: 0.5140 - val_accuracy: 0.7291 Epoch 9/20 312/312 [==============================] - 267s 857ms/step - loss: 0.4288 - accuracy: 0.8018 - val_loss: 0.3683 - val_accuracy: 0.7413 Epoch 10/20 312/312 [==============================] - 269s 861ms/step - loss: 0.3985 - accuracy: 0.8235 - val_loss: 0.4651 - val_accuracy: 0.7658 Epoch 11/20 312/312 [==============================] - 267s 856ms/step - loss: 0.3664 - accuracy: 0.8401 - val_loss: 0.5742 - val_accuracy: 0.7745 Epoch 12/20 312/312 [==============================] - 267s 856ms/step - loss: 0.3329 - accuracy: 0.8608 - val_loss: 0.5326 - val_accuracy: 0.7809 Epoch 13/20 312/312 [==============================] - 270s 865ms/step - loss: 0.3041 - accuracy: 0.8720 - val_loss: 0.6844 - val_accuracy: 0.7750 Epoch 14/20 312/312 [==============================] - 272s 873ms/step - loss: 0.2646 - accuracy: 0.8917 - val_loss: 0.4253 - val_accuracy: 0.7829 Epoch 15/20 312/312 [==============================] - 268s 860ms/step - loss: 0.2315 - accuracy: 0.9120 - val_loss: 0.2716 - val_accuracy: 0.7676 Epoch 16/20 312/312 [==============================] - 268s 857ms/step - loss: 0.2134 - accuracy: 0.9178 - val_loss: 0.2441 - val_accuracy: 0.7756 Epoch 17/20 312/312 [==============================] - 268s 859ms/step - loss: 0.1821 - accuracy: 0.9339 - val_loss: 0.4488 - val_accuracy: 0.7806 Epoch 18/20 312/312 [==============================] - 270s 865ms/step - loss: 0.1554 - accuracy: 0.9470 - val_loss: 0.7510 - val_accuracy: 0.7648 Epoch 19/20 312/312 [==============================] - 268s 859ms/step - loss: 0.1377 - accuracy: 0.9537 - val_loss: 0.6019 - val_accuracy: 0.7613 Epoch 20/20 312/312 [==============================] - 269s 863ms/step - loss: 0.1016 - accuracy: 0.9729 - val_loss: 0.1961 - val_accuracy: 0.7694 80/80 [==============================] - 16s 201ms/step > 76.935
This model is marginally better with 76.9% accuracy, but it is an improvement nonetheless. We can also see that the model is overfitting based on the learning curves below, and this is happening at an earlier epoch (around 8) than before. This process took about 90 minutes to complete.
cnn_diagnostics(dl_history)
Triple Layer Model¶
I'll try adding one more layer just to confirm that this trend continues before trying to improve the model further in a different way.
# Define Model
def define_TL_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
# compile model
opt = SGD(lr = 0.001, momentum = 0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
tl_model = define_TL_model()
train_it, test_it = iter_generator()
tl_history = fit_cnn_model(tl_model,train_it,test_it)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/20 312/312 [==============================] - 346s 1s/step - loss: 0.6556 - accuracy: 0.6092 - val_loss: 0.6442 - val_accuracy: 0.6611 Epoch 2/20 312/312 [==============================] - 341s 1s/step - loss: 0.5933 - accuracy: 0.6772 - val_loss: 0.5903 - val_accuracy: 0.7063 Epoch 3/20 312/312 [==============================] - 337s 1s/step - loss: 0.5576 - accuracy: 0.7103 - val_loss: 0.5116 - val_accuracy: 0.6809 Epoch 4/20 312/312 [==============================] - 340s 1s/step - loss: 0.5315 - accuracy: 0.7327 - val_loss: 0.7766 - val_accuracy: 0.7358 Epoch 5/20 312/312 [==============================] - 337s 1s/step - loss: 0.5007 - accuracy: 0.7581 - val_loss: 0.5519 - val_accuracy: 0.7240 Epoch 6/20 312/312 [==============================] - 337s 1s/step - loss: 0.4749 - accuracy: 0.7725 - val_loss: 0.5498 - val_accuracy: 0.7487 Epoch 7/20 312/312 [==============================] - 338s 1s/step - loss: 0.4463 - accuracy: 0.7944 - val_loss: 0.3613 - val_accuracy: 0.7817 Epoch 8/20 312/312 [==============================] - 338s 1s/step - loss: 0.4224 - accuracy: 0.8075 - val_loss: 0.7579 - val_accuracy: 0.7760 Epoch 9/20 312/312 [==============================] - 338s 1s/step - loss: 0.3949 - accuracy: 0.8230 - val_loss: 0.5986 - val_accuracy: 0.7782 Epoch 10/20 312/312 [==============================] - 337s 1s/step - loss: 0.3713 - accuracy: 0.8345 - val_loss: 0.3801 - val_accuracy: 0.7925 Epoch 11/20 312/312 [==============================] - 336s 1s/step - loss: 0.3420 - accuracy: 0.8516 - val_loss: 0.3372 - val_accuracy: 0.7963 Epoch 12/20 312/312 [==============================] - 334s 1s/step - loss: 0.3129 - accuracy: 0.8656 - val_loss: 0.3451 - val_accuracy: 0.7770 Epoch 13/20 312/312 [==============================] - 334s 1s/step - loss: 0.2982 - accuracy: 0.8731 - val_loss: 0.3409 - val_accuracy: 0.8020 Epoch 14/20 312/312 [==============================] - 336s 1s/step - loss: 0.2571 - accuracy: 0.8962 - val_loss: 0.4073 - val_accuracy: 0.8071 Epoch 15/20 312/312 [==============================] - 337s 1s/step - loss: 0.2307 - accuracy: 0.9104 - val_loss: 0.4753 - val_accuracy: 0.7984 Epoch 16/20 312/312 [==============================] - 335s 1s/step - loss: 0.2002 - accuracy: 0.9251 - val_loss: 0.4671 - val_accuracy: 0.8033 Epoch 17/20 312/312 [==============================] - 332s 1s/step - loss: 0.1827 - accuracy: 0.9308 - val_loss: 0.6437 - val_accuracy: 0.8124 Epoch 18/20 312/312 [==============================] - 334s 1s/step - loss: 0.1403 - accuracy: 0.9543 - val_loss: 0.4069 - val_accuracy: 0.8059 Epoch 19/20 312/312 [==============================] - 336s 1s/step - loss: 0.1137 - accuracy: 0.9655 - val_loss: 0.2892 - val_accuracy: 0.7980 Epoch 20/20 312/312 [==============================] - 338s 1s/step - loss: 0.0998 - accuracy: 0.9705 - val_loss: 0.6823 - val_accuracy: 0.7582 80/80 [==============================] - 19s 239ms/step > 75.815
The final accuracy of this model is 75.8% which is slightly worse than the two layer model. However, the accuracy did reach 80% in certain epochs. We can also see that the model is overfitting based on the learning curves below, and this is happening at an earlier epoch (around 6) than before. At this point, it is clear that adding more convolution/pooling layers is unlikely to improve the model further so I'll start including other strategies like dropout layers and implementing data augmentation strategies. This process took almost two hours to complete.
cnn_diagnostics(tl_history)
Adding dropout layers¶
I'll try addressing the overfitting issue by incorporating a dropout layer. This is known as dropout regularization. I'll start by incorporating a dropout of 20% after each of the initial 3 layers and a subsequent dropout of 50% after the FC layer. This may not necessary improve the accuracy of the model but it should help with overfitting which will be valuable in the long run.
# Define Model
def define_TLD_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(200, 200, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
# compile model
opt = SGD(learning_rate = 0.001, momentum = 0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
tld_model = define_TLD_model()
train_it, test_it = iter_generator()
tld_history = fit_cnn_model(tld_model,train_it,test_it, epochs = 50)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/50 312/312 [==============================] - 393s 1s/step - loss: 0.6976 - accuracy: 0.5245 - val_loss: 0.6855 - val_accuracy: 0.5717 Epoch 2/50 312/312 [==============================] - 394s 1s/step - loss: 0.6824 - accuracy: 0.5531 - val_loss: 0.6780 - val_accuracy: 0.6189 Epoch 3/50 312/312 [==============================] - 394s 1s/step - loss: 0.6762 - accuracy: 0.5735 - val_loss: 0.6870 - val_accuracy: 0.5980 Epoch 4/50 312/312 [==============================] - 393s 1s/step - loss: 0.6668 - accuracy: 0.5900 - val_loss: 0.6787 - val_accuracy: 0.6265 Epoch 5/50 312/312 [==============================] - 395s 1s/step - loss: 0.6547 - accuracy: 0.6077 - val_loss: 0.6317 - val_accuracy: 0.6373 Epoch 6/50 312/312 [==============================] - 395s 1s/step - loss: 0.6484 - accuracy: 0.6114 - val_loss: 0.6318 - val_accuracy: 0.6375 Epoch 7/50 312/312 [==============================] - 394s 1s/step - loss: 0.6381 - accuracy: 0.6259 - val_loss: 0.7235 - val_accuracy: 0.6481 Epoch 8/50 312/312 [==============================] - 394s 1s/step - loss: 0.6304 - accuracy: 0.6356 - val_loss: 0.6590 - val_accuracy: 0.6548 Epoch 9/50 312/312 [==============================] - 394s 1s/step - loss: 0.6195 - accuracy: 0.6508 - val_loss: 0.5910 - val_accuracy: 0.6646 Epoch 10/50 312/312 [==============================] - 393s 1s/step - loss: 0.6047 - accuracy: 0.6649 - val_loss: 0.6096 - val_accuracy: 0.6572 Epoch 11/50 312/312 [==============================] - 394s 1s/step - loss: 0.5948 - accuracy: 0.6769 - val_loss: 0.5737 - val_accuracy: 0.6817 Epoch 12/50 312/312 [==============================] - 402s 1s/step - loss: 0.5867 - accuracy: 0.6826 - val_loss: 0.5909 - val_accuracy: 0.6994 Epoch 13/50 312/312 [==============================] - 399s 1s/step - loss: 0.5791 - accuracy: 0.6926 - val_loss: 0.5816 - val_accuracy: 0.7167 Epoch 14/50 312/312 [==============================] - 397s 1s/step - loss: 0.5625 - accuracy: 0.7061 - val_loss: 0.5693 - val_accuracy: 0.7220 Epoch 15/50 312/312 [==============================] - 399s 1s/step - loss: 0.5535 - accuracy: 0.7095 - val_loss: 0.5309 - val_accuracy: 0.7244 Epoch 16/50 312/312 [==============================] - 398s 1s/step - loss: 0.5437 - accuracy: 0.7226 - val_loss: 0.4805 - val_accuracy: 0.7346 Epoch 17/50 312/312 [==============================] - 397s 1s/step - loss: 0.5272 - accuracy: 0.7356 - val_loss: 0.6038 - val_accuracy: 0.7385 Epoch 18/50 312/312 [==============================] - 399s 1s/step - loss: 0.5194 - accuracy: 0.7418 - val_loss: 0.5695 - val_accuracy: 0.7454 Epoch 19/50 312/312 [==============================] - 406s 1s/step - loss: 0.5124 - accuracy: 0.7476 - val_loss: 0.5086 - val_accuracy: 0.7379 Epoch 20/50 312/312 [==============================] - 398s 1s/step - loss: 0.5034 - accuracy: 0.7489 - val_loss: 0.4349 - val_accuracy: 0.7587 Epoch 21/50 312/312 [==============================] - 397s 1s/step - loss: 0.4914 - accuracy: 0.7637 - val_loss: 0.4599 - val_accuracy: 0.7648 Epoch 22/50 312/312 [==============================] - 395s 1s/step - loss: 0.4841 - accuracy: 0.7655 - val_loss: 0.5894 - val_accuracy: 0.7737 Epoch 23/50 312/312 [==============================] - 395s 1s/step - loss: 0.4708 - accuracy: 0.7771 - val_loss: 0.5886 - val_accuracy: 0.7619 Epoch 24/50 312/312 [==============================] - 408s 1s/step - loss: 0.4632 - accuracy: 0.7772 - val_loss: 0.4034 - val_accuracy: 0.7806 Epoch 25/50 312/312 [==============================] - 434s 1s/step - loss: 0.4551 - accuracy: 0.7857 - val_loss: 0.4416 - val_accuracy: 0.7770 Epoch 26/50 312/312 [==============================] - 449s 1s/step - loss: 0.4435 - accuracy: 0.7925 - val_loss: 0.4382 - val_accuracy: 0.7847 Epoch 27/50 312/312 [==============================] - 434s 1s/step - loss: 0.4345 - accuracy: 0.7979 - val_loss: 0.5633 - val_accuracy: 0.7855 Epoch 28/50 312/312 [==============================] - 407s 1s/step - loss: 0.4256 - accuracy: 0.8031 - val_loss: 0.4061 - val_accuracy: 0.7752 Epoch 29/50 312/312 [==============================] - 407s 1s/step - loss: 0.4206 - accuracy: 0.8076 - val_loss: 0.4251 - val_accuracy: 0.7772 Epoch 30/50 312/312 [==============================] - 401s 1s/step - loss: 0.4060 - accuracy: 0.8174 - val_loss: 0.4502 - val_accuracy: 0.7861 Epoch 31/50 312/312 [==============================] - 409s 1s/step - loss: 0.3949 - accuracy: 0.8212 - val_loss: 0.4621 - val_accuracy: 0.7874 Epoch 32/50 312/312 [==============================] - 417s 1s/step - loss: 0.3929 - accuracy: 0.8204 - val_loss: 0.5444 - val_accuracy: 0.7949 Epoch 33/50 312/312 [==============================] - 420s 1s/step - loss: 0.3759 - accuracy: 0.8298 - val_loss: 0.4273 - val_accuracy: 0.7937 Epoch 34/50 312/312 [==============================] - 419s 1s/step - loss: 0.3717 - accuracy: 0.8341 - val_loss: 0.4550 - val_accuracy: 0.7992 Epoch 35/50 312/312 [==============================] - 400s 1s/step - loss: 0.3595 - accuracy: 0.8401 - val_loss: 0.4704 - val_accuracy: 0.8014 Epoch 36/50 312/312 [==============================] - 398s 1s/step - loss: 0.3499 - accuracy: 0.8459 - val_loss: 0.4227 - val_accuracy: 0.7965 Epoch 37/50 312/312 [==============================] - 397s 1s/step - loss: 0.3500 - accuracy: 0.8480 - val_loss: 0.4315 - val_accuracy: 0.7931 Epoch 38/50 312/312 [==============================] - 398s 1s/step - loss: 0.3346 - accuracy: 0.8512 - val_loss: 0.4268 - val_accuracy: 0.8031 Epoch 39/50 312/312 [==============================] - 399s 1s/step - loss: 0.3292 - accuracy: 0.8570 - val_loss: 0.4325 - val_accuracy: 0.8075 Epoch 40/50 312/312 [==============================] - 400s 1s/step - loss: 0.3167 - accuracy: 0.8624 - val_loss: 0.3254 - val_accuracy: 0.8157 Epoch 41/50 312/312 [==============================] - 399s 1s/step - loss: 0.3012 - accuracy: 0.8721 - val_loss: 0.3822 - val_accuracy: 0.8143 Epoch 42/50 312/312 [==============================] - 412s 1s/step - loss: 0.2936 - accuracy: 0.8755 - val_loss: 0.3717 - val_accuracy: 0.8183 Epoch 43/50 312/312 [==============================] - 401s 1s/step - loss: 0.2872 - accuracy: 0.8755 - val_loss: 0.4908 - val_accuracy: 0.8165 Epoch 44/50 312/312 [==============================] - 399s 1s/step - loss: 0.2780 - accuracy: 0.8815 - val_loss: 0.4762 - val_accuracy: 0.8104 Epoch 45/50 312/312 [==============================] - 397s 1s/step - loss: 0.2659 - accuracy: 0.8866 - val_loss: 0.4051 - val_accuracy: 0.8143 Epoch 46/50 312/312 [==============================] - 393s 1s/step - loss: 0.2696 - accuracy: 0.8856 - val_loss: 0.4317 - val_accuracy: 0.8169 Epoch 47/50 312/312 [==============================] - 390s 1s/step - loss: 0.2565 - accuracy: 0.8931 - val_loss: 0.4638 - val_accuracy: 0.8147 Epoch 48/50 312/312 [==============================] - 391s 1s/step - loss: 0.2440 - accuracy: 0.9004 - val_loss: 0.7397 - val_accuracy: 0.8116 Epoch 49/50 312/312 [==============================] - 390s 1s/step - loss: 0.2323 - accuracy: 0.9037 - val_loss: 0.2923 - val_accuracy: 0.8204 Epoch 50/50 312/312 [==============================] - 390s 1s/step - loss: 0.2291 - accuracy: 0.9049 - val_loss: 0.5022 - val_accuracy: 0.8083 80/80 [==============================] - 18s 229ms/step > 80.825
Nice! The accuracy of the model is 80.8% and we have minimized overfitting quite a bit as evidenced by the smaller gap in the classification accuracy between the training and test sets.
cnn_diagnostics(tld_history)
Small Detour for GPU setup¶
The calculations and the epochs required for training are starting to get a bit too long (the last model took almost 6 hours). I've been running tensorflow using the CPU version so far. However, massive gains in computational speed can be obtained by setting up keras to work with a GPU so I'll switch my tensorflow environment to use my GPU instead.
My home setup has a AMD Ryzen 7 3700X 8-core CPU and an NVIDIA GeForce RTX 2060 GPU. I've noticed in my two most recent models that the CPU utilization was consistently 100% whch is slowing all my other tasks. Hence, the need to switch to GPU.
The instructions on setting up tensorflow with GPU can be found here: https://www.tensorflow.org/install/pip#linux
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available: 1
As you can see below tensorflow is recognizing my GPU. I'll run a dummy model using the dropout model I used before really quick to ensure that the GPU is being used now.
test_model = define_TLD_model()
train_it, test_it = iter_generator()
test_history = fit_cnn_model(test_model,train_it,test_it, epochs = 1)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. 312/312 [==============================] - 49s 155ms/step - loss: 0.7191 - accuracy: 0.5220 - val_loss: 0.6888 - val_accuracy: 0.5316
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 10s 122ms/step - loss: 0.6888 - accuracy: 0.5316 > 53.163
Perfect! That's a massive reduction in computational time. An epoch with the CPU was taking approximately 400s to complete. With the GPU its taking under 60s. That's over 5 times as fast! Cool, now I can start to feasibly explore more complex models within a reasonable time frame :]
Data Augmentation¶
Through data augmentation I'll be applying various modifications to the training/validation set that will not only make it larger but should also result in a more general and flexible model.
Some of the modifications we can appply to the images in our dataset are rotations, translations, zooms, mirror flips and noise.
This can all be readily done through the ImageDataGenerator as shown below where I'll modify one of my base functions iter_generator
to incorporate these new parameters:
#Instantiate DataGenerator and prepare iterators
def iter_generator(wsr = 0,hsr = 0, hf = False, pxx = 200, pxy = 200, pretrain = None):
# create data generators
if pretrain == 'VGG16':
datagen = ImageDataGenerator(featurewise_center=True)
datagen.mean = [123.68, 116.779, 103.939]
elif pretrain == 'ResNet50':
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
else:
datagen = ImageDataGenerator(rescale=1.0/255.0,
width_shift_range = wsr,
height_shift_range = hsr,
horizontal_flip = hf
)
# prepare iterators
train_it = datagen.flow_from_directory(train_folder,
class_mode='binary',
batch_size=64,
target_size=(pxx, pxy))
test_it = datagen.flow_from_directory(test_folder,
class_mode='binary',
batch_size=64,
target_size=(pxx, pxy))
return train_it, test_it
I'll call this model TLDA since it uses the same architecture as the TLD but the training set has now been augmented.
tlda_model = define_TLD_model()
train_it, test_it = iter_generator(wsr = 0.1, hsr = 0.1, hf = True)
tlda_history = fit_cnn_model(tlda_model,train_it,test_it, epochs = 50)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/50 312/312 [==============================] - 186s 594ms/step - loss: 0.7318 - accuracy: 0.5142 - val_loss: 0.6913 - val_accuracy: 0.5589 Epoch 2/50 312/312 [==============================] - 186s 595ms/step - loss: 0.6865 - accuracy: 0.5485 - val_loss: 0.6891 - val_accuracy: 0.5473 Epoch 3/50 312/312 [==============================] - 185s 593ms/step - loss: 0.6839 - accuracy: 0.5531 - val_loss: 0.6899 - val_accuracy: 0.5456 Epoch 4/50 312/312 [==============================] - 185s 594ms/step - loss: 0.6866 - accuracy: 0.5470 - val_loss: 0.6860 - val_accuracy: 0.5780 Epoch 5/50 312/312 [==============================] - 184s 590ms/step - loss: 0.6793 - accuracy: 0.5713 - val_loss: 0.6808 - val_accuracy: 0.6104 Epoch 6/50 312/312 [==============================] - 185s 591ms/step - loss: 0.6765 - accuracy: 0.5777 - val_loss: 0.6778 - val_accuracy: 0.6088 Epoch 7/50 312/312 [==============================] - 184s 590ms/step - loss: 0.6732 - accuracy: 0.5850 - val_loss: 0.6754 - val_accuracy: 0.5692 Epoch 8/50 312/312 [==============================] - 185s 592ms/step - loss: 0.6688 - accuracy: 0.5979 - val_loss: 0.6805 - val_accuracy: 0.5570 Epoch 9/50 312/312 [==============================] - 184s 590ms/step - loss: 0.6652 - accuracy: 0.6007 - val_loss: 0.6800 - val_accuracy: 0.5680 Epoch 10/50 312/312 [==============================] - 184s 588ms/step - loss: 0.6628 - accuracy: 0.6039 - val_loss: 0.6967 - val_accuracy: 0.5448 Epoch 11/50 312/312 [==============================] - 185s 593ms/step - loss: 0.6610 - accuracy: 0.6043 - val_loss: 0.6645 - val_accuracy: 0.6138 Epoch 12/50 312/312 [==============================] - 185s 592ms/step - loss: 0.6533 - accuracy: 0.6133 - val_loss: 0.6729 - val_accuracy: 0.5790 Epoch 13/50 312/312 [==============================] - 185s 592ms/step - loss: 0.6513 - accuracy: 0.6158 - val_loss: 0.6547 - val_accuracy: 0.6218 Epoch 14/50 312/312 [==============================] - 185s 591ms/step - loss: 0.6429 - accuracy: 0.6282 - val_loss: 0.6479 - val_accuracy: 0.6238 Epoch 15/50 312/312 [==============================] - 187s 598ms/step - loss: 0.6367 - accuracy: 0.6389 - val_loss: 0.6533 - val_accuracy: 0.6167 Epoch 16/50 312/312 [==============================] - 185s 594ms/step - loss: 0.6260 - accuracy: 0.6494 - val_loss: 0.6395 - val_accuracy: 0.6342 Epoch 17/50 312/312 [==============================] - 186s 597ms/step - loss: 0.6173 - accuracy: 0.6558 - val_loss: 0.6407 - val_accuracy: 0.6369 Epoch 18/50 312/312 [==============================] - 186s 596ms/step - loss: 0.6116 - accuracy: 0.6619 - val_loss: 0.6185 - val_accuracy: 0.6723 Epoch 19/50 312/312 [==============================] - 186s 594ms/step - loss: 0.6037 - accuracy: 0.6735 - val_loss: 0.6064 - val_accuracy: 0.6839 Epoch 20/50 312/312 [==============================] - 186s 595ms/step - loss: 0.5995 - accuracy: 0.6786 - val_loss: 0.6014 - val_accuracy: 0.6697 Epoch 21/50 312/312 [==============================] - 187s 600ms/step - loss: 0.5912 - accuracy: 0.6791 - val_loss: 0.5997 - val_accuracy: 0.6780 Epoch 22/50 312/312 [==============================] - 186s 597ms/step - loss: 0.5858 - accuracy: 0.6917 - val_loss: 0.5890 - val_accuracy: 0.6866 Epoch 23/50 312/312 [==============================] - 184s 590ms/step - loss: 0.5763 - accuracy: 0.6970 - val_loss: 0.5840 - val_accuracy: 0.6813 Epoch 24/50 312/312 [==============================] - 184s 590ms/step - loss: 0.5751 - accuracy: 0.6968 - val_loss: 0.5693 - val_accuracy: 0.7055 Epoch 25/50 312/312 [==============================] - 184s 591ms/step - loss: 0.5680 - accuracy: 0.7010 - val_loss: 0.5525 - val_accuracy: 0.7267 Epoch 26/50 312/312 [==============================] - 184s 591ms/step - loss: 0.5615 - accuracy: 0.7106 - val_loss: 0.5496 - val_accuracy: 0.7165 Epoch 27/50 312/312 [==============================] - 185s 592ms/step - loss: 0.5568 - accuracy: 0.7144 - val_loss: 0.5560 - val_accuracy: 0.7104 Epoch 28/50 312/312 [==============================] - 184s 590ms/step - loss: 0.5456 - accuracy: 0.7215 - val_loss: 0.5270 - val_accuracy: 0.7371 Epoch 29/50 312/312 [==============================] - 185s 594ms/step - loss: 0.5453 - accuracy: 0.7165 - val_loss: 0.5348 - val_accuracy: 0.7261 Epoch 30/50 312/312 [==============================] - 185s 593ms/step - loss: 0.5345 - accuracy: 0.7286 - val_loss: 0.5170 - val_accuracy: 0.7434 Epoch 31/50 312/312 [==============================] - 184s 590ms/step - loss: 0.5294 - accuracy: 0.7342 - val_loss: 0.5079 - val_accuracy: 0.7491 Epoch 32/50 312/312 [==============================] - 186s 596ms/step - loss: 0.5225 - accuracy: 0.7386 - val_loss: 0.5091 - val_accuracy: 0.7477 Epoch 33/50 312/312 [==============================] - 187s 601ms/step - loss: 0.5159 - accuracy: 0.7446 - val_loss: 0.5050 - val_accuracy: 0.7583 Epoch 34/50 312/312 [==============================] - 191s 611ms/step - loss: 0.5109 - accuracy: 0.7471 - val_loss: 0.4988 - val_accuracy: 0.7664 Epoch 35/50 312/312 [==============================] - 187s 600ms/step - loss: 0.5038 - accuracy: 0.7552 - val_loss: 0.5040 - val_accuracy: 0.7430 Epoch 36/50 312/312 [==============================] - 186s 595ms/step - loss: 0.5003 - accuracy: 0.7584 - val_loss: 0.4913 - val_accuracy: 0.7723 Epoch 37/50 312/312 [==============================] - 185s 595ms/step - loss: 0.4954 - accuracy: 0.7616 - val_loss: 0.4845 - val_accuracy: 0.7699 Epoch 38/50 312/312 [==============================] - 187s 598ms/step - loss: 0.4889 - accuracy: 0.7661 - val_loss: 0.4761 - val_accuracy: 0.7807 Epoch 39/50 312/312 [==============================] - 184s 588ms/step - loss: 0.4849 - accuracy: 0.7711 - val_loss: 0.4795 - val_accuracy: 0.7709 Epoch 40/50 312/312 [==============================] - 185s 593ms/step - loss: 0.4815 - accuracy: 0.7723 - val_loss: 0.4653 - val_accuracy: 0.7819 Epoch 41/50 312/312 [==============================] - 184s 590ms/step - loss: 0.4737 - accuracy: 0.7777 - val_loss: 0.4607 - val_accuracy: 0.7796 Epoch 42/50 312/312 [==============================] - 185s 592ms/step - loss: 0.4707 - accuracy: 0.7770 - val_loss: 0.4639 - val_accuracy: 0.7825 Epoch 43/50 312/312 [==============================] - 186s 595ms/step - loss: 0.4657 - accuracy: 0.7806 - val_loss: 0.4573 - val_accuracy: 0.7910 Epoch 44/50 312/312 [==============================] - 197s 633ms/step - loss: 0.4665 - accuracy: 0.7785 - val_loss: 0.4548 - val_accuracy: 0.7857 Epoch 45/50 312/312 [==============================] - 186s 598ms/step - loss: 0.4590 - accuracy: 0.7869 - val_loss: 0.4464 - val_accuracy: 0.7914 Epoch 46/50 312/312 [==============================] - 184s 591ms/step - loss: 0.4562 - accuracy: 0.7912 - val_loss: 0.4473 - val_accuracy: 0.7957 Epoch 47/50 312/312 [==============================] - 185s 591ms/step - loss: 0.4504 - accuracy: 0.7887 - val_loss: 0.4352 - val_accuracy: 0.7949 Epoch 48/50 312/312 [==============================] - 186s 596ms/step - loss: 0.4459 - accuracy: 0.7934 - val_loss: 0.4394 - val_accuracy: 0.7963 Epoch 49/50 312/312 [==============================] - 188s 603ms/step - loss: 0.4447 - accuracy: 0.7949 - val_loss: 0.4270 - val_accuracy: 0.8033 Epoch 50/50 312/312 [==============================] - 184s 588ms/step - loss: 0.4357 - accuracy: 0.7983 - val_loss: 0.4320 - val_accuracy: 0.7972
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 38s 467ms/step - loss: 0.4359 - accuracy: 0.7943 > 79.430
cnn_diagnostics(tlda_history)
Now we are getting somewhere! We have dealt with overfitting quite well and the accuracy of our model has improved by about 10% from our initial model.
At this point I can try using a variety of different model architectures, data augmentations, and learning conditions. This is something that I may explore but I think my time would be better spent using a pretrained model for the purposes of transfer learning.
VGG16 model¶
I built the VGG16 model from scratch in a previous post just for funsies. However, Keras already has this model (and serveral others) pretrained and ready to use into this and any other application. You can find these pretrained models here: https://keras.io/api/applications/
There are a few things that need to be adjusted from our previous approach to have the data fed in a format suitable for VGG16. You can find the details for VGG16 here: https://keras.io/api/applications/vgg/#vgg16-function
Here are the parameters that the pretrained VGG16 model takes in:
- include_top
- weights
- input_tensor
- input_shape
- pooling
- classes
- classifier_activation
I'll be adjusting the following:
1) Set include_top = False
since I need to tailor (i.e., fine tune) the FC layers to be more specific to the classification problem at hand.
2) Change the input_shape from the default (3,224,224) to (224,224,3) since I won't be using the default 3 FC layers of the original VGG16.
3) Change the shape of our images to have sizes of 224 x 224 since the images that this model was trained on had those dimensions.
4) Since VGG16 is a pretrained model, it is unlikely that a large number of epochs will be required for training so I'll change that to 10
5) I'll make it so that the FC layers are not trainable by 'freezing' them since I'm going to be fine tuning model parameters
6) The images need to be centered since that's how the VGG16 model prepared their dataset. This will be done by setting the mean to [123.68, 116.779,103.939]
# define cnn model
def define_VGG16_model():
# load model
model = VGG16(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
output = Dense(1, activation='sigmoid')(class1)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
Let's run it now!
vgg16_model = define_VGG16_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16_history = fit_cnn_model(vgg16_model,train_it,test_it, epochs = 10)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/10 312/312 [==============================] - 98s 299ms/step - loss: 0.3993 - accuracy: 0.9606 - val_loss: 0.0637 - val_accuracy: 0.9758 Epoch 2/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0311 - accuracy: 0.9891 - val_loss: 0.0651 - val_accuracy: 0.9764 Epoch 3/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0115 - accuracy: 0.9965 - val_loss: 0.0727 - val_accuracy: 0.9790 Epoch 4/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0032 - accuracy: 0.9993 - val_loss: 0.0907 - val_accuracy: 0.9786 Epoch 5/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0018 - accuracy: 0.9995 - val_loss: 0.0933 - val_accuracy: 0.9790 Epoch 6/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0011 - accuracy: 0.9997 - val_loss: 0.1048 - val_accuracy: 0.9796 Epoch 7/10 312/312 [==============================] - 90s 288ms/step - loss: 8.1669e-04 - accuracy: 0.9997 - val_loss: 0.1044 - val_accuracy: 0.9788 Epoch 8/10 312/312 [==============================] - 88s 282ms/step - loss: 8.4617e-04 - accuracy: 0.9998 - val_loss: 0.1079 - val_accuracy: 0.9804 Epoch 9/10 312/312 [==============================] - 89s 285ms/step - loss: 4.9252e-04 - accuracy: 0.9999 - val_loss: 0.1100 - val_accuracy: 0.9800 Epoch 10/10 312/312 [==============================] - 90s 288ms/step - loss: 4.3476e-04 - accuracy: 0.9999 - val_loss: 0.1128 - val_accuracy: 0.9802
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 18s 228ms/step - loss: 0.1128 - accuracy: 0.9802 > 98.016
The VGG16 resultsed in a 98.02% classification accuracy. Much much better than all the other models from before. However there's significant overfitting taking place. I'll try to incorporate a dropout layer to see if that helps.
cnn_diagnostics(vgg16_history)
# define cnn model
def define_VGG16D_model():
# load model
model = VGG16(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
drop0 = Dropout(0.5)(model.layers[-1].output)
flat1 = Flatten()(drop0)
drop1 = Dropout(0.2)(flat1)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
drop2 = Dropout(0.5)(class1)
output = Dense(1, activation='sigmoid')(drop2)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
vgg16d_model = define_VGG16D_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16_history = fit_cnn_model(vgg16d_model,train_it,test_it, epochs = 10)
C:\Users\vmurc\anaconda3\lib\site-packages\keras\optimizers\optimizer_v2\gradient_descent.py:108: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead. super(SGD, self).__init__(name, **kwargs)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/10 312/312 [==============================] - 90s 286ms/step - loss: 0.2224 - accuracy: 0.9630 - val_loss: 0.0662 - val_accuracy: 0.9772 Epoch 2/10 312/312 [==============================] - 90s 288ms/step - loss: 0.0390 - accuracy: 0.9846 - val_loss: 0.0649 - val_accuracy: 0.9768 Epoch 3/10 312/312 [==============================] - 91s 292ms/step - loss: 0.0190 - accuracy: 0.9932 - val_loss: 0.0728 - val_accuracy: 0.9794 Epoch 4/10 312/312 [==============================] - 90s 289ms/step - loss: 0.0102 - accuracy: 0.9966 - val_loss: 0.0835 - val_accuracy: 0.9790 Epoch 5/10 312/312 [==============================] - 91s 290ms/step - loss: 0.0069 - accuracy: 0.9977 - val_loss: 0.0904 - val_accuracy: 0.9800 Epoch 6/10 312/312 [==============================] - 92s 294ms/step - loss: 0.0050 - accuracy: 0.9982 - val_loss: 0.0987 - val_accuracy: 0.9804 Epoch 7/10 312/312 [==============================] - 91s 291ms/step - loss: 0.0035 - accuracy: 0.9986 - val_loss: 0.1077 - val_accuracy: 0.9794 Epoch 8/10 312/312 [==============================] - 90s 287ms/step - loss: 0.0047 - accuracy: 0.9987 - val_loss: 0.1001 - val_accuracy: 0.9794 Epoch 9/10 312/312 [==============================] - 88s 282ms/step - loss: 0.0039 - accuracy: 0.9989 - val_loss: 0.0998 - val_accuracy: 0.9804 Epoch 10/10 312/312 [==============================] - 89s 284ms/step - loss: 0.0029 - accuracy: 0.9994 - val_loss: 0.1075 - val_accuracy: 0.9806
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 18s 228ms/step - loss: 0.1075 - accuracy: 0.9806 > 98.055
cnn_diagnostics(vgg16_history)
vgg16d3_model = define_VGG16D_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'VGG16')
vgg16d3_history = fit_cnn_model(vgg16d3_model,train_it,test_it, epochs = 10)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/10 312/312 [==============================] - 92s 292ms/step - loss: 0.4423 - accuracy: 0.9364 - val_loss: 0.0807 - val_accuracy: 0.9701 Epoch 2/10 312/312 [==============================] - 89s 283ms/step - loss: 0.1194 - accuracy: 0.9556 - val_loss: 0.0727 - val_accuracy: 0.9756 Epoch 3/10 312/312 [==============================] - 89s 286ms/step - loss: 0.1000 - accuracy: 0.9646 - val_loss: 0.0617 - val_accuracy: 0.9754 Epoch 4/10 312/312 [==============================] - 89s 286ms/step - loss: 0.0949 - accuracy: 0.9670 - val_loss: 0.0626 - val_accuracy: 0.9770 Epoch 5/10 312/312 [==============================] - 88s 283ms/step - loss: 0.0874 - accuracy: 0.9672 - val_loss: 0.0624 - val_accuracy: 0.9790 Epoch 6/10 312/312 [==============================] - 87s 279ms/step - loss: 0.0782 - accuracy: 0.9718 - val_loss: 0.0584 - val_accuracy: 0.9798 Epoch 7/10 312/312 [==============================] - 87s 280ms/step - loss: 0.0733 - accuracy: 0.9733 - val_loss: 0.0554 - val_accuracy: 0.9802 Epoch 8/10 312/312 [==============================] - 88s 283ms/step - loss: 0.0638 - accuracy: 0.9754 - val_loss: 0.0566 - val_accuracy: 0.9798 Epoch 9/10 312/312 [==============================] - 87s 279ms/step - loss: 0.0624 - accuracy: 0.9762 - val_loss: 0.0607 - val_accuracy: 0.9788 Epoch 10/10 312/312 [==============================] - 88s 282ms/step - loss: 0.0595 - accuracy: 0.9768 - val_loss: 0.0559 - val_accuracy: 0.9798
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 18s 222ms/step - loss: 0.0559 - accuracy: 0.9798 > 97.976
cnn_diagnostics(vgg16d3_history)
Nice! The addition of the dropout layers is helping with the overfitting issue while preserving a pretty high classification accuracy of 97.98%.
ResNet50 Model¶
# define cnn model
def define_ResNet_model():
# load model
model = ResNet50(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
output = Dense(1, activation='sigmoid')(class1)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
resnet_model = define_ResNet_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'ResNet50')
resnet_history = fit_cnn_model(resnet_model, train_it,test_it, epochs = 10)
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 94765736/94765736 [==============================] - 2s 0us/step Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/10 312/312 [==============================] - 69s 212ms/step - loss: 0.0589 - accuracy: 0.9786 - val_loss: 0.0328 - val_accuracy: 0.9890 Epoch 2/10 312/312 [==============================] - 64s 205ms/step - loss: 0.0087 - accuracy: 0.9975 - val_loss: 0.0318 - val_accuracy: 0.9888 Epoch 3/10 312/312 [==============================] - 64s 205ms/step - loss: 0.0019 - accuracy: 0.9997 - val_loss: 0.0378 - val_accuracy: 0.9884 Epoch 4/10 312/312 [==============================] - 64s 206ms/step - loss: 8.0528e-04 - accuracy: 0.9998 - val_loss: 0.0389 - val_accuracy: 0.9892 Epoch 5/10 312/312 [==============================] - 64s 204ms/step - loss: 6.1654e-04 - accuracy: 0.9999 - val_loss: 0.0402 - val_accuracy: 0.9890 Epoch 6/10 312/312 [==============================] - 64s 204ms/step - loss: 4.4182e-04 - accuracy: 0.9999 - val_loss: 0.0414 - val_accuracy: 0.9892 Epoch 7/10 312/312 [==============================] - 64s 205ms/step - loss: 2.7643e-04 - accuracy: 1.0000 - val_loss: 0.0414 - val_accuracy: 0.9896 Epoch 8/10 312/312 [==============================] - 64s 205ms/step - loss: 2.2176e-04 - accuracy: 1.0000 - val_loss: 0.0423 - val_accuracy: 0.9896 Epoch 9/10 312/312 [==============================] - 65s 208ms/step - loss: 1.9270e-04 - accuracy: 1.0000 - val_loss: 0.0427 - val_accuracy: 0.9900 Epoch 10/10 312/312 [==============================] - 65s 209ms/step - loss: 1.7137e-04 - accuracy: 1.0000 - val_loss: 0.0431 - val_accuracy: 0.9900
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 13s 166ms/step - loss: 0.0431 - accuracy: 0.9900 > 98.998
cnn_diagnostics(resnet_history)
# define cnn model
def define_ResNetD_model():
# load model
model = ResNet50(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
drop0 = Dropout(0.5)(model.layers[-1].output)
flat1 = Flatten()(drop0)
drop1 = Dropout(0.2)(flat1)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
drop2 = Dropout(0.5)(class1)
output = Dense(1, activation='sigmoid')(drop2)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
resnetd_model = define_ResNetD_model()
train_it, test_it = iter_generator(pxx = 224, pxy = 224, pretrain= 'ResNet50')
resnetd_history = fit_cnn_model(resnetd_model, train_it,test_it, epochs = 10)
Found 19910 images belonging to 2 classes. Found 5090 images belonging to 2 classes. Epoch 1/10 312/312 [==============================] - 69s 214ms/step - loss: 0.0969 - accuracy: 0.9704 - val_loss: 0.0376 - val_accuracy: 0.9857 Epoch 2/10 312/312 [==============================] - 66s 211ms/step - loss: 0.0364 - accuracy: 0.9858 - val_loss: 0.0322 - val_accuracy: 0.9886 Epoch 3/10 312/312 [==============================] - 66s 210ms/step - loss: 0.0290 - accuracy: 0.9899 - val_loss: 0.0339 - val_accuracy: 0.9882 Epoch 4/10 312/312 [==============================] - 66s 211ms/step - loss: 0.0232 - accuracy: 0.9914 - val_loss: 0.0354 - val_accuracy: 0.9876 Epoch 5/10 312/312 [==============================] - 66s 211ms/step - loss: 0.0154 - accuracy: 0.9943 - val_loss: 0.0372 - val_accuracy: 0.9892 Epoch 6/10 312/312 [==============================] - 66s 210ms/step - loss: 0.0148 - accuracy: 0.9947 - val_loss: 0.0386 - val_accuracy: 0.9870 Epoch 7/10 312/312 [==============================] - 65s 209ms/step - loss: 0.0132 - accuracy: 0.9951 - val_loss: 0.0391 - val_accuracy: 0.9874 Epoch 8/10 312/312 [==============================] - 64s 205ms/step - loss: 0.0109 - accuracy: 0.9958 - val_loss: 0.0358 - val_accuracy: 0.9896 Epoch 9/10 312/312 [==============================] - 64s 205ms/step - loss: 0.0080 - accuracy: 0.9970 - val_loss: 0.0363 - val_accuracy: 0.9896 Epoch 10/10 312/312 [==============================] - 64s 206ms/step - loss: 0.0082 - accuracy: 0.9970 - val_loss: 0.0392 - val_accuracy: 0.9886
C:\Users\vmurc\anaconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: `Model.evaluate_generator` is deprecated and will be removed in a future version. Please use `Model.evaluate`, which supports generators.
80/80 [==============================] - 13s 163ms/step - loss: 0.0392 - accuracy: 0.9886 > 98.861
cnn_diagnostics(resnetd_history)
Model Performance Comparison¶
The 9 models tried so far are shown below. Transfer learning through VGG16 resulted in the highest classification accuracy. The incremental layer models
acc_scores = [74.617 , 76.935, 75.815 ,79.430, 81.218, 98.016, 97.976 , 98.998 ,98.861 ]
model_names = ['Single Layer','Double Layer','Triple Layer',
'Triple Layer + Drop',
'Triple Layer + Drop + Augment',
'VGG16',
'VGG16 + Drop',
'ResNet50',
'ResNet50 + Drop']
dft = pd.DataFrame(acc_scores,index = model_names,columns =['Accuracy Scores'])
dft
Accuracy Scores | |
---|---|
Single Layer | 74.617 |
Double Layer | 76.935 |
Triple Layer | 75.815 |
Triple Layer + Drop | 79.430 |
Triple Layer + Drop + Augment | 81.218 |
VGG16 | 98.016 |
VGG16 + Drop | 97.976 |
ResNet50 | 98.998 |
ResNet50 + Drop | 98.861 |
Based on the accuracy and the extent of overfitting, I'll use the "ResNet50 + Drop" model to train on the full data set and make predictions.
Final Model Preparations¶
I'll start testing the model now and see how well it performs using all the images in the training set. This will then allow me to make predictions on new images that were not originally included in the dataset. I'll start by organizing the images into the final dataset. No validation data required in this step.
#create directories
dataset_home = 'final_dogs_vs_cats/'
# create label subdirectories
labeldirs = ['dogs/', 'cats/']
for labldir in labeldirs:
newdir = dataset_home + labldir
makedirs(newdir, exist_ok=True)
print(newdir)
# copy training dataset images into subdirectories
flist = [name for name in os.listdir(".") if os.path.isdir(name)]
# copy training dataset images into subdirectories
src_directory = os.getcwd() + '\\' + flist[-1]
for file in listdir(src_directory):
src = src_directory + '\\' + file
if file.startswith('cat'):
dst = dataset_home + 'cats\\' + file
copyfile(src, dst)
elif file.startswith('dog'):
dst = dataset_home + 'dogs\\' + file
copyfile(src, dst)
final_dogs_vs_cats/dogs/ final_dogs_vs_cats/cats/
Now I'll gather my previous functions here one last time and update them to represent the final needs of the model evaluation. I'll also save my model into an h5 file for future use.
#Instantiate DataGenerator and prepare iterators
def iter_generator(wd , wsr = 0, hsr = 0, hf = False, pxx = 200, pxy = 200, model = None):
# create data generators
if model == 'VGG16':
datagen = ImageDataGenerator(featurewise_center=True)
datagen.mean = [123.68, 116.779, 103.939]
elif model == 'ResNet50':
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
else:
datagen = ImageDataGenerator(rescale=1.0/255.0,
width_shift_range = wsr,
height_shift_range = hsr,
horizontal_flip = hf
)
# prepare iterators
train_it = datagen.flow_from_directory(wd,
class_mode='binary',
batch_size=64,
target_size=(pxx, pxy))
return train_it
# define cnn model
def define_ResNetD_model():
# load model
model = ResNet50(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
drop0 = Dropout(0.5)(model.layers[-1].output)
flat1 = Flatten()(drop0)
drop1 = Dropout(0.2)(flat1)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(drop1)
drop2 = Dropout(0.5)(class1)
output = Dense(1, activation='sigmoid')(drop2)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
#Model fitting
def fit_final_model(model,train_it,epochs = 20):
# fit model
history = model.fit(train_it,
steps_per_epoch = len(train_it),
epochs = epochs,
verbose = 1)
model.save('final_model.h5')
return history
#Visualize learning results
def cnn_diagnostics(history):
fig, axs = plt.subplots(1,2,figsize =(30,10))
# plot loss
axs[0].set_title('Cross Entropy Loss', fontsize = 40)
axs[0].plot(history.history['loss'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[0].plot(history.history['val_loss'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[0].set_ylabel('Cross Entropy Loss', fontsize = 30) # Y label
axs[0].set_xlabel('Epoch', fontsize = 30) # X label
axs[0].tick_params(axis='x', labelsize=30)
axs[0].tick_params(axis='y', labelsize=30)
axs[0].legend(fontsize = 40)
# plot accuracy
axs[1].set_title('Classification Accuracy', fontsize = 40)
axs[1].plot(history.history['accuracy'], color='#27308a', label='train',linestyle='dotted',linewidth=4)
axs[1].plot(history.history['val_accuracy'], color='#86148f', label='test',linestyle='dotted',linewidth=4)
axs[1].set_ylabel('Classification Accuracy', fontsize = 30) # Y label
axs[1].set_xlabel('Epoch', fontsize = 30) # X label
axs[1].tick_params(axis='x', labelsize=30)
axs[1].tick_params(axis='y', labelsize=30)
axs[1].legend(fontsize = 40)
plt.show()
Okay, let's run it now!
dataset_home = 'final_dogs_vs_cats/'
resnetd_model = define_ResNetD_model()
train_it = iter_generator(dataset_home, pxx = 224, pxy = 224, model = 'ResNet50')
resnetd_history = fit_final_model(resnetd_model, train_it, epochs = 20)
Found 25000 images belonging to 2 classes. Epoch 1/20 391/391 [==============================] - 135s 328ms/step - loss: 0.0866 - accuracy: 0.9735 Epoch 2/20 391/391 [==============================] - 67s 170ms/step - loss: 0.0380 - accuracy: 0.9865 Epoch 3/20 391/391 [==============================] - 67s 170ms/step - loss: 0.0268 - accuracy: 0.9902 Epoch 4/20 391/391 [==============================] - 67s 170ms/step - loss: 0.0226 - accuracy: 0.9916 Epoch 5/20 391/391 [==============================] - 66s 168ms/step - loss: 0.0163 - accuracy: 0.9938 Epoch 6/20 391/391 [==============================] - 65s 166ms/step - loss: 0.0138 - accuracy: 0.9947 Epoch 7/20 391/391 [==============================] - 67s 170ms/step - loss: 0.0116 - accuracy: 0.9955 Epoch 8/20 391/391 [==============================] - 66s 169ms/step - loss: 0.0111 - accuracy: 0.9955 Epoch 9/20 391/391 [==============================] - 65s 166ms/step - loss: 0.0087 - accuracy: 0.9970 Epoch 10/20 391/391 [==============================] - 66s 168ms/step - loss: 0.0078 - accuracy: 0.9971 Epoch 11/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0073 - accuracy: 0.9972 Epoch 12/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0065 - accuracy: 0.9978 Epoch 13/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0049 - accuracy: 0.9982 Epoch 14/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0054 - accuracy: 0.9979 Epoch 15/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0062 - accuracy: 0.9978 Epoch 16/20 391/391 [==============================] - 64s 164ms/step - loss: 0.0045 - accuracy: 0.9984 Epoch 17/20 391/391 [==============================] - 65s 167ms/step - loss: 0.0057 - accuracy: 0.9979 Epoch 18/20 391/391 [==============================] - 67s 170ms/step - loss: 0.0041 - accuracy: 0.9987 Epoch 19/20 391/391 [==============================] - 66s 170ms/step - loss: 0.0050 - accuracy: 0.9984 Epoch 20/20 391/391 [==============================] - 66s 170ms/step - loss: 0.0042 - accuracy: 0.9985
Use Model To Predict Whether Image is of dog or cat¶
Let's see how well the model responds to images outside of our dataset! I'll use pictures of my cat Catterina and my dog Maia :)
First, I'll make a function to prepare images for loading into the model (i.e., make sure they have the right dimensions, size, and centering).
from tensorflow.keras.utils import load_img
from tensorflow.keras.utils import img_to_array
from keras.models import load_model
def load_image_for_prediction(filename):
# load the image
img = load_img(filename, target_size=(224, 224))
# convert to array
img = img_to_array(img)
# reshape into a single sample with 3 channels
img = img.reshape(1, 224, 224, 3)
# center pixel data
img = img.astype('float32')
img = img - [123.68, 116.779, 103.939]
return img
Then, I can feed this image to the model and have it predict whether the loaded image is of a cat or a dog. The model will return 0 if the image is of a cat or 1 if its a dog.
# load an image and predict the class
def dog_cat_predict(img_file, model_file):
# load the image
img = load_image_for_prediction(img_file)
# load model
model = load_model(model_file)
# predict the class
result = model.predict(img)
rounded = [np.round(x) for x in result]
if int(rounded[0]) == 0:
print('This is a cat!')
elif int(rounded[0]) == 1:
print('This is a dog!')
else:
print('This is neither a cat or a dog?')
Let's give it a go! Here's the first image I'll be giving the model.
dog_cat_predict('baby1.jpg', 'final_model.h5')
1/1 [==============================] - 1s 616ms/step This is a cat!
Super cool! It recognized Catterina as a cat correctly! Now let's try my dog Maia. Here's the picture I'll be giving the model
dog_cat_predict('maia1.jpg', 'final_model.h5')
1/1 [==============================] - 1s 635ms/step This is a dog!
Sweet! It recognized Maia as a dog correctly as well!!
Conclusions¶
9 different convolutional neural networks were explored in order to differentiate between images of cats and dogs. A model was trained,incrementally built, and optimized to reduce overfitting until an 81% classification accuracy was obtained. Transfer learning was also used via modifications to the VGG16 and ResNet50 models. The utilized model had a classification accuracy of 97.98% while minimizing overfitting through the incorporation of dropout layers. The model is capable of correctly classifying images of dogs and cats that weren't in the original dataset, and is thus flexible in learning and capable of processing new information correctly.