Dog App

Convolutional Neural Networks

Note: The rendered HTML version of this file is on github pages and the original file is on github.

Project: Write an Algorithm for a Dog Identification App


In this notebook, some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this project. You will not need to modify the included code beyond what is requested. Sections that begin with '(IMPLEMENTATION)' in the header indicate that the following block of code will require additional functionality which you must provide. Instructions will be provided for each section, and the specifics of the implementation are marked in the code block with a 'TODO' statement. Please be sure to read the instructions carefully!

Note: Once you have completed all of the code implementations, you need to finalize your work by exporting the Jupyter Notebook as an HTML document. Before exporting the notebook to html, all of the code cells need to have been run so that reviewers can see the final implementation and output. You can then export the notebook by using the menu above and navigating to File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question X' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. Markdown cells can be edited by double-clicking the cell to enter edit mode.

The rubric contains optional "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. If you decide to pursue the "Stand Out Suggestions", you should include the code in this Jupyter notebook.


Why We're Here

In this notebook, you will make the first steps towards developing an algorithm that could be used as part of a mobile or web app. At the end of this project, your code will accept any user-supplied image as input. If a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling. The image below displays potential sample output of your finished project (... but we expect that each student's algorithm will behave differently!).

Sample Dog Output

In this real-world setting, you will need to piece together a series of models to perform different tasks; for instance, the algorithm that detects humans in an image will be different from the CNN that infers dog breed. There are many points of possible failure, and no perfect algorithm exists. Your imperfect solution will nonetheless create a fun user experience!

The Road Ahead

We break the notebook into separate steps. Feel free to use the links below to navigate the notebook.

  • Step 0: Import Datasets
  • Step 1: Detect Humans
  • Step 2: Detect Dogs
  • Step 3: Create a CNN to Classify Dog Breeds (from Scratch)
  • Step 4: Create a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 5: Write your Algorithm
  • Step 6: Test Your Algorithm

Step 0: Import Datasets

Make sure that you've downloaded the required human and dog datasets:

  • Download the dog dataset. Unzip the folder and place it in this project's home directory, at the location /dogImages.

  • Download the human dataset. Unzip the folder and place it in the home directory, at location /lfw.

Note: If you are using a Windows machine, you are encouraged to use 7zip to extract the folder.

In the code cell below, we save the file paths for both the human (LFW) dataset and dog dataset in the numpy arrays human_files and dog_files.

The original notebook had the imports and set-up for plotting scattered around the notebook, but since there's so many different parts to work on it made it difficult to hunt them all down whenever I restarted the notebook so I've moved them here, but left the original imports in place (or nearly so).

Imports

In [1]:
# python
from datetime import datetime
from functools import partial
from pathlib import Path
import warnings

# from pypi
from PIL import Image, ImageFile
from tabulate import tabulate
from torchvision import datasets
import matplotlib
warnings.filterwarnings("ignore", category=matplotlib.cbook.mplDeprecation)
import cv2
import face_recognition
import matplotlib.image as matplotlib_image
import matplotlib.patches as patches
import matplotlib.pyplot as plt
import numpy as np
import seaborn
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optimizer
import torchvision.models as models
import torchvision.transforms as transforms

I tend to use the full names, but the included code uses the common practice (just not mine) of shortening numpy and pyplot so I'm going to alias them to cut down on the NameErrors.

In [2]:
pyplot = plt
numpy = np

Set Up the Plotting

In [3]:
get_ipython().run_line_magic('matplotlib', 'inline')
get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")
seaborn.set(style="whitegrid",
            rc={"axes.grid": False,
                "font.family": ["sans-serif"],
                "font.sans-serif": ["Open Sans", "Latin Modern Sans", "Lato"],
                "figure.figsize": (8, 6)},
            font_scale=1)

Constants

In [4]:
INCEPTION_IMAGE_SIZE = 299
SCRATCH_IMAGE_SIZE = INCEPTION_IMAGE_SIZE
VGG_IMAGE_SIZE = 224

MEANS = [0.485, 0.456, 0.406]
DEVIATIONS = [0.229, 0.224, 0.225]
DOG_LOWER, DOG_UPPER = 150, 269

Load filenames for human and dog images.

In [5]:
ROOT_PATH = Path("~/data/datasets/dog-breed-classification/").expanduser()
HUMAN_PATH = ROOT_PATH.joinpath("lfw")
DOG_PATH = ROOT_PATH.joinpath("dogImages")
MODEL_PATH = Path("~/models/dog-breed-classification").expanduser()

assert HUMAN_PATH.is_dir()
assert DOG_PATH.is_dir()
assert MODEL_PATH.is_dir()

The MODELS is a place to store things that have been moved to the GPU so I can off-load them if needed.

In [6]:
MODELS = []

Check CUDA

In [7]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print("Using {}".format(device))
Using cuda

Handle Truncated Images

In [8]:
ImageFile.LOAD_TRUNCATED_IMAGES = True
In [9]:
human_files = np.array(list(HUMAN_PATH.glob("*/*")))
dog_files = np.array(list(DOG_PATH.glob("*/*/*")))

assert len(human_files) > 0
assert len(dog_files) > 0

# print number of images in each dataset
print('There are {:,} total human images.'.format(len(human_files)))
print('There are {:,} total dog images.'.format(len(dog_files)))
There are 13,233 total human images.
There are 8,351 total dog images.

Step 1: Detect Humans

In this section, we use OpenCV's implementation of Haar feature-based cascade classifiers to detect human faces in images.

OpenCV provides many pre-trained face detectors, stored as XML files on github. We have downloaded one of these detectors and stored it in the haarcascades directory. In the next code cell, we demonstrate how to use this detector to find human faces in a sample image.

In [10]:
import cv2
import warnings
import matplotlib
warnings.filterwarnings("ignore", category=matplotlib.cbook.mplDeprecation)
import matplotlib.pyplot as plt

# extract pre-trained face detector
haar_path = ROOT_PATH.joinpath('haarcascades/haarcascade_frontalface_alt.xml')
assert haar_path.is_file()
face_cascade = cv2.CascadeClassifier(str(haar_path))

# load color (BGR) image
img = cv2.imread(str(human_files[0]))
# convert BGR image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# find faces in image
faces = face_cascade.detectMultiScale(gray)

# print number of faces detected in the image
print('Number of faces detected:', len(faces))

# get bounding box for each detected face
for (x,y,w,h) in faces:
    # add bounding box to color image
    cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
    
# convert BGR image to RGB for plotting
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# display the image, along with bounding box
plt.imshow(cv_rgb)
plt.show()
Number of faces detected: 1

Before using any of the face detectors, it is standard procedure to convert the images to grayscale. The detectMultiScale function executes the classifier stored in face_cascade and takes the grayscale image as a parameter.

In the above code, faces is a numpy array of detected faces, where each row corresponds to a detected face. Each detected face is a 1D array with four entries that specifies the bounding box of the detected face. The first two entries in the array (extracted in the above code as x and y) specify the horizontal and vertical positions of the top left corner of the bounding box. The last two entries in the array (extracted here as w and h) specify the width and height of the box.

Write a Human Face Detector

We can use this procedure to write a function that returns True if a human face is detected in an image and False otherwise. This function, aptly named face_detector, takes a string-valued file path to an image as input and appears in the code block below.

In [11]:
def face_detector(img_path):
    """"returns True if face is detected in image stored at img_path"""
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray)
    return len(faces) > 0

(IMPLEMENTATION) Assess the Human Face Detector

Question 1: Use the code cell below to test the performance of the face_detector function.

  • What percentage of the first 100 images in human_files have a detected human face?
  • What percentage of the first 100 images in dog_files have a detected human face?

Ideally, we would like 100% of human images with a detected face and 0% of dog images with a detected face. You will see that our algorithm falls short of this goal, but still gives acceptable performance. We extract the file paths for the first 100 images from each of the datasets and store them in the numpy arrays human_files_short and dog_files_short.

Answer: See output below.

In [12]:
from tqdm import tqdm

human_files_short = human_files[:100]
dog_files_short = dog_files[:100]

#-#-# Do NOT modify the code above this line. #-#-#
In [13]:
set([" ".join(filename.name.split("_")[:-1]) for filename in dog_files_short])
Out[13]:
{'Afghan hound',
 'American foxhound',
 'Basset hound',
 'Belgian tervuren',
 'Bichon frise',
 'Bluetick coonhound',
 'Border terrier',
 'Boxer',
 'English cocker spaniel',
 'Greyhound',
 'Lowchen',
 'Newfoundland',
 'Norwich terrier',
 'Papillon',
 'Smooth fox terrier',
 'Tibetan mastiff'}

I'm going to re-do this again with dlib so I'll make a function to answer the question of percentages and add an f1 score to make it a little easier to compare them.

In [14]:
def species_scorer(predictor: callable,
                   true_species: list,
                   false_species: list,
                   labels: list) -> list:
    """Emit a score-table for the predictor

    Args:
     predictor: callable that returns True if it detects the expected species
     true_species: list of images that should be matched by predictor
     false_species: list of images that shouldn't be matched by predictor
     labels: column labels for the table

    Returns:
     false-positive indices
    """
    misses = [predictor(str(image)) for image in false_species]
    false_positives = sum(misses)
    true_positives = sum([predictor(str(image)) for image in true_species])
    false_negatives = len(true_species) - true_positives
    others = len(false_species)
    expected = len(true_species)
    values = ("{:.2f}%".format(100 * true_positives/expected),
            "{:.2f}%".format(100 * false_positives/others),
              "{:.2f}".format((2 * true_positives)/(2 * true_positives
                                                    + false_positives
                                                    + false_negatives)))
    table = zip(labels, values)
    print(tabulate(table, tablefmt="github", headers=["Metric", "Value"]))
    return misses
In [16]:
face_scorer = partial(species_scorer,
                      true_species=human_files_short,
                      false_species=dog_files_short,
                      labels=("First 100 images in `human_files` detected with a face",
                              "First 100 images in `dog_files` detected with a face",
                              "F1"))
In [17]:
open_cv_false_positives = face_scorer(face_detector)
Metric                                                  Value
------------------------------------------------------  -------
First 100 images in `human_files` detected with a face  98.00%
First 100 images in `dog_files` detected with a face    9.00%
F1                                                      0.95

We suggest the face detector from OpenCV as a potential way to detect human images in your algorithm, but you are free to explore other approaches, especially approaches that make use of deep learning :). Please use the code cell below to design and test your own face detection algorithm. If you decide to pursue this optional task, report performance on human_files_short and dog_files_short.

DLIB with face_recognition

This face detector uses face_recognition, a python interface to dlib's facial recognition code.

Testing It with An Image

I created the detect_faces and add_bounding_boxes functions so that I can re-use detect_faces later for the dlib version of the face_detector function.

In [18]:
def detect_faces(image_path: str) -> numpy.ndarray:
    """Finds the locations of faces
    
    Args:
     image_path: path to the image
        
    Returns:
     array of bounding box coordinates for the face(s)
    """
    image = face_recognition.load_image_file(str(image_path))
    return face_recognition.face_locations(image)
In [19]:
def add_bounding_boxes(image_path: str,
                       axe: matplotlib.axes.Axes) -> None:
    """Adds patches to the current matplotlib figure
    
    Args:
     image_path: path to the image file
     axe: axes to add the rectangle to
    """
    for (top, right, bottom, left) in detect_faces(image_path):
        width = right - left
        height = top - bottom
        rectangle = matplotlib.patches.Rectangle((top, right), width, height,
                                      fill=False)
        axe.add_patch(rectangle)
    return    
In [20]:
figure, axe = pyplot.subplots()
human = human_files[0]
name = " ".join(human.name.split("_")[:-1])
image = matplotlib.image.imread(human)
figure.suptitle("dlib Face Recognition Bounding-Box ({})".format(name),
                weight='bold')
add_bounding_boxes(human, axe)
axe.tick_params(dict(axis="both",
                     which="both",
                     bottom=False,
                     top=False))
axe.get_xaxis().set_ticks([])
axe.get_yaxis().set_ticks([])
        
plot = axe.imshow(image)

Test the performance

In [21]:
def has_face(image_path: str) -> bool:
    """Checks if there is at least one face in the image

    Args:
     image_path: path to the image file

    Returns:
     True if there's at least one face in the image
    """
    return len(detect_faces(image_path)) > 0
In [22]:
dlib_false_positives = face_scorer(has_face)
Metric                                                  Value
------------------------------------------------------  -------
First 100 images in `human_files` detected with a face  100.00%
First 100 images in `dog_files` detected with a face    11.00%
F1                                                      0.95

The DLIB version did slightly better in recognizing the humans as humans, but it also had more false positives so it did about the same overall. Although I didn't include the time the dlib version is about four times slower than the OpenCV version, so the OpenCV verision might be better in a real-time environment, on the other hand the dlib version is much simpler to use and so might be better if speed isn't a factor or recall is more important than precision.


Step 2: Detect Dogs

In this section, we use a pre-trained model to detect dogs in images.

Obtain Pre-trained VGG-16 Model

The code cell below downloads the VGG-16 model, along with weights that have been trained on ImageNet, a very large, very popular dataset used for image classification and other vision tasks. ImageNet contains over 10 million URLs, each linking to an image containing an object from one of 1000 categories.

In [22]:
import torch
import torchvision.models as models
In [22]:
# define VGG16 model
VGG16 = models.vgg16(pretrained=True)
In [23]:
# move model to GPU if CUDA is available
if use_cuda:
    VGG16 = VGG16.cuda()
    MODELS.append(VGG16)

Given an image, this pre-trained VGG-16 model returns a prediction (derived from the 1000 possible categories in ImageNet) for the object that is contained in the image.

(IMPLEMENTATION) Making Predictions with a Pre-trained Model

In the next code cell, you will write a function that accepts a path to an image (such as 'dogImages/train/001.Affenpinscher/Affenpinscher_00001.jpg') as input and returns the index corresponding to the ImageNet class that is predicted by the pre-trained VGG-16 model. The output should always be an integer between 0 and 999, inclusive.

Before writing the function, make sure that you take the time to learn how to appropriately pre-process tensors for pre-trained models in the PyTorch documentation.

Transforms

The VGG model expects a 244x244 image (Very Deep Convolutional Networks for Large-Scale Image Recognition) and according to the pytorch documentation all the pre-trained models have means [0.485, 0.456, 0.406] and standard deviations [0.229, 0.224, 0.225] so the images need to be transformed accordingly. The MEANS and DEVIATIONS lists are defined in the constants section at the top of the document along with the VGG_IMAGE_SIZE.

In [24]:
vgg_transform = transforms.Compose([transforms.Resize(255),
                                    transforms.CenterCrop(VGG_IMAGE_SIZE),
                                    transforms.ToTensor(),
                                    transforms.Normalize(MEANS,
                                                         DEVIATIONS)])

Since I'm going to use the Inception-v3 network later on I'm going to create a generic function first and then use it to build separate predictor functions.

In [25]:
def model_predict(image_path: str, model: nn.Module,
                  transform: transforms.Compose) -> int:
    """Predicts the class of item in image

    Args:
     image_path: path to the image to check
     model: model to make the prediction
     transform: callable to convert the image to a tensor

    Returns:
     index corresponding to the model's prediction
    """
    image = Image.open(str(image_path))
    image = transform(image).unsqueeze(0).to(device)
    output = model(image)
    probabilities = torch.exp(output)
    _, top_class = probabilities.topk(1, dim=1)
    return top_class.item()    
In [26]:
VGG16_predict = partial(model_predict, model=VGG16, transform=vgg_transform)

(IMPLEMENTATION) Write a Dog Detector

While looking at the dictionary, you will notice that the categories corresponding to dogs appear in an uninterrupted sequence and correspond to dictionary keys 151-268, inclusive, to include all categories from 'Chihuahua' to 'Mexican hairless'. Thus, in order to check to see if an image is predicted to contain a dog by the pre-trained VGG-16 model, we need only check if the pre-trained model predicts an index between 151 and 268 (inclusive).

Use these ideas to complete the dog_detector function below, which returns True if a dog is detected in an image (and False if not).

In [27]:
def dog_detector(img_path: str, predictor: callable=VGG16_predict) -> bool:
    """Predicts if the image is a dog

    Args:
     img_path: path to image file
     predictor: callable that maps the image to an ID
    
    Returns:
     is-dog: True if the image contains a dog
    """
    return DOG_LOWER < predictor(img_path) < DOG_UPPER

(IMPLEMENTATION) Assess the Dog Detector

Question 2: Use the code cell below to test the performance of your dog_detector function.

  • What percentage of the images in human_files_short have a detected dog?
  • What percentage of the images in dog_files_short have a detected dog?
In [28]:
dog_scorer = partial(species_scorer,
                     true_species=dog_files_short,
                     false_species=human_files_short,
                     labels=("Images in `dog_files_short` with a detected dog",
                             "Images in `human_files_short with a detected dog", "F1"))
In [30]:
false_dogs = dog_scorer(dog_detector)
Metric                                            Value
------------------------------------------------  -------
Images in `dog_files_short` with a detected dog   92.00%
Images in `human_files_short with a detected dog  1.00%
F1                                                0.95

The VGG model didn't miss any dogs but it misclassified 1% of the humans as dogs.

We suggest VGG-16 as a potential network to detect dog images in your algorithm, but you are free to explore other pre-trained networks (such as Inception-v3, ResNet-50, etc). Please use the code cell below to test other pre-trained PyTorch models. If you decide to pursue this optional task, report performance on human_files_short and dog_files_short.

Inception Dog Detector

In [29]:
inception = models.inception_v3(pretrained=True)
inception.to(device)
MODELS.append(inception)
inception.eval()
pass # this is to prevent the output from dumping into the notebook

I couldn't find anyplace where pytorch documents it, but if you look at the source code they have a comment in the forward method indicating that the image needs to be 299x299x3 so they need to be transformed to a different size from the VGG images. INCEPTION_IMAGE_SIZE is set to `299# at the top of this document since this is shared with code that comes in a later section.

In [36]:
inception_transforms = transforms.Compose([transforms.Resize(INCEPTION_IMAGE_SIZE),
                                           transforms.CenterCrop(INCEPTION_IMAGE_SIZE),
                                           transforms.ToTensor(),
                                           transforms.Normalize(MEANS,
                                                                DEVIATIONS)])
In [37]:
inception_predicts = partial(model_predict, model=inception, transform=inception_transforms)
In [38]:
inception_dog_detector = partial(dog_detector, predictor=inception_predicts)
In [39]:
dlib_false_dogs = dog_scorer(inception_dog_detector)
Metric                                            Value
------------------------------------------------  -------
Images in `dog_files_short` with a detected dog   100.00%
Images in `human_files_short with a detected dog  0.00%
F1                                                1.00

The inception model seems to do better than the VGG model did.


Step 3: Create a CNN to Classify Dog Breeds (from Scratch)

Now that we have functions for detecting humans and dogs in images, we need a way to predict breed from images. In this step, you will create a CNN that classifies dog breeds. You must create your CNN from scratch (so, you can't use transfer learning yet!), and you must attain a test accuracy of at least 10%. In Step 4 of this notebook, you will have the opportunity to use transfer learning to create a CNN that attains greatly improved accuracy.

We mention that the task of assigning breed to dogs from images is considered exceptionally challenging. To see why, consider that even a human would have trouble distinguishing between a Brittany and a Welsh Springer Spaniel.

Brittany Welsh Springer Spaniel
title

It is not difficult to find other dog breed pairs with minimal inter-class variation (for instance, Curly-Coated Retrievers and American Water Spaniels).

Curly-Coated Retriever American Water Spaniel

Likewise, recall that labradors come in yellow, chocolate, and black. Your vision-based algorithm will have to conquer this high intra-class variation to determine how to classify all of these different shades as the same breed.

Yellow Labrador Chocolate Labrador

We also mention that random chance presents an exceptionally low bar: setting aside the fact that the classes are slightly imabalanced, a random guess will provide a correct answer roughly 1 in 133 times, which corresponds to an accuracy of less than 1%.

Remember that the practice is far ahead of the theory in deep learning. Experiment with many different architectures, and trust your intuition. And, of course, have fun!

(IMPLEMENTATION) Specify Data Loaders for the Dog Dataset

Use the code cell below to write three separate data loaders for the training, validation, and test datasets of dog images (located at dogImages/train, dogImages/valid, and dogImages/test, respectively). You may find this documentation on custom datasets to be a useful resource. If you are interested in augmenting your training and/or validation data, check out the wide variety of transforms!

The SCRATCH_IMAGE_SIZE, MEANS, and DEVIATIONS variables are defined in the constants section at the top of the notebook.

In [30]:
train_transform = transforms.Compose([
    transforms.RandomRotation(30),
    transforms.RandomResizedCrop(SCRATCH_IMAGE_SIZE),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(MEANS,
                         DEVIATIONS)])

test_transform = transforms.Compose([transforms.Resize(350),
                                     transforms.CenterCrop(SCRATCH_IMAGE_SIZE),
                                     transforms.ToTensor(),
                                     transforms.Normalize(MEANS,
                                                          DEVIATIONS)])
In [28]:
dog_training_path = DOG_PATH.joinpath("train")
dog_validation_path = DOG_PATH.joinpath("valid")
dog_testing_path = DOG_PATH.joinpath("test")
In [31]:
training = datasets.ImageFolder(root=str(dog_training_path),
                                transform=train_transform)
validation = datasets.ImageFolder(root=str(dog_validation_path),
                                  transform=test_transform)
testing = datasets.ImageFolder(root=str(dog_testing_path),
                               transform=test_transform)
In [43]:
BATCH_SIZE = 32
WORKERS = 0

train_batches = torch.utils.data.DataLoader(training, batch_size=BATCH_SIZE,
                                            shuffle=True, num_workers=WORKERS)
validation_batches = torch.utils.data.DataLoader(
    validation, batch_size=BATCH_SIZE, shuffle=True, num_workers=WORKERS)
test_batches = torch.utils.data.DataLoader(
    testing, batch_size=BATCH_SIZE, shuffle=True, num_workers=WORKERS)

loaders_scratch = dict(train=train_batches,
                       validation=validation_batches,
                       test=test_batches)

Question 3: Describe your chosen procedure for preprocessing the data.

  • How does your code resize the images (by cropping, stretching, etc)? What size did you pick for the input tensor, and why?
  • Did you decide to augment the dataset? If so, how (through translations, flips, rotations, etc)? If not, why not?

Answer:

  • The training images are resized by cropping them, while the testing images are resized by scaling then cropping them. The size I chose for the images was 299 pixels so that I can reuse them with an Inception V3 network in the next section.

  • The training was augmented using rotation, cropping, and horizontal flipping.

(IMPLEMENTATION) Model Architecture

Create a CNN to classify dog breed. Use the template in the code cell below.

In [33]:
BREEDS = len(training.classes)
print("There are {} breeds.".format(BREEDS))
There are 133 breeds.
In [14]:
LAYER_ONE_IN = 3
LAYER_ONE_OUT = 16
LAYER_TWO_OUT = LAYER_ONE_OUT * 2
LAYER_THREE_OUT = LAYER_TWO_OUT * 2
FLATTEN_TO = (SCRATCH_IMAGE_SIZE//8)**2 * LAYER_THREE_OUT
FULLY_CONNECTED_OUT = int(str(FLATTEN_TO)[:3])//100 * 100
KERNEL = 3
PADDING = 1
In [15]:
import torch.nn as nn
import torch.nn.functional as F
In [16]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(LAYER_ONE_IN, LAYER_ONE_OUT,
                               KERNEL, padding=PADDING)
        self.conv2 = nn.Conv2d(LAYER_ONE_OUT, LAYER_TWO_OUT,
                               KERNEL, padding=PADDING)
        self.conv3 = nn.Conv2d(LAYER_TWO_OUT, LAYER_THREE_OUT,
                               KERNEL, padding=PADDING)
        # max pooling layer
        self.pool = nn.MaxPool2d(2, 2)
        # linear layer
        self.fc1 = nn.Linear(FLATTEN_TO, FULLY_CONNECTED_OUT)
        self.fc2 = nn.Linear(FULLY_CONNECTED_OUT, BREEDS)
        # dropout layer
        self.dropout = nn.Dropout(0.25)
        return
    
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))

        x = x.view(-1, FLATTEN_TO)
        x = self.dropout(x)

        x = self.dropout(F.relu(self.fc1(x)))
        return self.fc2(x)
#-#-# You so NOT have to modify the code below this line. #-#-#

# instantiate the CNN
model_scratch = Net()

# move tensors to GPU if CUDA is available
if use_cuda:
    model_scratch.cuda()
    MODELS.append(model_scratch)

Question 4: Outline the steps you took to get to your final CNN architecture and your reasoning at each step.

Answer:

It was largely trial and error, copying what we did in the CIFAR problem. I chose (somewhat arbitrarily) three convolutional layers, since two layers didn't seem to do very well. Each convolutional layer doubles the depth while halving the height and width (using MaxPool).

I then flattened the layer to transition from the convolutional layers to the fully-connected layers. I added a fully-connected layer which has 500 outputs - a rough rounding of the number of input weights of the flattened layer down to the nearest 100th. There wasn't any magic to the number, I just wanted a transition from the large flattened layer to the final output layer and when I was experimenting with larger values I was running out of memory and since this isn't the intended final model I tried to keep it modest.

To reduce the likelihood of overfitting I applied dropout to the activation layers (except the final one). Finally, at each of the layers (except the final output layer) I applied ReLU activation to make the model non-linear.

(IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_scratch, and the optimizer as optimizer_scratch below.

In [17]:
import torch.optim as optimizer

criterion_scratch = nn.CrossEntropyLoss()
optimizer_scratch = optimizer.SGD(model_scratch.parameters(),
                                  lr=0.001,
                                  momentum=0.9)

(IMPLEMENTATION) Train and Validate the Model

Train and validate your model in the code cell below. Save the final model parameters at filepath 'model_scratch.pt'.

In [18]:
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path,
          print_function: callable=print,
          is_inception: bool=False):
    """Trains the model

    Args:
     n_epochs: the number of times to repeat training
     loaders: dict of data batch-loaders
     model: the model to train
     optimizer: the gradient descent object
     criterion: The object to calculate the loss
     use_cuda: boolean to decide whether to move the data to the GPU
     save_path: path to file to save best model to
     print_function: something to pass output to
     is_inception: if True, expect a tuple of tensors as the model output
    """
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf
    
    # check the keys are right so you don't waste an entire epoch to find out
    training_batches = loaders["train"]
    validation_batches = loaders["validation"]
    started = datetime.now()
    print_function("Training Started: {}".format(started))
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        epoch_started = datetime.now()
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        model.train()
        for data, target in training_batches:
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            optimizer.zero_grad()
            if is_inception:
                output, _ = model(data)
            else:
                output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            train_loss += loss.item() * data.size(0)
        train_loss /= len(training_batches.dataset)

        ######################    
        # validate the model #
        ######################
        model.eval()
        for data, target in validation_batches:
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            output = model(data)
            loss = criterion(output, target)
            valid_loss += loss.item() * data.size(0)
        valid_loss /= len(validation_batches.dataset)
        print_function('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}\tElapsed: {}'.format(
            epoch,                     
            train_loss,
            valid_loss,
            datetime.now() - epoch_started,
            ))
        
        if valid_loss < valid_loss_min:
            print_function(
                ("Validation loss decreased ({:.6f} --> {:.6f}). "
                 "Saving model ...").format(
                     valid_loss_min,
                     valid_loss))
            torch.save(model.state_dict(), save_path)
            valid_loss_min = valid_loss
    ended = datetime.now()
    print_function("Training Ended: {}".format(ended))
    print_function("Total Training Time: {}".format(ended - started))            
    return model

Tee

I found out the hard way that Jupyter loses the ability to re-connect to a running cell if you close and re-open the tab, so if you do close it you will have lost all your output. This is something to make sure it gets saved to a file.

In [64]:
class Tee:
    """Save the input to a file and print it

    Args:
     log_name: name to give the log    
     directory_path: path to the directory for the file
    """
    def __init__(self, log_name: str, 
                 directory_name: str="../../../logs/dog-breed-classifier") -> None:
        self.directory_name = directory_name
        self.log_name = log_name
        self._path = None
        self._log = None
        return

    @property
    def path(self) -> Path:
        """path to the log-file"""
        if self._path is None:
            self._path = Path(self.directory_name).expanduser()
            assert self._path.is_dir()
            self._path = self._path.joinpath(self.log_name)
        return self._path

    @property
    def log(self):
        """File object to write log to"""
        if self._log is None:
            self._log = self.path.open("w", buffering=1)
        return self._log

    def __call__(self, line: str) -> None:
        """Writes to the file and stdout

        Args:
         line: text to emit
        """
        self.log.write("{}\n".format(line))
        print(line)
        return

Train the Model

In [20]:
scratch_path = MODEL_PATH.joinpath("model_scratch.pt")
scratch_log = Tee(log_name="scratch_train.log")
In [21]:
EPOCHS = 100
In [22]:
model_scratch = train(EPOCHS, loaders_scratch, model_scratch, optimizer_scratch, 
                      criterion_scratch, use_cuda, scratch_path, print_function=scratch_log)
Training Started: 2019-01-07 00:17:48.769216
Epoch: 1        Training Loss: 4.877051         Validation Loss: 4.841412       Elapsed: 0:03:13.834452
Validation loss decreased (inf --> 4.841412). Saving model ...
Epoch: 2        Training Loss: 4.820985         Validation Loss: 4.747336       Elapsed: 0:03:01.535938
Validation loss decreased (4.841412 --> 4.747336). Saving model ...
Epoch: 3        Training Loss: 4.767189         Validation Loss: 4.684055       Elapsed: 0:03:01.574621
Validation loss decreased (4.747336 --> 4.684055). Saving model ...
Epoch: 4        Training Loss: 4.728553         Validation Loss: 4.607475       Elapsed: 0:03:02.878120
Validation loss decreased (4.684055 --> 4.607475). Saving model ...
Epoch: 5        Training Loss: 4.643230         Validation Loss: 4.515298       Elapsed: 0:03:01.719175
Validation loss decreased (4.607475 --> 4.515298). Saving model ...
Epoch: 6        Training Loss: 4.601643         Validation Loss: 4.451782       Elapsed: 0:03:02.711892
Validation loss decreased (4.515298 --> 4.451782). Saving model ...
Epoch: 7        Training Loss: 4.563049         Validation Loss: 4.390049       Elapsed: 0:03:02.421659
Validation loss decreased (4.451782 --> 4.390049). Saving model ...
Epoch: 8        Training Loss: 4.525313         Validation Loss: 4.401180       Elapsed: 0:03:00.623633
Epoch: 9        Training Loss: 4.494441         Validation Loss: 4.316231       Elapsed: 0:03:03.307759
Validation loss decreased (4.390049 --> 4.316231). Saving model ...
Epoch: 10       Training Loss: 4.462459         Validation Loss: 4.309952       Elapsed: 0:03:01.247355
Validation loss decreased (4.316231 --> 4.309952). Saving model ...
Epoch: 11       Training Loss: 4.440028         Validation Loss: 4.282603       Elapsed: 0:03:01.817202
Validation loss decreased (4.309952 --> 4.282603). Saving model ...
Epoch: 12       Training Loss: 4.408276         Validation Loss: 4.256291       Elapsed: 0:03:02.940067
Validation loss decreased (4.282603 --> 4.256291). Saving model ...
Epoch: 13       Training Loss: 4.382314         Validation Loss: 4.230955       Elapsed: 0:03:01.484585
Validation loss decreased (4.256291 --> 4.230955). Saving model ...
Epoch: 14       Training Loss: 4.339535         Validation Loss: 4.178119       Elapsed: 0:03:01.819115
Validation loss decreased (4.230955 --> 4.178119). Saving model ...
Epoch: 15       Training Loss: 4.314611         Validation Loss: 4.172305       Elapsed: 0:03:01.862936
Validation loss decreased (4.178119 --> 4.172305). Saving model ...
Epoch: 16       Training Loss: 4.294925         Validation Loss: 4.179273       Elapsed: 0:03:02.859107
Epoch: 17       Training Loss: 4.269919         Validation Loss: 4.121323       Elapsed: 0:03:02.187248
Validation loss decreased (4.172305 --> 4.121323). Saving model ...
Epoch: 18       Training Loss: 4.229653         Validation Loss: 4.078084       Elapsed: 0:03:02.005417
Validation loss decreased (4.121323 --> 4.078084). Saving model ...
Epoch: 19       Training Loss: 4.211623         Validation Loss: 4.075537       Elapsed: 0:03:02.023912
Validation loss decreased (4.078084 --> 4.075537). Saving model ...
Epoch: 20       Training Loss: 4.176366         Validation Loss: 4.071403       Elapsed: 0:03:02.443931
Validation loss decreased (4.075537 --> 4.071403). Saving model ...
Epoch: 21       Training Loss: 4.162033         Validation Loss: 4.060058       Elapsed: 0:03:01.880442
Validation loss decreased (4.071403 --> 4.060058). Saving model ...
Epoch: 22       Training Loss: 4.152350         Validation Loss: 4.017785       Elapsed: 0:03:02.961102
Validation loss decreased (4.060058 --> 4.017785). Saving model ...
Epoch: 23       Training Loss: 4.126623         Validation Loss: 4.061260       Elapsed: 0:03:02.727963
Epoch: 24       Training Loss: 4.099212         Validation Loss: 3.992973       Elapsed: 0:03:01.699973
Validation loss decreased (4.017785 --> 3.992973). Saving model ...
Epoch: 25       Training Loss: 4.075190         Validation Loss: 3.998641       Elapsed: 0:03:01.713804
Epoch: 26       Training Loss: 4.046143         Validation Loss: 3.997265       Elapsed: 0:03:02.571748
Epoch: 27       Training Loss: 4.043575         Validation Loss: 3.949613       Elapsed: 0:03:01.425152
Validation loss decreased (3.992973 --> 3.949613). Saving model ...
Epoch: 28       Training Loss: 4.015487         Validation Loss: 3.961522       Elapsed: 0:03:02.782270
Epoch: 29       Training Loss: 3.998070         Validation Loss: 3.948969       Elapsed: 0:03:02.048881
Validation loss decreased (3.949613 --> 3.948969). Saving model ...
Epoch: 30       Training Loss: 3.991606         Validation Loss: 3.938675       Elapsed: 0:03:02.713836
Validation loss decreased (3.948969 --> 3.938675). Saving model ...
Epoch: 31       Training Loss: 3.963830         Validation Loss: 3.918792       Elapsed: 0:03:01.697762
Validation loss decreased (3.938675 --> 3.918792). Saving model ...
Epoch: 32       Training Loss: 3.930790         Validation Loss: 3.897582       Elapsed: 0:03:01.460303
Validation loss decreased (3.918792 --> 3.897582). Saving model ...
Epoch: 33       Training Loss: 3.896765         Validation Loss: 3.963304       Elapsed: 0:03:02.224769
Epoch: 34       Training Loss: 3.879835         Validation Loss: 3.893857       Elapsed: 0:03:02.983978
Validation loss decreased (3.897582 --> 3.893857). Saving model ...
Epoch: 35       Training Loss: 3.888119         Validation Loss: 3.900615       Elapsed: 0:03:02.187086
Epoch: 36       Training Loss: 3.839318         Validation Loss: 3.884181       Elapsed: 0:03:02.805424
Validation loss decreased (3.893857 --> 3.884181). Saving model ...
Epoch: 37       Training Loss: 3.814765         Validation Loss: 3.863985       Elapsed: 0:03:03.838610
Validation loss decreased (3.884181 --> 3.863985). Saving model ...
Epoch: 38       Training Loss: 3.801056         Validation Loss: 3.873780       Elapsed: 0:03:03.033119
Epoch: 39       Training Loss: 3.797330         Validation Loss: 3.827120       Elapsed: 0:03:02.329334
Validation loss decreased (3.863985 --> 3.827120). Saving model ...
Epoch: 40       Training Loss: 3.776431         Validation Loss: 3.852023       Elapsed: 0:03:03.616306
Epoch: 41       Training Loss: 3.747829         Validation Loss: 3.814612       Elapsed: 0:03:03.231390
Validation loss decreased (3.827120 --> 3.814612). Saving model ...
Epoch: 42       Training Loss: 3.713182         Validation Loss: 3.811580       Elapsed: 0:03:00.355972
Validation loss decreased (3.814612 --> 3.811580). Saving model ...
Epoch: 43       Training Loss: 3.705967         Validation Loss: 3.811339       Elapsed: 0:03:11.512757
Validation loss decreased (3.811580 --> 3.811339). Saving model ...
Epoch: 44       Training Loss: 3.677942         Validation Loss: 3.763790       Elapsed: 0:03:06.798942
Validation loss decreased (3.811339 --> 3.763790). Saving model ...
Epoch: 45       Training Loss: 3.670521         Validation Loss: 3.804585       Elapsed: 0:03:09.111308
Epoch: 46       Training Loss: 3.616001         Validation Loss: 3.791811       Elapsed: 0:03:07.913439
Epoch: 47       Training Loss: 3.605779         Validation Loss: 3.818132       Elapsed: 0:03:08.180969
Epoch: 48       Training Loss: 3.578845         Validation Loss: 3.802942       Elapsed: 0:03:07.502958
Epoch: 49       Training Loss: 3.569269         Validation Loss: 3.763015       Elapsed: 0:03:08.838610
Validation loss decreased (3.763790 --> 3.763015). Saving model ...
Epoch: 50       Training Loss: 3.551981         Validation Loss: 3.727734       Elapsed: 0:03:07.301504
Validation loss decreased (3.763015 --> 3.727734). Saving model ...
Epoch: 51       Training Loss: 3.539640         Validation Loss: 3.763292       Elapsed: 0:03:08.697944
Epoch: 52       Training Loss: 3.514974         Validation Loss: 3.789170       Elapsed: 0:03:07.824023
Epoch: 53       Training Loss: 3.478333         Validation Loss: 3.730328       Elapsed: 0:03:08.594196
Epoch: 54       Training Loss: 3.474018         Validation Loss: 3.710677       Elapsed: 0:03:08.306823
Validation loss decreased (3.727734 --> 3.710677). Saving model ...
Epoch: 55       Training Loss: 3.455741         Validation Loss: 3.666004       Elapsed: 0:03:07.551808
Validation loss decreased (3.710677 --> 3.666004). Saving model ...
Epoch: 56       Training Loss: 3.385648         Validation Loss: 3.755735       Elapsed: 0:03:07.685431
Epoch: 57       Training Loss: 3.391713         Validation Loss: 3.739904       Elapsed: 0:03:09.560812
Epoch: 58       Training Loss: 3.385832         Validation Loss: 3.679237       Elapsed: 0:03:07.951572
Epoch: 59       Training Loss: 3.345478         Validation Loss: 3.698172       Elapsed: 0:03:07.605253
Epoch: 61       Training Loss: 3.329898         Validation Loss: 3.687313       Elapsed: 0:03:06.961018
Epoch: 62       Training Loss: 3.332215         Validation Loss: 3.722676       Elapsed: 0:03:08.430620
Epoch: 63       Training Loss: 3.290568         Validation Loss: 3.698964       Elapsed: 0:03:08.096713
Epoch: 64       Training Loss: 3.308631         Validation Loss: 3.693485       Elapsed: 0:03:06.612021
Epoch: 65       Training Loss: 3.242924         Validation Loss: 3.676528       Elapsed: 0:03:02.644056
Epoch: 66       Training Loss: 3.210221         Validation Loss: 3.672967       Elapsed: 0:03:02.000280
Epoch: 67       Training Loss: 3.248309         Validation Loss: 3.700498       Elapsed: 0:03:02.847392
Epoch: 68       Training Loss: 3.186689         Validation Loss: 3.672294       Elapsed: 0:03:04.354137
Epoch: 69       Training Loss: 3.148231         Validation Loss: 3.709312       Elapsed: 0:03:05.193586
Epoch: 70       Training Loss: 3.167838         Validation Loss: 3.735657       Elapsed: 0:03:04.797756
Epoch: 71       Training Loss: 3.154821         Validation Loss: 3.683042       Elapsed: 0:03:07.263391
Epoch: 72       Training Loss: 3.151534         Validation Loss: 3.803930       Elapsed: 0:03:02.779610
Epoch: 73       Training Loss: 3.157296         Validation Loss: 3.690141       Elapsed: 0:03:05.410248
Epoch: 74       Training Loss: 3.101250         Validation Loss: 3.771072       Elapsed: 0:03:03.327209
Epoch: 75       Training Loss: 3.052344         Validation Loss: 3.676567       Elapsed: 0:03:01.068909
Epoch: 76       Training Loss: 3.043009         Validation Loss: 3.728986       Elapsed: 0:03:01.663287
Epoch: 77       Training Loss: 3.035244         Validation Loss: 3.787941       Elapsed: 0:03:02.757887
Epoch: 78       Training Loss: 3.024287         Validation Loss: 3.795896       Elapsed: 0:03:01.845504
Epoch: 79       Training Loss: 2.992325         Validation Loss: 3.716417       Elapsed: 0:03:02.454654
Epoch: 80       Training Loss: 2.985272         Validation Loss: 3.665017       Elapsed: 0:03:01.616717
Validation loss decreased (3.666004 --> 3.665017). Saving model ...
Epoch: 81       Training Loss: 2.972644         Validation Loss: 3.750383       Elapsed: 0:03:02.581951
Epoch: 82       Training Loss: 2.948319         Validation Loss: 3.790278       Elapsed: 0:03:02.529694
Epoch: 83       Training Loss: 2.955792         Validation Loss: 3.807737       Elapsed: 0:03:02.909021
Epoch: 84       Training Loss: 2.953483         Validation Loss: 3.884490       Elapsed: 0:03:00.926423
Epoch: 85       Training Loss: 2.907973         Validation Loss: 3.876141       Elapsed: 0:03:01.702236
Epoch: 86       Training Loss: 2.886144         Validation Loss: 3.806277       Elapsed: 0:03:02.415406
Epoch: 87       Training Loss: 2.895160         Validation Loss: 3.768452       Elapsed: 0:03:02.365341
Epoch: 88       Training Loss: 2.878172         Validation Loss: 3.794703       Elapsed: 0:03:01.910776
Epoch: 89       Training Loss: 2.850065         Validation Loss: 3.784806       Elapsed: 0:03:01.821389
Epoch: 90       Training Loss: 2.808656         Validation Loss: 3.834159       Elapsed: 0:03:02.931420
Epoch: 91       Training Loss: 2.807267         Validation Loss: 3.879032       Elapsed: 0:03:01.804976
Epoch: 92       Training Loss: 2.773044         Validation Loss: 3.779162       Elapsed: 0:03:03.069339
Epoch: 93       Training Loss: 2.787731         Validation Loss: 3.912086       Elapsed: 0:03:01.484451
Epoch: 94       Training Loss: 2.741030         Validation Loss: 3.782457       Elapsed: 0:03:01.528688
Epoch: 95       Training Loss: 2.777800         Validation Loss: 3.873816       Elapsed: 0:03:02.658232
Epoch: 96       Training Loss: 2.748137         Validation Loss: 3.923467       Elapsed: 0:03:01.510292
Epoch: 97       Training Loss: 2.725654         Validation Loss: 3.989069       Elapsed: 0:03:02.315783
Epoch: 98       Training Loss: 2.723776         Validation Loss: 3.946343       Elapsed: 0:03:01.279152
Epoch: 99       Training Loss: 2.662464         Validation Loss: 3.885177       Elapsed: 0:03:02.807385
Epoch: 100      Training Loss: 2.714636         Validation Loss: 3.916170       Elapsed: 0:03:01.294095
Training Ended: 2019-01-07 05:24:48.263423
Total Training Time: 5:06:59.494207

load the model that got the best validation accuracy

In [23]:
model_scratch.load_state_dict(torch.load(scratch_path))

(IMPLEMENTATION) Test the Model

Try out your model on the test dataset of dog images. Use the code cell below to calculate and print the test loss and accuracy. Ensure that your test accuracy is greater than 10%.

In [45]:
def test(loaders, model, criterion, use_cuda, print_function=print):

    # monitor test loss and accuracy
    test_loss = 0.
    correct = 0.
    total = 0.

    model.eval()
    for batch_idx, (data, target) in enumerate(loaders['test']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        # calculate the loss
        loss = criterion(output, target)
        # update average test loss 
        test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data - test_loss))
        # convert output probabilities to predicted class
        pred = output.data.max(1, keepdim=True)[1]
        # compare predictions to true label
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(0)
            
    print_function('Test Loss: {:.6f}\n'.format(test_loss))

    print_function('\nTest Accuracy: %2d%% (%2d/%2d)' % (
        100. * correct / total, correct, total))
In [25]:
scratch_test_log = Tee("scratch_test.log")
In [ ]:
# call test function    
test(loaders_scratch, model_scratch, criterion_scratch, use_cuda, print_function=scratch_test_log)
Test Loss: 3.611238


Test Accuracy: 17% (149/836)

Step 4: Create a CNN to Classify Dog Breeds (using Transfer Learning)

You will now use transfer learning to create a CNN that can identify dog breed from images. Your CNN must attain at least 60% accuracy on the test set.

(IMPLEMENTATION) Specify Data Loaders for the Dog Dataset

Use the code cell below to write three separate data loaders for the training, validation, and test datasets of dog images (located at dogImages/train, dogImages/valid, and dogImages/test, respectively).

If you like, you are welcome to use the same data loaders from the previous step, when you created a CNN from scratch.

In [47]:
loaders_transfer = loaders_scratch

(IMPLEMENTATION) Model Architecture

Use transfer learning to create a CNN to classify dog breed. Use the code cell below, and save your initialized model as the variable model_transfer.

The Transfer Model
In [34]:
model_transfer = models.inception_v3(pretrained=True)
for parameter in model_transfer.parameters():
    parameter.requires_grad = False
classifier_inputs = model_transfer.fc.in_features
model_transfer.fc = nn.Linear(in_features=classifier_inputs,
                              out_features=BREEDS,
                              bias=True)
model_transfer.to(device)
MODELS.append(model_transfer)

Question 5: Outline the steps you took to get to your final CNN architecture and your reasoning at each step. Describe why you think the architecture is suitable for the current problem.

Answer:

I looked at the source code and the string representation of the model and saw that the classification was being done by a single fully-connected (Linear) layer with 2,048 inputs and 1,000 outputs. Since we only have 133 outputs I replaced their final layer (model.fc) with one that had the same number of inputs but only 133 outputs.

I chose the Inception V3 network because, like the VGG 16 model, it was trained on the ImageNet data-set and works to detect features in images but, as noted in Rethinking the Inception Architecture for Computer Vision, the Inception model requires fewer computational resources than the VGG model does, which I thought was an attractive feature. The Inception model does introduce a problem in that it uses an auxiliary classifier during training so the training function has to be modified to handle this (the output returns a tuple of tensors), but this seemed minor.

(IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_transfer, and the optimizer as optimizer_transfer below.

In [ ]:
criterion_transfer = nn.CrossEntropyLoss()
optimizer_transfer = optimizer.SGD(
    model_transfer.parameters(),
    lr=0.001,
    momentum=0.9)

(IMPLEMENTATION) Train and Validate the Model

Train and validate your model in the code cell below. Save the final model parameters at filepath 'model_transfer.pt'.

In [24]:
transfer_model_path = MODEL_PATH.joinpath("model_transfer.pt")
In [65]:
transfer_log = Tee(log_name="transfer_train.log")
In [ ]:
EPOCHS = 100
In [ ]:
# train the model
model_transfer = train(EPOCHS,
                       loaders=loaders_transfer,
                       model=model_transfer,
                       optimizer=optimizer_transfer,
                       criterion=criterion_transfer,
                       use_cuda=use_cuda,
                       save_path=transfer_model_path,
                       print_function=transfer_log,
                       is_inception=True)
Training Started: 2019-01-07 05:25:10.303990
Epoch: 1        Training Loss: 4.699307         Validation Loss: 4.270935       Elapsed: 0:03:18.031065
Validation loss decreased (inf --> 4.270935). Saving model ...
Epoch: 2        Training Loss: 4.181660         Validation Loss: 3.670290       Elapsed: 0:03:17.966246
Validation loss decreased (4.270935 --> 3.670290). Saving model ...
Epoch: 3        Training Loss: 3.735970         Validation Loss: 3.142542       Elapsed: 0:03:17.943660
Validation loss decreased (3.670290 --> 3.142542). Saving model ...
Epoch: 4        Training Loss: 3.343428         Validation Loss: 2.698115       Elapsed: 0:03:18.696943
Validation loss decreased (3.142542 --> 2.698115). Saving model ...
Epoch: 5        Training Loss: 2.995878         Validation Loss: 2.334530       Elapsed: 0:03:19.205373
Validation loss decreased (2.698115 --> 2.334530). Saving model ...
Epoch: 6        Training Loss: 2.723056         Validation Loss: 2.033339       Elapsed: 0:03:19.099028
Validation loss decreased (2.334530 --> 2.033339). Saving model ...
Epoch: 7        Training Loss: 2.518057         Validation Loss: 1.812573       Elapsed: 0:03:17.994237
Validation loss decreased (2.033339 --> 1.812573). Saving model ...
Epoch: 8        Training Loss: 2.310053         Validation Loss: 1.609529       Elapsed: 0:03:16.717152
Validation loss decreased (1.812573 --> 1.609529). Saving model ...
Epoch: 9        Training Loss: 2.166829         Validation Loss: 1.439860       Elapsed: 0:03:17.935079
Validation loss decreased (1.609529 --> 1.439860). Saving model ...
Epoch: 10       Training Loss: 2.057079         Validation Loss: 1.292030       Elapsed: 0:03:17.791206
Validation loss decreased (1.439860 --> 1.292030). Saving model ...
Epoch: 11       Training Loss: 1.958263         Validation Loss: 1.243316       Elapsed: 0:03:18.748263
Validation loss decreased (1.292030 --> 1.243316). Saving model ...
Epoch: 12       Training Loss: 1.859445         Validation Loss: 1.130529       Elapsed: 0:03:17.303672
Validation loss decreased (1.243316 --> 1.130529). Saving model ...
Epoch: 13       Training Loss: 1.799369         Validation Loss: 1.067557       Elapsed: 0:03:18.150230
Validation loss decreased (1.130529 --> 1.067557). Saving model ...
Epoch: 14       Training Loss: 1.723310         Validation Loss: 1.018531       Elapsed: 0:03:18.394798
Validation loss decreased (1.067557 --> 1.018531). Saving model ...
Epoch: 15       Training Loss: 1.688872         Validation Loss: 0.965496       Elapsed: 0:03:17.432118
Validation loss decreased (1.018531 --> 0.965496). Saving model ...
Epoch: 16       Training Loss: 1.639950         Validation Loss: 0.907270       Elapsed: 0:03:17.425620
Validation loss decreased (0.965496 --> 0.907270). Saving model ...
Epoch: 17       Training Loss: 1.576800         Validation Loss: 0.875295       Elapsed: 0:03:17.972938
Validation loss decreased (0.907270 --> 0.875295). Saving model ...
Epoch: 18       Training Loss: 1.547050         Validation Loss: 0.824278       Elapsed: 0:03:18.100030
Validation loss decreased (0.875295 --> 0.824278). Saving model ...
Epoch: 19       Training Loss: 1.539646         Validation Loss: 0.808194       Elapsed: 0:03:19.895761
Validation loss decreased (0.824278 --> 0.808194). Saving model ...
Epoch: 20       Training Loss: 1.500094         Validation Loss: 0.777300       Elapsed: 0:03:18.248607
Validation loss decreased (0.808194 --> 0.777300). Saving model ...
Epoch: 21       Training Loss: 1.478536         Validation Loss: 0.762025       Elapsed: 0:03:18.096901
Validation loss decreased (0.777300 --> 0.762025). Saving model ...
Epoch: 22       Training Loss: 1.449271         Validation Loss: 0.745259       Elapsed: 0:03:17.565620
Validation loss decreased (0.762025 --> 0.745259). Saving model ...
Epoch: 23       Training Loss: 1.426696         Validation Loss: 0.721501       Elapsed: 0:03:17.674511
Validation loss decreased (0.745259 --> 0.721501). Saving model ...
Epoch: 24       Training Loss: 1.384365         Validation Loss: 0.706536       Elapsed: 0:03:18.663604
Validation loss decreased (0.721501 --> 0.706536). Saving model ...
Epoch: 25       Training Loss: 1.352370         Validation Loss: 0.684035       Elapsed: 0:03:18.739320
Validation loss decreased (0.706536 --> 0.684035). Saving model ...
Epoch: 26       Training Loss: 1.382330         Validation Loss: 0.680882       Elapsed: 0:03:18.504176
Validation loss decreased (0.684035 --> 0.680882). Saving model ...
Epoch: 27       Training Loss: 1.352410         Validation Loss: 0.662414       Elapsed: 0:03:18.004690
Validation loss decreased (0.680882 --> 0.662414). Saving model ...
Epoch: 28       Training Loss: 1.323105         Validation Loss: 0.652469       Elapsed: 0:03:17.707236
Validation loss decreased (0.662414 --> 0.652469). Saving model ...
Epoch: 29       Training Loss: 1.321770         Validation Loss: 0.634052       Elapsed: 0:03:20.164878
Validation loss decreased (0.652469 --> 0.634052). Saving model ...
Epoch: 30       Training Loss: 1.309750         Validation Loss: 0.638077       Elapsed: 0:03:21.737296
Epoch: 31       Training Loss: 1.307307         Validation Loss: 0.615018       Elapsed: 0:03:18.198152
Validation loss decreased (0.634052 --> 0.615018). Saving model ...
Epoch: 32       Training Loss: 1.259097         Validation Loss: 0.618697       Elapsed: 0:03:19.649852
Epoch: 33       Training Loss: 1.276199         Validation Loss: 0.603413       Elapsed: 0:03:16.942841
Validation loss decreased (0.615018 --> 0.603413). Saving model ...
Epoch: 34       Training Loss: 1.258176         Validation Loss: 0.589237       Elapsed: 0:03:18.103221
Validation loss decreased (0.603413 --> 0.589237). Saving model ...
Epoch: 35       Training Loss: 1.254458         Validation Loss: 0.576390       Elapsed: 0:03:18.758651
Validation loss decreased (0.589237 --> 0.576390). Saving model ...
Epoch: 36       Training Loss: 1.246464         Validation Loss: 0.571317       Elapsed: 0:03:17.794329
Validation loss decreased (0.576390 --> 0.571317). Saving model ...
Epoch: 37       Training Loss: 1.227437         Validation Loss: 0.567114       Elapsed: 0:03:17.484424
Validation loss decreased (0.571317 --> 0.567114). Saving model ...
Epoch: 38       Training Loss: 1.228403         Validation Loss: 0.557364       Elapsed: 0:03:17.744637
Validation loss decreased (0.567114 --> 0.557364). Saving model ...
Epoch: 39       Training Loss: 1.213402         Validation Loss: 0.558201       Elapsed: 0:03:17.285552
Epoch: 40       Training Loss: 1.206945         Validation Loss: 0.557859       Elapsed: 0:03:18.132396
Epoch: 41       Training Loss: 1.193073         Validation Loss: 0.536087       Elapsed: 0:03:17.725738
Validation loss decreased (0.557364 --> 0.536087). Saving model ...
Epoch: 42       Training Loss: 1.194688         Validation Loss: 0.536722       Elapsed: 0:03:17.683174
Epoch: 43       Training Loss: 1.179069         Validation Loss: 0.533558       Elapsed: 0:03:18.412587
Validation loss decreased (0.536087 --> 0.533558). Saving model ...

The connection to the server died during the training (thank you, CenturyLink) so I'll try and read the log instead.

In [28]:
with transfer_log.path.open() as reader:
    for line in reader:
        print(line.rstrip())
Training Started: 2019-01-07 05:25:10.303990
Epoch: 1        Training Loss: 4.699307         Validation Loss: 4.270935       Elapsed: 0:03:18.031065
Validation loss decreased (inf --> 4.270935). Saving model ...
Epoch: 2        Training Loss: 4.181660         Validation Loss: 3.670290       Elapsed: 0:03:17.966246
Validation loss decreased (4.270935 --> 3.670290). Saving model ...
Epoch: 3        Training Loss: 3.735970         Validation Loss: 3.142542       Elapsed: 0:03:17.943660
Validation loss decreased (3.670290 --> 3.142542). Saving model ...
Epoch: 4        Training Loss: 3.343428         Validation Loss: 2.698115       Elapsed: 0:03:18.696943
Validation loss decreased (3.142542 --> 2.698115). Saving model ...
Epoch: 5        Training Loss: 2.995878         Validation Loss: 2.334530       Elapsed: 0:03:19.205373
Validation loss decreased (2.698115 --> 2.334530). Saving model ...
Epoch: 6        Training Loss: 2.723056         Validation Loss: 2.033339       Elapsed: 0:03:19.099028
Validation loss decreased (2.334530 --> 2.033339). Saving model ...
Epoch: 7        Training Loss: 2.518057         Validation Loss: 1.812573       Elapsed: 0:03:17.994237
Validation loss decreased (2.033339 --> 1.812573). Saving model ...
Epoch: 8        Training Loss: 2.310053         Validation Loss: 1.609529       Elapsed: 0:03:16.717152
Validation loss decreased (1.812573 --> 1.609529). Saving model ...
Epoch: 9        Training Loss: 2.166829         Validation Loss: 1.439860       Elapsed: 0:03:17.935079
Validation loss decreased (1.609529 --> 1.439860). Saving model ...
Epoch: 10       Training Loss: 2.057079         Validation Loss: 1.292030       Elapsed: 0:03:17.791206
Validation loss decreased (1.439860 --> 1.292030). Saving model ...
Epoch: 11       Training Loss: 1.958263         Validation Loss: 1.243316       Elapsed: 0:03:18.748263
Validation loss decreased (1.292030 --> 1.243316). Saving model ...
Epoch: 12       Training Loss: 1.859445         Validation Loss: 1.130529       Elapsed: 0:03:17.303672
Validation loss decreased (1.243316 --> 1.130529). Saving model ...
Epoch: 13       Training Loss: 1.799369         Validation Loss: 1.067557       Elapsed: 0:03:18.150230
Validation loss decreased (1.130529 --> 1.067557). Saving model ...
Epoch: 14       Training Loss: 1.723310         Validation Loss: 1.018531       Elapsed: 0:03:18.394798
Validation loss decreased (1.067557 --> 1.018531). Saving model ...
Epoch: 15       Training Loss: 1.688872         Validation Loss: 0.965496       Elapsed: 0:03:17.432118
Validation loss decreased (1.018531 --> 0.965496). Saving model ...
Epoch: 16       Training Loss: 1.639950         Validation Loss: 0.907270       Elapsed: 0:03:17.425620
Validation loss decreased (0.965496 --> 0.907270). Saving model ...
Epoch: 17       Training Loss: 1.576800         Validation Loss: 0.875295       Elapsed: 0:03:17.972938
Validation loss decreased (0.907270 --> 0.875295). Saving model ...
Epoch: 18       Training Loss: 1.547050         Validation Loss: 0.824278       Elapsed: 0:03:18.100030
Validation loss decreased (0.875295 --> 0.824278). Saving model ...
Epoch: 19       Training Loss: 1.539646         Validation Loss: 0.808194       Elapsed: 0:03:19.895761
Validation loss decreased (0.824278 --> 0.808194). Saving model ...
Epoch: 20       Training Loss: 1.500094         Validation Loss: 0.777300       Elapsed: 0:03:18.248607
Validation loss decreased (0.808194 --> 0.777300). Saving model ...
Epoch: 21       Training Loss: 1.478536         Validation Loss: 0.762025       Elapsed: 0:03:18.096901
Validation loss decreased (0.777300 --> 0.762025). Saving model ...
Epoch: 22       Training Loss: 1.449271         Validation Loss: 0.745259       Elapsed: 0:03:17.565620
Validation loss decreased (0.762025 --> 0.745259). Saving model ...
Epoch: 23       Training Loss: 1.426696         Validation Loss: 0.721501       Elapsed: 0:03:17.674511
Validation loss decreased (0.745259 --> 0.721501). Saving model ...
Epoch: 24       Training Loss: 1.384365         Validation Loss: 0.706536       Elapsed: 0:03:18.663604
Validation loss decreased (0.721501 --> 0.706536). Saving model ...
Epoch: 25       Training Loss: 1.352370         Validation Loss: 0.684035       Elapsed: 0:03:18.739320
Validation loss decreased (0.706536 --> 0.684035). Saving model ...
Epoch: 26       Training Loss: 1.382330         Validation Loss: 0.680882       Elapsed: 0:03:18.504176
Validation loss decreased (0.684035 --> 0.680882). Saving model ...
Epoch: 27       Training Loss: 1.352410         Validation Loss: 0.662414       Elapsed: 0:03:18.004690
Validation loss decreased (0.680882 --> 0.662414). Saving model ...
Epoch: 28       Training Loss: 1.323105         Validation Loss: 0.652469       Elapsed: 0:03:17.707236
Validation loss decreased (0.662414 --> 0.652469). Saving model ...
Epoch: 29       Training Loss: 1.321770         Validation Loss: 0.634052       Elapsed: 0:03:20.164878
Validation loss decreased (0.652469 --> 0.634052). Saving model ...
Epoch: 30       Training Loss: 1.309750         Validation Loss: 0.638077       Elapsed: 0:03:21.737296
Epoch: 31       Training Loss: 1.307307         Validation Loss: 0.615018       Elapsed: 0:03:18.198152
Validation loss decreased (0.634052 --> 0.615018). Saving model ...
Epoch: 32       Training Loss: 1.259097         Validation Loss: 0.618697       Elapsed: 0:03:19.649852
Epoch: 33       Training Loss: 1.276199         Validation Loss: 0.603413       Elapsed: 0:03:16.942841
Validation loss decreased (0.615018 --> 0.603413). Saving model ...
Epoch: 34       Training Loss: 1.258176         Validation Loss: 0.589237       Elapsed: 0:03:18.103221
Validation loss decreased (0.603413 --> 0.589237). Saving model ...
Epoch: 35       Training Loss: 1.254458         Validation Loss: 0.576390       Elapsed: 0:03:18.758651
Validation loss decreased (0.589237 --> 0.576390). Saving model ...
Epoch: 36       Training Loss: 1.246464         Validation Loss: 0.571317       Elapsed: 0:03:17.794329
Validation loss decreased (0.576390 --> 0.571317). Saving model ...
Epoch: 37       Training Loss: 1.227437         Validation Loss: 0.567114       Elapsed: 0:03:17.484424
Validation loss decreased (0.571317 --> 0.567114). Saving model ...
Epoch: 38       Training Loss: 1.228403         Validation Loss: 0.557364       Elapsed: 0:03:17.744637
Validation loss decreased (0.567114 --> 0.557364). Saving model ...
Epoch: 39       Training Loss: 1.213402         Validation Loss: 0.558201       Elapsed: 0:03:17.285552
Epoch: 40       Training Loss: 1.206945         Validation Loss: 0.557859       Elapsed: 0:03:18.132396
Epoch: 41       Training Loss: 1.193073         Validation Loss: 0.536087       Elapsed: 0:03:17.725738
Validation loss decreased (0.557364 --> 0.536087). Saving model ...
Epoch: 42       Training Loss: 1.194688         Validation Loss: 0.536722       Elapsed: 0:03:17.683174
Epoch: 43       Training Loss: 1.179069         Validation Loss: 0.533558       Elapsed: 0:03:18.412587
Validation loss decreased (0.536087 --> 0.533558). Saving model ...
Epoch: 44       Training Loss: 1.173093         Validation Loss: 0.521101       Elapsed: 0:03:17.631464
Validation loss decreased (0.533558 --> 0.521101). Saving model ...
Epoch: 45       Training Loss: 1.153653         Validation Loss: 0.527879       Elapsed: 0:03:17.595422
Epoch: 46       Training Loss: 1.158538         Validation Loss: 0.535613       Elapsed: 0:03:18.427818
Epoch: 47       Training Loss: 1.174377         Validation Loss: 0.528422       Elapsed: 0:03:17.892116
Epoch: 48       Training Loss: 1.164288         Validation Loss: 0.507026       Elapsed: 0:03:17.780444
Validation loss decreased (0.521101 --> 0.507026). Saving model ...
Epoch: 49       Training Loss: 1.161782         Validation Loss: 0.503888       Elapsed: 0:03:17.422116
Validation loss decreased (0.507026 --> 0.503888). Saving model ...
Epoch: 50       Training Loss: 1.163059         Validation Loss: 0.500597       Elapsed: 0:03:17.825155
Validation loss decreased (0.503888 --> 0.500597). Saving model ...
Epoch: 51       Training Loss: 1.154003         Validation Loss: 0.509676       Elapsed: 0:03:17.683708
Epoch: 52       Training Loss: 1.122364         Validation Loss: 0.500437       Elapsed: 0:03:16.342809
Validation loss decreased (0.500597 --> 0.500437). Saving model ...
Epoch: 53       Training Loss: 1.118776         Validation Loss: 0.502778       Elapsed: 0:03:17.775326
Epoch: 54       Training Loss: 1.137227         Validation Loss: 0.489028       Elapsed: 0:03:16.730713
Validation loss decreased (0.500437 --> 0.489028). Saving model ...
Epoch: 55       Training Loss: 1.112989         Validation Loss: 0.490746       Elapsed: 0:03:17.194025
Epoch: 56       Training Loss: 1.112278         Validation Loss: 0.491313       Elapsed: 0:03:18.037435
Epoch: 57       Training Loss: 1.105172         Validation Loss: 0.488087       Elapsed: 0:03:17.750197
Validation loss decreased (0.489028 --> 0.488087). Saving model ...
Epoch: 58       Training Loss: 1.106263         Validation Loss: 0.477318       Elapsed: 0:03:17.918800
Validation loss decreased (0.488087 --> 0.477318). Saving model ...
Epoch: 59       Training Loss: 1.110798         Validation Loss: 0.484890       Elapsed: 0:03:17.959631
Epoch: 60       Training Loss: 1.102846         Validation Loss: 0.475269       Elapsed: 0:03:17.318802
Validation loss decreased (0.477318 --> 0.475269). Saving model ...
Epoch: 61       Training Loss: 1.107576         Validation Loss: 0.470764       Elapsed: 0:03:17.191263
Validation loss decreased (0.475269 --> 0.470764). Saving model ...
Epoch: 62       Training Loss: 1.079003         Validation Loss: 0.469544       Elapsed: 0:03:17.907726
Validation loss decreased (0.470764 --> 0.469544). Saving model ...
Epoch: 63       Training Loss: 1.085582         Validation Loss: 0.473371       Elapsed: 0:03:17.590775
Epoch: 64       Training Loss: 1.097795         Validation Loss: 0.466651       Elapsed: 0:03:16.782743
Validation loss decreased (0.469544 --> 0.466651). Saving model ...
Epoch: 65       Training Loss: 1.087516         Validation Loss: 0.466158       Elapsed: 0:03:18.581609
Validation loss decreased (0.466651 --> 0.466158). Saving model ...
Epoch: 66       Training Loss: 1.041934         Validation Loss: 0.469748       Elapsed: 0:03:17.901108
Epoch: 67       Training Loss: 1.075575         Validation Loss: 0.454066       Elapsed: 0:03:17.029518
Validation loss decreased (0.466158 --> 0.454066). Saving model ...
Epoch: 68       Training Loss: 1.074739         Validation Loss: 0.474331       Elapsed: 0:03:18.015337
Epoch: 69       Training Loss: 1.052330         Validation Loss: 0.461796       Elapsed: 0:03:17.474546
Epoch: 70       Training Loss: 1.074078         Validation Loss: 0.457424       Elapsed: 0:03:16.963451
Epoch: 71       Training Loss: 1.032617         Validation Loss: 0.449744       Elapsed: 0:03:17.340017
Validation loss decreased (0.454066 --> 0.449744). Saving model ...
Epoch: 72       Training Loss: 1.054414         Validation Loss: 0.454565       Elapsed: 0:03:17.676010
Epoch: 73       Training Loss: 1.044849         Validation Loss: 0.453206       Elapsed: 0:03:17.600106
Epoch: 74       Training Loss: 1.035498         Validation Loss: 0.458112       Elapsed: 0:03:17.464877
Epoch: 75       Training Loss: 1.047880         Validation Loss: 0.459989       Elapsed: 0:03:17.049121
Epoch: 76       Training Loss: 1.034578         Validation Loss: 0.446105       Elapsed: 0:03:18.764851
Validation loss decreased (0.449744 --> 0.446105). Saving model ...
Epoch: 77       Training Loss: 1.032169         Validation Loss: 0.439367       Elapsed: 0:03:18.741754
Validation loss decreased (0.446105 --> 0.439367). Saving model ...
Epoch: 78       Training Loss: 1.048666         Validation Loss: 0.448395       Elapsed: 0:03:17.824941
Epoch: 79       Training Loss: 1.040212         Validation Loss: 0.440193       Elapsed: 0:03:18.251639
Epoch: 80       Training Loss: 1.032011         Validation Loss: 0.441098       Elapsed: 0:03:17.759952
Epoch: 81       Training Loss: 1.038431         Validation Loss: 0.434215       Elapsed: 0:03:16.541620
Validation loss decreased (0.439367 --> 0.434215). Saving model ...
Epoch: 82       Training Loss: 1.039337         Validation Loss: 0.442144       Elapsed: 0:03:17.911105
Epoch: 83       Training Loss: 1.032783         Validation Loss: 0.438590       Elapsed: 0:03:17.591553
Epoch: 84       Training Loss: 1.034323         Validation Loss: 0.441891       Elapsed: 0:03:17.387050
Epoch: 85       Training Loss: 1.055545         Validation Loss: 0.434267       Elapsed: 0:03:17.262275
Epoch: 86       Training Loss: 0.996985         Validation Loss: 0.432956       Elapsed: 0:03:17.287156
Validation loss decreased (0.434215 --> 0.432956). Saving model ...
Epoch: 87       Training Loss: 1.025106         Validation Loss: 0.433783       Elapsed: 0:03:17.746683
Epoch: 88       Training Loss: 1.003464         Validation Loss: 0.436888       Elapsed: 0:03:17.344770
Epoch: 89       Training Loss: 1.021132         Validation Loss: 0.432445       Elapsed: 0:03:18.347353
Validation loss decreased (0.432956 --> 0.432445). Saving model ...
Epoch: 90       Training Loss: 1.025346         Validation Loss: 0.428862       Elapsed: 0:03:18.518516
Validation loss decreased (0.432445 --> 0.428862). Saving model ...
Epoch: 91       Training Loss: 1.039084         Validation Loss: 0.418361       Elapsed: 0:03:18.556944
Validation loss decreased (0.428862 --> 0.418361). Saving model ...
Epoch: 92       Training Loss: 1.009550         Validation Loss: 0.424567       Elapsed: 0:03:17.763665
Epoch: 93       Training Loss: 1.002043         Validation Loss: 0.430174       Elapsed: 0:03:17.460125
Epoch: 94       Training Loss: 0.995485         Validation Loss: 0.417896       Elapsed: 0:03:18.836221
Validation loss decreased (0.418361 --> 0.417896). Saving model ...
Epoch: 95       Training Loss: 0.969755         Validation Loss: 0.419555       Elapsed: 0:03:11.488185
Epoch: 96       Training Loss: 0.987362         Validation Loss: 0.421185       Elapsed: 0:03:10.406026
Epoch: 97       Training Loss: 0.980267         Validation Loss: 0.417785       Elapsed: 0:03:10.542342
Validation loss decreased (0.417896 --> 0.417785). Saving model ...
Epoch: 98       Training Loss: 0.973978         Validation Loss: 0.416819       Elapsed: 0:03:12.167687
Validation loss decreased (0.417785 --> 0.416819). Saving model ...
Epoch: 99       Training Loss: 0.994163         Validation Loss: 0.418498       Elapsed: 0:03:17.225706
Epoch: 100      Training Loss: 0.998819         Validation Loss: 0.423518       Elapsed: 0:03:18.415953
Training Ended: 2019-01-07 10:55:04.465024
Total Training Time: 5:29:54.161034
In [25]:
# load the model that got the best validation accuracy (uncomment the line below)
model_transfer.load_state_dict(torch.load(transfer_model_path))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-25-bac3efba0fcd> in <module>
      1 # load the model that got the best validation accuracy (uncomment the line below)
----> 2 model_transfer.load_state_dict(torch.load(transfer_model_path))

~/.virtualenvs/neural_networks/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    717         if len(error_msgs) > 0:
    718             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 719                                self.__class__.__name__, "\n\t".join(error_msgs)))
    720 
    721     def parameters(self):

RuntimeError: Error(s) in loading state_dict for Inception3:
        size mismatch for fc.weight: copying a param of torch.Size([1000, 2048]) from checkpoint, where the shape is torch.Size([133, 2048]) in current model.
        size mismatch for fc.bias: copying a param of torch.Size([1000]) from checkpoint, where the shape is torch.Size([133]) in current model.

(IMPLEMENTATION) Test the Model

Try out your model on the test dataset of dog images. Use the code cell below to calculate and print the test loss and accuracy. Ensure that your test accuracy is greater than 60%.

In [46]:
transfer_test_log = Tee("transfer_test.log")
In [51]:
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda, print_function=transfer_test_log)
Test Loss: 0.425383


Test Accuracy: 87% (734/836)

(IMPLEMENTATION) Predict Dog Breed with the Model

Write a function that takes an image path as input and returns the dog breed (Affenpinscher, Afghan hound, etc) that is predicted by your model.

In [32]:
class_names = [item[4:].replace("_", " ") for item in training.classes]

def predict_breed_transfer(img_path: str) -> str:
    """Predicts the dog-breed of what's in the image

    Args:
     img_path: path to the image to search

    Returns:
     the name of the dog-breed
    """
    # load the image
    image = Image.open(image_path)

    # convert the image to a tensor
    tensor = test_transform(image)

    # add a batch number
    tensor = tensor.unsqueeze_(0)

    # put on the GPU or CPU
    tensor = tensor.to(device)

    # make it a variable
    x = torch.autograd.Variable(tensor)

    # make the prediction
    output = model(x)
    return class_names[output.data.cpu().numpy().argmax()]

Step 5: Write your Algorithm

Write an algorithm that accepts a file path to an image and first determines whether the image contains a human, dog, or neither. Then,

  • if a dog is detected in the image, return the predicted breed.
  • if a human is detected in the image, return the resembling dog breed.
  • if neither is detected in the image, provide output that indicates an error.

You are welcome to write your own functions for detecting humans and dogs in images, but feel free to use the face_detector and human_detector functions developed above. You are required to use your CNN from Step 4 to predict dog breed.

Some sample output for our algorithm is provided below, but feel free to design your own user experience!

Sample Human Output

(IMPLEMENTATION) Write your Algorithm

Re-Done Code

I originally wrote my implementation using classes, because I kept getting errors related to the fact that jupyter lets you run cells out of order so I wanted them defined as a group (and because I find it easier to work this way once there is this much code). So I broke the parts up to answer the questions but am including them in this section to make my final solution work. Everything until the Dog Breed Classifier section was already implemented in the sections above using functions and global variables instead of class methods, only the Dog Breed Classification section and below has new implementations.

In [53]:
class Transformer:
    """Builds the image transformers

    Args:
     means: list of means for each channel
     deviations: list of standard deviations for each channel
     image_size: size to crop the image to
    """
    def __init__(self,
                 means: list=MEANS,
                 deviations: list=DEVIATIONS,
                 image_size: int=INCEPTION_IMAGE_SIZE) -> None:
        self.means = means
        self.deviations = deviations
        self.image_size = image_size
        self._training = None
        self._testing = None
        return

    @property
    def training(self) -> transforms.Compose:
        """The image transformers for the training"""
        if self._training is None:
            self._training = transforms.Compose([
                transforms.RandomRotation(30),
                transforms.RandomResizedCrop(self.image_size),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize(self.means,
                                     self.deviations)])
        return self._training

    @property
    def testing(self) -> transforms.Compose:
        """Image transforms for the testing"""
        if self._testing is None:
            self._testing = transforms.Compose(
                [transforms.Resize(self.image_size),
                 transforms.CenterCrop(INCEPTION_IMAGE_SIZE),
                 transforms.ToTensor(),
                 transforms.Normalize(self.means,
                                      self.deviations)])
        return self._testing
In [54]:
class DogDetector:
    """Detects dogs

    Args:
     model_definition: definition for the model
     device: where to run the model (CPU or CUDA)
     image_size: what to resize the file to (depends on the model-definition)
     means: mean for each channel
     deviations: standard deviation for each channel
     dog_lower_bound: index below where dogs start
     dog_upper_bound: index above where dogs end
    """
    def __init__(self,
                 model_definition: nn.Module=models.inception_v3,
                 image_size: int=INCEPTION_IMAGE_SIZE,
                 means: list=MEANS,
                 deviations: list=DEVIATIONS,
                 dog_lower_bound: int=DOG_LOWER,
                 dog_upper_bound: int=DOG_UPPER,
                 device: torch.device=None) -> None:
        self.model_definition = model_definition
        self.image_size = image_size
        self.means = means
        self.deviations = deviations
        self.dog_lower_bound = dog_lower_bound
        self.dog_upper_bound = dog_upper_bound
        self._device = device
        self._model = None
        self._transformer = None
        return

    @property
    def device(self) -> torch.device:
        """The device to add the model to"""
        if self._device is None:
            self._device = torch.device("cuda"
                                        if torch.cuda.is_available()
                                        else "cpu")
        return self._device

    @property
    def model(self) -> nn.Module:
        """Build the model"""
        if self._model is None:
            self._model = self.model_definition(pretrained=True)
            self._model.to(self.device)
            self._model.eval()
        return self._model

    @property
    def transformer(self) -> Transformer:
        """The transformer for the image data"""
        if self._transformer is None:
            self._transformer = Transformer()
        return self._transformer

    def __call__(self, image_path: str) -> bool:
        """Checks if there is a dog in the image"""
        image = Image.open(str(image_path))
        image = self.transformer.testing(image).unsqueeze(0).to(self.device)
        output = self.model(image)
        probabilities = torch.exp(output)
        _, top_class = probabilities.topk(1, dim=1)
        return self.dog_lower_bound < top_class.item() < self.dog_upper_bound
In [55]:
class SpeciesDetector:
    """Detect dogs and humans

    Args:
     device: where to put the dog-detecting model
    """
    def __init__(self, device: torch.device=None) -> None:
        self.device = device
        self._dog_detector = None
        return

    @property
    def dog_detector(self) -> DogDetector:
        """Neural Network dog-detector"""
        if self._dog_detector is None:
            self._dog_detector = DogDetector(device=self.device)
        return self._dog_detector

    def is_human(self, image_path: str) -> bool:
        """Checks if the image is a human
        
        Args:
         image_path: path to the image

        Returns:
         True if there is a human face in the image
        """
        image = face_recognition.load_image_file(str(image_path))
        faces = face_recognition.face_locations(image)
        return len(faces) > 0

    def is_dog(self, image_path: str) -> bool:        
        """Checks if there is a dog in the image"""
        return self.dog_detector(image_path)
In [56]:
class DogPaths:
    """holds the paths to the dog images"""
    def __init__(self) -> None:
        self._main = None
        self._training = None
        self._testing = None
        self._validation = None
        return

    @property
    def main(self) -> Path:
        """The path to the main folder"""
        if self._main is None:
            self._main = DOG_PATH
        return self._main

    @property
    def training(self) -> Path:
        """Path to the training images"""
        if self._training is None:
            self._training = DOG_PATH.joinpath("train")
        return self._training

    @property
    def validation(self) -> Path:
        """Path to the validation images"""
        if self._validation is None:
            self._validation = DOG_PATH.joinpath("valid")
        return self._validation

    @property
    def testing(self) -> Path:
        """Path to the testing images"""
        if self._testing is None:
            self._testing = DOG_PATH.joinpath("test")
        return self._testing
In [57]:
class Inception:
    """Sets up the model, criterion, and optimizer for the transfer learning

    Args:
     classes: number of outputs for the final layer
     device: processor to use
     model_path: path to a saved model
     learning_rate: learning rate for the optimizer
     momentum: momentum for the optimizer
    """
    def __init__(self, classes: int,
                 device: torch.device=None,
                 model_path: str=None,
                 learning_rate: float=0.001, momentum: float=0.9) -> None:
        self.classes = classes
        self.model_path = model_path
        self.learning_rate = learning_rate
        self.momentum = momentum
        self._device = device
        self._model = None
        self._classifier_inputs = None
        self._criterion = None
        self._optimizer = None
        return

    @property
    def device(self) -> torch.device:
        """Processor to use (cpu or cuda)"""
        if self._device is None:
            self._device = torch.device(
                "cuda" if torch.cuda.is_available() else "cpu")
        return self._device

    @property
    def model(self) -> models.inception_v3:
        """The inception model"""
        if self._model is None:
            self._model = models.inception_v3(pretrained=True)
            for parameter in self._model.parameters():
                parameter.requires_grad = False
            classifier_inputs = self._model.fc.in_features
            self._model.fc = nn.Linear(in_features=classifier_inputs,
                                       out_features=self.classes,
                                       bias=True)
            self._model.to(self.device)
            if self.model_path:
                self._model.load_state_dict(torch.load(self.model_path))
        return self._model

    @property
    def criterion(self) -> nn.CrossEntropyLoss:
        """The loss callable"""
        if self._criterion is None:
            self._criterion = nn.CrossEntropyLoss()
        return self._criterion

    @property
    def optimizer(self) -> optimizer.SGD:
        """The Gradient Descent object"""
        if self._optimizer is None:
            self._optimizer = optimizer.SGD(
                self.model.parameters(),
                lr=self.learning_rate,
                momentum=self.momentum)
        return self._optimizer
In [58]:
class DataSets:
    """Builds the data-sets

    Args:
     paths: object with the paths to the data-sets
    """
    def __init__(self, paths: DogPaths=None, transformer: Transformer=None) -> None:
        self._paths = paths
        self._transformer = transformer
        self._training = None
        self._validation = None
        self._testing = None
        return

    @property
    def paths(self) -> DogPaths:
        """Object with the paths to the image files"""
        if self._paths is None:
            self._paths = DogPaths()
        return self._paths

    @property
    def transformer(self) -> Transformer:
        """Object with the image transforms"""
        if self._transformer is None:
            self._transformer = Transformer()
        return self._transformer

    @property
    def training(self) -> datasets.ImageFolder:
        """The training data set"""
        if self._training is None:
            self._training = datasets.ImageFolder(
                root=self.paths.training,
                transform=self.transformer.training)
        return self._training

    @property
    def validation(self) -> datasets.ImageFolder:
        """The validation dataset"""
        if self._validation is None:
            self._validation = datasets.ImageFolder(
                root=self.paths.validation,
                transform=self.transformer.testing)
        return self._validation

    @property
    def testing(self) -> datasets.ImageFolder:
        """The test set"""
        if self._testing is None:
            self._testing = datasets.ImageFolder(
                root=self.paths.testing,
                transform=self.transformer.testing)
        return self._testing
In [59]:
class DogPredictor:
    """Makes dog-breed predictions
    
    Args:
     model_path: path to the model's state-dict
     device: processor to run the model on
     data_sets: a DataSets object
     inception: an Inception object
    """
    def __init__(self, model_path: str=None,
                 device: torch.device=None,
                 data_sets: DataSets=None,
                 inception: Inception=None) -> None:
        self.model_path = model_path
        self.device = device
        self._data_sets = data_sets
        self._inception = inception
        self._breeds = None
        return

    @property
    def data_sets(self) -> DataSets:
        if self._data_sets is None:
            self._data_sets = DataSets()
        return self._data_sets

    @property
    def inception(self) -> Inception:
        """An Inception object"""
        if self._inception is None:
            self._inception = Inception(
                classes=len(self.data_sets.training.classes),
                model_path=self.model_path,
                device=self.device)
            self._inception.model.eval()
        return self._inception

    @property
    def breeds(self) -> list:
        """A list of dog-breeds"""
        if self._breeds is None:
            self._breeds = [name[4:].replace("_", " ")
                            for name in self.data_sets.training.classes]
        return self._breeds

    def predict_index(self, image_path:str) -> int:
        """Predicts the index of the breed of the dog in the image

        Args:
         image_path: path to the image
        Returns:
         index in the breeds list for the image
        """
        model = self.inception.model        
        image = Image.open(image_path)
        tensor = self.data_sets.transformer.testing(image)
        # add a batch number
        tensor = tensor.unsqueeze_(0)
        tensor = tensor.to(self.inception.device)
        x = torch.autograd.Variable(tensor)
        output = model(x)
        return output.data.cpu().numpy().argmax()

    def __call__(self, image_path) -> str:
        """Predicts the breed of the dog in the image

        Args:
         image_path: path to the image
        Returns:
         name of the breed
        """
        return self.breeds[self.predict_index(image_path)]

The Dog Breed Classifier

This implements the dog-breed classifier using the classes immediately above.

In [60]:
class DogBreedClassifier:
    """Tries To predict the dog-breed for an image

    Args:
     model_path: path to the inception-model
    """
    def __init__(self, model_path: str) -> None:
        self.model_path = model_path
        self._breed_predictor = None
        self._species_detector = None
        return

    @property
    def breed_predictor(self) -> DogPredictor:
        """Predictor of dog-breeds"""
        if self._breed_predictor is None:
            self._breed_predictor = DogPredictor(model_path=self.model_path)
        return self._breed_predictor

    @property
    def species_detector(self) -> SpeciesDetector:
        """Detector of humans and dogs"""
        if self._species_detector is None:
            self._species_detector = SpeciesDetector(
                device=self.breed_predictor.inception.device)
        return self._species_detector

    def render(self, image_path: str, species: str, breed: str) -> None:
        """Renders the image

        Args:
         image_path: path to the image to render
         species: identified species
         breed: identified breed
        """
        name = " ".join(image_path.name.split(".")[0].split("_")).title()
        figure, axe = pyplot.subplots()
        figure.suptitle("{} ({})".format(species, name), weight="bold")
        axe.set_xlabel("Looks like a {}.".format(breed))
        image = Image.open(image_path)
        axe.tick_params(dict(axis="both",
                             which="both",
                             bottom=False,
                             top=False))
        axe.get_xaxis().set_ticks([])
        axe.get_yaxis().set_ticks([])
        axe_image = axe.imshow(image)
        return

    def __call__(self, image_path:str) -> None:
        """detects the dog-breed and displays the image

        Args:
         image_path: path to the image
        """
        image_path = Path(image_path)
        is_dog = self.species_detector.is_dog(image_path)
        is_human = self.species_detector.is_human(image_path)

        if not is_dog and not is_human:
            species = "Error: Neither Human nor Dog"
            breed = "?"
        else:
            breed = self.breed_predictor(image_path)

        if is_dog and is_human:
            species = "Human-Dog Hybrid"
        elif is_dog:
            species = "Dog"
        elif is_human:
            species = "Human"
        self.render(image_path, species, breed)
        return

The next cell transfers the existing models to the CPU to free up memory on the GPU, since the class-based version builds them anyway.

In [67]:
for model in MODELS:
    model.cpu()
classifier = DogBreedClassifier(model_path=transfer_model_path)
In [68]:
def run_app(img_path):
    """Runs the dog breed classifier

    Args:
     img_path: path to the image to classify
    """
    classifier(img_path)
    return

Step 6: Test Your Algorithm

In this section, you will take your new algorithm for a spin! What kind of dog does the algorithm think that you look like? If you have a dog, does it predict your dog's breed accurately? If you have a cat, does it mistakenly think that your cat is a dog?

(IMPLEMENTATION) Test Your Algorithm on Sample Images!

Test your algorithm at least six images on your computer. Feel free to use any images you like. Use at least two human and two dog images.

First, I'll create a function to find species detections that were wrong.

In [12]:
def first_prediction(source: list, start:int=0, count: int=1) -> int:
    """Gets the index of the first True prediction

    Args:
     source: list of True/False predictions
     start: index to start the search from
     count: number of indices to find

    Returns:
     indices of first True predictions found
    """
    indices = []
    found = 0
    for index, prediction in enumerate(source[start:]):
        if prediction:
            print("{}: {}".format(start + index, prediction))
            indices.append(index)
            found += 1
            if found == count:
                break
    return indices
In [37]:
human_dog = first_prediction(dlib_false_positives)
0: True
In [38]:
hot_dog = "hot_dog.jpg"
rabbit = "rabbit.jpg"
test_images = [dog_files_short[human_dog[0]], hot_dog, rabbit]
In [39]:
dogs = numpy.random.choice(dog_files, 3)
humans = numpy.random.choice(human_files, 3)
In [71]:
images = numpy.hstack((dogs, humans, test_images))
for image in images:
    run_app(image)

Question 6: Is the output better than you expected :) ? Or worse :( ? Provide at least three possible points of improvement for your algorithm.

Answer: (Three possible points for improvement) The outcome was better than I expected, but here are some possible improvements:

  1. Try other models, in particular the Resnet model which is the state-of-the art for imagenet.
  2. Tune the Transfer Model more - it improved at epoch 98 so it might do better with more training (I stopped because of the time it took to train it).
  3. Try alternatives to Stochastic Gradient Descent - in particular Adam optimization - to improve training.