IMDB Reviews Tensorflow Dataset

Beginning

We're going to use the IMDB Reviews Dataset (used in this tutorial) - a set of 50,000 movie reviews taken from the Internet Movie Database that have been classified as either positive or negative. It looks like the original source is from a page on Stanford University's web sight title Large Movie Review Dataset. The dataset seems to be widely available (the Stanford page and Kaggle for instance) but this will serve as practice for using tensorflow datasets as well.

Imports

Python

from functools import partial

PyPi

import hvplot.pandas
import pandas
import tensorflow
import tensorflow_datasets

Graeae

from graeae import EmbedHoloviews, Timer

Set Up

Plotting

SLUG = "imdb-reviews-tensorflow-dataset"
Embed = partial(EmbedHoloviews, folder_path=f"../../files/posts/keras/{SLUG}")

Timer

TIMER = Timer()

Middle

Get the Dataset

Load It

The load function takes quite a few parameters, in this case we're just passing in three - the name of the dataset, with_info which tells it to return both a Dataset and a DatasetInfo object, and as_supervised, which tells the builder to return the Dataset as a series of (input, label) tuples.

dataset, info = tensorflow_datasets.load('imdb_reviews/subwords8k',
                                         with_info=True,
                                         as_supervised=True)

Split It

The dataset is a dict with three keys:

print(dataset.keys())
dict_keys(['test', 'train', 'unsupervised'])

As you might guess, we don't use the unsupervised key.

train_dataset, test_dataset = dataset['train'], dataset['test']

The Tokenizer

One of the advantages of using the tensorflow dataset version of this is that it comes with a pre-built tokenizer inside the DatasetInfo object.

print(info.features)
FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8185>),
})
tokenizer = info.features['text'].encoder
print(tokenizer)
<SubwordTextEncoder vocab_size=8185>

The tokenizer is a SubwordTextEncoder with a vocabulary size of 8,185.

Set Up Data

We're going to shuffle the training data and then add padding to both sets so theyre all the same size.

BUFFER_SIZE = 20000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE, train_dataset.output_shapes)
test_dataset = test_dataset.padded_batch(BATCH_SIZE, test_dataset.output_shapes)

The Model

model = tensorflow.keras.Sequential([
    tensorflow.keras.layers.Embedding(tokenizer.vocab_size, 64),
    tensorflow.keras.layers.Bidirectional(tensorflow.keras.layers.LSTM(64)),
    tensorflow.keras.layers.Dense(64, activation='relu'),
    tensorflow.keras.layers.Dense(1, activation='sigmoid')
])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, None, 64)          523840    
_________________________________________________________________
bidirectional (Bidirectional (None, 128)               66048     
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 65        
=================================================================
Total params: 598,209
Trainable params: 598,209
Non-trainable params: 0
_________________________________________________________________

Compile It

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

Train It

EPOCHS = 10
SILENT = 0
ONCE_PER_EPOCH = 2
with TIMER:
    history = model.fit(train_dataset,
                        epochs=EPOCHS,
                        validation_data=test_dataset,
                        verbose=ONCE_PER_EPOCH)
2019-09-21 15:52:50,469 graeae.timers.timer start: Started: 2019-09-21 15:52:50.469787
I0921 15:52:50.469841 140086305412928 timer.py:70] Started: 2019-09-21 15:52:50.469787
Epoch 1/10
391/391 - 80s - loss: 0.3991 - accuracy: 0.8377 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/10
391/391 - 80s - loss: 0.3689 - accuracy: 0.8571 - val_loss: 0.4595 - val_accuracy: 0.8021
Epoch 3/10
391/391 - 80s - loss: 0.3664 - accuracy: 0.8444 - val_loss: 0.5262 - val_accuracy: 0.7228
Epoch 4/10
391/391 - 80s - loss: 0.5611 - accuracy: 0.7133 - val_loss: 0.6832 - val_accuracy: 0.6762
Epoch 5/10
391/391 - 80s - loss: 0.6151 - accuracy: 0.6597 - val_loss: 0.5164 - val_accuracy: 0.7844
Epoch 6/10
391/391 - 80s - loss: 0.3842 - accuracy: 0.8340 - val_loss: 0.4970 - val_accuracy: 0.7996
Epoch 7/10
391/391 - 80s - loss: 0.2449 - accuracy: 0.9058 - val_loss: 0.3639 - val_accuracy: 0.8463
Epoch 8/10
391/391 - 80s - loss: 0.1896 - accuracy: 0.9306 - val_loss: 0.3698 - val_accuracy: 0.8614
Epoch 9/10
391/391 - 80s - loss: 0.1555 - accuracy: 0.9456 - val_loss: 0.3896 - val_accuracy: 0.8535
Epoch 10/10
391/391 - 80s - loss: 0.1195 - accuracy: 0.9606 - val_loss: 0.4878 - val_accuracy: 0.8428
2019-09-21 16:06:09,935 graeae.timers.timer end: Ended: 2019-09-21 16:06:09.935707
I0921 16:06:09.935745 140086305412928 timer.py:77] Ended: 2019-09-21 16:06:09.935707
2019-09-21 16:06:09,938 graeae.timers.timer end: Elapsed: 0:13:19.465920
I0921 16:06:09.938812 140086305412928 timer.py:78] Elapsed: 0:13:19.465920

Plot the Performance

  • Note: This only works if your kernel is on the local machine, running it remotely gives an error, as it tries to save it on the remote machine.
data = pandas.DataFrame(history.history)
data = data.rename(columns={"loss": "Training Loss",
                            "accuracy": "Training Accuracy",
                            "val_loss": "Validation Loss",
                            "val_accuracy": "Validation Accuracy"})
plot = data.hvplot().opts(title="LSTM IMDB Performance", width=1000, height=800)
Embed(plot=plot, file_name="model_performance")()

Figure Missing

It looks like I over-trained it, as the loss is getting high. (Also note that I used this notebook to troubleshoot so there was actually one extra epoch that isn't shown).

End

Citation

This is the paper where the dataset was originally used.

  • Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).

BBC News Classification

Beginning

Imports

Python

from functools import partial
from pathlib import Path
import csv
import random

PyPi

from sklearn.model_selection import train_test_split
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import hvplot.pandas
import numpy
import pandas
import spacy
import tensorflow

Graeae

from graeae import EmbedHoloviews, SubPathLoader, Timer

Setup

The Timer

TIMER = Timer()

The Environment

ENVIRONMENT = SubPathLoader('DATASETS')

Spacy

spacy.prefer_gpu()
nlp = spacy.load("en_core_web_lg")

Plotting

SLUG = "bbc-news-classification"
Embed = partial(EmbedHoloviews, folder_path=f"../../files/posts/keras/{SLUG}")

Middle

Load the Datasets

path = Path(ENVIRONMENT["BBC_NEWS"]).expanduser()

texts = []
labels = []
with TIMER:
    with path.open() as csvfile:
        lines = csv.DictReader(csvfile)
        for line in lines:
            labels.append(line["category"])
            texts.append(nlp(line["text"]))
WARNING: Logging before flag parsing goes to stderr.
I0908 13:32:14.804769 139839933974336 environment.py:35] Environment Path: /home/athena/.env
I0908 13:32:14.806000 139839933974336 environment.py:90] Environment Path: /home/athena/.config/datasets/env
2019-09-08 13:32:14,806 graeae.timers.timer start: Started: 2019-09-08 13:32:14.806861
I0908 13:32:14.806965 139839933974336 timer.py:70] Started: 2019-09-08 13:32:14.806861
2019-09-08 13:33:37,430 graeae.timers.timer end: Ended: 2019-09-08 13:33:37.430228
I0908 13:33:37.430259 139839933974336 timer.py:77] Ended: 2019-09-08 13:33:37.430228
2019-09-08 13:33:37,431 graeae.timers.timer end: Elapsed: 0:01:22.623367
I0908 13:33:37.431128 139839933974336 timer.py:78] Elapsed: 0:01:22.623367
print(texts[random.randrange(len(texts))])
candidate resigns over bnp link a prospective candidate for the uk independence party (ukip) has resigned after admitting a  brief attachment  to the british national party(bnp).  nicholas betts-green  who had been selected to fight the suffolk coastal seat  quit after reports in a newspaper that he attended a bnp meeting. the former teacher confirmed he had attended the meeting but said that was the only contact he had with the group. mr betts-green resigned after being questioned by the party s leadership. a ukip spokesman said mr betts-green s resignation followed disclosures in the east anglian daily times last month about his attendance at a bnp meeting.  he did once attend a bnp meeting. he did not like what he saw and heard and will take no further part of it   the spokesman added. a meeting of suffolk coastal ukip members is due to be held next week to discuss a replacement. mr betts-green  of woodbridge  suffolk  has also resigned as ukip s branch chairman.

So, it looks like the text has been lower-cased but there's still punctuation and extra white-space.

print(f"Rows: {len(labels):,}")
print(f"Unique Labels: {len(set(labels)):,}")
Rows: 2,225
Unique Labels: 5

Since there's only five maybe we should plot it.

labels_frame = pandas.DataFrame({"label": labels})
counts = labels_frame.label.value_counts().reset_index().rename(
    columns={"index": "Category", "label": "Articles"})
plot = counts.hvplot.bar("Category", "Articles").opts(
    title="Count of BBC News Articles by Category",
    height=800, width=1000)
Embed(plot=plot, file_name="bbc_category_counts")()

Figure Missing

It looks like the categories are somewhat unevenly distributed. Now to normalize the tokens.

with TIMER:
    cleaned = [[token.lemma_ for token in text if not any((token.is_stop, token.is_space, token.is_punct))]
               for text in texts]
2019-09-08 13:33:40,257 graeae.timers.timer start: Started: 2019-09-08 13:33:40.257908
I0908 13:33:40.257930 139839933974336 timer.py:70] Started: 2019-09-08 13:33:40.257908
2019-09-08 13:33:40,810 graeae.timers.timer end: Ended: 2019-09-08 13:33:40.810135
I0908 13:33:40.810176 139839933974336 timer.py:77] Ended: 2019-09-08 13:33:40.810135
2019-09-08 13:33:40,811 graeae.timers.timer end: Elapsed: 0:00:00.552227
I0908 13:33:40.811067 139839933974336 timer.py:78] Elapsed: 0:00:00.552227

The Tokenizers

Even though I've already tokenized the texts, we need to eventually one-hot-encode them so I'll use the tensorflow keras Tokenizer.

Note: The labels tokenizer doesn't get the out-of-vocabulary token, only the text-tokenizer does.

tokenizer = Tokenizer(num_words=1000, oov_token="<OOV>")
labels_tokenizer = Tokenizer()
labels_tokenizer.fit_on_texts(labels)

The num_words is the total amount of words that will be kept in the word index - I don't know why a thousand, I just found that in the "answer" notebook. The oov_token is what's used when a word is encountered outside of the words we're building into our word-index (Out Of Vocabulary). The next step is to create the word-index by fitting the tokenizer to the text.

with TIMER:
    tokenizer.fit_on_texts(cleaned)
2019-09-08 14:59:30,671 graeae.timers.timer start: Started: 2019-09-08 14:59:30.671536
I0908 14:59:30.671563 139839933974336 timer.py:70] Started: 2019-09-08 14:59:30.671536
2019-09-08 14:59:30,862 graeae.timers.timer end: Ended: 2019-09-08 14:59:30.862483
I0908 14:59:30.862523 139839933974336 timer.py:77] Ended: 2019-09-08 14:59:30.862483
2019-09-08 14:59:30,863 graeae.timers.timer end: Elapsed: 0:00:00.190947
I0908 14:59:30.863504 139839933974336 timer.py:78] Elapsed: 0:00:00.190947

The tokenizer now has a dictionary named word_index that holds the words:index pairs for all the tokens found (it only uses the num_words when you call tokenizer's methods according to Stack Overflow).

print(f"{len(tokenizer.word_index):,}")
24,339

Making the Sequences

I've trained the Tokenizer so that it has a word-index, but now we have to one hot encode our texts and pad them so they're all the same length.

MAX_LENGTH = 120
sequences = tokenizer.texts_to_sequences(cleaned)
padded = pad_sequences(sequences, padding="post", maxlen=MAX_LENGTH)
labels_sequenced = labels_tokenizer.texts_to_sequences(labels)

Make training and testing sets

TESTING = 0.2
x_train, x_test, y_train, y_test = train_test_split(
    padded, labels_sequenced,
    test_size=TESTING)
x_train, x_validation, y_train, y_validation = train_test_split(
    x_train, y_train, test_size=TESTING)

y_train = numpy.array(y_train)
y_test = numpy.array(y_test)
y_validation = numpy.array(y_validation)

print(f"Training: {x_train.shape}")
print(f"Validation: {x_validation.shape}")
print(f"Testing: {x_test.shape}")
Training: (1424, 120)
Validation: (356, 120)
Testing: (445, 120)

Note: I originally forgot to pass the TESTING variable with the keyword test_size and got an error that I couldn't use a Singleton array - don't forget the keywords when you pass in anything other than the data to train_test_split.

The Model

vocabulary_size = 1000
embedding_dimension = 16
max_length=120

model = tensorflow.keras.Sequential([
    layers.Embedding(vocabulary_size, embedding_dimension,
                     input_length=max_length),
    layers.GlobalAveragePooling1D(),
    layers.Dense(24, activation="relu"),
    layers.Dense(6, activation="softmax"),
])
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
print(model.summary())
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 120, 16)           16000     
_________________________________________________________________
global_average_pooling1d_1 ( (None, 16)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 24)                408       
_________________________________________________________________
dense_3 (Dense)              (None, 6)                 150       
=================================================================
Total params: 16,558
Trainable params: 16,558
Non-trainable params: 0
_________________________________________________________________
None
model.fit(x_train, y_train, epochs=30,
          validation_data=(x_validation, y_validation), verbose=2)
Train on 1424 samples, validate on 356 samples
Epoch 1/30
1424/1424 - 0s - loss: 1.7623 - accuracy: 0.2879 - val_loss: 1.7257 - val_accuracy: 0.5000
Epoch 2/30
1424/1424 - 0s - loss: 1.6871 - accuracy: 0.5190 - val_loss: 1.6332 - val_accuracy: 0.5281
Epoch 3/30
1424/1424 - 0s - loss: 1.5814 - accuracy: 0.4782 - val_loss: 1.5118 - val_accuracy: 0.4944
Epoch 4/30
1424/1424 - 0s - loss: 1.4417 - accuracy: 0.4677 - val_loss: 1.3543 - val_accuracy: 0.5365
Epoch 5/30
1424/1424 - 0s - loss: 1.2706 - accuracy: 0.5934 - val_loss: 1.1850 - val_accuracy: 0.7022
Epoch 6/30
1424/1424 - 0s - loss: 1.1075 - accuracy: 0.6749 - val_loss: 1.0387 - val_accuracy: 0.8006
Epoch 7/30
1424/1424 - 0s - loss: 0.9606 - accuracy: 0.8483 - val_loss: 0.9081 - val_accuracy: 0.8567
Epoch 8/30
1424/1424 - 0s - loss: 0.8244 - accuracy: 0.8869 - val_loss: 0.7893 - val_accuracy: 0.8848
Epoch 9/30
1424/1424 - 0s - loss: 0.6963 - accuracy: 0.9164 - val_loss: 0.6747 - val_accuracy: 0.8961
Epoch 10/30
1424/1424 - 0s - loss: 0.5815 - accuracy: 0.9228 - val_loss: 0.5767 - val_accuracy: 0.9185
Epoch 11/30
1424/1424 - 0s - loss: 0.4831 - accuracy: 0.9375 - val_loss: 0.4890 - val_accuracy: 0.9270
Epoch 12/30
1424/1424 - 0s - loss: 0.3991 - accuracy: 0.9473 - val_loss: 0.4195 - val_accuracy: 0.9326
Epoch 13/30
1424/1424 - 0s - loss: 0.3321 - accuracy: 0.9508 - val_loss: 0.3669 - val_accuracy: 0.9438
Epoch 14/30
1424/1424 - 0s - loss: 0.2800 - accuracy: 0.9572 - val_loss: 0.3268 - val_accuracy: 0.9494
Epoch 15/30
1424/1424 - 0s - loss: 0.2385 - accuracy: 0.9656 - val_loss: 0.2936 - val_accuracy: 0.9438
Epoch 16/30
1424/1424 - 0s - loss: 0.2053 - accuracy: 0.9740 - val_loss: 0.2693 - val_accuracy: 0.9466
Epoch 17/30
1424/1424 - 0s - loss: 0.1775 - accuracy: 0.9761 - val_loss: 0.2501 - val_accuracy: 0.9466
Epoch 18/30
1424/1424 - 0s - loss: 0.1557 - accuracy: 0.9789 - val_loss: 0.2332 - val_accuracy: 0.9494
Epoch 19/30
1424/1424 - 0s - loss: 0.1362 - accuracy: 0.9831 - val_loss: 0.2189 - val_accuracy: 0.9522
Epoch 20/30
1424/1424 - 0s - loss: 0.1209 - accuracy: 0.9853 - val_loss: 0.2082 - val_accuracy: 0.9551
Epoch 21/30
1424/1424 - 0s - loss: 0.1070 - accuracy: 0.9860 - val_loss: 0.1979 - val_accuracy: 0.9579
Epoch 22/30
1424/1424 - 0s - loss: 0.0952 - accuracy: 0.9888 - val_loss: 0.1897 - val_accuracy: 0.9551
Epoch 23/30
1424/1424 - 0s - loss: 0.0854 - accuracy: 0.9902 - val_loss: 0.1815 - val_accuracy: 0.9579
Epoch 24/30
1424/1424 - 0s - loss: 0.0765 - accuracy: 0.9916 - val_loss: 0.1761 - val_accuracy: 0.9522
Epoch 25/30
1424/1424 - 0s - loss: 0.0689 - accuracy: 0.9930 - val_loss: 0.1729 - val_accuracy: 0.9579
Epoch 26/30
1424/1424 - 0s - loss: 0.0618 - accuracy: 0.9951 - val_loss: 0.1680 - val_accuracy: 0.9551
Epoch 27/30
1424/1424 - 0s - loss: 0.0559 - accuracy: 0.9958 - val_loss: 0.1633 - val_accuracy: 0.9551
Epoch 28/30
1424/1424 - 0s - loss: 0.0505 - accuracy: 0.9958 - val_loss: 0.1594 - val_accuracy: 0.9579
Epoch 29/30
1424/1424 - 0s - loss: 0.0457 - accuracy: 0.9965 - val_loss: 0.1559 - val_accuracy: 0.9522
Epoch 30/30
1424/1424 - 0s - loss: 0.0416 - accuracy: 0.9972 - val_loss: 0.1544 - val_accuracy: 0.9551

It seems to get good suprisingly fast - it might be overfitting toward the end.

loss, accuracy =model.evaluate(x_test, y_test, verbose=0)
print(f"Loss: {loss: .2f} Accuracy: {accuracy:.2f}")
Loss:  0.16 Accuracy: 0.95

It does pretty well, even on the test set.

Plotting the Performance

data = pandas.DataFrame(model.history.history)
plot = data.hvplot().opts(title="Training Performance", width=1000, height=800)
Embed(plot=plot, file_name="model_performance")()

Figure Missing

Unlike with the image classifications, the validation performance never quite matches the training performance (although it's quite good), probably because we aren't doing any kind of augmentation the way you tend to do with images.

End

Okay, so we seem to have a decent model, but is that really the end-game? No, we want to be able to predict what classification a new input should get.

index_to_label = {value:key for (key, value) in labels_tokenizer.word_index.items()}

def category(text: str) -> None:
    """Categorizes the text

    Args:
     text: text to categorize
    """
    text = tokenizer.texts_to_sequences([text])
    predictions = model.predict(pad_sequences(text, maxlen=MAX_LENGTH))
    print(f"Predicted Category: {index_to_label[predictions.argmax()]}")
    return
text = "crickets are nutritious and delicious but make for such a silly game"
category(text)
Predicted Category: sport
text = "i like butts that are big and round, something something like a xxx throw down, and so does the house of parliament"
category(text)
Predicted Category: sport

It kind of looks like it's biased toward sports.

text = "tv future hand viewer home theatre"
category(text)
Predicted Category: sport

Something isn't right here.

Cleaning the BBC News Archive

Beginning

This is an initial look at cleaning up a text dataset from the BBC News archives. Although the exercise sites this as the source the dataset provided doesn't look like the actual raw dataset which is broken up into folders that classify the contents and each news item is in a separate file. Instead we're starting with a partially pre-processed CSV that has been lower-cased and the classification is given as the first column in the dataset.

Imports

Python

from pathlib import Path

PyPi

from nltk.corpus import stopwords
from sklearn.feature_extraction.text import CountVectorizer
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import pandas

Graeae

from graeae import SubPathLoader, Timer

Set Up

The Environment

ENVIRONMENT = SubPathLoader("DATASETS")

The Timer

TIMER = Timer()

Middle

The DataSet

bbc_path = Path(ENVIRONMENT["BBC_NEWS"]).expanduser()
with TIMER:
    data = pandas.read_csv(bbc_path/"bbc-text.csv")
2019-08-25 18:51:38,411 graeae.timers.timer start: Started: 2019-08-25 18:51:38.411196
2019-08-25 18:51:38,658 graeae.timers.timer end: Ended: 2019-08-25 18:51:38.658181
2019-08-25 18:51:38,658 graeae.timers.timer end: Elapsed: 0:00:00.246985
print(data.shape)
(2225, 2)
print(data.sample().iloc[0])
category                                                sport
text        bell set for england debut bath prop duncan be...
Name: 2134, dtype: object

So we have two columns - category and text, text being the one we have to clean up.

print(data.text.dtype)
object

That's not such an informative answer, but I checked and each row of text is a single string.

The Tokenizer

The Keras Tokenizer tokenizes the text for us as well as removing the punctuation, lower-casing the text, and some other things. We're also going to use a Out-of-Vocabulary token of "<OOV>" to identify words that are outside of the vocabulary when converting new texts to sequences.

tokenizer = Tokenizer(oov_token="<OOV>", num_words=100)
tokenizer.fit_on_texts(data.text)
word_index = tokenizer.word_index
print(len(word_index))
29727

The word-index is a dict that maps words found in the documents to counts.

Convert the Texts To Sequences

We're going to convert each of our texts to a sequence of numbers representing the words in them (one-hot-encoding). The pad_sequences function adds zeros to the end of sequences that are shorter than the longest one so that they are all the same size.

sequences = tokenizer.texts_to_sequences(data.text)
padded = pad_sequences(sequences, padding="post")
print(padded[0])
print(padded.shape)
[1 1 7 ... 0 0 0]
(2225, 4491)

Strangely there doesn't appear to be a good way to use stopwords. Maybe sklearn is more appropriate here.

vectorizer = CountVectorizer(stop_words=stopwords.words("english"),
                             lowercase=True, min_df=3,
                             max_df=0.9, max_features=5000)
vectors = vectorizer.fit_transform(data.text)

End

Sources

The Original Dataset

  • D. Greene and P. Cunningham. "Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering", Proc. ICML 2006. [PDF] [BibTeX].

Sign Language Exercise

Beginning

This data I'm using is the Sign-Language MNIST set (hosted on Kaggle). It's a drop-in replacement for the MNIST dataset that contains images of hands showing letters in American Sign Language that was created by taking 1,704 photos of hands showing letters in the alphabet and then using ImageMagick to alter the photos to create a training set with 27,455 images and a test set with 7,172 images.

Imports

Python

from functools import partial
from pathlib import Path

PyPi

from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image as keras_image
from tensorflow.keras.utils import to_categorical
import hvplot.pandas
import matplotlib.pyplot as pyplot
import matplotlib.image as matplotlib_image

import numpy
import pandas
import seaborn
import tensorflow

Graeae

from graeae import EmbedHoloviews, SubPathLoader, Timer

Set Up

Plotting

get_ipython().run_line_magic('matplotlib', 'inline')
get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")
seaborn.set(style="whitegrid",
            rc={"axes.grid": False,
                "font.family": ["sans-serif"],
                "font.sans-serif": ["Open Sans", "Latin Modern Sans", "Lato"],
                "figure.figsize": (8, 6)},
            font_scale=1)
FIGURE_SIZE = (12, 10)

Embed = partial(
    EmbedHoloviews,
    folder_path="../../files/posts/keras/sign-language-exercise/")

Timer

TIMER = Timer()

The Environment

ENVIRONMENT = SubPathLoader("DATASETS")

Middle

The Datasets

root_path = Path(ENVIRONMENT["SIGN_LANGUAGE_MNIST"]).expanduser()
def get_data(test_or_train: str) -> tuple:
    """Gets the MNIST data

    The pixels are reshaped so that they are 28x28
    Also, an extra dimension is added to make the shape:
     (<rows>, 28, 28, 1)

    Also converts the labels to a categorical (so there are 25 columns)

    Args:
     test_or_train: which data set to load

    Returns: 
     images, labels: numpy arrays with the data
    """
    path = root_path/f"sign_mnist_{test_or_train}.csv"
    data = pandas.read_csv(path) 
    labels = data.label
    labels = to_categorical(labels)
    pixels = data[[column for column in data.columns if column.startswith("pixel")]]
    pixels = pixels.values.reshape(((len(pixels), 28, 28, 1)))
    print(labels.shape)
    print(pixels.shape)
    return pixels, labels
  • Training

    The data is a CSV with the first column being the labels and the rest of the columns holding the pixel values. To make it work with our networks we need to re-shape the data so that we have a shape of (<rows>, 28, 28). The 28 comes from the fact that there are 784 pixel columns(28 x 28 = 784).

    train_images, train_labels = get_data("train")
    assert train_images.shape == (27455, 28, 28, 1)
    assert train_labels.shape == (27455, 25)
    
    (27455, 25)
    (27455, 28, 28, 1)
    

    As you can see, there's a lot of columns in the original set. The first one is the "label" and the rest are the "pixel" columns.

    test_images, test_labels = get_data("test")
    assert test_images.shape == (7172, 28, 28, 1)
    assert test_labels.shape == (7172, 25)
    
    (7172, 25)
    (7172, 28, 28, 1)
    

    Note: The original exercise calls for doing this with the python csv module. But why?

Data Generators

training_data_generator = ImageDataGenerator(
    rescale = 1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

validation_data_generator = ImageDataGenerator(rescale = 1./255)

train_generator = training_data_generator.flow(
        train_images, train_labels,
)

validation_generator = validation_data_generator.flow(
        test_images, test_labels,
)

The Model

Part of the exercise requires that we only use two convolutional layers.

model = tensorflow.keras.models.Sequential([
    # Input Layer/convolution
    tensorflow.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tensorflow.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tensorflow.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tensorflow.keras.layers.MaxPooling2D(2,2),
    # Flatten
    tensorflow.keras.layers.Flatten(),
    tensorflow.keras.layers.Dropout(0.5),
    # Fully-connected and output layers
    tensorflow.keras.layers.Dense(512, activation='relu'),
    tensorflow.keras.layers.Dense(25, activation='softmax'),
])
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_12 (Conv2D)           (None, 26, 26, 64)        640       
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 13, 13, 64)        0         
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 11, 11, 128)       73856     
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 5, 5, 128)         0         
_________________________________________________________________
flatten_6 (Flatten)          (None, 3200)              0         
_________________________________________________________________
dropout_6 (Dropout)          (None, 3200)              0         
_________________________________________________________________
dense_12 (Dense)             (None, 512)               1638912   
_________________________________________________________________
dense_13 (Dense)             (None, 25)                12825     
=================================================================
Total params: 1,726,233
Trainable params: 1,726,233
Non-trainable params: 0
_________________________________________________________________

Train It

model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
MODELS = Path("~/models/sign-language-mnist/").expanduser()
assert MODELS.is_dir()
best_model = MODELS/"two-cnn-layers.hdf5"
checkpoint = tensorflow.keras.callbacks.ModelCheckpoint(
    str(best_model), monitor="val_accuracy", verbose=1, 
    save_best_only=True)

with TIMER:
    model.fit_generator(generator=train_generator,
                        epochs=25,
                        callbacks=[checkpoint],
                        validation_data = validation_generator,
                        verbose=2)
2019-08-25 16:25:13,710 graeae.timers.timer start: Started: 2019-08-25 16:25:13.710604
I0825 16:25:13.710640 140637170140992 timer.py:70] Started: 2019-08-25 16:25:13.710604
Epoch 1/25

Epoch 00001: val_accuracy improved from -inf to 0.45427, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 8s - loss: 2.6016 - accuracy: 0.2048 - val_loss: 1.5503 - val_accuracy: 0.4543
Epoch 2/25

Epoch 00002: val_accuracy improved from 0.45427 to 0.71403, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 1.8267 - accuracy: 0.4160 - val_loss: 0.8762 - val_accuracy: 0.7140
Epoch 3/25

Epoch 00003: val_accuracy improved from 0.71403 to 0.74888, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 1.4297 - accuracy: 0.5323 - val_loss: 0.7413 - val_accuracy: 0.7489
Epoch 4/25

Epoch 00004: val_accuracy improved from 0.74888 to 0.76157, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 1.1984 - accuracy: 0.6100 - val_loss: 0.6402 - val_accuracy: 0.7616
Epoch 5/25

Epoch 00005: val_accuracy improved from 0.76157 to 0.84816, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 1.0498 - accuracy: 0.6570 - val_loss: 0.4581 - val_accuracy: 0.8482
Epoch 6/25

Epoch 00006: val_accuracy improved from 0.84816 to 0.85778, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.9340 - accuracy: 0.6944 - val_loss: 0.4195 - val_accuracy: 0.8578
Epoch 7/25

Epoch 00007: val_accuracy improved from 0.85778 to 0.90240, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.8522 - accuracy: 0.7189 - val_loss: 0.3270 - val_accuracy: 0.9024
Epoch 8/25

Epoch 00008: val_accuracy did not improve from 0.90240
858/858 - 7s - loss: 0.7963 - accuracy: 0.7410 - val_loss: 0.3144 - val_accuracy: 0.8887
Epoch 9/25

Epoch 00009: val_accuracy did not improve from 0.90240
858/858 - 7s - loss: 0.7388 - accuracy: 0.7560 - val_loss: 0.3184 - val_accuracy: 0.8984
Epoch 10/25

Epoch 00010: val_accuracy improved from 0.90240 to 0.92777, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.7127 - accuracy: 0.7692 - val_loss: 0.2045 - val_accuracy: 0.9278
Epoch 11/25

Epoch 00011: val_accuracy improved from 0.92777 to 0.93572, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 9s - loss: 0.6798 - accuracy: 0.7792 - val_loss: 0.1813 - val_accuracy: 0.9357
Epoch 12/25

Epoch 00012: val_accuracy improved from 0.93572 to 0.94046, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.6506 - accuracy: 0.7875 - val_loss: 0.1857 - val_accuracy: 0.9405
Epoch 13/25

Epoch 00013: val_accuracy improved from 0.94046 to 0.94074, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.6365 - accuracy: 0.7941 - val_loss: 0.1691 - val_accuracy: 0.9407
Epoch 14/25

Epoch 00014: val_accuracy improved from 0.94074 to 0.95706, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.6127 - accuracy: 0.8028 - val_loss: 0.1426 - val_accuracy: 0.9571
Epoch 15/25

Epoch 00015: val_accuracy did not improve from 0.95706
858/858 - 7s - loss: 0.6009 - accuracy: 0.8076 - val_loss: 0.1925 - val_accuracy: 0.9265
Epoch 16/25

Epoch 00016: val_accuracy improved from 0.95706 to 0.96207, saving model to /home/athena/models/sign-language-mnist/two-cnn-layers.hdf5
858/858 - 7s - loss: 0.5883 - accuracy: 0.8121 - val_loss: 0.1393 - val_accuracy: 0.9621
Epoch 17/25

Epoch 00017: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5785 - accuracy: 0.8127 - val_loss: 0.2188 - val_accuracy: 0.9250
Epoch 18/25

Epoch 00018: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5728 - accuracy: 0.8158 - val_loss: 0.2003 - val_accuracy: 0.9350
Epoch 19/25

Epoch 00019: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5633 - accuracy: 0.8225 - val_loss: 0.1452 - val_accuracy: 0.9578
Epoch 20/25

Epoch 00020: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5536 - accuracy: 0.8223 - val_loss: 0.1341 - val_accuracy: 0.9605
Epoch 21/25

Epoch 00021: val_accuracy did not improve from 0.96207
858/858 - 8s - loss: 0.5477 - accuracy: 0.8252 - val_loss: 0.1500 - val_accuracy: 0.9442
Epoch 22/25

Epoch 00022: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5367 - accuracy: 0.8291 - val_loss: 0.1435 - val_accuracy: 0.9568
Epoch 23/25

Epoch 00023: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5425 - accuracy: 0.8336 - val_loss: 0.1598 - val_accuracy: 0.9615
Epoch 24/25

Epoch 00024: val_accuracy did not improve from 0.96207
858/858 - 8s - loss: 0.5243 - accuracy: 0.8330 - val_loss: 0.1749 - val_accuracy: 0.9483
Epoch 25/25

Epoch 00025: val_accuracy did not improve from 0.96207
858/858 - 7s - loss: 0.5163 - accuracy: 0.8379 - val_loss: 0.1353 - val_accuracy: 0.9587
2019-08-25 16:28:20,707 graeae.timers.timer end: Ended: 2019-08-25 16:28:20.707567
I0825 16:28:20.707660 140637170140992 timer.py:77] Ended: 2019-08-25 16:28:20.707567
2019-08-25 16:28:20,712 graeae.timers.timer end: Elapsed: 0:03:06.996963
I0825 16:28:20.712478 140637170140992 timer.py:78] Elapsed: 0:03:06.996963
predictor = load_model(best_model)
data = pandas.DataFrame(model.history.history)
plot = data.hvplot().opts(title="Sign Language MNIST Training and Validation",
                          fontsize={"title": 16},
                          width=1000, height=800)
Embed(plot=plot, file_name="training")()

Figure Missing

I'm not sure why these small networks do so well, bit this one seems to be doing fairly well.

loss, accuracy=predictor.evaluate(test_images, test_labels, verbose=0)
print(f"Loss: {loss:.2f}, Accuracy: {accuracy:.2f}")
Loss: 4.36, Accuracy: 0.72

So, actually, the performance drops quite a bit outside of the training, even though I'm using the same data-set.

End

Source

Rock-Paper-Scissors

Beginning

Imports

Python

from functools import partial
from pathlib import Path
import hvplot.pandas
import numpy
import pandas
import random

PyPi

from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image as keras_image
import holoviews
import matplotlib.pyplot as pyplot
import matplotlib.image as matplotlib_image
import seaborn
import tensorflow

graeae

from graeae import EmbedHoloviews, SubPathLoader, Timer, ZipDownloader

Set Up

Plotting

get_ipython().run_line_magic('matplotlib', 'inline')
get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")
seaborn.set(style="whitegrid",
            rc={"axes.grid": False,
                "font.family": ["sans-serif"],
                "font.sans-serif": ["Open Sans", "Latin Modern Sans", "Lato"],
                "figure.figsize": (8, 6)},
            font_scale=1)
FIGURE_SIZE = (12, 10)

Embed = partial(EmbedHoloviews,
                folder_path="../../files/posts/keras/rock-paper-scissors/")
holoviews.extension("bokeh")

The Timer

TIMER = Timer()

The Environment

ENVIRONMENT = SubPathLoader("DATASETS")

Middle

The Data

Downloading it

TRAINING_URL = "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip"
TEST_URL = "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps-test-set.zip"
OUT_PATH = Path(ENVIRONMENT["ROCK_PAPER_SCISSORS"]).expanduser()
download_train = ZipDownloader(TRAINING_URL, OUT_PATH/"train")
download_test = ZipDownloader(TEST_URL, OUT_PATH/"test")
download_train()
download_test()
I0825 15:06:11.577721 139626236733248 environment.py:35] Environment Path: /home/athena/.env
I0825 15:06:11.578975 139626236733248 environment.py:90] Environment Path: /home/athena/.config/datasets/env
Files exist, not downloading
Files exist, not downloading

The data structure for the folders is fairly deep, so I'll make some shortcuts.

TRAINING = OUT_PATH/"train/rps"
rocks = TRAINING/"rock"
papers = TRAINING/"paper"
scissors = TRAINING/"scissors"
assert papers.is_dir()
assert rocks.is_dir()
assert scissors.is_dir()
rock_images = list(rocks.iterdir())
paper_images = list(papers.iterdir())
scissors_images = list(scissors.iterdir())
print(f"Rocks: {len(rock_images):,}")
print(f"Papers: {len(paper_images):,}")
print(f"Scissors: {len(scissors_images):,}")
Rocks: 840
Papers: 840
Scissors: 840

Some Examples

count=1
rock_sample = random.choice(rock_images[:count])

image = matplotlib_image.imread(str(rock_sample))
pyplot.title("Rock")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

rock.png

paper_sample = random.choice(paper_images)
image = matplotlib_image.imread(str(paper_sample))
pyplot.title("Paper")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

paper.png

scissors_sample = random.choice(scissors_images)
image = matplotlib_image.imread(str(scissors_sample))
pyplot.title("Scissors")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

scissors.png

Data Generators

Note: I was originally using keras_preprocessing.image.ImageDataGenerator and getting

AttributeError: 'DirectoryIterator' object has no attribute 'shape'

Make sure to use tensorflow.keras.preprocessing.image.ImageDataGenerator instead.

VALIDATION = OUT_PATH/"test/rps-test-set"
training_data_generator = ImageDataGenerator(
      rescale = 1./255,
          rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

validation_data_generator = ImageDataGenerator(rescale = 1./255)

train_generator = training_data_generator.flow_from_directory(
        TRAINING,
        target_size=(150,150),
        class_mode='categorical'
)

validation_generator = validation_data_generator.flow_from_directory(
        VALIDATION,
        target_size=(150,150),
        class_mode='categorical'
)
Found 2520 images belonging to 3 classes.
Found 372 images belonging to 3 classes.

A Four-CNN Model

Definition

This is a hand-crafted, relatively shallow Convolutional Neural Network. The input shape matches our target_size arguments for the data-generators. There are four convolutional layers with a filter size of 3 x 3 each follewd by a max-pooling layer. The first two layers have 64 nodes while the two following those have 128 nodes. The convolution layers are followed by a layer to flatten the input and add dropout before reaching our fully connected and output layer which uses softmax to predict the most likely category. Since we have three categories (rock, paper, or scissors) the final layer has three nodes.

model = tensorflow.keras.models.Sequential([
    # Input Layer/convolution
    tensorflow.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tensorflow.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tensorflow.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tensorflow.keras.layers.MaxPooling2D(2,2),
    # The third convolution
    tensorflow.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tensorflow.keras.layers.MaxPooling2D(2,2),
    # The fourth convolution
    tensorflow.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tensorflow.keras.layers.MaxPooling2D(2,2),
    # Flatten
    tensorflow.keras.layers.Flatten(),
    tensorflow.keras.layers.Dropout(0.5),
    # Fully-connected and output layers
    tensorflow.keras.layers.Dense(512, activation='relu'),
    tensorflow.keras.layers.Dense(3, activation='softmax')
])

Here's a summary of the layers.

model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_16 (Conv2D)           (None, 148, 148, 64)      1792      
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 74, 74, 64)        0         
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 72, 72, 64)        36928     
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 7, 7, 128)         0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 6272)              0         
_________________________________________________________________
dense_8 (Dense)              (None, 512)               3211776   
_________________________________________________________________
dense_9 (Dense)              (None, 3)                 1539      
=================================================================
Total params: 3,473,475
Trainable params: 3,473,475
Non-trainable params: 0
_________________________________________________________________

You can see that the convolutional layers lose two pixels on output, so the filters are stopping when their edges match the image (so the 3 x 3 filter stops with the center one pixel away from the edge of the image). Additionally, our max-pooling layers are cutting the size of the convolutional layers' output in half, so as we progress through the network the inputs are getting smaller and smaller before reaching the fully-connected layers.

Compile and Fit

Now we need to compile and train the model.

Note: The metrics can change with your settings - make sure the monitor= parameter is pointing to a key in the history. If you see this in the output:

Can save best model only with val_acc available, skipping.

You might have the wrong name for your metric (it isn't val_acc).

model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
MODELS = Path("~/models/rock-paper-scissors/").expanduser()
best_model = MODELS/"four-layer-cnn.hdf5"
checkpoint = tensorflow.keras.callbacks.ModelCheckpoint(
    str(best_model), monitor="val_accuracy", verbose=1, 
    save_best_only=True)

with TIMER:
    model.fit_generator(generator=train_generator,
                        epochs=25,
                        callbacks=[checkpoint],
                        validation_data = validation_generator,
                        verbose=2)
2019-08-25 15:06:17,145 graeae.timers.timer start: Started: 2019-08-25 15:06:17.145536
I0825 15:06:17.145575 139626236733248 timer.py:70] Started: 2019-08-25 15:06:17.145536
Epoch 1/25

Epoch 00001: val_accuracy improved from -inf to 0.61559, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 15s - loss: 1.1174 - accuracy: 0.3996 - val_loss: 0.8997 - val_accuracy: 0.6156
Epoch 2/25

Epoch 00002: val_accuracy improved from 0.61559 to 0.93817, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 14s - loss: 0.8115 - accuracy: 0.6381 - val_loss: 0.2403 - val_accuracy: 0.9382
Epoch 3/25

Epoch 00003: val_accuracy improved from 0.93817 to 0.97043, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 14s - loss: 0.5604 - accuracy: 0.7750 - val_loss: 0.2333 - val_accuracy: 0.9704
Epoch 4/25

Epoch 00004: val_accuracy improved from 0.97043 to 0.98387, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 14s - loss: 0.3926 - accuracy: 0.8496 - val_loss: 0.0681 - val_accuracy: 0.9839
Epoch 5/25

Epoch 00005: val_accuracy improved from 0.98387 to 0.99194, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 14s - loss: 0.2746 - accuracy: 0.8925 - val_loss: 0.0395 - val_accuracy: 0.9919
Epoch 6/25

Epoch 00006: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.2018 - accuracy: 0.9246 - val_loss: 0.1427 - val_accuracy: 0.9328
Epoch 7/25

Epoch 00007: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.2052 - accuracy: 0.9238 - val_loss: 0.4212 - val_accuracy: 0.8253
Epoch 8/25

Epoch 00008: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.1649 - accuracy: 0.9460 - val_loss: 0.1079 - val_accuracy: 0.9597
Epoch 9/25

Epoch 00009: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.1678 - accuracy: 0.9452 - val_loss: 0.0782 - val_accuracy: 0.9597
Epoch 10/25

Epoch 00010: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.1388 - accuracy: 0.9508 - val_loss: 0.0425 - val_accuracy: 0.9731
Epoch 11/25

Epoch 00011: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.1207 - accuracy: 0.9611 - val_loss: 0.0758 - val_accuracy: 0.9570
Epoch 12/25

Epoch 00012: val_accuracy did not improve from 0.99194
79/79 - 14s - loss: 0.1195 - accuracy: 0.9639 - val_loss: 0.1392 - val_accuracy: 0.9489
Epoch 13/25

Epoch 00013: val_accuracy improved from 0.99194 to 1.00000, saving model to /home/athena/models/rock-paper-scissors/four-layer-cnn.hdf5
79/79 - 14s - loss: 0.1182 - accuracy: 0.9583 - val_loss: 0.0147 - val_accuracy: 1.0000
Epoch 14/25

Epoch 00014: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0959 - accuracy: 0.9722 - val_loss: 0.1264 - val_accuracy: 0.9543
Epoch 15/25

Epoch 00015: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.1225 - accuracy: 0.9643 - val_loss: 0.1124 - val_accuracy: 0.9677
Epoch 16/25

Epoch 00016: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0959 - accuracy: 0.9706 - val_loss: 0.0773 - val_accuracy: 0.9677
Epoch 17/25

Epoch 00017: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0817 - accuracy: 0.9687 - val_loss: 0.0120 - val_accuracy: 1.0000
Epoch 18/25

Epoch 00018: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.1308 - accuracy: 0.9627 - val_loss: 0.1058 - val_accuracy: 0.9758
Epoch 19/25

Epoch 00019: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0967 - accuracy: 0.9675 - val_loss: 0.0356 - val_accuracy: 0.9866
Epoch 20/25

Epoch 00020: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0785 - accuracy: 0.9726 - val_loss: 0.0474 - val_accuracy: 0.9704
Epoch 21/25

Epoch 00021: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0962 - accuracy: 0.9710 - val_loss: 0.0774 - val_accuracy: 0.9677
Epoch 22/25

Epoch 00022: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0802 - accuracy: 0.9754 - val_loss: 0.1592 - val_accuracy: 0.9516
Epoch 23/25

Epoch 00023: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0909 - accuracy: 0.9714 - val_loss: 0.1123 - val_accuracy: 0.9382
Epoch 24/25

Epoch 00024: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0573 - accuracy: 0.9782 - val_loss: 0.0609 - val_accuracy: 0.9785
Epoch 25/25

Epoch 00025: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0860 - accuracy: 0.9778 - val_loss: 0.1106 - val_accuracy: 0.9677
2019-08-25 15:12:11,360 graeae.timers.timer end: Ended: 2019-08-25 15:12:11.360361
I0825 15:12:11.360388 139626236733248 timer.py:77] Ended: 2019-08-25 15:12:11.360361
2019-08-25 15:12:11,361 graeae.timers.timer end: Elapsed: 0:05:54.214825
I0825 15:12:11.361701 139626236733248 timer.py:78] Elapsed: 0:05:54.214825

That did surprisingly well… is it really that easy a problem?

predictor = load_model(best_model)
data = pandas.DataFrame(model.history.history)
plot = data.hvplot().opts(title="Rock, Paper, Scissors Training and Validation", width=1000, height=800)
Embed(plot=plot, file_name="training")()

Figure Missing

Looking at the validation accuracy it appears that it starts to overfit at the end. Strangely, the validation loss, up until the overfitting, is lower than the training loss, and the validation accuracy is better almost throughout - perhaps this is because the image augmentation for the training set is too hard.

End

Some Test Images

base = Path("~/test_images").expanduser()
paper = base/"Rock-paper-scissors_(paper).png"

image_ = matplotlib_image.imread(str(paper))
pyplot.title("Paper Test Case")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

test_paper.png

classifications = dict(zip(range(3), ("Paper", "Rock", "Scissors")))
image_ = keras_image.load_img(str(paper), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = predictor.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Paper
base = Path("~/test_images").expanduser()
rock = base/"Rock-paper-scissors_(rock).png"

image = matplotlib_image.imread(str(rock))
pyplot.title("Rock Test Case")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

test_rock.png

base = Path("~/test_images").expanduser()
rock = base/"Rock-paper-scissors_(rock).png"
image_ = keras_image.load_img(str(rock), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = predictor.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Rock
base = Path("~/test_images").expanduser()
scissors = base/"Rock-paper-scissors_(scissors).png"

image = matplotlib_image.imread(str(scissors))
pyplot.title("Scissors Test Case")
pyplot.imshow(image)
pyplot.axis('Off')
pyplot.show()

test_scissors.png

image_ = keras_image.load_img(str(scissors), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = predictor.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Paper

What If we re-train the model, will it get better?

with TIMER:
    model.fit_generator(generator=train_generator,
                        epochs=25,
                        callbacks=[checkpoint],
                        validation_data = validation_generator,
                        verbose=2)
2019-08-25 15:21:37,706 graeae.timers.timer start: Started: 2019-08-25 15:21:37.706175
I0825 15:21:37.706199 139626236733248 timer.py:70] Started: 2019-08-25 15:21:37.706175
Epoch 1/25

Epoch 00001: val_accuracy did not improve from 1.00000
79/79 - 15s - loss: 0.0792 - accuracy: 0.9798 - val_loss: 0.1101 - val_accuracy: 0.9543
Epoch 2/25

Epoch 00002: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0691 - accuracy: 0.9798 - val_loss: 0.1004 - val_accuracy: 0.9570
Epoch 3/25

Epoch 00003: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0850 - accuracy: 0.9762 - val_loss: 0.0098 - val_accuracy: 1.0000
Epoch 4/25

Epoch 00004: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0799 - accuracy: 0.9730 - val_loss: 0.1022 - val_accuracy: 0.9409
Epoch 5/25

Epoch 00005: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0767 - accuracy: 0.9758 - val_loss: 0.1134 - val_accuracy: 0.9328
Epoch 6/25

Epoch 00006: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0747 - accuracy: 0.9833 - val_loss: 0.0815 - val_accuracy: 0.9731
Epoch 7/25

Epoch 00007: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0680 - accuracy: 0.9817 - val_loss: 0.1476 - val_accuracy: 0.9059
Epoch 8/25

Epoch 00008: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0669 - accuracy: 0.9821 - val_loss: 0.0202 - val_accuracy: 0.9866
Epoch 9/25

Epoch 00009: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0809 - accuracy: 0.9774 - val_loss: 0.3860 - val_accuracy: 0.8844
Epoch 10/25

Epoch 00010: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0583 - accuracy: 0.9817 - val_loss: 0.0504 - val_accuracy: 0.9812
Epoch 11/25

Epoch 00011: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0691 - accuracy: 0.9806 - val_loss: 0.0979 - val_accuracy: 0.9624
Epoch 12/25

Epoch 00012: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0459 - accuracy: 0.9881 - val_loss: 0.1776 - val_accuracy: 0.9167
Epoch 13/25

Epoch 00013: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0648 - accuracy: 0.9821 - val_loss: 0.0770 - val_accuracy: 0.9435
Epoch 14/25

Epoch 00014: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0549 - accuracy: 0.9825 - val_loss: 0.0075 - val_accuracy: 1.0000
Epoch 15/25

Epoch 00015: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0575 - accuracy: 0.9829 - val_loss: 0.1787 - val_accuracy: 0.9167
Epoch 16/25

Epoch 00016: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0665 - accuracy: 0.9778 - val_loss: 0.0230 - val_accuracy: 0.9866
Epoch 17/25

Epoch 00017: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0557 - accuracy: 0.9825 - val_loss: 0.0431 - val_accuracy: 0.9785
Epoch 18/25

Epoch 00018: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0628 - accuracy: 0.9817 - val_loss: 0.2121 - val_accuracy: 0.8952
Epoch 19/25

Epoch 00019: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0580 - accuracy: 0.9841 - val_loss: 0.0705 - val_accuracy: 0.9651
Epoch 20/25

Epoch 00020: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0578 - accuracy: 0.9810 - val_loss: 0.3318 - val_accuracy: 0.8925
Epoch 21/25

Epoch 00021: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0500 - accuracy: 0.9821 - val_loss: 0.2106 - val_accuracy: 0.8925
Epoch 22/25

Epoch 00022: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0520 - accuracy: 0.9829 - val_loss: 0.1040 - val_accuracy: 0.9382
Epoch 23/25

Epoch 00023: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0693 - accuracy: 0.9853 - val_loss: 0.6132 - val_accuracy: 0.8575
Epoch 24/25

Epoch 00024: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0553 - accuracy: 0.9849 - val_loss: 0.3048 - val_accuracy: 0.8817
Epoch 25/25

Epoch 00025: val_accuracy did not improve from 1.00000
79/79 - 14s - loss: 0.0376 - accuracy: 0.9877 - val_loss: 0.0121 - val_accuracy: 1.0000
2019-08-25 15:27:32,278 graeae.timers.timer end: Ended: 2019-08-25 15:27:32.278250
I0825 15:27:32.278276 139626236733248 timer.py:77] Ended: 2019-08-25 15:27:32.278250
2019-08-25 15:27:32,279 graeae.timers.timer end: Elapsed: 0:05:54.572075
I0825 15:27:32.279404 139626236733248 timer.py:78] Elapsed: 0:05:54.572075

So, your validation went up to 100%, is it a super-classifier?

data = pandas.DataFrame(model.history.history)
plot = data.hvplot().opts(title="Rock, Paper, Scissors Re-Training and Re-Validation", width=1000, height=800)
Embed(plot=plot, file_name="re_training")()

Figure Missing

predictor = load_model(best_model)
classifications = dict(zip(range(3), ("Paper", "Rock", "Scissors")))
image_ = keras_image.load_img(str(paper), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = model.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Rock
classifications = dict(zip(range(3), ("Paper", "Rock", "Scissors")))
image_ = keras_image.load_img(str(rock), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = model.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Rock
classifications = dict(zip(range(3), ("Paper", "Rock", "Scissors")))
image_ = keras_image.load_img(str(scissors), target_size=(150, 150))
x = keras_image.img_to_array(image_)
x = numpy.expand_dims(x, axis=0)
images = numpy.vstack([x])
classes = model.predict(images, batch_size=10)
print(classifications[classes.argmax()])
Paper

I don't have a large test set, but just from these three it seems like the model got worse.

Sources

Horse Or Human Using TensorFlow 2.0

Beginning

Imports

Python

from functools import partial
from pathlib import Path

PyPi

from holoviews.operation.datashader import datashade
from tensorflow.keras.models import load_model
from tensorflow.keras import layers
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import cv2
import holoviews
import hvplot.pandas
import matplotlib.pyplot as pyplot
import numpy
import pandas
import tensorflow

Graeae

from graeae import EmbedHoloviews, Timer, ZipDownloader

Setup

The Timer

TIMER = Timer()

Plotting

Embed = partial(
    EmbedHoloviews,
    folder_path="../../files/posts/keras/horse-or-human-using-tensorflow-20")
holoviews.extension("bokeh")

Storage

MODELS = Path("~/models/horses-vs-humans/").expanduser()
if not MODELS.is_dir():
    MODELS.mkdir()

Middle

The Data Set

OUTPUT = "~/data/datasets/images/horse-or-human/training/"
URL = ("https://storage.googleapis.com/"
       "laurencemoroney-blog.appspot.com/"
       "horse-or-human.zip")

download = ZipDownloader(url=URL, target=OUTPUT)
download()
Files exist, not downloading

:RESULTS:

2019-08-13 13:51:57,970 [1mZipDownloader[0m start: ([1mZipDownloader[0m) Started: 2019-08-13 13:51:57.970533
2019-08-13 13:51:57,972 [1mZipDownloader[0m download: Downloading the zip file
2019-08-13 13:52:08,157 [1mZipDownloader[0m end: ([1mZipDownloader[0m) Ended: 2019-08-13 13:52:08.157484
2019-08-13 13:52:08,159 [1mZipDownloader[0m end: ([1mZipDownloader[0m) Elapsed: 0:00:10.186951
output_path = download.target

The convention for training models for computer vision appears to be that you use the folder names to label the contents of the images within them. In this case we have horses and humans.

Here's what some of the files themselves are named.

horses_path = output_path/"horses"
humans_path = output_path/"humans"

for path in (horses_path, humans_path):
    print(path.name)
    for index, image in enumerate(path.iterdir()):
        print(f"File: {image.name}")
        if index == 9:
            break
    print()
horses
File: horse48-5.png
File: horse45-8.png
File: horse13-5.png
File: horse34-4.png
File: horse46-5.png
File: horse02-3.png
File: horse06-3.png
File: horse32-1.png
File: horse25-3.png
File: horse04-3.png

humans
File: human01-07.png
File: human02-11.png
File: human13-07.png
File: human10-10.png
File: human15-06.png
File: human05-15.png
File: human06-18.png
File: human16-28.png
File: human02-24.png
File: human10-05.png

So, in this case you can tell what they are from the file-names as well. How many images are there?

horse_files = list(horses_path.iterdir())
human_files = list(humans_path.iterdir())
print(f"Horse Images: {len(horse_files)}")
print(f"Human Images: {len(human_files)}")
print(f"Image Shape: {pyplot.imread(str(horse_files[0])).shape}")
Horse Images: 500
Human Images: 527
Image Shape: (300, 300, 4)

This is sort of a small data-set, and it's odd that there are more humans than horses. Let's see what some of them look like. I'm assuming all the files have the same shape. In this case it looks like they are 300 x 300 with four channels (RGB and alpha?).

height = width = 300
count = 4
columns = 2
horse_plots = [datashade(holoviews.RGB.load_image(str(horse)).opts(
    height=height,
    width=width,
))
               for horse in horse_files[:count]]
human_plots = [datashade(holoviews.RGB.load_image(str(human))).opts(
    height=height,
    width=width,
)
               for human in human_files[:count]]

plot = holoviews.Layout(horse_plots + human_plots).cols(2).opts(
    title="Horses and Humans")
Embed(plot=plot, file_name="horses_and_humans", 
      height_in_pixels=900)()

Figure Missing

As you can see, the people in the images aren't really humans (and it may not be so obvious, but they aren't horses either), these are computer-generated images.

The Model

input_shape = (300, 300, 3)
base_model = InceptionV3(input_shape=input_shape, include_top=False)
base_model.trainable = False
print(base_model.summary())
Model: "inception_v3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 300, 300, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 149, 149, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 149, 149, 32) 96          conv2d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 149, 149, 32) 0           batch_normalization[0][0]        
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 147, 147, 32) 9216        activation[0][0]                 
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 147, 147, 32) 96          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 147, 147, 32) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 147, 147, 64) 18432       activation_1[0][0]               
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 147, 147, 64) 192         conv2d_2[0][0]                   
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 147, 147, 64) 0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 73, 73, 64)   0           activation_2[0][0]               
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 73, 73, 80)   5120        max_pooling2d[0][0]              
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 73, 73, 80)   240         conv2d_3[0][0]                   
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 73, 73, 80)   0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 71, 71, 192)  138240      activation_3[0][0]               
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 71, 71, 192)  576         conv2d_4[0][0]                   
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 71, 71, 192)  0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 35, 35, 192)  0           activation_4[0][0]               
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 35, 35, 64)   12288       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 35, 35, 64)   192         conv2d_8[0][0]                   
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 35, 35, 64)   0           batch_normalization_8[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 35, 35, 48)   9216        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 35, 35, 96)   55296       activation_8[0][0]               
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 35, 35, 48)   144         conv2d_6[0][0]                   
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 35, 35, 96)   288         conv2d_9[0][0]                   
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 35, 35, 48)   0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 35, 35, 96)   0           batch_normalization_9[0][0]      
__________________________________________________________________________________________________
average_pooling2d (AveragePooli (None, 35, 35, 192)  0           max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 35, 35, 64)   12288       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 35, 35, 64)   76800       activation_6[0][0]               
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 35, 35, 96)   82944       activation_9[0][0]               
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 35, 35, 32)   6144        average_pooling2d[0][0]          
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 35, 35, 64)   192         conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 35, 35, 64)   192         conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 35, 35, 96)   288         conv2d_10[0][0]                  
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 35, 35, 32)   96          conv2d_11[0][0]                  
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 35, 35, 64)   0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 35, 35, 64)   0           batch_normalization_7[0][0]      
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 35, 35, 96)   0           batch_normalization_10[0][0]     
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 35, 35, 32)   0           batch_normalization_11[0][0]     
__________________________________________________________________________________________________
mixed0 (Concatenate)            (None, 35, 35, 256)  0           activation_5[0][0]               
                                                                 activation_7[0][0]               
                                                                 activation_10[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 35, 35, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 35, 35, 64)   192         conv2d_15[0][0]                  
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 35, 35, 64)   0           batch_normalization_15[0][0]     
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 35, 35, 48)   12288       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 35, 35, 96)   55296       activation_15[0][0]              
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 35, 35, 48)   144         conv2d_13[0][0]                  
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 35, 35, 96)   288         conv2d_16[0][0]                  
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 35, 35, 48)   0           batch_normalization_13[0][0]     
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 35, 35, 96)   0           batch_normalization_16[0][0]     
__________________________________________________________________________________________________
average_pooling2d_1 (AveragePoo (None, 35, 35, 256)  0           mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 35, 35, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 35, 35, 64)   76800       activation_13[0][0]              
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 35, 35, 96)   82944       activation_16[0][0]              
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 35, 35, 64)   16384       average_pooling2d_1[0][0]        
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 35, 35, 64)   192         conv2d_12[0][0]                  
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 35, 35, 64)   192         conv2d_14[0][0]                  
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 35, 35, 96)   288         conv2d_17[0][0]                  
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 35, 35, 64)   192         conv2d_18[0][0]                  
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 35, 35, 64)   0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 35, 35, 64)   0           batch_normalization_14[0][0]     
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 35, 35, 96)   0           batch_normalization_17[0][0]     
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 35, 35, 64)   0           batch_normalization_18[0][0]     
__________________________________________________________________________________________________
mixed1 (Concatenate)            (None, 35, 35, 288)  0           activation_12[0][0]              
                                                                 activation_14[0][0]              
                                                                 activation_17[0][0]              
                                                                 activation_18[0][0]              
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 35, 35, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 35, 35, 64)   192         conv2d_22[0][0]                  
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 35, 35, 64)   0           batch_normalization_22[0][0]     
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 35, 35, 48)   13824       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 35, 35, 96)   55296       activation_22[0][0]              
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 35, 35, 48)   144         conv2d_20[0][0]                  
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 35, 35, 96)   288         conv2d_23[0][0]                  
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 35, 35, 48)   0           batch_normalization_20[0][0]     
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 35, 35, 96)   0           batch_normalization_23[0][0]     
__________________________________________________________________________________________________
average_pooling2d_2 (AveragePoo (None, 35, 35, 288)  0           mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 35, 35, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 35, 35, 64)   76800       activation_20[0][0]              
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 35, 35, 96)   82944       activation_23[0][0]              
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 35, 35, 64)   18432       average_pooling2d_2[0][0]        
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 35, 35, 64)   192         conv2d_19[0][0]                  
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 35, 35, 64)   192         conv2d_21[0][0]                  
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 35, 35, 96)   288         conv2d_24[0][0]                  
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 35, 35, 64)   192         conv2d_25[0][0]                  
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 35, 35, 64)   0           batch_normalization_19[0][0]     
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 35, 35, 64)   0           batch_normalization_21[0][0]     
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 35, 35, 96)   0           batch_normalization_24[0][0]     
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 35, 35, 64)   0           batch_normalization_25[0][0]     
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, 35, 35, 288)  0           activation_19[0][0]              
                                                                 activation_21[0][0]              
                                                                 activation_24[0][0]              
                                                                 activation_25[0][0]              
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 35, 35, 64)   18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, 35, 35, 64)   192         conv2d_27[0][0]                  
__________________________________________________________________________________________________
activation_27 (Activation)      (None, 35, 35, 64)   0           batch_normalization_27[0][0]     
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 35, 35, 96)   55296       activation_27[0][0]              
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, 35, 35, 96)   288         conv2d_28[0][0]                  
__________________________________________________________________________________________________
activation_28 (Activation)      (None, 35, 35, 96)   0           batch_normalization_28[0][0]     
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 17, 17, 384)  995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, 17, 17, 96)   82944       activation_28[0][0]              
__________________________________________________________________________________________________
batch_normalization_26 (BatchNo (None, 17, 17, 384)  1152        conv2d_26[0][0]                  
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, 17, 17, 96)   288         conv2d_29[0][0]                  
__________________________________________________________________________________________________
activation_26 (Activation)      (None, 17, 17, 384)  0           batch_normalization_26[0][0]     
__________________________________________________________________________________________________
activation_29 (Activation)      (None, 17, 17, 96)   0           batch_normalization_29[0][0]     
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 17, 17, 288)  0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, 17, 17, 768)  0           activation_26[0][0]              
                                                                 activation_29[0][0]              
                                                                 max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, 17, 17, 128)  98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, 17, 17, 128)  384         conv2d_34[0][0]                  
__________________________________________________________________________________________________
activation_34 (Activation)      (None, 17, 17, 128)  0           batch_normalization_34[0][0]     
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, 17, 17, 128)  114688      activation_34[0][0]              
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, 17, 17, 128)  384         conv2d_35[0][0]                  
__________________________________________________________________________________________________
activation_35 (Activation)      (None, 17, 17, 128)  0           batch_normalization_35[0][0]     
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, 17, 17, 128)  98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, 17, 17, 128)  114688      activation_35[0][0]              
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, 17, 17, 128)  384         conv2d_31[0][0]                  
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, 17, 17, 128)  384         conv2d_36[0][0]                  
__________________________________________________________________________________________________
activation_31 (Activation)      (None, 17, 17, 128)  0           batch_normalization_31[0][0]     
__________________________________________________________________________________________________
activation_36 (Activation)      (None, 17, 17, 128)  0           batch_normalization_36[0][0]     
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, 17, 17, 128)  114688      activation_31[0][0]              
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, 17, 17, 128)  114688      activation_36[0][0]              
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, 17, 17, 128)  384         conv2d_32[0][0]                  
__________________________________________________________________________________________________
batch_normalization_37 (BatchNo (None, 17, 17, 128)  384         conv2d_37[0][0]                  
__________________________________________________________________________________________________
activation_32 (Activation)      (None, 17, 17, 128)  0           batch_normalization_32[0][0]     
__________________________________________________________________________________________________
activation_37 (Activation)      (None, 17, 17, 128)  0           batch_normalization_37[0][0]     
__________________________________________________________________________________________________
average_pooling2d_3 (AveragePoo (None, 17, 17, 768)  0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, 17, 17, 192)  147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, 17, 17, 192)  172032      activation_32[0][0]              
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, 17, 17, 192)  172032      activation_37[0][0]              
__________________________________________________________________________________________________
conv2d_39 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_3[0][0]        
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, 17, 17, 192)  576         conv2d_30[0][0]                  
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, 17, 17, 192)  576         conv2d_33[0][0]                  
__________________________________________________________________________________________________
batch_normalization_38 (BatchNo (None, 17, 17, 192)  576         conv2d_38[0][0]                  
__________________________________________________________________________________________________
batch_normalization_39 (BatchNo (None, 17, 17, 192)  576         conv2d_39[0][0]                  
__________________________________________________________________________________________________
activation_30 (Activation)      (None, 17, 17, 192)  0           batch_normalization_30[0][0]     
__________________________________________________________________________________________________
activation_33 (Activation)      (None, 17, 17, 192)  0           batch_normalization_33[0][0]     
__________________________________________________________________________________________________
activation_38 (Activation)      (None, 17, 17, 192)  0           batch_normalization_38[0][0]     
__________________________________________________________________________________________________
activation_39 (Activation)      (None, 17, 17, 192)  0           batch_normalization_39[0][0]     
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, 17, 17, 768)  0           activation_30[0][0]              
                                                                 activation_33[0][0]              
                                                                 activation_38[0][0]              
                                                                 activation_39[0][0]              
__________________________________________________________________________________________________
conv2d_44 (Conv2D)              (None, 17, 17, 160)  122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_44 (BatchNo (None, 17, 17, 160)  480         conv2d_44[0][0]                  
__________________________________________________________________________________________________
activation_44 (Activation)      (None, 17, 17, 160)  0           batch_normalization_44[0][0]     
__________________________________________________________________________________________________
conv2d_45 (Conv2D)              (None, 17, 17, 160)  179200      activation_44[0][0]              
__________________________________________________________________________________________________
batch_normalization_45 (BatchNo (None, 17, 17, 160)  480         conv2d_45[0][0]                  
__________________________________________________________________________________________________
activation_45 (Activation)      (None, 17, 17, 160)  0           batch_normalization_45[0][0]     
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, 17, 17, 160)  122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_46 (Conv2D)              (None, 17, 17, 160)  179200      activation_45[0][0]              
__________________________________________________________________________________________________
batch_normalization_41 (BatchNo (None, 17, 17, 160)  480         conv2d_41[0][0]                  
__________________________________________________________________________________________________
batch_normalization_46 (BatchNo (None, 17, 17, 160)  480         conv2d_46[0][0]                  
__________________________________________________________________________________________________
activation_41 (Activation)      (None, 17, 17, 160)  0           batch_normalization_41[0][0]     
__________________________________________________________________________________________________
activation_46 (Activation)      (None, 17, 17, 160)  0           batch_normalization_46[0][0]     
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, 17, 17, 160)  179200      activation_41[0][0]              
__________________________________________________________________________________________________
conv2d_47 (Conv2D)              (None, 17, 17, 160)  179200      activation_46[0][0]              
__________________________________________________________________________________________________
batch_normalization_42 (BatchNo (None, 17, 17, 160)  480         conv2d_42[0][0]                  
__________________________________________________________________________________________________
batch_normalization_47 (BatchNo (None, 17, 17, 160)  480         conv2d_47[0][0]                  
__________________________________________________________________________________________________
activation_42 (Activation)      (None, 17, 17, 160)  0           batch_normalization_42[0][0]     
__________________________________________________________________________________________________
activation_47 (Activation)      (None, 17, 17, 160)  0           batch_normalization_47[0][0]     
__________________________________________________________________________________________________
average_pooling2d_4 (AveragePoo (None, 17, 17, 768)  0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, 17, 17, 192)  147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_43 (Conv2D)              (None, 17, 17, 192)  215040      activation_42[0][0]              
__________________________________________________________________________________________________
conv2d_48 (Conv2D)              (None, 17, 17, 192)  215040      activation_47[0][0]              
__________________________________________________________________________________________________
conv2d_49 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_4[0][0]        
__________________________________________________________________________________________________
batch_normalization_40 (BatchNo (None, 17, 17, 192)  576         conv2d_40[0][0]                  
__________________________________________________________________________________________________
batch_normalization_43 (BatchNo (None, 17, 17, 192)  576         conv2d_43[0][0]                  
__________________________________________________________________________________________________
batch_normalization_48 (BatchNo (None, 17, 17, 192)  576         conv2d_48[0][0]                  
__________________________________________________________________________________________________
batch_normalization_49 (BatchNo (None, 17, 17, 192)  576         conv2d_49[0][0]                  
__________________________________________________________________________________________________
activation_40 (Activation)      (None, 17, 17, 192)  0           batch_normalization_40[0][0]     
__________________________________________________________________________________________________
activation_43 (Activation)      (None, 17, 17, 192)  0           batch_normalization_43[0][0]     
__________________________________________________________________________________________________
activation_48 (Activation)      (None, 17, 17, 192)  0           batch_normalization_48[0][0]     
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 17, 17, 192)  0           batch_normalization_49[0][0]     
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, 17, 17, 768)  0           activation_40[0][0]              
                                                                 activation_43[0][0]              
                                                                 activation_48[0][0]              
                                                                 activation_49[0][0]              
__________________________________________________________________________________________________
conv2d_54 (Conv2D)              (None, 17, 17, 160)  122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_54 (BatchNo (None, 17, 17, 160)  480         conv2d_54[0][0]                  
__________________________________________________________________________________________________
activation_54 (Activation)      (None, 17, 17, 160)  0           batch_normalization_54[0][0]     
__________________________________________________________________________________________________
conv2d_55 (Conv2D)              (None, 17, 17, 160)  179200      activation_54[0][0]              
__________________________________________________________________________________________________
batch_normalization_55 (BatchNo (None, 17, 17, 160)  480         conv2d_55[0][0]                  
__________________________________________________________________________________________________
activation_55 (Activation)      (None, 17, 17, 160)  0           batch_normalization_55[0][0]     
__________________________________________________________________________________________________
conv2d_51 (Conv2D)              (None, 17, 17, 160)  122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_56 (Conv2D)              (None, 17, 17, 160)  179200      activation_55[0][0]              
__________________________________________________________________________________________________
batch_normalization_51 (BatchNo (None, 17, 17, 160)  480         conv2d_51[0][0]                  
__________________________________________________________________________________________________
batch_normalization_56 (BatchNo (None, 17, 17, 160)  480         conv2d_56[0][0]                  
__________________________________________________________________________________________________
activation_51 (Activation)      (None, 17, 17, 160)  0           batch_normalization_51[0][0]     
__________________________________________________________________________________________________
activation_56 (Activation)      (None, 17, 17, 160)  0           batch_normalization_56[0][0]     
__________________________________________________________________________________________________
conv2d_52 (Conv2D)              (None, 17, 17, 160)  179200      activation_51[0][0]              
__________________________________________________________________________________________________
conv2d_57 (Conv2D)              (None, 17, 17, 160)  179200      activation_56[0][0]              
__________________________________________________________________________________________________
batch_normalization_52 (BatchNo (None, 17, 17, 160)  480         conv2d_52[0][0]                  
__________________________________________________________________________________________________
batch_normalization_57 (BatchNo (None, 17, 17, 160)  480         conv2d_57[0][0]                  
__________________________________________________________________________________________________
activation_52 (Activation)      (None, 17, 17, 160)  0           batch_normalization_52[0][0]     
__________________________________________________________________________________________________
activation_57 (Activation)      (None, 17, 17, 160)  0           batch_normalization_57[0][0]     
__________________________________________________________________________________________________
average_pooling2d_5 (AveragePoo (None, 17, 17, 768)  0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_50 (Conv2D)              (None, 17, 17, 192)  147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, 17, 17, 192)  215040      activation_52[0][0]              
__________________________________________________________________________________________________
conv2d_58 (Conv2D)              (None, 17, 17, 192)  215040      activation_57[0][0]              
__________________________________________________________________________________________________
conv2d_59 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_5[0][0]        
__________________________________________________________________________________________________
batch_normalization_50 (BatchNo (None, 17, 17, 192)  576         conv2d_50[0][0]                  
__________________________________________________________________________________________________
batch_normalization_53 (BatchNo (None, 17, 17, 192)  576         conv2d_53[0][0]                  
__________________________________________________________________________________________________
batch_normalization_58 (BatchNo (None, 17, 17, 192)  576         conv2d_58[0][0]                  
__________________________________________________________________________________________________
batch_normalization_59 (BatchNo (None, 17, 17, 192)  576         conv2d_59[0][0]                  
__________________________________________________________________________________________________
activation_50 (Activation)      (None, 17, 17, 192)  0           batch_normalization_50[0][0]     
__________________________________________________________________________________________________
activation_53 (Activation)      (None, 17, 17, 192)  0           batch_normalization_53[0][0]     
__________________________________________________________________________________________________
activation_58 (Activation)      (None, 17, 17, 192)  0           batch_normalization_58[0][0]     
__________________________________________________________________________________________________
activation_59 (Activation)      (None, 17, 17, 192)  0           batch_normalization_59[0][0]     
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, 17, 17, 768)  0           activation_50[0][0]              
                                                                 activation_53[0][0]              
                                                                 activation_58[0][0]              
                                                                 activation_59[0][0]              
__________________________________________________________________________________________________
conv2d_64 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_64 (BatchNo (None, 17, 17, 192)  576         conv2d_64[0][0]                  
__________________________________________________________________________________________________
activation_64 (Activation)      (None, 17, 17, 192)  0           batch_normalization_64[0][0]     
__________________________________________________________________________________________________
conv2d_65 (Conv2D)              (None, 17, 17, 192)  258048      activation_64[0][0]              
__________________________________________________________________________________________________
batch_normalization_65 (BatchNo (None, 17, 17, 192)  576         conv2d_65[0][0]                  
__________________________________________________________________________________________________
activation_65 (Activation)      (None, 17, 17, 192)  0           batch_normalization_65[0][0]     
__________________________________________________________________________________________________
conv2d_61 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_66 (Conv2D)              (None, 17, 17, 192)  258048      activation_65[0][0]              
__________________________________________________________________________________________________
batch_normalization_61 (BatchNo (None, 17, 17, 192)  576         conv2d_61[0][0]                  
__________________________________________________________________________________________________
batch_normalization_66 (BatchNo (None, 17, 17, 192)  576         conv2d_66[0][0]                  
__________________________________________________________________________________________________
activation_61 (Activation)      (None, 17, 17, 192)  0           batch_normalization_61[0][0]     
__________________________________________________________________________________________________
activation_66 (Activation)      (None, 17, 17, 192)  0           batch_normalization_66[0][0]     
__________________________________________________________________________________________________
conv2d_62 (Conv2D)              (None, 17, 17, 192)  258048      activation_61[0][0]              
__________________________________________________________________________________________________
conv2d_67 (Conv2D)              (None, 17, 17, 192)  258048      activation_66[0][0]              
__________________________________________________________________________________________________
batch_normalization_62 (BatchNo (None, 17, 17, 192)  576         conv2d_62[0][0]                  
__________________________________________________________________________________________________
batch_normalization_67 (BatchNo (None, 17, 17, 192)  576         conv2d_67[0][0]                  
__________________________________________________________________________________________________
activation_62 (Activation)      (None, 17, 17, 192)  0           batch_normalization_62[0][0]     
__________________________________________________________________________________________________
activation_67 (Activation)      (None, 17, 17, 192)  0           batch_normalization_67[0][0]     
__________________________________________________________________________________________________
average_pooling2d_6 (AveragePoo (None, 17, 17, 768)  0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_60 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_63 (Conv2D)              (None, 17, 17, 192)  258048      activation_62[0][0]              
__________________________________________________________________________________________________
conv2d_68 (Conv2D)              (None, 17, 17, 192)  258048      activation_67[0][0]              
__________________________________________________________________________________________________
conv2d_69 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_6[0][0]        
__________________________________________________________________________________________________
batch_normalization_60 (BatchNo (None, 17, 17, 192)  576         conv2d_60[0][0]                  
__________________________________________________________________________________________________
batch_normalization_63 (BatchNo (None, 17, 17, 192)  576         conv2d_63[0][0]                  
__________________________________________________________________________________________________
batch_normalization_68 (BatchNo (None, 17, 17, 192)  576         conv2d_68[0][0]                  
__________________________________________________________________________________________________
batch_normalization_69 (BatchNo (None, 17, 17, 192)  576         conv2d_69[0][0]                  
__________________________________________________________________________________________________
activation_60 (Activation)      (None, 17, 17, 192)  0           batch_normalization_60[0][0]     
__________________________________________________________________________________________________
activation_63 (Activation)      (None, 17, 17, 192)  0           batch_normalization_63[0][0]     
__________________________________________________________________________________________________
activation_68 (Activation)      (None, 17, 17, 192)  0           batch_normalization_68[0][0]     
__________________________________________________________________________________________________
activation_69 (Activation)      (None, 17, 17, 192)  0           batch_normalization_69[0][0]     
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, 17, 17, 768)  0           activation_60[0][0]              
                                                                 activation_63[0][0]              
                                                                 activation_68[0][0]              
                                                                 activation_69[0][0]              
__________________________________________________________________________________________________
conv2d_72 (Conv2D)              (None, 17, 17, 192)  147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_72 (BatchNo (None, 17, 17, 192)  576         conv2d_72[0][0]                  
__________________________________________________________________________________________________
activation_72 (Activation)      (None, 17, 17, 192)  0           batch_normalization_72[0][0]     
__________________________________________________________________________________________________
conv2d_73 (Conv2D)              (None, 17, 17, 192)  258048      activation_72[0][0]              
__________________________________________________________________________________________________
batch_normalization_73 (BatchNo (None, 17, 17, 192)  576         conv2d_73[0][0]                  
__________________________________________________________________________________________________
activation_73 (Activation)      (None, 17, 17, 192)  0           batch_normalization_73[0][0]     
__________________________________________________________________________________________________
conv2d_70 (Conv2D)              (None, 17, 17, 192)  147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_74 (Conv2D)              (None, 17, 17, 192)  258048      activation_73[0][0]              
__________________________________________________________________________________________________
batch_normalization_70 (BatchNo (None, 17, 17, 192)  576         conv2d_70[0][0]                  
__________________________________________________________________________________________________
batch_normalization_74 (BatchNo (None, 17, 17, 192)  576         conv2d_74[0][0]                  
__________________________________________________________________________________________________
activation_70 (Activation)      (None, 17, 17, 192)  0           batch_normalization_70[0][0]     
__________________________________________________________________________________________________
activation_74 (Activation)      (None, 17, 17, 192)  0           batch_normalization_74[0][0]     
__________________________________________________________________________________________________
conv2d_71 (Conv2D)              (None, 8, 8, 320)    552960      activation_70[0][0]              
__________________________________________________________________________________________________
conv2d_75 (Conv2D)              (None, 8, 8, 192)    331776      activation_74[0][0]              
__________________________________________________________________________________________________
batch_normalization_71 (BatchNo (None, 8, 8, 320)    960         conv2d_71[0][0]                  
__________________________________________________________________________________________________
batch_normalization_75 (BatchNo (None, 8, 8, 192)    576         conv2d_75[0][0]                  
__________________________________________________________________________________________________
activation_71 (Activation)      (None, 8, 8, 320)    0           batch_normalization_71[0][0]     
__________________________________________________________________________________________________
activation_75 (Activation)      (None, 8, 8, 192)    0           batch_normalization_75[0][0]     
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 8, 8, 768)    0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, 8, 8, 1280)   0           activation_71[0][0]              
                                                                 activation_75[0][0]              
                                                                 max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
conv2d_80 (Conv2D)              (None, 8, 8, 448)    573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_80 (BatchNo (None, 8, 8, 448)    1344        conv2d_80[0][0]                  
__________________________________________________________________________________________________
activation_80 (Activation)      (None, 8, 8, 448)    0           batch_normalization_80[0][0]     
__________________________________________________________________________________________________
conv2d_77 (Conv2D)              (None, 8, 8, 384)    491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_81 (Conv2D)              (None, 8, 8, 384)    1548288     activation_80[0][0]              
__________________________________________________________________________________________________
batch_normalization_77 (BatchNo (None, 8, 8, 384)    1152        conv2d_77[0][0]                  
__________________________________________________________________________________________________
batch_normalization_81 (BatchNo (None, 8, 8, 384)    1152        conv2d_81[0][0]                  
__________________________________________________________________________________________________
activation_77 (Activation)      (None, 8, 8, 384)    0           batch_normalization_77[0][0]     
__________________________________________________________________________________________________
activation_81 (Activation)      (None, 8, 8, 384)    0           batch_normalization_81[0][0]     
__________________________________________________________________________________________________
conv2d_78 (Conv2D)              (None, 8, 8, 384)    442368      activation_77[0][0]              
__________________________________________________________________________________________________
conv2d_79 (Conv2D)              (None, 8, 8, 384)    442368      activation_77[0][0]              
__________________________________________________________________________________________________
conv2d_82 (Conv2D)              (None, 8, 8, 384)    442368      activation_81[0][0]              
__________________________________________________________________________________________________
conv2d_83 (Conv2D)              (None, 8, 8, 384)    442368      activation_81[0][0]              
__________________________________________________________________________________________________
average_pooling2d_7 (AveragePoo (None, 8, 8, 1280)   0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_76 (Conv2D)              (None, 8, 8, 320)    409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_78 (BatchNo (None, 8, 8, 384)    1152        conv2d_78[0][0]                  
__________________________________________________________________________________________________
batch_normalization_79 (BatchNo (None, 8, 8, 384)    1152        conv2d_79[0][0]                  
__________________________________________________________________________________________________
batch_normalization_82 (BatchNo (None, 8, 8, 384)    1152        conv2d_82[0][0]                  
__________________________________________________________________________________________________
batch_normalization_83 (BatchNo (None, 8, 8, 384)    1152        conv2d_83[0][0]                  
__________________________________________________________________________________________________
conv2d_84 (Conv2D)              (None, 8, 8, 192)    245760      average_pooling2d_7[0][0]        
__________________________________________________________________________________________________
batch_normalization_76 (BatchNo (None, 8, 8, 320)    960         conv2d_76[0][0]                  
__________________________________________________________________________________________________
activation_78 (Activation)      (None, 8, 8, 384)    0           batch_normalization_78[0][0]     
__________________________________________________________________________________________________
activation_79 (Activation)      (None, 8, 8, 384)    0           batch_normalization_79[0][0]     
__________________________________________________________________________________________________
activation_82 (Activation)      (None, 8, 8, 384)    0           batch_normalization_82[0][0]     
__________________________________________________________________________________________________
activation_83 (Activation)      (None, 8, 8, 384)    0           batch_normalization_83[0][0]     
__________________________________________________________________________________________________
batch_normalization_84 (BatchNo (None, 8, 8, 192)    576         conv2d_84[0][0]                  
__________________________________________________________________________________________________
activation_76 (Activation)      (None, 8, 8, 320)    0           batch_normalization_76[0][0]     
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, 8, 8, 768)    0           activation_78[0][0]              
                                                                 activation_79[0][0]              
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 8, 8, 768)    0           activation_82[0][0]              
                                                                 activation_83[0][0]              
__________________________________________________________________________________________________
activation_84 (Activation)      (None, 8, 8, 192)    0           batch_normalization_84[0][0]     
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, 8, 8, 2048)   0           activation_76[0][0]              
                                                                 mixed9_0[0][0]                   
                                                                 concatenate[0][0]                
                                                                 activation_84[0][0]              
__________________________________________________________________________________________________
conv2d_89 (Conv2D)              (None, 8, 8, 448)    917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_89 (BatchNo (None, 8, 8, 448)    1344        conv2d_89[0][0]                  
__________________________________________________________________________________________________
activation_89 (Activation)      (None, 8, 8, 448)    0           batch_normalization_89[0][0]     
__________________________________________________________________________________________________
conv2d_86 (Conv2D)              (None, 8, 8, 384)    786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, 8, 8, 384)    1548288     activation_89[0][0]              
__________________________________________________________________________________________________
batch_normalization_86 (BatchNo (None, 8, 8, 384)    1152        conv2d_86[0][0]                  
__________________________________________________________________________________________________
batch_normalization_90 (BatchNo (None, 8, 8, 384)    1152        conv2d_90[0][0]                  
__________________________________________________________________________________________________
activation_86 (Activation)      (None, 8, 8, 384)    0           batch_normalization_86[0][0]     
__________________________________________________________________________________________________
activation_90 (Activation)      (None, 8, 8, 384)    0           batch_normalization_90[0][0]     
__________________________________________________________________________________________________
conv2d_87 (Conv2D)              (None, 8, 8, 384)    442368      activation_86[0][0]              
__________________________________________________________________________________________________
conv2d_88 (Conv2D)              (None, 8, 8, 384)    442368      activation_86[0][0]              
__________________________________________________________________________________________________
conv2d_91 (Conv2D)              (None, 8, 8, 384)    442368      activation_90[0][0]              
__________________________________________________________________________________________________
conv2d_92 (Conv2D)              (None, 8, 8, 384)    442368      activation_90[0][0]              
__________________________________________________________________________________________________
average_pooling2d_8 (AveragePoo (None, 8, 8, 2048)   0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_85 (Conv2D)              (None, 8, 8, 320)    655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_87 (BatchNo (None, 8, 8, 384)    1152        conv2d_87[0][0]                  
__________________________________________________________________________________________________
batch_normalization_88 (BatchNo (None, 8, 8, 384)    1152        conv2d_88[0][0]                  
__________________________________________________________________________________________________
batch_normalization_91 (BatchNo (None, 8, 8, 384)    1152        conv2d_91[0][0]                  
__________________________________________________________________________________________________
batch_normalization_92 (BatchNo (None, 8, 8, 384)    1152        conv2d_92[0][0]                  
__________________________________________________________________________________________________
conv2d_93 (Conv2D)              (None, 8, 8, 192)    393216      average_pooling2d_8[0][0]        
__________________________________________________________________________________________________
batch_normalization_85 (BatchNo (None, 8, 8, 320)    960         conv2d_85[0][0]                  
__________________________________________________________________________________________________
activation_87 (Activation)      (None, 8, 8, 384)    0           batch_normalization_87[0][0]     
__________________________________________________________________________________________________
activation_88 (Activation)      (None, 8, 8, 384)    0           batch_normalization_88[0][0]     
__________________________________________________________________________________________________
activation_91 (Activation)      (None, 8, 8, 384)    0           batch_normalization_91[0][0]     
__________________________________________________________________________________________________
activation_92 (Activation)      (None, 8, 8, 384)    0           batch_normalization_92[0][0]     
__________________________________________________________________________________________________
batch_normalization_93 (BatchNo (None, 8, 8, 192)    576         conv2d_93[0][0]                  
__________________________________________________________________________________________________
activation_85 (Activation)      (None, 8, 8, 320)    0           batch_normalization_85[0][0]     
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, 8, 8, 768)    0           activation_87[0][0]              
                                                                 activation_88[0][0]              
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 8, 8, 768)    0           activation_91[0][0]              
                                                                 activation_92[0][0]              
__________________________________________________________________________________________________
activation_93 (Activation)      (None, 8, 8, 192)    0           batch_normalization_93[0][0]     
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, 8, 8, 2048)   0           activation_85[0][0]              
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_1[0][0]              
                                                                 activation_93[0][0]              
==================================================================================================
Total params: 21,802,784
Trainable params: 0
Non-trainable params: 21,802,784
__________________________________________________________________________________________________
None

Create the Output Layers

x = layers.GlobalAveragePooling2D()(base_model.output)
x = layers.Dense(1024, activation="relu")(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(1, activation="sigmoid")(x)

Now build the model combining the pre-built layer with a Dense layer (that we're going to train). Since we only have two classes the activation function is the sigmoid.

model = tensorflow.keras.Model(
    base_model.input,
    x,
)

Compile the Model

model.compile(optimizer = RMSprop(lr=0.0001), 
              loss = 'binary_crossentropy', 
              metrics = ['acc'])

Train the Model

A Model Saver

best_model = MODELS/"inception_transfer.hdf5"
checkpoint = tensorflow.keras.callbacks.ModelCheckpoint(
    str(best_model), monitor="val_acc", verbose=1, 
    save_best_only=True)

A good Enough Callback

class GoodEnough(tensorflow.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if logs.get('acc') > 0.999:
      print("\nReached 99.9% accuracy so cancelling training!")
      self.model.stop_training = True

A Data Generator

This bundles up the steps to build the data generator.

class Data:
    """creates the data generators

    Args:
     path: path to the images
     validation_split: fraction that goes to the validation set
     batch_size: size for the batches in the epochs
    """
    def __init__(self, path: str, validation_split: float=0.2,
                 batch_size: int=20) -> None:
        self.path = path
        self.validation_split = validation_split
        self.batch_size = batch_size
        self._data_generator = None
        self._testing_data_generator = None
        self._training_generator = None
        self._validation_generator = None
        return

    @property
    def data_generator(self) -> ImageDataGenerator:
        """The data generator for training and validation"""
        if self._data_generator is None:
            self._data_generator = ImageDataGenerator(
                rescale=1/255,
                rotation_range=40,
                width_shift_range=0.2,
                height_shift_range=0.2,
                horizontal_flip=True,
                shear_range=0.2,
                zoom_range=0.2,
                fill_mode="nearest",
                validation_split=self.validation_split)
        return self._data_generator

    @property
    def training_generator(self):
        """The training data generator"""
        if self._training_generator is None:
            self._training_generator = (self.data_generator
                                        .flow_from_directory)(
                                            self.path,
                                            batch_size=self.batch_size,
                                            class_mode="binary",
                                            target_size=(300, 300),
                                            subset="training",
            )
        return self._training_generator

    @property
    def validation_generator(self):
        """the validation data generator"""
        if self._validation_generator is None:
            self._validation_generator = (self.data_generator
                                          .flow_from_directory)(
                                              self.path,
                                              batch_size=self.batch_size,
                                              class_mode="binary",
                                              target_size = (300, 300),
                                              subset="validation",
            )
        return self._validation_generator

    def __str__(self) -> str:
        return (f"(Data) - Path: {self.path}, "
                f"Validation Split: {self.validation_split},"
                f"Batch Size: {self.batch_size}")

A Model Builder

class Network:
    """The model to categorize the images

    Args:
     model: model to train
     path: path to the training data
     epochs: number of epochs to train
     batch_size: size of the batches for each epoch
     convolution_layers: layers of cnn/max-pooling
     callbacks: things to stop the training
     set_steps: whether to set the training steps-per-epoch
    """
    def __init__(self, model, path: str, epochs: int=15,
                 batch_size: int=128, convolution_layers: int=3,
                 set_steps: bool=True,
                 callbacks: list=None) -> None:
        self.model = model
        self.path = path
        self.epochs = epochs
        self.batch_size = batch_size
        self.convolution_layers = convolution_layers
        self.set_steps = set_steps
        self.callbacks = callbacks
        self._data = None
        self._model = None
        self.history = None
        return

    @property
    def data(self) -> Data:
        """The data generator builder"""
        if self._data is None:
            self._data = Data(self.path, batch_size=self.batch_size)
        return self._data

    def summary(self) -> None:
        """Prints the model summary"""
        print(self.model.summary())
        return

    def train(self) -> None:
        """Trains the model"""
        callbacks = self.callbacks if self.callbacks else []
        arguments = dict(
            generator=self.data.training_generator,
            validation_data=self.data.validation_generator,
            epochs = self.epochs,
            callbacks = callbacks,
            verbose=2,
        )
        if self.set_steps:
            arguments["steps_per_epoch"] = int(
                self.data.training_generator.samples/self.batch_size)
            arguments["validation_steps"] = int(
                self.data.validation_generator.samples/self.batch_size)

        self.history = self.model.fit_generator(**arguments)
        return

    def __str__(self) -> str:
        return (f"(Network) - \nPath: {self.path}\n Epochs: {self.epochs}\n "
                f"Batch Size: {self.batch_size}\n Callbacks: {self.callbacks}\n"
                f"Data: {self.data}\n"
                f"Callbacks: {self.callbacks}")

Train It

good_enough = GoodEnough()
network = Network(model, Path(OUTPUT).expanduser(), 
                  set_steps = True,
                  epochs = 40,
                  callbacks=[checkpoint, good_enough],
                  batch_size=1)
with TIMER:
    network.train()
2019-08-03 19:28:04,102 graeae.timers.timer start: Started: 2019-08-03 19:28:04.102954
I0803 19:28:04.102986 139918777980736 timer.py:70] Started: 2019-08-03 19:28:04.102954
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/10

Epoch 00001: val_acc improved from -inf to 0.43660, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 615s - loss: 0.7032 - acc: 0.4977 - val_loss: 0.8069 - val_acc: 0.4366
Epoch 2/10

Epoch 00002: val_acc improved from 0.43660 to 0.43780, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 631s - loss: 0.6933 - acc: 0.5049 - val_loss: 0.7958 - val_acc: 0.4378
Epoch 3/10

Epoch 00003: val_acc did not improve from 0.43780
20000/20000 - 670s - loss: 0.6932 - acc: 0.4990 - val_loss: 0.8142 - val_acc: 0.4230
Epoch 4/10

Epoch 00004: val_acc improved from 0.43780 to 0.45020, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 666s - loss: 0.6932 - acc: 0.4990 - val_loss: 0.7856 - val_acc: 0.4502
Epoch 5/10

Epoch 00005: val_acc did not improve from 0.45020
20000/20000 - 636s - loss: 0.6932 - acc: 0.4983 - val_loss: 0.7982 - val_acc: 0.4312
Epoch 6/10

Epoch 00006: val_acc did not improve from 0.45020
20000/20000 - 618s - loss: 0.6932 - acc: 0.4999 - val_loss: 0.8018 - val_acc: 0.4326
Epoch 7/10

Epoch 00007: val_acc did not improve from 0.45020
20000/20000 - 614s - loss: 0.6932 - acc: 0.4999 - val_loss: 0.7870 - val_acc: 0.4484
Epoch 8/10

Epoch 00008: val_acc improved from 0.45020 to 0.45660, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 607s - loss: 0.6932 - acc: 0.4981 - val_loss: 0.7773 - val_acc: 0.4566
Epoch 9/10

Epoch 00009: val_acc did not improve from 0.45660
20000/20000 - 608s - loss: 0.6932 - acc: 0.4891 - val_loss: 0.7811 - val_acc: 0.4414
Epoch 10/10

Epoch 00010: val_acc did not improve from 0.45660
20000/20000 - 619s - loss: 0.6932 - acc: 0.5010 - val_loss: 0.7878 - val_acc: 0.4474
2019-08-03 21:12:49,142 graeae.timers.timer end: Ended: 2019-08-03 21:12:49.142478
I0803 21:12:49.142507 139918777980736 timer.py:77] Ended: 2019-08-03 21:12:49.142478
2019-08-03 21:12:49,143 graeae.timers.timer end: Elapsed: 1:44:45.039524
I0803 21:12:49.143225 139918777980736 timer.py:78] Elapsed: 1:44:45.039524

Raw

#+begin_comment import os import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras import Model >

Epoch 00002: val_acc improved from 0.88780 to 0.89268, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 61s - loss: 0.7190 - acc: 0.5255 - val_loss: 0.5419 - val_acc: 0.8927 Epoch 3/40

Epoch 00003: val_acc improved from 0.89268 to 0.92195, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 61s - loss: 0.7102 - acc: 0.5170 - val_loss: 0.5290 - val_acc: 0.9220 Epoch 4/40

Epoch 00004: val_acc did not improve from 0.92195 822/822 - 60s - loss: 0.7103 - acc: 0.5097 - val_loss: 0.5357 - val_acc: 0.8146 Epoch 5/40

Epoch 00005: val_acc did not improve from 0.92195 822/822 - 60s - loss: 0.7051 - acc: 0.5012 - val_loss: 0.5330 - val_acc: 0.6780 Epoch 6/40

Epoch 00006: val_acc did not improve from 0.92195 822/822 - 64s - loss: 0.7006 - acc: 0.5012 - val_loss: 0.5969 - val_acc: 0.5317 Epoch 7/40

Epoch 00007: val_acc did not improve from 0.92195 822/822 - 63s - loss: 0.7009 - acc: 0.5109 - val_loss: 0.5356 - val_acc: 0.9122 Epoch 8/40

Epoch 00008: val_acc did not improve from 0.92195 822/822 - 62s - loss: 0.7025 - acc: 0.4878 - val_loss: 0.5103 - val_acc: 0.9073 Epoch 9/40

Epoch 00009: val_acc did not improve from 0.92195 822/822 - 60s - loss: 0.6972 - acc: 0.5207 - val_loss: 0.5321 - val_acc: 0.7561 Epoch 10/40

Epoch 00010: val_acc did not improve from 0.92195 822/822 - 61s - loss: 0.6946 - acc: 0.5316 - val_loss: 0.5102 - val_acc: 0.9220 Epoch 11/40

Epoch 00011: val_acc did not improve from 0.92195 822/822 - 62s - loss: 0.6966 - acc: 0.5365 - val_loss: 0.5149 - val_acc: 0.8488 Epoch 12/40

Epoch 00012: val_acc did not improve from 0.92195 822/822 - 62s - loss: 0.6981 - acc: 0.5073 - val_loss: 0.5266 - val_acc: 0.8293 Epoch 13/40

Epoch 00013: val_acc did not improve from 0.92195 822/822 - 62s - loss: 0.6949 - acc: 0.5182 - val_loss: 0.5046 - val_acc: 0.8780 Epoch 14/40

Epoch 00014: val_acc improved from 0.92195 to 0.95122, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 62s - loss: 0.6957 - acc: 0.5170 - val_loss: 0.4872 - val_acc: 0.9512 Epoch 15/40

Epoch 00015: val_acc did not improve from 0.95122 822/822 - 61s - loss: 0.6944 - acc: 0.5049 - val_loss: 0.4904 - val_acc: 0.9366 Epoch 16/40

Epoch 00016: val_acc did not improve from 0.95122 822/822 - 60s - loss: 0.6920 - acc: 0.5158 - val_loss: 0.5201 - val_acc: 0.7463 Epoch 17/40

Epoch 00017: val_acc did not improve from 0.95122 822/822 - 60s - loss: 0.6951 - acc: 0.4988 - val_loss: 0.4872 - val_acc: 0.8488 Epoch 18/40

Epoch 00018: val_acc improved from 0.95122 to 0.97073, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 61s - loss: 0.6927 - acc: 0.5377 - val_loss: 0.4889 - val_acc: 0.9707 Epoch 19/40

Epoch 00019: val_acc did not improve from 0.97073 822/822 - 63s - loss: 0.6900 - acc: 0.5255 - val_loss: 0.4912 - val_acc: 0.7854 Epoch 20/40

Epoch 00020: val_acc did not improve from 0.97073 822/822 - 64s - loss: 0.6927 - acc: 0.5243 - val_loss: 0.4651 - val_acc: 0.8878 Epoch 21/40

Epoch 00021: val_acc did not improve from 0.97073 822/822 - 64s - loss: 0.6914 - acc: 0.5304 - val_loss: 0.4368 - val_acc: 0.9659 Epoch 22/40

Epoch 00022: val_acc improved from 0.97073 to 0.97561, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 65s - loss: 0.6881 - acc: 0.5341 - val_loss: 0.4350 - val_acc: 0.9756 Epoch 23/40

Epoch 00023: val_acc did not improve from 0.97561 822/822 - 62s - loss: 0.6914 - acc: 0.5401 - val_loss: 0.4421 - val_acc: 0.8439 Epoch 24/40

Epoch 00024: val_acc improved from 0.97561 to 0.99024, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 61s - loss: 0.6887 - acc: 0.5511 - val_loss: 0.3974 - val_acc: 0.9902 Epoch 25/40

Epoch 00025: val_acc did not improve from 0.99024 822/822 - 62s - loss: 0.6855 - acc: 0.5535 - val_loss: 0.3716 - val_acc: 0.9902 Epoch 26/40

Epoch 00026: val_acc did not improve from 0.99024 822/822 - 63s - loss: 0.6865 - acc: 0.5389 - val_loss: 0.3736 - val_acc: 0.9610 Epoch 27/40

Epoch 00027: val_acc did not improve from 0.99024 822/822 - 60s - loss: 0.6823 - acc: 0.5718 - val_loss: 0.3799 - val_acc: 0.9220 Epoch 28/40

Epoch 00028: val_acc did not improve from 0.99024 822/822 - 61s - loss: 0.6875 - acc: 0.5474 - val_loss: 0.3530 - val_acc: 0.9902 Epoch 29/40

Epoch 00029: val_acc did not improve from 0.99024 822/822 - 60s - loss: 0.6881 - acc: 0.5487 - val_loss: 0.3376 - val_acc: 0.9902 Epoch 30/40

Epoch 00030: val_acc did not improve from 0.99024 822/822 - 62s - loss: 0.6857 - acc: 0.5462 - val_loss: 0.3216 - val_acc: 0.9707 Epoch 31/40

Epoch 00031: val_acc did not improve from 0.99024 822/822 - 62s - loss: 0.6847 - acc: 0.5450 - val_loss: 0.3025 - val_acc: 0.9902 Epoch 32/40

Epoch 00032: val_acc improved from 0.99024 to 0.99512, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 64s - loss: 0.6821 - acc: 0.5535 - val_loss: 0.2852 - val_acc: 0.9951 Epoch 33/40

Epoch 00033: val_acc did not improve from 0.99512 822/822 - 62s - loss: 0.6793 - acc: 0.5669 - val_loss: 0.2617 - val_acc: 0.9854 Epoch 34/40

Epoch 00034: val_acc did not improve from 0.99512 822/822 - 60s - loss: 0.6772 - acc: 0.5937 - val_loss: 0.2565 - val_acc: 0.9707 Epoch 35/40

Epoch 00035: val_acc did not improve from 0.99512 822/822 - 61s - loss: 0.6766 - acc: 0.5803 - val_loss: 0.2190 - val_acc: 0.9951 Epoch 36/40

Epoch 00036: val_acc did not improve from 0.99512 822/822 - 63s - loss: 0.6726 - acc: 0.5937 - val_loss: 0.2423 - val_acc: 0.9463 Epoch 37/40

Epoch 00037: val_acc did not improve from 0.99512 822/822 - 61s - loss: 0.6735 - acc: 0.5669 - val_loss: 0.2106 - val_acc: 0.9902 Epoch 38/40

Epoch 00038: val_acc improved from 0.99512 to 1.00000, saving model to /home/athena/models/horses-vs-humans/inception_transfer.hdf5 822/822 - 61s - loss: 0.6718 - acc: 0.5949 - val_loss: 0.1868 - val_acc: 1.0000 Epoch 39/40

Epoch 00039: val_acc did not improve from 1.00000 822/822 - 60s - loss: 0.6647 - acc: 0.6119 - val_loss: 0.2140 - val_acc: 0.9610 Epoch 40/40

Epoch 00040: val_acc did not improve from 1.00000 822/822 - 60s - loss: 0.6671 - acc: 0.5815 - val_loss: 0.1823 - val_acc: 0.9707 2019-08-18 14:37:14,814 graeae.timers.timer end: Ended: 2019-08-18 14:37:14.814322 I0818 14:37:14.814355 139914340390720 timer.py:77] Ended: 2019-08-18 14:37:14.814322 2019-08-18 14:37:14,815 graeae.timers.timer end: Elapsed: 0:41:03.671258 I0818 14:37:14.815070 139914340390720 timer.py:78] Elapsed: 0:41:03.671258 #+end_example

So, we got 100% accuracy… that seems to be an overfitting. Also, why didn't the callback stop it? On further inspection I noticed that the training accuracy never gets to 100%, while the validation accuracy does… that seems odd.

history_ = pandas.DataFrame.from_dict(model.history.history)
history = history_.rename(columns={"loss": "Training Loss",
                                   "acc": "Training Accuracy",
                                   "val_loss": "Validation Loss",
                                   "val_acc": "Validation Accuracy"})
plot = history.hvplot().opts(
    title="Loss and Accuracy of the Horses Vs Humans Model",
    height=800,
    width=1000,
)
Embed(plot=plot, file_name="model_history")()

Figure Missing

So this is a little weird - should Validation Accuracy start out that high? Maybe, since the original network is pre-trained… And why does the Validation Loss improve faster than the Training Loss?

Testing

I don't remember downloading this, but there's a separate folder called "validation" which I'm assuming has a different set of image files.

model = load_model(best_model)
target_size = (300, 300)
def predict(model, filename):
    loaded = cv2.imread(str(filename))
    x = cv2.resize(loaded, target_size)/255
    x = numpy.reshape(x, (1, 300, 300, 3))
    return model.predict(x)
path = Path("~/data/datasets/images/horse-or-human/validation/").expanduser()
target_size = (300, 300)
correct = 0
for index, filename in enumerate((path/"horses").iterdir()):
    prediction = predict(model, filename)
    correct += 0 if prediction[0] > 0.5 else 1
print(f"Fraction of horses correctly classified: {correct/(index + 1):.2f}")
Fraction of horses correctly classified: 0.99
correct = 0
for index, filename in enumerate((path/"humans").iterdir()):
    prediction = predict(model, filename)
    correct += 1 if prediction[0] > 0.5 else 0
print(f"Fraction of humans correctly classified: {correct/(index + 1):.2f}")
Fraction of humans correctly classified: 1.00

The testing images are slightly different in that they are on a white background, rather than a simulated background.

Cats and Dogs with TensorFlow 2

Beginning

This is transfer learning using tensorflow 2.

Imports

Python

from functools import partial
from pathlib import Path

PyPi

import hvplot.pandas
import matplotlib.pyplot as pyplot
import pandas
import seaborn
import tensorflow
import tensorflow_datasets
keras = tensorflow.keras

My Stuff

from graeae import EmbedHoloviews

Setup

Plotting

Embed = partial(
    EmbedHoloviews,
    folder_path="../../files/posts/keras/cats-and-dogs-with-tensorflow-2")

Datasets

tensorflow_datasets.disable_progress_bar()

Middle

The Dataset

Previously I had downloaded the Cats vs Dogs dataset from kaggle or Microsoft, but this time I'll download it using TensorFlow Datasets.

The Dogs vs Cats data set doesn't define train-validation-splits so we have to do that by passing in the first digit of the percentage we want (e.g. 8 means 80% so (8, 1, 1) means (80%, 10%, 10%)).

SPLIT_WEIGHTS = (8, 1, 1)
splits = tensorflow_datasets.Split.TRAIN.subsplit(weighted=SPLIT_WEIGHTS)

(raw_train, raw_validation, raw_test), metadata = tensorflow_datasets.load(
    "cats_vs_dogs", 
    split=list(splits), 
    with_info=True, 
    as_supervised=True)
Downloading and preparing dataset cats_vs_dogs (786.68 MiB) to /home/athena/tensorflow_datasets/cats_vs_dogs/2.0.1...
/home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
WARNING: Logging before flag parsing goes to stderr.
W0804 14:02:41.729103 140458756540224 cats_vs_dogs.py:117] 1738 images were corrupted and were skipped
W0804 14:02:41.734799 140458756540224 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow_datasets/core/file_format_adapter.py:209: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
Dataset cats_vs_dogs downloaded and prepared to /home/athena/tensorflow_datasets/cats_vs_dogs/2.0.1. Subsequent calls will reuse this data.
print(raw_train)
print(raw_validation)
print(raw_test)
<_OptionsDataset shapes: ((None, None, 3), ()), types: (tf.uint8, tf.int64)>
<_OptionsDataset shapes: ((None, None, 3), ()), types: (tf.uint8, tf.int64)>
<_OptionsDataset shapes: ((None, None, 3), ()), types: (tf.uint8, tf.int64)>
get_label_name = metadata.features["label"].int2str
for image, label in raw_train.take(2):
    pyplot.figure()
    pyplot.imshow(image)
    pyplot.title(get_label_name(label))

sample_images.png

Sample Images

Format the Data

IMAGE_SIZE = 160
def format_example(image, label):
    image = tensorflow.cast(image, tensorflow.float32)
    image = (image/127.5) - 1
    image = tensorflow.image.resize(image, (IMAGE_SIZE, IMAGE_SIZE))
    return image, label
train = raw_train.map(format_example)
validation = raw_validation.map(format_example)
test = raw_test.map(format_example)
BATCH_SIZE = 32
SHUFFLE_BUFFER_SIZE = 1000

train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
validation_batches = validation.batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)
for image_batch, label_batch in train_batches.take(1):
    print(image_batch.shape)
(32, 160, 160, 3)

The Base Model

IMAGE_SHAPE = (IMAGE_SIZE, IMAGE_SIZE, 3)
base_model = tensorflow.keras.applications.MobileNetV2(
    input_shape=IMAGE_SHAPE,
    include_top=False,
    weights="imagenet"
)
IMAGE_SHAPE = (IMAGE_SIZE, IMAGE_SIZE, 3)
base_model = tensorflow.keras.applications.InceptionV3(
    input_shape=IMAGE_SHAPE,
    include_top=False,
    weights="imagenet"
)
feature_batch = base_model(image_batch)
print(feature_batch.shape)
(32, 3, 3, 2048)
base_model.trainable = False
print(base_model.summary())
Model: "inception_v3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_3 (InputLayer)            [(None, 160, 160, 3) 0                                            
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, 79, 79, 32)   864         input_3[0][0]                    
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, 79, 79, 32)   96          conv2d_94[0][0]                  
__________________________________________________________________________________________________
activation_94 (Activation)      (None, 79, 79, 32)   0           batch_normalization_94[0][0]     
__________________________________________________________________________________________________
conv2d_95 (Conv2D)              (None, 77, 77, 32)   9216        activation_94[0][0]              
__________________________________________________________________________________________________
batch_normalization_95 (BatchNo (None, 77, 77, 32)   96          conv2d_95[0][0]                  
__________________________________________________________________________________________________
activation_95 (Activation)      (None, 77, 77, 32)   0           batch_normalization_95[0][0]     
__________________________________________________________________________________________________
conv2d_96 (Conv2D)              (None, 77, 77, 64)   18432       activation_95[0][0]              
__________________________________________________________________________________________________
batch_normalization_96 (BatchNo (None, 77, 77, 64)   192         conv2d_96[0][0]                  
__________________________________________________________________________________________________
activation_96 (Activation)      (None, 77, 77, 64)   0           batch_normalization_96[0][0]     
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 38, 38, 64)   0           activation_96[0][0]              
__________________________________________________________________________________________________
conv2d_97 (Conv2D)              (None, 38, 38, 80)   5120        max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
batch_normalization_97 (BatchNo (None, 38, 38, 80)   240         conv2d_97[0][0]                  
__________________________________________________________________________________________________
activation_97 (Activation)      (None, 38, 38, 80)   0           batch_normalization_97[0][0]     
__________________________________________________________________________________________________
conv2d_98 (Conv2D)              (None, 36, 36, 192)  138240      activation_97[0][0]              
__________________________________________________________________________________________________
batch_normalization_98 (BatchNo (None, 36, 36, 192)  576         conv2d_98[0][0]                  
__________________________________________________________________________________________________
activation_98 (Activation)      (None, 36, 36, 192)  0           batch_normalization_98[0][0]     
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 17, 17, 192)  0           activation_98[0][0]              
__________________________________________________________________________________________________
conv2d_102 (Conv2D)             (None, 17, 17, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
batch_normalization_102 (BatchN (None, 17, 17, 64)   192         conv2d_102[0][0]                 
__________________________________________________________________________________________________
activation_102 (Activation)     (None, 17, 17, 64)   0           batch_normalization_102[0][0]    
__________________________________________________________________________________________________
conv2d_100 (Conv2D)             (None, 17, 17, 48)   9216        max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_103 (Conv2D)             (None, 17, 17, 96)   55296       activation_102[0][0]             
__________________________________________________________________________________________________
batch_normalization_100 (BatchN (None, 17, 17, 48)   144         conv2d_100[0][0]                 
__________________________________________________________________________________________________
batch_normalization_103 (BatchN (None, 17, 17, 96)   288         conv2d_103[0][0]                 
__________________________________________________________________________________________________
activation_100 (Activation)     (None, 17, 17, 48)   0           batch_normalization_100[0][0]    
__________________________________________________________________________________________________
activation_103 (Activation)     (None, 17, 17, 96)   0           batch_normalization_103[0][0]    
__________________________________________________________________________________________________
average_pooling2d_9 (AveragePoo (None, 17, 17, 192)  0           max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_99 (Conv2D)              (None, 17, 17, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_101 (Conv2D)             (None, 17, 17, 64)   76800       activation_100[0][0]             
__________________________________________________________________________________________________
conv2d_104 (Conv2D)             (None, 17, 17, 96)   82944       activation_103[0][0]             
__________________________________________________________________________________________________
conv2d_105 (Conv2D)             (None, 17, 17, 32)   6144        average_pooling2d_9[0][0]        
__________________________________________________________________________________________________
batch_normalization_99 (BatchNo (None, 17, 17, 64)   192         conv2d_99[0][0]                  
__________________________________________________________________________________________________
batch_normalization_101 (BatchN (None, 17, 17, 64)   192         conv2d_101[0][0]                 
__________________________________________________________________________________________________
batch_normalization_104 (BatchN (None, 17, 17, 96)   288         conv2d_104[0][0]                 
__________________________________________________________________________________________________
batch_normalization_105 (BatchN (None, 17, 17, 32)   96          conv2d_105[0][0]                 
__________________________________________________________________________________________________
activation_99 (Activation)      (None, 17, 17, 64)   0           batch_normalization_99[0][0]     
__________________________________________________________________________________________________
activation_101 (Activation)     (None, 17, 17, 64)   0           batch_normalization_101[0][0]    
__________________________________________________________________________________________________
activation_104 (Activation)     (None, 17, 17, 96)   0           batch_normalization_104[0][0]    
__________________________________________________________________________________________________
activation_105 (Activation)     (None, 17, 17, 32)   0           batch_normalization_105[0][0]    
__________________________________________________________________________________________________
mixed0 (Concatenate)            (None, 17, 17, 256)  0           activation_99[0][0]              
                                                                 activation_101[0][0]             
                                                                 activation_104[0][0]             
                                                                 activation_105[0][0]             
__________________________________________________________________________________________________
conv2d_109 (Conv2D)             (None, 17, 17, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
batch_normalization_109 (BatchN (None, 17, 17, 64)   192         conv2d_109[0][0]                 
__________________________________________________________________________________________________
activation_109 (Activation)     (None, 17, 17, 64)   0           batch_normalization_109[0][0]    
__________________________________________________________________________________________________
conv2d_107 (Conv2D)             (None, 17, 17, 48)   12288       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_110 (Conv2D)             (None, 17, 17, 96)   55296       activation_109[0][0]             
__________________________________________________________________________________________________
batch_normalization_107 (BatchN (None, 17, 17, 48)   144         conv2d_107[0][0]                 
__________________________________________________________________________________________________
batch_normalization_110 (BatchN (None, 17, 17, 96)   288         conv2d_110[0][0]                 
__________________________________________________________________________________________________
activation_107 (Activation)     (None, 17, 17, 48)   0           batch_normalization_107[0][0]    
__________________________________________________________________________________________________
activation_110 (Activation)     (None, 17, 17, 96)   0           batch_normalization_110[0][0]    
__________________________________________________________________________________________________
average_pooling2d_10 (AveragePo (None, 17, 17, 256)  0           mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_106 (Conv2D)             (None, 17, 17, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_108 (Conv2D)             (None, 17, 17, 64)   76800       activation_107[0][0]             
__________________________________________________________________________________________________
conv2d_111 (Conv2D)             (None, 17, 17, 96)   82944       activation_110[0][0]             
__________________________________________________________________________________________________
conv2d_112 (Conv2D)             (None, 17, 17, 64)   16384       average_pooling2d_10[0][0]       
__________________________________________________________________________________________________
batch_normalization_106 (BatchN (None, 17, 17, 64)   192         conv2d_106[0][0]                 
__________________________________________________________________________________________________
batch_normalization_108 (BatchN (None, 17, 17, 64)   192         conv2d_108[0][0]                 
__________________________________________________________________________________________________
batch_normalization_111 (BatchN (None, 17, 17, 96)   288         conv2d_111[0][0]                 
__________________________________________________________________________________________________
batch_normalization_112 (BatchN (None, 17, 17, 64)   192         conv2d_112[0][0]                 
__________________________________________________________________________________________________
activation_106 (Activation)     (None, 17, 17, 64)   0           batch_normalization_106[0][0]    
__________________________________________________________________________________________________
activation_108 (Activation)     (None, 17, 17, 64)   0           batch_normalization_108[0][0]    
__________________________________________________________________________________________________
activation_111 (Activation)     (None, 17, 17, 96)   0           batch_normalization_111[0][0]    
__________________________________________________________________________________________________
activation_112 (Activation)     (None, 17, 17, 64)   0           batch_normalization_112[0][0]    
__________________________________________________________________________________________________
mixed1 (Concatenate)            (None, 17, 17, 288)  0           activation_106[0][0]             
                                                                 activation_108[0][0]             
                                                                 activation_111[0][0]             
                                                                 activation_112[0][0]             
__________________________________________________________________________________________________
conv2d_116 (Conv2D)             (None, 17, 17, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
batch_normalization_116 (BatchN (None, 17, 17, 64)   192         conv2d_116[0][0]                 
__________________________________________________________________________________________________
activation_116 (Activation)     (None, 17, 17, 64)   0           batch_normalization_116[0][0]    
__________________________________________________________________________________________________
conv2d_114 (Conv2D)             (None, 17, 17, 48)   13824       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_117 (Conv2D)             (None, 17, 17, 96)   55296       activation_116[0][0]             
__________________________________________________________________________________________________
batch_normalization_114 (BatchN (None, 17, 17, 48)   144         conv2d_114[0][0]                 
__________________________________________________________________________________________________
batch_normalization_117 (BatchN (None, 17, 17, 96)   288         conv2d_117[0][0]                 
__________________________________________________________________________________________________
activation_114 (Activation)     (None, 17, 17, 48)   0           batch_normalization_114[0][0]    
__________________________________________________________________________________________________
activation_117 (Activation)     (None, 17, 17, 96)   0           batch_normalization_117[0][0]    
__________________________________________________________________________________________________
average_pooling2d_11 (AveragePo (None, 17, 17, 288)  0           mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_113 (Conv2D)             (None, 17, 17, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_115 (Conv2D)             (None, 17, 17, 64)   76800       activation_114[0][0]             
__________________________________________________________________________________________________
conv2d_118 (Conv2D)             (None, 17, 17, 96)   82944       activation_117[0][0]             
__________________________________________________________________________________________________
conv2d_119 (Conv2D)             (None, 17, 17, 64)   18432       average_pooling2d_11[0][0]       
__________________________________________________________________________________________________
batch_normalization_113 (BatchN (None, 17, 17, 64)   192         conv2d_113[0][0]                 
__________________________________________________________________________________________________
batch_normalization_115 (BatchN (None, 17, 17, 64)   192         conv2d_115[0][0]                 
__________________________________________________________________________________________________
batch_normalization_118 (BatchN (None, 17, 17, 96)   288         conv2d_118[0][0]                 
__________________________________________________________________________________________________
batch_normalization_119 (BatchN (None, 17, 17, 64)   192         conv2d_119[0][0]                 
__________________________________________________________________________________________________
activation_113 (Activation)     (None, 17, 17, 64)   0           batch_normalization_113[0][0]    
__________________________________________________________________________________________________
activation_115 (Activation)     (None, 17, 17, 64)   0           batch_normalization_115[0][0]    
__________________________________________________________________________________________________
activation_118 (Activation)     (None, 17, 17, 96)   0           batch_normalization_118[0][0]    
__________________________________________________________________________________________________
activation_119 (Activation)     (None, 17, 17, 64)   0           batch_normalization_119[0][0]    
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, 17, 17, 288)  0           activation_113[0][0]             
                                                                 activation_115[0][0]             
                                                                 activation_118[0][0]             
                                                                 activation_119[0][0]             
__________________________________________________________________________________________________
conv2d_121 (Conv2D)             (None, 17, 17, 64)   18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_121 (BatchN (None, 17, 17, 64)   192         conv2d_121[0][0]                 
__________________________________________________________________________________________________
activation_121 (Activation)     (None, 17, 17, 64)   0           batch_normalization_121[0][0]    
__________________________________________________________________________________________________
conv2d_122 (Conv2D)             (None, 17, 17, 96)   55296       activation_121[0][0]             
__________________________________________________________________________________________________
batch_normalization_122 (BatchN (None, 17, 17, 96)   288         conv2d_122[0][0]                 
__________________________________________________________________________________________________
activation_122 (Activation)     (None, 17, 17, 96)   0           batch_normalization_122[0][0]    
__________________________________________________________________________________________________
conv2d_120 (Conv2D)             (None, 8, 8, 384)    995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_123 (Conv2D)             (None, 8, 8, 96)     82944       activation_122[0][0]             
__________________________________________________________________________________________________
batch_normalization_120 (BatchN (None, 8, 8, 384)    1152        conv2d_120[0][0]                 
__________________________________________________________________________________________________
batch_normalization_123 (BatchN (None, 8, 8, 96)     288         conv2d_123[0][0]                 
__________________________________________________________________________________________________
activation_120 (Activation)     (None, 8, 8, 384)    0           batch_normalization_120[0][0]    
__________________________________________________________________________________________________
activation_123 (Activation)     (None, 8, 8, 96)     0           batch_normalization_123[0][0]    
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 8, 8, 288)    0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, 8, 8, 768)    0           activation_120[0][0]             
                                                                 activation_123[0][0]             
                                                                 max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
conv2d_128 (Conv2D)             (None, 8, 8, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_128 (BatchN (None, 8, 8, 128)    384         conv2d_128[0][0]                 
__________________________________________________________________________________________________
activation_128 (Activation)     (None, 8, 8, 128)    0           batch_normalization_128[0][0]    
__________________________________________________________________________________________________
conv2d_129 (Conv2D)             (None, 8, 8, 128)    114688      activation_128[0][0]             
__________________________________________________________________________________________________
batch_normalization_129 (BatchN (None, 8, 8, 128)    384         conv2d_129[0][0]                 
__________________________________________________________________________________________________
activation_129 (Activation)     (None, 8, 8, 128)    0           batch_normalization_129[0][0]    
__________________________________________________________________________________________________
conv2d_125 (Conv2D)             (None, 8, 8, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_130 (Conv2D)             (None, 8, 8, 128)    114688      activation_129[0][0]             
__________________________________________________________________________________________________
batch_normalization_125 (BatchN (None, 8, 8, 128)    384         conv2d_125[0][0]                 
__________________________________________________________________________________________________
batch_normalization_130 (BatchN (None, 8, 8, 128)    384         conv2d_130[0][0]                 
__________________________________________________________________________________________________
activation_125 (Activation)     (None, 8, 8, 128)    0           batch_normalization_125[0][0]    
__________________________________________________________________________________________________
activation_130 (Activation)     (None, 8, 8, 128)    0           batch_normalization_130[0][0]    
__________________________________________________________________________________________________
conv2d_126 (Conv2D)             (None, 8, 8, 128)    114688      activation_125[0][0]             
__________________________________________________________________________________________________
conv2d_131 (Conv2D)             (None, 8, 8, 128)    114688      activation_130[0][0]             
__________________________________________________________________________________________________
batch_normalization_126 (BatchN (None, 8, 8, 128)    384         conv2d_126[0][0]                 
__________________________________________________________________________________________________
batch_normalization_131 (BatchN (None, 8, 8, 128)    384         conv2d_131[0][0]                 
__________________________________________________________________________________________________
activation_126 (Activation)     (None, 8, 8, 128)    0           batch_normalization_126[0][0]    
__________________________________________________________________________________________________
activation_131 (Activation)     (None, 8, 8, 128)    0           batch_normalization_131[0][0]    
__________________________________________________________________________________________________
average_pooling2d_12 (AveragePo (None, 8, 8, 768)    0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_124 (Conv2D)             (None, 8, 8, 192)    147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_127 (Conv2D)             (None, 8, 8, 192)    172032      activation_126[0][0]             
__________________________________________________________________________________________________
conv2d_132 (Conv2D)             (None, 8, 8, 192)    172032      activation_131[0][0]             
__________________________________________________________________________________________________
conv2d_133 (Conv2D)             (None, 8, 8, 192)    147456      average_pooling2d_12[0][0]       
__________________________________________________________________________________________________
batch_normalization_124 (BatchN (None, 8, 8, 192)    576         conv2d_124[0][0]                 
__________________________________________________________________________________________________
batch_normalization_127 (BatchN (None, 8, 8, 192)    576         conv2d_127[0][0]                 
__________________________________________________________________________________________________
batch_normalization_132 (BatchN (None, 8, 8, 192)    576         conv2d_132[0][0]                 
__________________________________________________________________________________________________
batch_normalization_133 (BatchN (None, 8, 8, 192)    576         conv2d_133[0][0]                 
__________________________________________________________________________________________________
activation_124 (Activation)     (None, 8, 8, 192)    0           batch_normalization_124[0][0]    
__________________________________________________________________________________________________
activation_127 (Activation)     (None, 8, 8, 192)    0           batch_normalization_127[0][0]    
__________________________________________________________________________________________________
activation_132 (Activation)     (None, 8, 8, 192)    0           batch_normalization_132[0][0]    
__________________________________________________________________________________________________
activation_133 (Activation)     (None, 8, 8, 192)    0           batch_normalization_133[0][0]    
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, 8, 8, 768)    0           activation_124[0][0]             
                                                                 activation_127[0][0]             
                                                                 activation_132[0][0]             
                                                                 activation_133[0][0]             
__________________________________________________________________________________________________
conv2d_138 (Conv2D)             (None, 8, 8, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_138 (BatchN (None, 8, 8, 160)    480         conv2d_138[0][0]                 
__________________________________________________________________________________________________
activation_138 (Activation)     (None, 8, 8, 160)    0           batch_normalization_138[0][0]    
__________________________________________________________________________________________________
conv2d_139 (Conv2D)             (None, 8, 8, 160)    179200      activation_138[0][0]             
__________________________________________________________________________________________________
batch_normalization_139 (BatchN (None, 8, 8, 160)    480         conv2d_139[0][0]                 
__________________________________________________________________________________________________
activation_139 (Activation)     (None, 8, 8, 160)    0           batch_normalization_139[0][0]    
__________________________________________________________________________________________________
conv2d_135 (Conv2D)             (None, 8, 8, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_140 (Conv2D)             (None, 8, 8, 160)    179200      activation_139[0][0]             
__________________________________________________________________________________________________
batch_normalization_135 (BatchN (None, 8, 8, 160)    480         conv2d_135[0][0]                 
__________________________________________________________________________________________________
batch_normalization_140 (BatchN (None, 8, 8, 160)    480         conv2d_140[0][0]                 
__________________________________________________________________________________________________
activation_135 (Activation)     (None, 8, 8, 160)    0           batch_normalization_135[0][0]    
__________________________________________________________________________________________________
activation_140 (Activation)     (None, 8, 8, 160)    0           batch_normalization_140[0][0]    
__________________________________________________________________________________________________
conv2d_136 (Conv2D)             (None, 8, 8, 160)    179200      activation_135[0][0]             
__________________________________________________________________________________________________
conv2d_141 (Conv2D)             (None, 8, 8, 160)    179200      activation_140[0][0]             
__________________________________________________________________________________________________
batch_normalization_136 (BatchN (None, 8, 8, 160)    480         conv2d_136[0][0]                 
__________________________________________________________________________________________________
batch_normalization_141 (BatchN (None, 8, 8, 160)    480         conv2d_141[0][0]                 
__________________________________________________________________________________________________
activation_136 (Activation)     (None, 8, 8, 160)    0           batch_normalization_136[0][0]    
__________________________________________________________________________________________________
activation_141 (Activation)     (None, 8, 8, 160)    0           batch_normalization_141[0][0]    
__________________________________________________________________________________________________
average_pooling2d_13 (AveragePo (None, 8, 8, 768)    0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_134 (Conv2D)             (None, 8, 8, 192)    147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_137 (Conv2D)             (None, 8, 8, 192)    215040      activation_136[0][0]             
__________________________________________________________________________________________________
conv2d_142 (Conv2D)             (None, 8, 8, 192)    215040      activation_141[0][0]             
__________________________________________________________________________________________________
conv2d_143 (Conv2D)             (None, 8, 8, 192)    147456      average_pooling2d_13[0][0]       
__________________________________________________________________________________________________
batch_normalization_134 (BatchN (None, 8, 8, 192)    576         conv2d_134[0][0]                 
__________________________________________________________________________________________________
batch_normalization_137 (BatchN (None, 8, 8, 192)    576         conv2d_137[0][0]                 
__________________________________________________________________________________________________
batch_normalization_142 (BatchN (None, 8, 8, 192)    576         conv2d_142[0][0]                 
__________________________________________________________________________________________________
batch_normalization_143 (BatchN (None, 8, 8, 192)    576         conv2d_143[0][0]                 
__________________________________________________________________________________________________
activation_134 (Activation)     (None, 8, 8, 192)    0           batch_normalization_134[0][0]    
__________________________________________________________________________________________________
activation_137 (Activation)     (None, 8, 8, 192)    0           batch_normalization_137[0][0]    
__________________________________________________________________________________________________
activation_142 (Activation)     (None, 8, 8, 192)    0           batch_normalization_142[0][0]    
__________________________________________________________________________________________________
activation_143 (Activation)     (None, 8, 8, 192)    0           batch_normalization_143[0][0]    
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, 8, 8, 768)    0           activation_134[0][0]             
                                                                 activation_137[0][0]             
                                                                 activation_142[0][0]             
                                                                 activation_143[0][0]             
__________________________________________________________________________________________________
conv2d_148 (Conv2D)             (None, 8, 8, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_148 (BatchN (None, 8, 8, 160)    480         conv2d_148[0][0]                 
__________________________________________________________________________________________________
activation_148 (Activation)     (None, 8, 8, 160)    0           batch_normalization_148[0][0]    
__________________________________________________________________________________________________
conv2d_149 (Conv2D)             (None, 8, 8, 160)    179200      activation_148[0][0]             
__________________________________________________________________________________________________
batch_normalization_149 (BatchN (None, 8, 8, 160)    480         conv2d_149[0][0]                 
__________________________________________________________________________________________________
activation_149 (Activation)     (None, 8, 8, 160)    0           batch_normalization_149[0][0]    
__________________________________________________________________________________________________
conv2d_145 (Conv2D)             (None, 8, 8, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_150 (Conv2D)             (None, 8, 8, 160)    179200      activation_149[0][0]             
__________________________________________________________________________________________________
batch_normalization_145 (BatchN (None, 8, 8, 160)    480         conv2d_145[0][0]                 
__________________________________________________________________________________________________
batch_normalization_150 (BatchN (None, 8, 8, 160)    480         conv2d_150[0][0]                 
__________________________________________________________________________________________________
activation_145 (Activation)     (None, 8, 8, 160)    0           batch_normalization_145[0][0]    
__________________________________________________________________________________________________
activation_150 (Activation)     (None, 8, 8, 160)    0           batch_normalization_150[0][0]    
__________________________________________________________________________________________________
conv2d_146 (Conv2D)             (None, 8, 8, 160)    179200      activation_145[0][0]             
__________________________________________________________________________________________________
conv2d_151 (Conv2D)             (None, 8, 8, 160)    179200      activation_150[0][0]             
__________________________________________________________________________________________________
batch_normalization_146 (BatchN (None, 8, 8, 160)    480         conv2d_146[0][0]                 
__________________________________________________________________________________________________
batch_normalization_151 (BatchN (None, 8, 8, 160)    480         conv2d_151[0][0]                 
__________________________________________________________________________________________________
activation_146 (Activation)     (None, 8, 8, 160)    0           batch_normalization_146[0][0]    
__________________________________________________________________________________________________
activation_151 (Activation)     (None, 8, 8, 160)    0           batch_normalization_151[0][0]    
__________________________________________________________________________________________________
average_pooling2d_14 (AveragePo (None, 8, 8, 768)    0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_144 (Conv2D)             (None, 8, 8, 192)    147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_147 (Conv2D)             (None, 8, 8, 192)    215040      activation_146[0][0]             
__________________________________________________________________________________________________
conv2d_152 (Conv2D)             (None, 8, 8, 192)    215040      activation_151[0][0]             
__________________________________________________________________________________________________
conv2d_153 (Conv2D)             (None, 8, 8, 192)    147456      average_pooling2d_14[0][0]       
__________________________________________________________________________________________________
batch_normalization_144 (BatchN (None, 8, 8, 192)    576         conv2d_144[0][0]                 
__________________________________________________________________________________________________
batch_normalization_147 (BatchN (None, 8, 8, 192)    576         conv2d_147[0][0]                 
__________________________________________________________________________________________________
batch_normalization_152 (BatchN (None, 8, 8, 192)    576         conv2d_152[0][0]                 
__________________________________________________________________________________________________
batch_normalization_153 (BatchN (None, 8, 8, 192)    576         conv2d_153[0][0]                 
__________________________________________________________________________________________________
activation_144 (Activation)     (None, 8, 8, 192)    0           batch_normalization_144[0][0]    
__________________________________________________________________________________________________
activation_147 (Activation)     (None, 8, 8, 192)    0           batch_normalization_147[0][0]    
__________________________________________________________________________________________________
activation_152 (Activation)     (None, 8, 8, 192)    0           batch_normalization_152[0][0]    
__________________________________________________________________________________________________
activation_153 (Activation)     (None, 8, 8, 192)    0           batch_normalization_153[0][0]    
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, 8, 8, 768)    0           activation_144[0][0]             
                                                                 activation_147[0][0]             
                                                                 activation_152[0][0]             
                                                                 activation_153[0][0]             
__________________________________________________________________________________________________
conv2d_158 (Conv2D)             (None, 8, 8, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_158 (BatchN (None, 8, 8, 192)    576         conv2d_158[0][0]                 
__________________________________________________________________________________________________
activation_158 (Activation)     (None, 8, 8, 192)    0           batch_normalization_158[0][0]    
__________________________________________________________________________________________________
conv2d_159 (Conv2D)             (None, 8, 8, 192)    258048      activation_158[0][0]             
__________________________________________________________________________________________________
batch_normalization_159 (BatchN (None, 8, 8, 192)    576         conv2d_159[0][0]                 
__________________________________________________________________________________________________
activation_159 (Activation)     (None, 8, 8, 192)    0           batch_normalization_159[0][0]    
__________________________________________________________________________________________________
conv2d_155 (Conv2D)             (None, 8, 8, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_160 (Conv2D)             (None, 8, 8, 192)    258048      activation_159[0][0]             
__________________________________________________________________________________________________
batch_normalization_155 (BatchN (None, 8, 8, 192)    576         conv2d_155[0][0]                 
__________________________________________________________________________________________________
batch_normalization_160 (BatchN (None, 8, 8, 192)    576         conv2d_160[0][0]                 
__________________________________________________________________________________________________
activation_155 (Activation)     (None, 8, 8, 192)    0           batch_normalization_155[0][0]    
__________________________________________________________________________________________________
activation_160 (Activation)     (None, 8, 8, 192)    0           batch_normalization_160[0][0]    
__________________________________________________________________________________________________
conv2d_156 (Conv2D)             (None, 8, 8, 192)    258048      activation_155[0][0]             
__________________________________________________________________________________________________
conv2d_161 (Conv2D)             (None, 8, 8, 192)    258048      activation_160[0][0]             
__________________________________________________________________________________________________
batch_normalization_156 (BatchN (None, 8, 8, 192)    576         conv2d_156[0][0]                 
__________________________________________________________________________________________________
batch_normalization_161 (BatchN (None, 8, 8, 192)    576         conv2d_161[0][0]                 
__________________________________________________________________________________________________
activation_156 (Activation)     (None, 8, 8, 192)    0           batch_normalization_156[0][0]    
__________________________________________________________________________________________________
activation_161 (Activation)     (None, 8, 8, 192)    0           batch_normalization_161[0][0]    
__________________________________________________________________________________________________
average_pooling2d_15 (AveragePo (None, 8, 8, 768)    0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_154 (Conv2D)             (None, 8, 8, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_157 (Conv2D)             (None, 8, 8, 192)    258048      activation_156[0][0]             
__________________________________________________________________________________________________
conv2d_162 (Conv2D)             (None, 8, 8, 192)    258048      activation_161[0][0]             
__________________________________________________________________________________________________
conv2d_163 (Conv2D)             (None, 8, 8, 192)    147456      average_pooling2d_15[0][0]       
__________________________________________________________________________________________________
batch_normalization_154 (BatchN (None, 8, 8, 192)    576         conv2d_154[0][0]                 
__________________________________________________________________________________________________
batch_normalization_157 (BatchN (None, 8, 8, 192)    576         conv2d_157[0][0]                 
__________________________________________________________________________________________________
batch_normalization_162 (BatchN (None, 8, 8, 192)    576         conv2d_162[0][0]                 
__________________________________________________________________________________________________
batch_normalization_163 (BatchN (None, 8, 8, 192)    576         conv2d_163[0][0]                 
__________________________________________________________________________________________________
activation_154 (Activation)     (None, 8, 8, 192)    0           batch_normalization_154[0][0]    
__________________________________________________________________________________________________
activation_157 (Activation)     (None, 8, 8, 192)    0           batch_normalization_157[0][0]    
__________________________________________________________________________________________________
activation_162 (Activation)     (None, 8, 8, 192)    0           batch_normalization_162[0][0]    
__________________________________________________________________________________________________
activation_163 (Activation)     (None, 8, 8, 192)    0           batch_normalization_163[0][0]    
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, 8, 8, 768)    0           activation_154[0][0]             
                                                                 activation_157[0][0]             
                                                                 activation_162[0][0]             
                                                                 activation_163[0][0]             
__________________________________________________________________________________________________
conv2d_166 (Conv2D)             (None, 8, 8, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_166 (BatchN (None, 8, 8, 192)    576         conv2d_166[0][0]                 
__________________________________________________________________________________________________
activation_166 (Activation)     (None, 8, 8, 192)    0           batch_normalization_166[0][0]    
__________________________________________________________________________________________________
conv2d_167 (Conv2D)             (None, 8, 8, 192)    258048      activation_166[0][0]             
__________________________________________________________________________________________________
batch_normalization_167 (BatchN (None, 8, 8, 192)    576         conv2d_167[0][0]                 
__________________________________________________________________________________________________
activation_167 (Activation)     (None, 8, 8, 192)    0           batch_normalization_167[0][0]    
__________________________________________________________________________________________________
conv2d_164 (Conv2D)             (None, 8, 8, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_168 (Conv2D)             (None, 8, 8, 192)    258048      activation_167[0][0]             
__________________________________________________________________________________________________
batch_normalization_164 (BatchN (None, 8, 8, 192)    576         conv2d_164[0][0]                 
__________________________________________________________________________________________________
batch_normalization_168 (BatchN (None, 8, 8, 192)    576         conv2d_168[0][0]                 
__________________________________________________________________________________________________
activation_164 (Activation)     (None, 8, 8, 192)    0           batch_normalization_164[0][0]    
__________________________________________________________________________________________________
activation_168 (Activation)     (None, 8, 8, 192)    0           batch_normalization_168[0][0]    
__________________________________________________________________________________________________
conv2d_165 (Conv2D)             (None, 3, 3, 320)    552960      activation_164[0][0]             
__________________________________________________________________________________________________
conv2d_169 (Conv2D)             (None, 3, 3, 192)    331776      activation_168[0][0]             
__________________________________________________________________________________________________
batch_normalization_165 (BatchN (None, 3, 3, 320)    960         conv2d_165[0][0]                 
__________________________________________________________________________________________________
batch_normalization_169 (BatchN (None, 3, 3, 192)    576         conv2d_169[0][0]                 
__________________________________________________________________________________________________
activation_165 (Activation)     (None, 3, 3, 320)    0           batch_normalization_165[0][0]    
__________________________________________________________________________________________________
activation_169 (Activation)     (None, 3, 3, 192)    0           batch_normalization_169[0][0]    
__________________________________________________________________________________________________
max_pooling2d_7 (MaxPooling2D)  (None, 3, 3, 768)    0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, 3, 3, 1280)   0           activation_165[0][0]             
                                                                 activation_169[0][0]             
                                                                 max_pooling2d_7[0][0]            
__________________________________________________________________________________________________
conv2d_174 (Conv2D)             (None, 3, 3, 448)    573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_174 (BatchN (None, 3, 3, 448)    1344        conv2d_174[0][0]                 
__________________________________________________________________________________________________
activation_174 (Activation)     (None, 3, 3, 448)    0           batch_normalization_174[0][0]    
__________________________________________________________________________________________________
conv2d_171 (Conv2D)             (None, 3, 3, 384)    491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_175 (Conv2D)             (None, 3, 3, 384)    1548288     activation_174[0][0]             
__________________________________________________________________________________________________
batch_normalization_171 (BatchN (None, 3, 3, 384)    1152        conv2d_171[0][0]                 
__________________________________________________________________________________________________
batch_normalization_175 (BatchN (None, 3, 3, 384)    1152        conv2d_175[0][0]                 
__________________________________________________________________________________________________
activation_171 (Activation)     (None, 3, 3, 384)    0           batch_normalization_171[0][0]    
__________________________________________________________________________________________________
activation_175 (Activation)     (None, 3, 3, 384)    0           batch_normalization_175[0][0]    
__________________________________________________________________________________________________
conv2d_172 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_173 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_176 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
conv2d_177 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
average_pooling2d_16 (AveragePo (None, 3, 3, 1280)   0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_170 (Conv2D)             (None, 3, 3, 320)    409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_172 (BatchN (None, 3, 3, 384)    1152        conv2d_172[0][0]                 
__________________________________________________________________________________________________
batch_normalization_173 (BatchN (None, 3, 3, 384)    1152        conv2d_173[0][0]                 
__________________________________________________________________________________________________
batch_normalization_176 (BatchN (None, 3, 3, 384)    1152        conv2d_176[0][0]                 
__________________________________________________________________________________________________
batch_normalization_177 (BatchN (None, 3, 3, 384)    1152        conv2d_177[0][0]                 
__________________________________________________________________________________________________
conv2d_178 (Conv2D)             (None, 3, 3, 192)    245760      average_pooling2d_16[0][0]       
__________________________________________________________________________________________________
batch_normalization_170 (BatchN (None, 3, 3, 320)    960         conv2d_170[0][0]                 
__________________________________________________________________________________________________
activation_172 (Activation)     (None, 3, 3, 384)    0           batch_normalization_172[0][0]    
__________________________________________________________________________________________________
activation_173 (Activation)     (None, 3, 3, 384)    0           batch_normalization_173[0][0]    
__________________________________________________________________________________________________
activation_176 (Activation)     (None, 3, 3, 384)    0           batch_normalization_176[0][0]    
__________________________________________________________________________________________________
activation_177 (Activation)     (None, 3, 3, 384)    0           batch_normalization_177[0][0]    
__________________________________________________________________________________________________
batch_normalization_178 (BatchN (None, 3, 3, 192)    576         conv2d_178[0][0]                 
__________________________________________________________________________________________________
activation_170 (Activation)     (None, 3, 3, 320)    0           batch_normalization_170[0][0]    
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, 3, 3, 768)    0           activation_172[0][0]             
                                                                 activation_173[0][0]             
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 3, 3, 768)    0           activation_176[0][0]             
                                                                 activation_177[0][0]             
__________________________________________________________________________________________________
activation_178 (Activation)     (None, 3, 3, 192)    0           batch_normalization_178[0][0]    
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, 3, 3, 2048)   0           activation_170[0][0]             
                                                                 mixed9_0[0][0]                   
                                                                 concatenate_2[0][0]              
                                                                 activation_178[0][0]             
__________________________________________________________________________________________________
conv2d_183 (Conv2D)             (None, 3, 3, 448)    917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_183 (BatchN (None, 3, 3, 448)    1344        conv2d_183[0][0]                 
__________________________________________________________________________________________________
activation_183 (Activation)     (None, 3, 3, 448)    0           batch_normalization_183[0][0]    
__________________________________________________________________________________________________
conv2d_180 (Conv2D)             (None, 3, 3, 384)    786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_184 (Conv2D)             (None, 3, 3, 384)    1548288     activation_183[0][0]             
__________________________________________________________________________________________________
batch_normalization_180 (BatchN (None, 3, 3, 384)    1152        conv2d_180[0][0]                 
__________________________________________________________________________________________________
batch_normalization_184 (BatchN (None, 3, 3, 384)    1152        conv2d_184[0][0]                 
__________________________________________________________________________________________________
activation_180 (Activation)     (None, 3, 3, 384)    0           batch_normalization_180[0][0]    
__________________________________________________________________________________________________
activation_184 (Activation)     (None, 3, 3, 384)    0           batch_normalization_184[0][0]    
__________________________________________________________________________________________________
conv2d_181 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_182 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_185 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
conv2d_186 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
average_pooling2d_17 (AveragePo (None, 3, 3, 2048)   0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_179 (Conv2D)             (None, 3, 3, 320)    655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_181 (BatchN (None, 3, 3, 384)    1152        conv2d_181[0][0]                 
__________________________________________________________________________________________________
batch_normalization_182 (BatchN (None, 3, 3, 384)    1152        conv2d_182[0][0]                 
__________________________________________________________________________________________________
batch_normalization_185 (BatchN (None, 3, 3, 384)    1152        conv2d_185[0][0]                 
__________________________________________________________________________________________________
batch_normalization_186 (BatchN (None, 3, 3, 384)    1152        conv2d_186[0][0]                 
__________________________________________________________________________________________________
conv2d_187 (Conv2D)             (None, 3, 3, 192)    393216      average_pooling2d_17[0][0]       
__________________________________________________________________________________________________
batch_normalization_179 (BatchN (None, 3, 3, 320)    960         conv2d_179[0][0]                 
__________________________________________________________________________________________________
activation_181 (Activation)     (None, 3, 3, 384)    0           batch_normalization_181[0][0]    
__________________________________________________________________________________________________
activation_182 (Activation)     (None, 3, 3, 384)    0           batch_normalization_182[0][0]    
__________________________________________________________________________________________________
activation_185 (Activation)     (None, 3, 3, 384)    0           batch_normalization_185[0][0]    
__________________________________________________________________________________________________
activation_186 (Activation)     (None, 3, 3, 384)    0           batch_normalization_186[0][0]    
__________________________________________________________________________________________________
batch_normalization_187 (BatchN (None, 3, 3, 192)    576         conv2d_187[0][0]                 
__________________________________________________________________________________________________
activation_179 (Activation)     (None, 3, 3, 320)    0           batch_normalization_179[0][0]    
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, 3, 3, 768)    0           activation_181[0][0]             
                                                                 activation_182[0][0]             
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 3, 3, 768)    0           activation_185[0][0]             
                                                                 activation_186[0][0]             
__________________________________________________________________________________________________
activation_187 (Activation)     (None, 3, 3, 192)    0           batch_normalization_187[0][0]    
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, 3, 3, 2048)   0           activation_179[0][0]             
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_3[0][0]              
                                                                 activation_187[0][0]             
==================================================================================================
Total params: 21,802,784
Trainable params: 0
Non-trainable params: 21,802,784
__________________________________________________________________________________________________
None

Add The End Layers

global_average_layer = tensorflow.keras.layers.GlobalAveragePooling2D()
feature_batch_average = global_average_layer(feature_batch)
print(feature_batch_average.shape)
(32, 2048)
prediction_layer = keras.layers.Dense(1)
prediction_batch = prediction_layer(feature_batch_average)
print(prediction_batch.shape)
(32, 1)

The Model

model = tensorflow.keras.Sequential([
    base_model,
    global_average_layer,
    prediction_layer,
])

Compile it

base_learning_rate = 0.0001
model.compile(optimizer=tensorflow.keras.optimizers.RMSprop(lr=base_learning_rate),
              loss="binary_crossentropy",
              metrics=["accuracy"])
print(model.summary())
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inception_v3 (Model)         (None, 3, 3, 2048)        21802784  
_________________________________________________________________
global_average_pooling2d_2 ( (None, 2048)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 2049      
=================================================================
Total params: 21,804,833
Trainable params: 2,049
Non-trainable params: 21,802,784
_________________________________________________________________
None
print(len(model.trainable_variables))
2

The two trainable variables are the weights and the biases.

Train the Model

number_train, number_val, number_test = (
    metadata.splits["train"].num_examples * weight / 10
    for weight in SPLIT_WEIGHTS
)
epochs = 10
steps_per_epoch = round(number_train)//BATCH_SIZE
validation_steps = 20

loss, accuracy = model.evaluate(validation_batches, steps = validation_steps)
print(f"Starting Loss: {loss:.2f}")
print(f"Starting Accuracy: {accuracy:.2f}")

 1/20 [>.............................] - ETA: 38s - loss: 4.4074 - accuracy: 0.6875
 2/20 [==>...........................] - ETA: 18s - loss: 4.9268 - accuracy: 0.6250
 3/20 [===>..........................] - ETA: 12s - loss: 4.8930 - accuracy: 0.6458
 4/20 [=====>........................] - ETA: 8s - loss: 5.0363 - accuracy: 0.6328 
 5/20 [======>.......................] - ETA: 6s - loss: 5.0895 - accuracy: 0.6375
 6/20 [========>.....................] - ETA: 5s - loss: 5.0549 - accuracy: 0.6406
 7/20 [=========>....................] - ETA: 4s - loss: 5.1643 - accuracy: 0.6384
 8/20 [===========>..................] - ETA: 3s - loss: 5.5128 - accuracy: 0.6094
 9/20 [============>.................] - ETA: 3s - loss: 5.6081 - accuracy: 0.6007
10/20 [==============>...............] - ETA: 2s - loss: 5.4980 - accuracy: 0.6031
11/20 [===============>..............] - ETA: 2s - loss: 5.4492 - accuracy: 0.6023
12/20 [=================>............] - ETA: 1s - loss: 5.2914 - accuracy: 0.6120
13/20 [==================>...........] - ETA: 1s - loss: 5.2644 - accuracy: 0.6130
14/20 [====================>.........] - ETA: 1s - loss: 5.3703 - accuracy: 0.6094
15/20 [=====================>........] - ETA: 0s - loss: 5.4075 - accuracy: 0.6042
16/20 [=======================>......] - ETA: 0s - loss: 5.4311 - accuracy: 0.6055
17/20 [========================>.....] - ETA: 0s - loss: 5.4268 - accuracy: 0.6066
18/20 [==========================>...] - ETA: 0s - loss: 5.4514 - accuracy: 0.6042
19/20 [===========================>..] - ETA: 0s - loss: 5.4500 - accuracy: 0.6053
20/20 [==============================] - 3s 154ms/step - loss: 5.4252 - accuracy: 0.6047
Starting Loss: 5.43
Starting Accuracy: 0.60
history = model.fit(train_batches, 
                    epochs=epochs, 
                    validation_data=validation_batches,
                    verbose=2,
)
Epoch 1/10
582/582 - 73s - loss: 1.9513 - accuracy: 0.6666 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/10
582/582 - 70s - loss: 1.2932 - accuracy: 0.7352 - val_loss: 0.9332 - val_accuracy: 0.9164
Epoch 3/10
582/582 - 70s - loss: 1.0500 - accuracy: 0.7730 - val_loss: 0.9082 - val_accuracy: 0.9241
Epoch 4/10
582/582 - 67s - loss: 0.9344 - accuracy: 0.7945 - val_loss: 0.7346 - val_accuracy: 0.9358
Epoch 5/10
582/582 - 70s - loss: 0.8509 - accuracy: 0.8076 - val_loss: 0.7172 - val_accuracy: 0.9375
Epoch 6/10
582/582 - 69s - loss: 0.7973 - accuracy: 0.8178 - val_loss: 0.6902 - val_accuracy: 0.9414
Epoch 7/10
582/582 - 70s - loss: 0.7563 - accuracy: 0.8268 - val_loss: 0.6368 - val_accuracy: 0.9457
Epoch 8/10
582/582 - 69s - loss: 0.7063 - accuracy: 0.8374 - val_loss: 0.6246 - val_accuracy: 0.9453
Epoch 9/10
582/582 - 69s - loss: 0.6833 - accuracy: 0.8444 - val_loss: 0.5738 - val_accuracy: 0.9474
Epoch 10/10
582/582 - 66s - loss: 0.6503 - accuracy: 0.8490 - val_loss: 0.5956 - val_accuracy: 0.9483
MODELS = Path("~/models/dogs-vs-cats/").expanduser()
checkpoint = tensorflow.keras.callbacks.ModelCheckpoint(
    str(MODELS/"mobilenet_transfer.hdf5"), monitor="val_acc", verbose=1, 
    save_best_only=True)
model.fit(train_batches, 
          epochs=epochs, 
          validation_data=validation_batches, 
          verbose=2, 
          callbacks=[checkpoint])
Epoch 1/10
W0804 20:23:10.072212 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 64s - loss: 0.6035 - accuracy: 0.8511 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/10
W0804 20:24:19.531489 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 69s - loss: 0.5911 - accuracy: 0.8553 - val_loss: 0.5240 - val_accuracy: 0.9522
Epoch 3/10
W0804 20:25:24.658446 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 65s - loss: 0.5786 - accuracy: 0.8586 - val_loss: 0.5165 - val_accuracy: 0.9526
Epoch 4/10
W0804 20:26:31.952232 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 67s - loss: 0.5733 - accuracy: 0.8615 - val_loss: 0.5058 - val_accuracy: 0.9504
Epoch 5/10
W0804 20:27:38.677954 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 67s - loss: 0.5645 - accuracy: 0.8622 - val_loss: 0.5139 - val_accuracy: 0.9509
Epoch 6/10
W0804 20:28:34.189206 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 55s - loss: 0.5542 - accuracy: 0.8645 - val_loss: 0.5474 - val_accuracy: 0.9517
Epoch 7/10
W0804 20:29:28.300311 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 54s - loss: 0.5419 - accuracy: 0.8652 - val_loss: 0.5432 - val_accuracy: 0.9517
Epoch 8/10
W0804 20:30:23.016782 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 55s - loss: 0.5280 - accuracy: 0.8680 - val_loss: 0.5235 - val_accuracy: 0.9517
Epoch 9/10
W0804 20:31:20.244823 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 57s - loss: 0.5184 - accuracy: 0.8680 - val_loss: 0.5358 - val_accuracy: 0.9513
Epoch 10/10
W0804 20:32:22.545533 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 62s - loss: 0.5123 - accuracy: 0.8700 - val_loss: 0.5312 - val_accuracy: 0.9526
model.fit(train_batches, 
          epochs=epochs, 
          validation_data=validation_batches, 
          verbose=2, 
          callbacks=[checkpoint])
Epoch 1/10
W0804 21:21:53.728024 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 55s - loss: 0.5080 - accuracy: 0.8709 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/10
W0804 21:22:51.354294 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 58s - loss: 0.5037 - accuracy: 0.8726 - val_loss: 0.5471 - val_accuracy: 0.9522
Epoch 3/10
W0804 21:23:49.173404 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 58s - loss: 0.4991 - accuracy: 0.8744 - val_loss: 0.5406 - val_accuracy: 0.9530
Epoch 4/10
W0804 21:24:43.631767 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 54s - loss: 0.4928 - accuracy: 0.8748 - val_loss: 0.5449 - val_accuracy: 0.9522
Epoch 5/10
W0804 21:25:44.675879 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 61s - loss: 0.4900 - accuracy: 0.8764 - val_loss: 0.5528 - val_accuracy: 0.9517
Epoch 6/10
W0804 21:26:56.251078 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 72s - loss: 0.4834 - accuracy: 0.8774 - val_loss: 0.5698 - val_accuracy: 0.9496
Epoch 7/10
W0804 21:28:04.352257 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 68s - loss: 0.4778 - accuracy: 0.8792 - val_loss: 0.5532 - val_accuracy: 0.9491
Epoch 8/10
W0804 21:29:14.626469 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 70s - loss: 0.4751 - accuracy: 0.8803 - val_loss: 0.5537 - val_accuracy: 0.9500
Epoch 9/10
W0804 21:30:28.570089 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 74s - loss: 0.4721 - accuracy: 0.8796 - val_loss: 0.5479 - val_accuracy: 0.9496
Epoch 10/10
W0804 21:31:40.674842 140458756540224 callbacks.py:986] Can save best model only with val_acc available, skipping.
582/582 - 72s - loss: 0.4718 - accuracy: 0.8799 - val_loss: 0.5544 - val_accuracy: 0.9496
data = pandas.DataFrame(model.history.history)
print(data.val_accuracy.max())
0.9530172348022461

So it looks like we max-out at around 96% - not much better than our five-layer model, but we reached it much faster.

plot = data.hvplot().opts(
    title="Transfer Learning Performance",
    height=800,
    width=1000
)
Embed(plot=plot, file_name="performance_1")()

Figure Missing

End

Dogs Vs Cats With Transfer Learning

Beginning

Imports

Python

from pathlib import Path

PyPi

from tensorflow.keras import layers
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow

My Stuff

from graeae import SubPathLoader, EmbedHoloviews, Timer

Set Up

The Timer

TIMER = Timer()

Tensorflow

tensorflow.enable_eager_execution()

Paths

MODELS = Path("~/models/dogs-vs-cats/").expanduser()

The Datasets

environment = SubPathLoader("DATASETS")
base_path = Path(environment["DOGS_VS_CATS"]).expanduser()
for item in base_path.iterdir():
    print(item)
WARNING: Logging before flag parsing goes to stderr.
I0803 17:35:32.889567 139918777980736 environment.py:35] Environment Path: /home/athena/.env
I0803 17:35:32.890873 139918777980736 environment.py:90] Environment Path: /home/athena/.config/datasets/env
/home/athena/data/datasets/images/dogs-vs-cats/train
/home/athena/data/datasets/images/dogs-vs-cats/exercise
/home/athena/data/datasets/images/dogs-vs-cats/test1
training_path = base_path/"train"
testing_path = base_path/"test"
for item in training_path.iterdir():
    print(item)
/home/athena/data/datasets/images/dogs-vs-cats/train/dogs
/home/athena/data/datasets/images/dogs-vs-cats/train/cats

Note: The download from kaggle has all the training files in one folder with the files labeled with either cat or dog, I made the subfolders and moved the files to make it work better with the Data Generator.

Middle

The Pre-built Model

The Broken Version

Note: This doesn't work, there's a bug (related to this one, I think). You need to pass a tags={train} argument to one of the methods called by the KerasLayer but the KerasLayer isn't defined in a way that it gets passed through. Ooops…

This is going to use the Keras version of tensorflow hub and the tensorflow version 1.x model (the URL determines which model we're using). There is a different one for tensorflow 2.0. First create an extraction layer from the prebuilt one and make it untrainable.

model_url = "https://tfhub.dev/google/imagenet/inception_v3/feature_vector/3"
input_shape = (150, 150, 3)
feature_extraction_layer = tensorflow_hub.KerasLayer(model_url, 
                                                     input_shape=input_shape)

Take Two

This uses the InceptionV3 pre-trained model. By default it has weights trained on imagenet. The example on coursera uses weights downloaded from the web, but I'll try the defaults instead.

input_shape = (150, 150, 3)
base_model = InceptionV3(
    input_shape = input_shape,
    include_top=False                
)

Actually, looking at the urls it looks like the example from the course is downloading the same weights just from a different place.

for layer in base_model.layers:
  layer.trainable = False

Freeze the Model

base_model.trainable = False
print(base_model.summary())
Model: "inception_v3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            [(None, 150, 150, 3) 0                                            
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, 74, 74, 32)   864         input_2[0][0]                    
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, 74, 74, 32)   96          conv2d_94[0][0]                  
__________________________________________________________________________________________________
activation_94 (Activation)      (None, 74, 74, 32)   0           batch_normalization_94[0][0]     
__________________________________________________________________________________________________
conv2d_95 (Conv2D)              (None, 72, 72, 32)   9216        activation_94[0][0]              
__________________________________________________________________________________________________
batch_normalization_95 (BatchNo (None, 72, 72, 32)   96          conv2d_95[0][0]                  
__________________________________________________________________________________________________
activation_95 (Activation)      (None, 72, 72, 32)   0           batch_normalization_95[0][0]     
__________________________________________________________________________________________________
conv2d_96 (Conv2D)              (None, 72, 72, 64)   18432       activation_95[0][0]              
__________________________________________________________________________________________________
batch_normalization_96 (BatchNo (None, 72, 72, 64)   192         conv2d_96[0][0]                  
__________________________________________________________________________________________________
activation_96 (Activation)      (None, 72, 72, 64)   0           batch_normalization_96[0][0]     
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 35, 35, 64)   0           activation_96[0][0]              
__________________________________________________________________________________________________
conv2d_97 (Conv2D)              (None, 35, 35, 80)   5120        max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
batch_normalization_97 (BatchNo (None, 35, 35, 80)   240         conv2d_97[0][0]                  
__________________________________________________________________________________________________
activation_97 (Activation)      (None, 35, 35, 80)   0           batch_normalization_97[0][0]     
__________________________________________________________________________________________________
conv2d_98 (Conv2D)              (None, 33, 33, 192)  138240      activation_97[0][0]              
__________________________________________________________________________________________________
batch_normalization_98 (BatchNo (None, 33, 33, 192)  576         conv2d_98[0][0]                  
__________________________________________________________________________________________________
activation_98 (Activation)      (None, 33, 33, 192)  0           batch_normalization_98[0][0]     
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 16, 16, 192)  0           activation_98[0][0]              
__________________________________________________________________________________________________
conv2d_102 (Conv2D)             (None, 16, 16, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
batch_normalization_102 (BatchN (None, 16, 16, 64)   192         conv2d_102[0][0]                 
__________________________________________________________________________________________________
activation_102 (Activation)     (None, 16, 16, 64)   0           batch_normalization_102[0][0]    
__________________________________________________________________________________________________
conv2d_100 (Conv2D)             (None, 16, 16, 48)   9216        max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_103 (Conv2D)             (None, 16, 16, 96)   55296       activation_102[0][0]             
__________________________________________________________________________________________________
batch_normalization_100 (BatchN (None, 16, 16, 48)   144         conv2d_100[0][0]                 
__________________________________________________________________________________________________
batch_normalization_103 (BatchN (None, 16, 16, 96)   288         conv2d_103[0][0]                 
__________________________________________________________________________________________________
activation_100 (Activation)     (None, 16, 16, 48)   0           batch_normalization_100[0][0]    
__________________________________________________________________________________________________
activation_103 (Activation)     (None, 16, 16, 96)   0           batch_normalization_103[0][0]    
__________________________________________________________________________________________________
average_pooling2d_9 (AveragePoo (None, 16, 16, 192)  0           max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_99 (Conv2D)              (None, 16, 16, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_101 (Conv2D)             (None, 16, 16, 64)   76800       activation_100[0][0]             
__________________________________________________________________________________________________
conv2d_104 (Conv2D)             (None, 16, 16, 96)   82944       activation_103[0][0]             
__________________________________________________________________________________________________
conv2d_105 (Conv2D)             (None, 16, 16, 32)   6144        average_pooling2d_9[0][0]        
__________________________________________________________________________________________________
batch_normalization_99 (BatchNo (None, 16, 16, 64)   192         conv2d_99[0][0]                  
__________________________________________________________________________________________________
batch_normalization_101 (BatchN (None, 16, 16, 64)   192         conv2d_101[0][0]                 
__________________________________________________________________________________________________
batch_normalization_104 (BatchN (None, 16, 16, 96)   288         conv2d_104[0][0]                 
__________________________________________________________________________________________________
batch_normalization_105 (BatchN (None, 16, 16, 32)   96          conv2d_105[0][0]                 
__________________________________________________________________________________________________
activation_99 (Activation)      (None, 16, 16, 64)   0           batch_normalization_99[0][0]     
__________________________________________________________________________________________________
activation_101 (Activation)     (None, 16, 16, 64)   0           batch_normalization_101[0][0]    
__________________________________________________________________________________________________
activation_104 (Activation)     (None, 16, 16, 96)   0           batch_normalization_104[0][0]    
__________________________________________________________________________________________________
activation_105 (Activation)     (None, 16, 16, 32)   0           batch_normalization_105[0][0]    
__________________________________________________________________________________________________
mixed0 (Concatenate)            (None, 16, 16, 256)  0           activation_99[0][0]              
                                                                 activation_101[0][0]             
                                                                 activation_104[0][0]             
                                                                 activation_105[0][0]             
__________________________________________________________________________________________________
conv2d_109 (Conv2D)             (None, 16, 16, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
batch_normalization_109 (BatchN (None, 16, 16, 64)   192         conv2d_109[0][0]                 
__________________________________________________________________________________________________
activation_109 (Activation)     (None, 16, 16, 64)   0           batch_normalization_109[0][0]    
__________________________________________________________________________________________________
conv2d_107 (Conv2D)             (None, 16, 16, 48)   12288       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_110 (Conv2D)             (None, 16, 16, 96)   55296       activation_109[0][0]             
__________________________________________________________________________________________________
batch_normalization_107 (BatchN (None, 16, 16, 48)   144         conv2d_107[0][0]                 
__________________________________________________________________________________________________
batch_normalization_110 (BatchN (None, 16, 16, 96)   288         conv2d_110[0][0]                 
__________________________________________________________________________________________________
activation_107 (Activation)     (None, 16, 16, 48)   0           batch_normalization_107[0][0]    
__________________________________________________________________________________________________
activation_110 (Activation)     (None, 16, 16, 96)   0           batch_normalization_110[0][0]    
__________________________________________________________________________________________________
average_pooling2d_10 (AveragePo (None, 16, 16, 256)  0           mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_106 (Conv2D)             (None, 16, 16, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_108 (Conv2D)             (None, 16, 16, 64)   76800       activation_107[0][0]             
__________________________________________________________________________________________________
conv2d_111 (Conv2D)             (None, 16, 16, 96)   82944       activation_110[0][0]             
__________________________________________________________________________________________________
conv2d_112 (Conv2D)             (None, 16, 16, 64)   16384       average_pooling2d_10[0][0]       
__________________________________________________________________________________________________
batch_normalization_106 (BatchN (None, 16, 16, 64)   192         conv2d_106[0][0]                 
__________________________________________________________________________________________________
batch_normalization_108 (BatchN (None, 16, 16, 64)   192         conv2d_108[0][0]                 
__________________________________________________________________________________________________
batch_normalization_111 (BatchN (None, 16, 16, 96)   288         conv2d_111[0][0]                 
__________________________________________________________________________________________________
batch_normalization_112 (BatchN (None, 16, 16, 64)   192         conv2d_112[0][0]                 
__________________________________________________________________________________________________
activation_106 (Activation)     (None, 16, 16, 64)   0           batch_normalization_106[0][0]    
__________________________________________________________________________________________________
activation_108 (Activation)     (None, 16, 16, 64)   0           batch_normalization_108[0][0]    
__________________________________________________________________________________________________
activation_111 (Activation)     (None, 16, 16, 96)   0           batch_normalization_111[0][0]    
__________________________________________________________________________________________________
activation_112 (Activation)     (None, 16, 16, 64)   0           batch_normalization_112[0][0]    
__________________________________________________________________________________________________
mixed1 (Concatenate)            (None, 16, 16, 288)  0           activation_106[0][0]             
                                                                 activation_108[0][0]             
                                                                 activation_111[0][0]             
                                                                 activation_112[0][0]             
__________________________________________________________________________________________________
conv2d_116 (Conv2D)             (None, 16, 16, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
batch_normalization_116 (BatchN (None, 16, 16, 64)   192         conv2d_116[0][0]                 
__________________________________________________________________________________________________
activation_116 (Activation)     (None, 16, 16, 64)   0           batch_normalization_116[0][0]    
__________________________________________________________________________________________________
conv2d_114 (Conv2D)             (None, 16, 16, 48)   13824       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_117 (Conv2D)             (None, 16, 16, 96)   55296       activation_116[0][0]             
__________________________________________________________________________________________________
batch_normalization_114 (BatchN (None, 16, 16, 48)   144         conv2d_114[0][0]                 
__________________________________________________________________________________________________
batch_normalization_117 (BatchN (None, 16, 16, 96)   288         conv2d_117[0][0]                 
__________________________________________________________________________________________________
activation_114 (Activation)     (None, 16, 16, 48)   0           batch_normalization_114[0][0]    
__________________________________________________________________________________________________
activation_117 (Activation)     (None, 16, 16, 96)   0           batch_normalization_117[0][0]    
__________________________________________________________________________________________________
average_pooling2d_11 (AveragePo (None, 16, 16, 288)  0           mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_113 (Conv2D)             (None, 16, 16, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_115 (Conv2D)             (None, 16, 16, 64)   76800       activation_114[0][0]             
__________________________________________________________________________________________________
conv2d_118 (Conv2D)             (None, 16, 16, 96)   82944       activation_117[0][0]             
__________________________________________________________________________________________________
conv2d_119 (Conv2D)             (None, 16, 16, 64)   18432       average_pooling2d_11[0][0]       
__________________________________________________________________________________________________
batch_normalization_113 (BatchN (None, 16, 16, 64)   192         conv2d_113[0][0]                 
__________________________________________________________________________________________________
batch_normalization_115 (BatchN (None, 16, 16, 64)   192         conv2d_115[0][0]                 
__________________________________________________________________________________________________
batch_normalization_118 (BatchN (None, 16, 16, 96)   288         conv2d_118[0][0]                 
__________________________________________________________________________________________________
batch_normalization_119 (BatchN (None, 16, 16, 64)   192         conv2d_119[0][0]                 
__________________________________________________________________________________________________
activation_113 (Activation)     (None, 16, 16, 64)   0           batch_normalization_113[0][0]    
__________________________________________________________________________________________________
activation_115 (Activation)     (None, 16, 16, 64)   0           batch_normalization_115[0][0]    
__________________________________________________________________________________________________
activation_118 (Activation)     (None, 16, 16, 96)   0           batch_normalization_118[0][0]    
__________________________________________________________________________________________________
activation_119 (Activation)     (None, 16, 16, 64)   0           batch_normalization_119[0][0]    
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, 16, 16, 288)  0           activation_113[0][0]             
                                                                 activation_115[0][0]             
                                                                 activation_118[0][0]             
                                                                 activation_119[0][0]             
__________________________________________________________________________________________________
conv2d_121 (Conv2D)             (None, 16, 16, 64)   18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_121 (BatchN (None, 16, 16, 64)   192         conv2d_121[0][0]                 
__________________________________________________________________________________________________
activation_121 (Activation)     (None, 16, 16, 64)   0           batch_normalization_121[0][0]    
__________________________________________________________________________________________________
conv2d_122 (Conv2D)             (None, 16, 16, 96)   55296       activation_121[0][0]             
__________________________________________________________________________________________________
batch_normalization_122 (BatchN (None, 16, 16, 96)   288         conv2d_122[0][0]                 
__________________________________________________________________________________________________
activation_122 (Activation)     (None, 16, 16, 96)   0           batch_normalization_122[0][0]    
__________________________________________________________________________________________________
conv2d_120 (Conv2D)             (None, 7, 7, 384)    995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_123 (Conv2D)             (None, 7, 7, 96)     82944       activation_122[0][0]             
__________________________________________________________________________________________________
batch_normalization_120 (BatchN (None, 7, 7, 384)    1152        conv2d_120[0][0]                 
__________________________________________________________________________________________________
batch_normalization_123 (BatchN (None, 7, 7, 96)     288         conv2d_123[0][0]                 
__________________________________________________________________________________________________
activation_120 (Activation)     (None, 7, 7, 384)    0           batch_normalization_120[0][0]    
__________________________________________________________________________________________________
activation_123 (Activation)     (None, 7, 7, 96)     0           batch_normalization_123[0][0]    
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 7, 7, 288)    0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, 7, 7, 768)    0           activation_120[0][0]             
                                                                 activation_123[0][0]             
                                                                 max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
conv2d_128 (Conv2D)             (None, 7, 7, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_128 (BatchN (None, 7, 7, 128)    384         conv2d_128[0][0]                 
__________________________________________________________________________________________________
activation_128 (Activation)     (None, 7, 7, 128)    0           batch_normalization_128[0][0]    
__________________________________________________________________________________________________
conv2d_129 (Conv2D)             (None, 7, 7, 128)    114688      activation_128[0][0]             
__________________________________________________________________________________________________
batch_normalization_129 (BatchN (None, 7, 7, 128)    384         conv2d_129[0][0]                 
__________________________________________________________________________________________________
activation_129 (Activation)     (None, 7, 7, 128)    0           batch_normalization_129[0][0]    
__________________________________________________________________________________________________
conv2d_125 (Conv2D)             (None, 7, 7, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_130 (Conv2D)             (None, 7, 7, 128)    114688      activation_129[0][0]             
__________________________________________________________________________________________________
batch_normalization_125 (BatchN (None, 7, 7, 128)    384         conv2d_125[0][0]                 
__________________________________________________________________________________________________
batch_normalization_130 (BatchN (None, 7, 7, 128)    384         conv2d_130[0][0]                 
__________________________________________________________________________________________________
activation_125 (Activation)     (None, 7, 7, 128)    0           batch_normalization_125[0][0]    
__________________________________________________________________________________________________
activation_130 (Activation)     (None, 7, 7, 128)    0           batch_normalization_130[0][0]    
__________________________________________________________________________________________________
conv2d_126 (Conv2D)             (None, 7, 7, 128)    114688      activation_125[0][0]             
__________________________________________________________________________________________________
conv2d_131 (Conv2D)             (None, 7, 7, 128)    114688      activation_130[0][0]             
__________________________________________________________________________________________________
batch_normalization_126 (BatchN (None, 7, 7, 128)    384         conv2d_126[0][0]                 
__________________________________________________________________________________________________
batch_normalization_131 (BatchN (None, 7, 7, 128)    384         conv2d_131[0][0]                 
__________________________________________________________________________________________________
activation_126 (Activation)     (None, 7, 7, 128)    0           batch_normalization_126[0][0]    
__________________________________________________________________________________________________
activation_131 (Activation)     (None, 7, 7, 128)    0           batch_normalization_131[0][0]    
__________________________________________________________________________________________________
average_pooling2d_12 (AveragePo (None, 7, 7, 768)    0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_124 (Conv2D)             (None, 7, 7, 192)    147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_127 (Conv2D)             (None, 7, 7, 192)    172032      activation_126[0][0]             
__________________________________________________________________________________________________
conv2d_132 (Conv2D)             (None, 7, 7, 192)    172032      activation_131[0][0]             
__________________________________________________________________________________________________
conv2d_133 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_12[0][0]       
__________________________________________________________________________________________________
batch_normalization_124 (BatchN (None, 7, 7, 192)    576         conv2d_124[0][0]                 
__________________________________________________________________________________________________
batch_normalization_127 (BatchN (None, 7, 7, 192)    576         conv2d_127[0][0]                 
__________________________________________________________________________________________________
batch_normalization_132 (BatchN (None, 7, 7, 192)    576         conv2d_132[0][0]                 
__________________________________________________________________________________________________
batch_normalization_133 (BatchN (None, 7, 7, 192)    576         conv2d_133[0][0]                 
__________________________________________________________________________________________________
activation_124 (Activation)     (None, 7, 7, 192)    0           batch_normalization_124[0][0]    
__________________________________________________________________________________________________
activation_127 (Activation)     (None, 7, 7, 192)    0           batch_normalization_127[0][0]    
__________________________________________________________________________________________________
activation_132 (Activation)     (None, 7, 7, 192)    0           batch_normalization_132[0][0]    
__________________________________________________________________________________________________
activation_133 (Activation)     (None, 7, 7, 192)    0           batch_normalization_133[0][0]    
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, 7, 7, 768)    0           activation_124[0][0]             
                                                                 activation_127[0][0]             
                                                                 activation_132[0][0]             
                                                                 activation_133[0][0]             
__________________________________________________________________________________________________
conv2d_138 (Conv2D)             (None, 7, 7, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_138 (BatchN (None, 7, 7, 160)    480         conv2d_138[0][0]                 
__________________________________________________________________________________________________
activation_138 (Activation)     (None, 7, 7, 160)    0           batch_normalization_138[0][0]    
__________________________________________________________________________________________________
conv2d_139 (Conv2D)             (None, 7, 7, 160)    179200      activation_138[0][0]             
__________________________________________________________________________________________________
batch_normalization_139 (BatchN (None, 7, 7, 160)    480         conv2d_139[0][0]                 
__________________________________________________________________________________________________
activation_139 (Activation)     (None, 7, 7, 160)    0           batch_normalization_139[0][0]    
__________________________________________________________________________________________________
conv2d_135 (Conv2D)             (None, 7, 7, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_140 (Conv2D)             (None, 7, 7, 160)    179200      activation_139[0][0]             
__________________________________________________________________________________________________
batch_normalization_135 (BatchN (None, 7, 7, 160)    480         conv2d_135[0][0]                 
__________________________________________________________________________________________________
batch_normalization_140 (BatchN (None, 7, 7, 160)    480         conv2d_140[0][0]                 
__________________________________________________________________________________________________
activation_135 (Activation)     (None, 7, 7, 160)    0           batch_normalization_135[0][0]    
__________________________________________________________________________________________________
activation_140 (Activation)     (None, 7, 7, 160)    0           batch_normalization_140[0][0]    
__________________________________________________________________________________________________
conv2d_136 (Conv2D)             (None, 7, 7, 160)    179200      activation_135[0][0]             
__________________________________________________________________________________________________
conv2d_141 (Conv2D)             (None, 7, 7, 160)    179200      activation_140[0][0]             
__________________________________________________________________________________________________
batch_normalization_136 (BatchN (None, 7, 7, 160)    480         conv2d_136[0][0]                 
__________________________________________________________________________________________________
batch_normalization_141 (BatchN (None, 7, 7, 160)    480         conv2d_141[0][0]                 
__________________________________________________________________________________________________
activation_136 (Activation)     (None, 7, 7, 160)    0           batch_normalization_136[0][0]    
__________________________________________________________________________________________________
activation_141 (Activation)     (None, 7, 7, 160)    0           batch_normalization_141[0][0]    
__________________________________________________________________________________________________
average_pooling2d_13 (AveragePo (None, 7, 7, 768)    0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_134 (Conv2D)             (None, 7, 7, 192)    147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_137 (Conv2D)             (None, 7, 7, 192)    215040      activation_136[0][0]             
__________________________________________________________________________________________________
conv2d_142 (Conv2D)             (None, 7, 7, 192)    215040      activation_141[0][0]             
__________________________________________________________________________________________________
conv2d_143 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_13[0][0]       
__________________________________________________________________________________________________
batch_normalization_134 (BatchN (None, 7, 7, 192)    576         conv2d_134[0][0]                 
__________________________________________________________________________________________________
batch_normalization_137 (BatchN (None, 7, 7, 192)    576         conv2d_137[0][0]                 
__________________________________________________________________________________________________
batch_normalization_142 (BatchN (None, 7, 7, 192)    576         conv2d_142[0][0]                 
__________________________________________________________________________________________________
batch_normalization_143 (BatchN (None, 7, 7, 192)    576         conv2d_143[0][0]                 
__________________________________________________________________________________________________
activation_134 (Activation)     (None, 7, 7, 192)    0           batch_normalization_134[0][0]    
__________________________________________________________________________________________________
activation_137 (Activation)     (None, 7, 7, 192)    0           batch_normalization_137[0][0]    
__________________________________________________________________________________________________
activation_142 (Activation)     (None, 7, 7, 192)    0           batch_normalization_142[0][0]    
__________________________________________________________________________________________________
activation_143 (Activation)     (None, 7, 7, 192)    0           batch_normalization_143[0][0]    
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, 7, 7, 768)    0           activation_134[0][0]             
                                                                 activation_137[0][0]             
                                                                 activation_142[0][0]             
                                                                 activation_143[0][0]             
__________________________________________________________________________________________________
conv2d_148 (Conv2D)             (None, 7, 7, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_148 (BatchN (None, 7, 7, 160)    480         conv2d_148[0][0]                 
__________________________________________________________________________________________________
activation_148 (Activation)     (None, 7, 7, 160)    0           batch_normalization_148[0][0]    
__________________________________________________________________________________________________
conv2d_149 (Conv2D)             (None, 7, 7, 160)    179200      activation_148[0][0]             
__________________________________________________________________________________________________
batch_normalization_149 (BatchN (None, 7, 7, 160)    480         conv2d_149[0][0]                 
__________________________________________________________________________________________________
activation_149 (Activation)     (None, 7, 7, 160)    0           batch_normalization_149[0][0]    
__________________________________________________________________________________________________
conv2d_145 (Conv2D)             (None, 7, 7, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_150 (Conv2D)             (None, 7, 7, 160)    179200      activation_149[0][0]             
__________________________________________________________________________________________________
batch_normalization_145 (BatchN (None, 7, 7, 160)    480         conv2d_145[0][0]                 
__________________________________________________________________________________________________
batch_normalization_150 (BatchN (None, 7, 7, 160)    480         conv2d_150[0][0]                 
__________________________________________________________________________________________________
activation_145 (Activation)     (None, 7, 7, 160)    0           batch_normalization_145[0][0]    
__________________________________________________________________________________________________
activation_150 (Activation)     (None, 7, 7, 160)    0           batch_normalization_150[0][0]    
__________________________________________________________________________________________________
conv2d_146 (Conv2D)             (None, 7, 7, 160)    179200      activation_145[0][0]             
__________________________________________________________________________________________________
conv2d_151 (Conv2D)             (None, 7, 7, 160)    179200      activation_150[0][0]             
__________________________________________________________________________________________________
batch_normalization_146 (BatchN (None, 7, 7, 160)    480         conv2d_146[0][0]                 
__________________________________________________________________________________________________
batch_normalization_151 (BatchN (None, 7, 7, 160)    480         conv2d_151[0][0]                 
__________________________________________________________________________________________________
activation_146 (Activation)     (None, 7, 7, 160)    0           batch_normalization_146[0][0]    
__________________________________________________________________________________________________
activation_151 (Activation)     (None, 7, 7, 160)    0           batch_normalization_151[0][0]    
__________________________________________________________________________________________________
average_pooling2d_14 (AveragePo (None, 7, 7, 768)    0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_144 (Conv2D)             (None, 7, 7, 192)    147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_147 (Conv2D)             (None, 7, 7, 192)    215040      activation_146[0][0]             
__________________________________________________________________________________________________
conv2d_152 (Conv2D)             (None, 7, 7, 192)    215040      activation_151[0][0]             
__________________________________________________________________________________________________
conv2d_153 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_14[0][0]       
__________________________________________________________________________________________________
batch_normalization_144 (BatchN (None, 7, 7, 192)    576         conv2d_144[0][0]                 
__________________________________________________________________________________________________
batch_normalization_147 (BatchN (None, 7, 7, 192)    576         conv2d_147[0][0]                 
__________________________________________________________________________________________________
batch_normalization_152 (BatchN (None, 7, 7, 192)    576         conv2d_152[0][0]                 
__________________________________________________________________________________________________
batch_normalization_153 (BatchN (None, 7, 7, 192)    576         conv2d_153[0][0]                 
__________________________________________________________________________________________________
activation_144 (Activation)     (None, 7, 7, 192)    0           batch_normalization_144[0][0]    
__________________________________________________________________________________________________
activation_147 (Activation)     (None, 7, 7, 192)    0           batch_normalization_147[0][0]    
__________________________________________________________________________________________________
activation_152 (Activation)     (None, 7, 7, 192)    0           batch_normalization_152[0][0]    
__________________________________________________________________________________________________
activation_153 (Activation)     (None, 7, 7, 192)    0           batch_normalization_153[0][0]    
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, 7, 7, 768)    0           activation_144[0][0]             
                                                                 activation_147[0][0]             
                                                                 activation_152[0][0]             
                                                                 activation_153[0][0]             
__________________________________________________________________________________________________
conv2d_158 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_158 (BatchN (None, 7, 7, 192)    576         conv2d_158[0][0]                 
__________________________________________________________________________________________________
activation_158 (Activation)     (None, 7, 7, 192)    0           batch_normalization_158[0][0]    
__________________________________________________________________________________________________
conv2d_159 (Conv2D)             (None, 7, 7, 192)    258048      activation_158[0][0]             
__________________________________________________________________________________________________
batch_normalization_159 (BatchN (None, 7, 7, 192)    576         conv2d_159[0][0]                 
__________________________________________________________________________________________________
activation_159 (Activation)     (None, 7, 7, 192)    0           batch_normalization_159[0][0]    
__________________________________________________________________________________________________
conv2d_155 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_160 (Conv2D)             (None, 7, 7, 192)    258048      activation_159[0][0]             
__________________________________________________________________________________________________
batch_normalization_155 (BatchN (None, 7, 7, 192)    576         conv2d_155[0][0]                 
__________________________________________________________________________________________________
batch_normalization_160 (BatchN (None, 7, 7, 192)    576         conv2d_160[0][0]                 
__________________________________________________________________________________________________
activation_155 (Activation)     (None, 7, 7, 192)    0           batch_normalization_155[0][0]    
__________________________________________________________________________________________________
activation_160 (Activation)     (None, 7, 7, 192)    0           batch_normalization_160[0][0]    
__________________________________________________________________________________________________
conv2d_156 (Conv2D)             (None, 7, 7, 192)    258048      activation_155[0][0]             
__________________________________________________________________________________________________
conv2d_161 (Conv2D)             (None, 7, 7, 192)    258048      activation_160[0][0]             
__________________________________________________________________________________________________
batch_normalization_156 (BatchN (None, 7, 7, 192)    576         conv2d_156[0][0]                 
__________________________________________________________________________________________________
batch_normalization_161 (BatchN (None, 7, 7, 192)    576         conv2d_161[0][0]                 
__________________________________________________________________________________________________
activation_156 (Activation)     (None, 7, 7, 192)    0           batch_normalization_156[0][0]    
__________________________________________________________________________________________________
activation_161 (Activation)     (None, 7, 7, 192)    0           batch_normalization_161[0][0]    
__________________________________________________________________________________________________
average_pooling2d_15 (AveragePo (None, 7, 7, 768)    0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_154 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_157 (Conv2D)             (None, 7, 7, 192)    258048      activation_156[0][0]             
__________________________________________________________________________________________________
conv2d_162 (Conv2D)             (None, 7, 7, 192)    258048      activation_161[0][0]             
__________________________________________________________________________________________________
conv2d_163 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_15[0][0]       
__________________________________________________________________________________________________
batch_normalization_154 (BatchN (None, 7, 7, 192)    576         conv2d_154[0][0]                 
__________________________________________________________________________________________________
batch_normalization_157 (BatchN (None, 7, 7, 192)    576         conv2d_157[0][0]                 
__________________________________________________________________________________________________
batch_normalization_162 (BatchN (None, 7, 7, 192)    576         conv2d_162[0][0]                 
__________________________________________________________________________________________________
batch_normalization_163 (BatchN (None, 7, 7, 192)    576         conv2d_163[0][0]                 
__________________________________________________________________________________________________
activation_154 (Activation)     (None, 7, 7, 192)    0           batch_normalization_154[0][0]    
__________________________________________________________________________________________________
activation_157 (Activation)     (None, 7, 7, 192)    0           batch_normalization_157[0][0]    
__________________________________________________________________________________________________
activation_162 (Activation)     (None, 7, 7, 192)    0           batch_normalization_162[0][0]    
__________________________________________________________________________________________________
activation_163 (Activation)     (None, 7, 7, 192)    0           batch_normalization_163[0][0]    
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, 7, 7, 768)    0           activation_154[0][0]             
                                                                 activation_157[0][0]             
                                                                 activation_162[0][0]             
                                                                 activation_163[0][0]             
__________________________________________________________________________________________________
conv2d_166 (Conv2D)             (None, 7, 7, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_166 (BatchN (None, 7, 7, 192)    576         conv2d_166[0][0]                 
__________________________________________________________________________________________________
activation_166 (Activation)     (None, 7, 7, 192)    0           batch_normalization_166[0][0]    
__________________________________________________________________________________________________
conv2d_167 (Conv2D)             (None, 7, 7, 192)    258048      activation_166[0][0]             
__________________________________________________________________________________________________
batch_normalization_167 (BatchN (None, 7, 7, 192)    576         conv2d_167[0][0]                 
__________________________________________________________________________________________________
activation_167 (Activation)     (None, 7, 7, 192)    0           batch_normalization_167[0][0]    
__________________________________________________________________________________________________
conv2d_164 (Conv2D)             (None, 7, 7, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_168 (Conv2D)             (None, 7, 7, 192)    258048      activation_167[0][0]             
__________________________________________________________________________________________________
batch_normalization_164 (BatchN (None, 7, 7, 192)    576         conv2d_164[0][0]                 
__________________________________________________________________________________________________
batch_normalization_168 (BatchN (None, 7, 7, 192)    576         conv2d_168[0][0]                 
__________________________________________________________________________________________________
activation_164 (Activation)     (None, 7, 7, 192)    0           batch_normalization_164[0][0]    
__________________________________________________________________________________________________
activation_168 (Activation)     (None, 7, 7, 192)    0           batch_normalization_168[0][0]    
__________________________________________________________________________________________________
conv2d_165 (Conv2D)             (None, 3, 3, 320)    552960      activation_164[0][0]             
__________________________________________________________________________________________________
conv2d_169 (Conv2D)             (None, 3, 3, 192)    331776      activation_168[0][0]             
__________________________________________________________________________________________________
batch_normalization_165 (BatchN (None, 3, 3, 320)    960         conv2d_165[0][0]                 
__________________________________________________________________________________________________
batch_normalization_169 (BatchN (None, 3, 3, 192)    576         conv2d_169[0][0]                 
__________________________________________________________________________________________________
activation_165 (Activation)     (None, 3, 3, 320)    0           batch_normalization_165[0][0]    
__________________________________________________________________________________________________
activation_169 (Activation)     (None, 3, 3, 192)    0           batch_normalization_169[0][0]    
__________________________________________________________________________________________________
max_pooling2d_7 (MaxPooling2D)  (None, 3, 3, 768)    0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, 3, 3, 1280)   0           activation_165[0][0]             
                                                                 activation_169[0][0]             
                                                                 max_pooling2d_7[0][0]            
__________________________________________________________________________________________________
conv2d_174 (Conv2D)             (None, 3, 3, 448)    573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_174 (BatchN (None, 3, 3, 448)    1344        conv2d_174[0][0]                 
__________________________________________________________________________________________________
activation_174 (Activation)     (None, 3, 3, 448)    0           batch_normalization_174[0][0]    
__________________________________________________________________________________________________
conv2d_171 (Conv2D)             (None, 3, 3, 384)    491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_175 (Conv2D)             (None, 3, 3, 384)    1548288     activation_174[0][0]             
__________________________________________________________________________________________________
batch_normalization_171 (BatchN (None, 3, 3, 384)    1152        conv2d_171[0][0]                 
__________________________________________________________________________________________________
batch_normalization_175 (BatchN (None, 3, 3, 384)    1152        conv2d_175[0][0]                 
__________________________________________________________________________________________________
activation_171 (Activation)     (None, 3, 3, 384)    0           batch_normalization_171[0][0]    
__________________________________________________________________________________________________
activation_175 (Activation)     (None, 3, 3, 384)    0           batch_normalization_175[0][0]    
__________________________________________________________________________________________________
conv2d_172 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_173 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_176 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
conv2d_177 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
average_pooling2d_16 (AveragePo (None, 3, 3, 1280)   0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_170 (Conv2D)             (None, 3, 3, 320)    409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_172 (BatchN (None, 3, 3, 384)    1152        conv2d_172[0][0]                 
__________________________________________________________________________________________________
batch_normalization_173 (BatchN (None, 3, 3, 384)    1152        conv2d_173[0][0]                 
__________________________________________________________________________________________________
batch_normalization_176 (BatchN (None, 3, 3, 384)    1152        conv2d_176[0][0]                 
__________________________________________________________________________________________________
batch_normalization_177 (BatchN (None, 3, 3, 384)    1152        conv2d_177[0][0]                 
__________________________________________________________________________________________________
conv2d_178 (Conv2D)             (None, 3, 3, 192)    245760      average_pooling2d_16[0][0]       
__________________________________________________________________________________________________
batch_normalization_170 (BatchN (None, 3, 3, 320)    960         conv2d_170[0][0]                 
__________________________________________________________________________________________________
activation_172 (Activation)     (None, 3, 3, 384)    0           batch_normalization_172[0][0]    
__________________________________________________________________________________________________
activation_173 (Activation)     (None, 3, 3, 384)    0           batch_normalization_173[0][0]    
__________________________________________________________________________________________________
activation_176 (Activation)     (None, 3, 3, 384)    0           batch_normalization_176[0][0]    
__________________________________________________________________________________________________
activation_177 (Activation)     (None, 3, 3, 384)    0           batch_normalization_177[0][0]    
__________________________________________________________________________________________________
batch_normalization_178 (BatchN (None, 3, 3, 192)    576         conv2d_178[0][0]                 
__________________________________________________________________________________________________
activation_170 (Activation)     (None, 3, 3, 320)    0           batch_normalization_170[0][0]    
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, 3, 3, 768)    0           activation_172[0][0]             
                                                                 activation_173[0][0]             
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 3, 3, 768)    0           activation_176[0][0]             
                                                                 activation_177[0][0]             
__________________________________________________________________________________________________
activation_178 (Activation)     (None, 3, 3, 192)    0           batch_normalization_178[0][0]    
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, 3, 3, 2048)   0           activation_170[0][0]             
                                                                 mixed9_0[0][0]                   
                                                                 concatenate_2[0][0]              
                                                                 activation_178[0][0]             
__________________________________________________________________________________________________
conv2d_183 (Conv2D)             (None, 3, 3, 448)    917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_183 (BatchN (None, 3, 3, 448)    1344        conv2d_183[0][0]                 
__________________________________________________________________________________________________
activation_183 (Activation)     (None, 3, 3, 448)    0           batch_normalization_183[0][0]    
__________________________________________________________________________________________________
conv2d_180 (Conv2D)             (None, 3, 3, 384)    786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_184 (Conv2D)             (None, 3, 3, 384)    1548288     activation_183[0][0]             
__________________________________________________________________________________________________
batch_normalization_180 (BatchN (None, 3, 3, 384)    1152        conv2d_180[0][0]                 
__________________________________________________________________________________________________
batch_normalization_184 (BatchN (None, 3, 3, 384)    1152        conv2d_184[0][0]                 
__________________________________________________________________________________________________
activation_180 (Activation)     (None, 3, 3, 384)    0           batch_normalization_180[0][0]    
__________________________________________________________________________________________________
activation_184 (Activation)     (None, 3, 3, 384)    0           batch_normalization_184[0][0]    
__________________________________________________________________________________________________
conv2d_181 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_182 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_185 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
conv2d_186 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
average_pooling2d_17 (AveragePo (None, 3, 3, 2048)   0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_179 (Conv2D)             (None, 3, 3, 320)    655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_181 (BatchN (None, 3, 3, 384)    1152        conv2d_181[0][0]                 
__________________________________________________________________________________________________
batch_normalization_182 (BatchN (None, 3, 3, 384)    1152        conv2d_182[0][0]                 
__________________________________________________________________________________________________
batch_normalization_185 (BatchN (None, 3, 3, 384)    1152        conv2d_185[0][0]                 
__________________________________________________________________________________________________
batch_normalization_186 (BatchN (None, 3, 3, 384)    1152        conv2d_186[0][0]                 
__________________________________________________________________________________________________
conv2d_187 (Conv2D)             (None, 3, 3, 192)    393216      average_pooling2d_17[0][0]       
__________________________________________________________________________________________________
batch_normalization_179 (BatchN (None, 3, 3, 320)    960         conv2d_179[0][0]                 
__________________________________________________________________________________________________
activation_181 (Activation)     (None, 3, 3, 384)    0           batch_normalization_181[0][0]    
__________________________________________________________________________________________________
activation_182 (Activation)     (None, 3, 3, 384)    0           batch_normalization_182[0][0]    
__________________________________________________________________________________________________
activation_185 (Activation)     (None, 3, 3, 384)    0           batch_normalization_185[0][0]    
__________________________________________________________________________________________________
activation_186 (Activation)     (None, 3, 3, 384)    0           batch_normalization_186[0][0]    
__________________________________________________________________________________________________
batch_normalization_187 (BatchN (None, 3, 3, 192)    576         conv2d_187[0][0]                 
__________________________________________________________________________________________________
activation_179 (Activation)     (None, 3, 3, 320)    0           batch_normalization_179[0][0]    
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, 3, 3, 768)    0           activation_181[0][0]             
                                                                 activation_182[0][0]             
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 3, 3, 768)    0           activation_185[0][0]             
                                                                 activation_186[0][0]             
__________________________________________________________________________________________________
activation_187 (Activation)     (None, 3, 3, 192)    0           batch_normalization_187[0][0]    
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, 3, 3, 2048)   0           activation_179[0][0]             
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_3[0][0]              
                                                                 activation_187[0][0]             
==================================================================================================
Total params: 21,802,784
Trainable params: 0
Non-trainable params: 21,802,784
__________________________________________________________________________________________________
None

That is a big network.

Create the Output Layers

x = layers.GlobalAveragePooling2D()(base_model.output)
x = layers.Dense(1024, activation="relu")(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(1, activation="sigmoid")(x)

Now build the model combining the pre-built layer with a Dense layer (that we're going to train). Since we only have two classes the activation function is the sigmoid.

model = tensorflow.keras.Model(
    base_model.input,
    x,
)
print(model.summary())
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            [(None, 150, 150, 3) 0                                            
__________________________________________________________________________________________________
conv2d_94 (Conv2D)              (None, 74, 74, 32)   864         input_2[0][0]                    
__________________________________________________________________________________________________
batch_normalization_94 (BatchNo (None, 74, 74, 32)   96          conv2d_94[0][0]                  
__________________________________________________________________________________________________
activation_94 (Activation)      (None, 74, 74, 32)   0           batch_normalization_94[0][0]     
__________________________________________________________________________________________________
conv2d_95 (Conv2D)              (None, 72, 72, 32)   9216        activation_94[0][0]              
__________________________________________________________________________________________________
batch_normalization_95 (BatchNo (None, 72, 72, 32)   96          conv2d_95[0][0]                  
__________________________________________________________________________________________________
activation_95 (Activation)      (None, 72, 72, 32)   0           batch_normalization_95[0][0]     
__________________________________________________________________________________________________
conv2d_96 (Conv2D)              (None, 72, 72, 64)   18432       activation_95[0][0]              
__________________________________________________________________________________________________
batch_normalization_96 (BatchNo (None, 72, 72, 64)   192         conv2d_96[0][0]                  
__________________________________________________________________________________________________
activation_96 (Activation)      (None, 72, 72, 64)   0           batch_normalization_96[0][0]     
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 35, 35, 64)   0           activation_96[0][0]              
__________________________________________________________________________________________________
conv2d_97 (Conv2D)              (None, 35, 35, 80)   5120        max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
batch_normalization_97 (BatchNo (None, 35, 35, 80)   240         conv2d_97[0][0]                  
__________________________________________________________________________________________________
activation_97 (Activation)      (None, 35, 35, 80)   0           batch_normalization_97[0][0]     
__________________________________________________________________________________________________
conv2d_98 (Conv2D)              (None, 33, 33, 192)  138240      activation_97[0][0]              
__________________________________________________________________________________________________
batch_normalization_98 (BatchNo (None, 33, 33, 192)  576         conv2d_98[0][0]                  
__________________________________________________________________________________________________
activation_98 (Activation)      (None, 33, 33, 192)  0           batch_normalization_98[0][0]     
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 16, 16, 192)  0           activation_98[0][0]              
__________________________________________________________________________________________________
conv2d_102 (Conv2D)             (None, 16, 16, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
batch_normalization_102 (BatchN (None, 16, 16, 64)   192         conv2d_102[0][0]                 
__________________________________________________________________________________________________
activation_102 (Activation)     (None, 16, 16, 64)   0           batch_normalization_102[0][0]    
__________________________________________________________________________________________________
conv2d_100 (Conv2D)             (None, 16, 16, 48)   9216        max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_103 (Conv2D)             (None, 16, 16, 96)   55296       activation_102[0][0]             
__________________________________________________________________________________________________
batch_normalization_100 (BatchN (None, 16, 16, 48)   144         conv2d_100[0][0]                 
__________________________________________________________________________________________________
batch_normalization_103 (BatchN (None, 16, 16, 96)   288         conv2d_103[0][0]                 
__________________________________________________________________________________________________
activation_100 (Activation)     (None, 16, 16, 48)   0           batch_normalization_100[0][0]    
__________________________________________________________________________________________________
activation_103 (Activation)     (None, 16, 16, 96)   0           batch_normalization_103[0][0]    
__________________________________________________________________________________________________
average_pooling2d_9 (AveragePoo (None, 16, 16, 192)  0           max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_99 (Conv2D)              (None, 16, 16, 64)   12288       max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
conv2d_101 (Conv2D)             (None, 16, 16, 64)   76800       activation_100[0][0]             
__________________________________________________________________________________________________
conv2d_104 (Conv2D)             (None, 16, 16, 96)   82944       activation_103[0][0]             
__________________________________________________________________________________________________
conv2d_105 (Conv2D)             (None, 16, 16, 32)   6144        average_pooling2d_9[0][0]        
__________________________________________________________________________________________________
batch_normalization_99 (BatchNo (None, 16, 16, 64)   192         conv2d_99[0][0]                  
__________________________________________________________________________________________________
batch_normalization_101 (BatchN (None, 16, 16, 64)   192         conv2d_101[0][0]                 
__________________________________________________________________________________________________
batch_normalization_104 (BatchN (None, 16, 16, 96)   288         conv2d_104[0][0]                 
__________________________________________________________________________________________________
batch_normalization_105 (BatchN (None, 16, 16, 32)   96          conv2d_105[0][0]                 
__________________________________________________________________________________________________
activation_99 (Activation)      (None, 16, 16, 64)   0           batch_normalization_99[0][0]     
__________________________________________________________________________________________________
activation_101 (Activation)     (None, 16, 16, 64)   0           batch_normalization_101[0][0]    
__________________________________________________________________________________________________
activation_104 (Activation)     (None, 16, 16, 96)   0           batch_normalization_104[0][0]    
__________________________________________________________________________________________________
activation_105 (Activation)     (None, 16, 16, 32)   0           batch_normalization_105[0][0]    
__________________________________________________________________________________________________
mixed0 (Concatenate)            (None, 16, 16, 256)  0           activation_99[0][0]              
                                                                 activation_101[0][0]             
                                                                 activation_104[0][0]             
                                                                 activation_105[0][0]             
__________________________________________________________________________________________________
conv2d_109 (Conv2D)             (None, 16, 16, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
batch_normalization_109 (BatchN (None, 16, 16, 64)   192         conv2d_109[0][0]                 
__________________________________________________________________________________________________
activation_109 (Activation)     (None, 16, 16, 64)   0           batch_normalization_109[0][0]    
__________________________________________________________________________________________________
conv2d_107 (Conv2D)             (None, 16, 16, 48)   12288       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_110 (Conv2D)             (None, 16, 16, 96)   55296       activation_109[0][0]             
__________________________________________________________________________________________________
batch_normalization_107 (BatchN (None, 16, 16, 48)   144         conv2d_107[0][0]                 
__________________________________________________________________________________________________
batch_normalization_110 (BatchN (None, 16, 16, 96)   288         conv2d_110[0][0]                 
__________________________________________________________________________________________________
activation_107 (Activation)     (None, 16, 16, 48)   0           batch_normalization_107[0][0]    
__________________________________________________________________________________________________
activation_110 (Activation)     (None, 16, 16, 96)   0           batch_normalization_110[0][0]    
__________________________________________________________________________________________________
average_pooling2d_10 (AveragePo (None, 16, 16, 256)  0           mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_106 (Conv2D)             (None, 16, 16, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_108 (Conv2D)             (None, 16, 16, 64)   76800       activation_107[0][0]             
__________________________________________________________________________________________________
conv2d_111 (Conv2D)             (None, 16, 16, 96)   82944       activation_110[0][0]             
__________________________________________________________________________________________________
conv2d_112 (Conv2D)             (None, 16, 16, 64)   16384       average_pooling2d_10[0][0]       
__________________________________________________________________________________________________
batch_normalization_106 (BatchN (None, 16, 16, 64)   192         conv2d_106[0][0]                 
__________________________________________________________________________________________________
batch_normalization_108 (BatchN (None, 16, 16, 64)   192         conv2d_108[0][0]                 
__________________________________________________________________________________________________
batch_normalization_111 (BatchN (None, 16, 16, 96)   288         conv2d_111[0][0]                 
__________________________________________________________________________________________________
batch_normalization_112 (BatchN (None, 16, 16, 64)   192         conv2d_112[0][0]                 
__________________________________________________________________________________________________
activation_106 (Activation)     (None, 16, 16, 64)   0           batch_normalization_106[0][0]    
__________________________________________________________________________________________________
activation_108 (Activation)     (None, 16, 16, 64)   0           batch_normalization_108[0][0]    
__________________________________________________________________________________________________
activation_111 (Activation)     (None, 16, 16, 96)   0           batch_normalization_111[0][0]    
__________________________________________________________________________________________________
activation_112 (Activation)     (None, 16, 16, 64)   0           batch_normalization_112[0][0]    
__________________________________________________________________________________________________
mixed1 (Concatenate)            (None, 16, 16, 288)  0           activation_106[0][0]             
                                                                 activation_108[0][0]             
                                                                 activation_111[0][0]             
                                                                 activation_112[0][0]             
__________________________________________________________________________________________________
conv2d_116 (Conv2D)             (None, 16, 16, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
batch_normalization_116 (BatchN (None, 16, 16, 64)   192         conv2d_116[0][0]                 
__________________________________________________________________________________________________
activation_116 (Activation)     (None, 16, 16, 64)   0           batch_normalization_116[0][0]    
__________________________________________________________________________________________________
conv2d_114 (Conv2D)             (None, 16, 16, 48)   13824       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_117 (Conv2D)             (None, 16, 16, 96)   55296       activation_116[0][0]             
__________________________________________________________________________________________________
batch_normalization_114 (BatchN (None, 16, 16, 48)   144         conv2d_114[0][0]                 
__________________________________________________________________________________________________
batch_normalization_117 (BatchN (None, 16, 16, 96)   288         conv2d_117[0][0]                 
__________________________________________________________________________________________________
activation_114 (Activation)     (None, 16, 16, 48)   0           batch_normalization_114[0][0]    
__________________________________________________________________________________________________
activation_117 (Activation)     (None, 16, 16, 96)   0           batch_normalization_117[0][0]    
__________________________________________________________________________________________________
average_pooling2d_11 (AveragePo (None, 16, 16, 288)  0           mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_113 (Conv2D)             (None, 16, 16, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_115 (Conv2D)             (None, 16, 16, 64)   76800       activation_114[0][0]             
__________________________________________________________________________________________________
conv2d_118 (Conv2D)             (None, 16, 16, 96)   82944       activation_117[0][0]             
__________________________________________________________________________________________________
conv2d_119 (Conv2D)             (None, 16, 16, 64)   18432       average_pooling2d_11[0][0]       
__________________________________________________________________________________________________
batch_normalization_113 (BatchN (None, 16, 16, 64)   192         conv2d_113[0][0]                 
__________________________________________________________________________________________________
batch_normalization_115 (BatchN (None, 16, 16, 64)   192         conv2d_115[0][0]                 
__________________________________________________________________________________________________
batch_normalization_118 (BatchN (None, 16, 16, 96)   288         conv2d_118[0][0]                 
__________________________________________________________________________________________________
batch_normalization_119 (BatchN (None, 16, 16, 64)   192         conv2d_119[0][0]                 
__________________________________________________________________________________________________
activation_113 (Activation)     (None, 16, 16, 64)   0           batch_normalization_113[0][0]    
__________________________________________________________________________________________________
activation_115 (Activation)     (None, 16, 16, 64)   0           batch_normalization_115[0][0]    
__________________________________________________________________________________________________
activation_118 (Activation)     (None, 16, 16, 96)   0           batch_normalization_118[0][0]    
__________________________________________________________________________________________________
activation_119 (Activation)     (None, 16, 16, 64)   0           batch_normalization_119[0][0]    
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, 16, 16, 288)  0           activation_113[0][0]             
                                                                 activation_115[0][0]             
                                                                 activation_118[0][0]             
                                                                 activation_119[0][0]             
__________________________________________________________________________________________________
conv2d_121 (Conv2D)             (None, 16, 16, 64)   18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_121 (BatchN (None, 16, 16, 64)   192         conv2d_121[0][0]                 
__________________________________________________________________________________________________
activation_121 (Activation)     (None, 16, 16, 64)   0           batch_normalization_121[0][0]    
__________________________________________________________________________________________________
conv2d_122 (Conv2D)             (None, 16, 16, 96)   55296       activation_121[0][0]             
__________________________________________________________________________________________________
batch_normalization_122 (BatchN (None, 16, 16, 96)   288         conv2d_122[0][0]                 
__________________________________________________________________________________________________
activation_122 (Activation)     (None, 16, 16, 96)   0           batch_normalization_122[0][0]    
__________________________________________________________________________________________________
conv2d_120 (Conv2D)             (None, 7, 7, 384)    995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_123 (Conv2D)             (None, 7, 7, 96)     82944       activation_122[0][0]             
__________________________________________________________________________________________________
batch_normalization_120 (BatchN (None, 7, 7, 384)    1152        conv2d_120[0][0]                 
__________________________________________________________________________________________________
batch_normalization_123 (BatchN (None, 7, 7, 96)     288         conv2d_123[0][0]                 
__________________________________________________________________________________________________
activation_120 (Activation)     (None, 7, 7, 384)    0           batch_normalization_120[0][0]    
__________________________________________________________________________________________________
activation_123 (Activation)     (None, 7, 7, 96)     0           batch_normalization_123[0][0]    
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 7, 7, 288)    0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, 7, 7, 768)    0           activation_120[0][0]             
                                                                 activation_123[0][0]             
                                                                 max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
conv2d_128 (Conv2D)             (None, 7, 7, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_128 (BatchN (None, 7, 7, 128)    384         conv2d_128[0][0]                 
__________________________________________________________________________________________________
activation_128 (Activation)     (None, 7, 7, 128)    0           batch_normalization_128[0][0]    
__________________________________________________________________________________________________
conv2d_129 (Conv2D)             (None, 7, 7, 128)    114688      activation_128[0][0]             
__________________________________________________________________________________________________
batch_normalization_129 (BatchN (None, 7, 7, 128)    384         conv2d_129[0][0]                 
__________________________________________________________________________________________________
activation_129 (Activation)     (None, 7, 7, 128)    0           batch_normalization_129[0][0]    
__________________________________________________________________________________________________
conv2d_125 (Conv2D)             (None, 7, 7, 128)    98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_130 (Conv2D)             (None, 7, 7, 128)    114688      activation_129[0][0]             
__________________________________________________________________________________________________
batch_normalization_125 (BatchN (None, 7, 7, 128)    384         conv2d_125[0][0]                 
__________________________________________________________________________________________________
batch_normalization_130 (BatchN (None, 7, 7, 128)    384         conv2d_130[0][0]                 
__________________________________________________________________________________________________
activation_125 (Activation)     (None, 7, 7, 128)    0           batch_normalization_125[0][0]    
__________________________________________________________________________________________________
activation_130 (Activation)     (None, 7, 7, 128)    0           batch_normalization_130[0][0]    
__________________________________________________________________________________________________
conv2d_126 (Conv2D)             (None, 7, 7, 128)    114688      activation_125[0][0]             
__________________________________________________________________________________________________
conv2d_131 (Conv2D)             (None, 7, 7, 128)    114688      activation_130[0][0]             
__________________________________________________________________________________________________
batch_normalization_126 (BatchN (None, 7, 7, 128)    384         conv2d_126[0][0]                 
__________________________________________________________________________________________________
batch_normalization_131 (BatchN (None, 7, 7, 128)    384         conv2d_131[0][0]                 
__________________________________________________________________________________________________
activation_126 (Activation)     (None, 7, 7, 128)    0           batch_normalization_126[0][0]    
__________________________________________________________________________________________________
activation_131 (Activation)     (None, 7, 7, 128)    0           batch_normalization_131[0][0]    
__________________________________________________________________________________________________
average_pooling2d_12 (AveragePo (None, 7, 7, 768)    0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_124 (Conv2D)             (None, 7, 7, 192)    147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_127 (Conv2D)             (None, 7, 7, 192)    172032      activation_126[0][0]             
__________________________________________________________________________________________________
conv2d_132 (Conv2D)             (None, 7, 7, 192)    172032      activation_131[0][0]             
__________________________________________________________________________________________________
conv2d_133 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_12[0][0]       
__________________________________________________________________________________________________
batch_normalization_124 (BatchN (None, 7, 7, 192)    576         conv2d_124[0][0]                 
__________________________________________________________________________________________________
batch_normalization_127 (BatchN (None, 7, 7, 192)    576         conv2d_127[0][0]                 
__________________________________________________________________________________________________
batch_normalization_132 (BatchN (None, 7, 7, 192)    576         conv2d_132[0][0]                 
__________________________________________________________________________________________________
batch_normalization_133 (BatchN (None, 7, 7, 192)    576         conv2d_133[0][0]                 
__________________________________________________________________________________________________
activation_124 (Activation)     (None, 7, 7, 192)    0           batch_normalization_124[0][0]    
__________________________________________________________________________________________________
activation_127 (Activation)     (None, 7, 7, 192)    0           batch_normalization_127[0][0]    
__________________________________________________________________________________________________
activation_132 (Activation)     (None, 7, 7, 192)    0           batch_normalization_132[0][0]    
__________________________________________________________________________________________________
activation_133 (Activation)     (None, 7, 7, 192)    0           batch_normalization_133[0][0]    
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, 7, 7, 768)    0           activation_124[0][0]             
                                                                 activation_127[0][0]             
                                                                 activation_132[0][0]             
                                                                 activation_133[0][0]             
__________________________________________________________________________________________________
conv2d_138 (Conv2D)             (None, 7, 7, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_138 (BatchN (None, 7, 7, 160)    480         conv2d_138[0][0]                 
__________________________________________________________________________________________________
activation_138 (Activation)     (None, 7, 7, 160)    0           batch_normalization_138[0][0]    
__________________________________________________________________________________________________
conv2d_139 (Conv2D)             (None, 7, 7, 160)    179200      activation_138[0][0]             
__________________________________________________________________________________________________
batch_normalization_139 (BatchN (None, 7, 7, 160)    480         conv2d_139[0][0]                 
__________________________________________________________________________________________________
activation_139 (Activation)     (None, 7, 7, 160)    0           batch_normalization_139[0][0]    
__________________________________________________________________________________________________
conv2d_135 (Conv2D)             (None, 7, 7, 160)    122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_140 (Conv2D)             (None, 7, 7, 160)    179200      activation_139[0][0]             
__________________________________________________________________________________________________
batch_normalization_135 (BatchN (None, 7, 7, 160)    480         conv2d_135[0][0]                 
__________________________________________________________________________________________________
batch_normalization_140 (BatchN (None, 7, 7, 160)    480         conv2d_140[0][0]                 
__________________________________________________________________________________________________
activation_135 (Activation)     (None, 7, 7, 160)    0           batch_normalization_135[0][0]    
__________________________________________________________________________________________________
activation_140 (Activation)     (None, 7, 7, 160)    0           batch_normalization_140[0][0]    
__________________________________________________________________________________________________
conv2d_136 (Conv2D)             (None, 7, 7, 160)    179200      activation_135[0][0]             
__________________________________________________________________________________________________
conv2d_141 (Conv2D)             (None, 7, 7, 160)    179200      activation_140[0][0]             
__________________________________________________________________________________________________
batch_normalization_136 (BatchN (None, 7, 7, 160)    480         conv2d_136[0][0]                 
__________________________________________________________________________________________________
batch_normalization_141 (BatchN (None, 7, 7, 160)    480         conv2d_141[0][0]                 
__________________________________________________________________________________________________
activation_136 (Activation)     (None, 7, 7, 160)    0           batch_normalization_136[0][0]    
__________________________________________________________________________________________________
activation_141 (Activation)     (None, 7, 7, 160)    0           batch_normalization_141[0][0]    
__________________________________________________________________________________________________
average_pooling2d_13 (AveragePo (None, 7, 7, 768)    0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_134 (Conv2D)             (None, 7, 7, 192)    147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_137 (Conv2D)             (None, 7, 7, 192)    215040      activation_136[0][0]             
__________________________________________________________________________________________________
conv2d_142 (Conv2D)             (None, 7, 7, 192)    215040      activation_141[0][0]             
__________________________________________________________________________________________________
conv2d_143 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_13[0][0]       
__________________________________________________________________________________________________
batch_normalization_134 (BatchN (None, 7, 7, 192)    576         conv2d_134[0][0]                 
__________________________________________________________________________________________________
batch_normalization_137 (BatchN (None, 7, 7, 192)    576         conv2d_137[0][0]                 
__________________________________________________________________________________________________
batch_normalization_142 (BatchN (None, 7, 7, 192)    576         conv2d_142[0][0]                 
__________________________________________________________________________________________________
batch_normalization_143 (BatchN (None, 7, 7, 192)    576         conv2d_143[0][0]                 
__________________________________________________________________________________________________
activation_134 (Activation)     (None, 7, 7, 192)    0           batch_normalization_134[0][0]    
__________________________________________________________________________________________________
activation_137 (Activation)     (None, 7, 7, 192)    0           batch_normalization_137[0][0]    
__________________________________________________________________________________________________
activation_142 (Activation)     (None, 7, 7, 192)    0           batch_normalization_142[0][0]    
__________________________________________________________________________________________________
activation_143 (Activation)     (None, 7, 7, 192)    0           batch_normalization_143[0][0]    
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, 7, 7, 768)    0           activation_134[0][0]             
                                                                 activation_137[0][0]             
                                                                 activation_142[0][0]             
                                                                 activation_143[0][0]             
__________________________________________________________________________________________________
conv2d_148 (Conv2D)             (None, 7, 7, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_148 (BatchN (None, 7, 7, 160)    480         conv2d_148[0][0]                 
__________________________________________________________________________________________________
activation_148 (Activation)     (None, 7, 7, 160)    0           batch_normalization_148[0][0]    
__________________________________________________________________________________________________
conv2d_149 (Conv2D)             (None, 7, 7, 160)    179200      activation_148[0][0]             
__________________________________________________________________________________________________
batch_normalization_149 (BatchN (None, 7, 7, 160)    480         conv2d_149[0][0]                 
__________________________________________________________________________________________________
activation_149 (Activation)     (None, 7, 7, 160)    0           batch_normalization_149[0][0]    
__________________________________________________________________________________________________
conv2d_145 (Conv2D)             (None, 7, 7, 160)    122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_150 (Conv2D)             (None, 7, 7, 160)    179200      activation_149[0][0]             
__________________________________________________________________________________________________
batch_normalization_145 (BatchN (None, 7, 7, 160)    480         conv2d_145[0][0]                 
__________________________________________________________________________________________________
batch_normalization_150 (BatchN (None, 7, 7, 160)    480         conv2d_150[0][0]                 
__________________________________________________________________________________________________
activation_145 (Activation)     (None, 7, 7, 160)    0           batch_normalization_145[0][0]    
__________________________________________________________________________________________________
activation_150 (Activation)     (None, 7, 7, 160)    0           batch_normalization_150[0][0]    
__________________________________________________________________________________________________
conv2d_146 (Conv2D)             (None, 7, 7, 160)    179200      activation_145[0][0]             
__________________________________________________________________________________________________
conv2d_151 (Conv2D)             (None, 7, 7, 160)    179200      activation_150[0][0]             
__________________________________________________________________________________________________
batch_normalization_146 (BatchN (None, 7, 7, 160)    480         conv2d_146[0][0]                 
__________________________________________________________________________________________________
batch_normalization_151 (BatchN (None, 7, 7, 160)    480         conv2d_151[0][0]                 
__________________________________________________________________________________________________
activation_146 (Activation)     (None, 7, 7, 160)    0           batch_normalization_146[0][0]    
__________________________________________________________________________________________________
activation_151 (Activation)     (None, 7, 7, 160)    0           batch_normalization_151[0][0]    
__________________________________________________________________________________________________
average_pooling2d_14 (AveragePo (None, 7, 7, 768)    0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_144 (Conv2D)             (None, 7, 7, 192)    147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_147 (Conv2D)             (None, 7, 7, 192)    215040      activation_146[0][0]             
__________________________________________________________________________________________________
conv2d_152 (Conv2D)             (None, 7, 7, 192)    215040      activation_151[0][0]             
__________________________________________________________________________________________________
conv2d_153 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_14[0][0]       
__________________________________________________________________________________________________
batch_normalization_144 (BatchN (None, 7, 7, 192)    576         conv2d_144[0][0]                 
__________________________________________________________________________________________________
batch_normalization_147 (BatchN (None, 7, 7, 192)    576         conv2d_147[0][0]                 
__________________________________________________________________________________________________
batch_normalization_152 (BatchN (None, 7, 7, 192)    576         conv2d_152[0][0]                 
__________________________________________________________________________________________________
batch_normalization_153 (BatchN (None, 7, 7, 192)    576         conv2d_153[0][0]                 
__________________________________________________________________________________________________
activation_144 (Activation)     (None, 7, 7, 192)    0           batch_normalization_144[0][0]    
__________________________________________________________________________________________________
activation_147 (Activation)     (None, 7, 7, 192)    0           batch_normalization_147[0][0]    
__________________________________________________________________________________________________
activation_152 (Activation)     (None, 7, 7, 192)    0           batch_normalization_152[0][0]    
__________________________________________________________________________________________________
activation_153 (Activation)     (None, 7, 7, 192)    0           batch_normalization_153[0][0]    
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, 7, 7, 768)    0           activation_144[0][0]             
                                                                 activation_147[0][0]             
                                                                 activation_152[0][0]             
                                                                 activation_153[0][0]             
__________________________________________________________________________________________________
conv2d_158 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_158 (BatchN (None, 7, 7, 192)    576         conv2d_158[0][0]                 
__________________________________________________________________________________________________
activation_158 (Activation)     (None, 7, 7, 192)    0           batch_normalization_158[0][0]    
__________________________________________________________________________________________________
conv2d_159 (Conv2D)             (None, 7, 7, 192)    258048      activation_158[0][0]             
__________________________________________________________________________________________________
batch_normalization_159 (BatchN (None, 7, 7, 192)    576         conv2d_159[0][0]                 
__________________________________________________________________________________________________
activation_159 (Activation)     (None, 7, 7, 192)    0           batch_normalization_159[0][0]    
__________________________________________________________________________________________________
conv2d_155 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_160 (Conv2D)             (None, 7, 7, 192)    258048      activation_159[0][0]             
__________________________________________________________________________________________________
batch_normalization_155 (BatchN (None, 7, 7, 192)    576         conv2d_155[0][0]                 
__________________________________________________________________________________________________
batch_normalization_160 (BatchN (None, 7, 7, 192)    576         conv2d_160[0][0]                 
__________________________________________________________________________________________________
activation_155 (Activation)     (None, 7, 7, 192)    0           batch_normalization_155[0][0]    
__________________________________________________________________________________________________
activation_160 (Activation)     (None, 7, 7, 192)    0           batch_normalization_160[0][0]    
__________________________________________________________________________________________________
conv2d_156 (Conv2D)             (None, 7, 7, 192)    258048      activation_155[0][0]             
__________________________________________________________________________________________________
conv2d_161 (Conv2D)             (None, 7, 7, 192)    258048      activation_160[0][0]             
__________________________________________________________________________________________________
batch_normalization_156 (BatchN (None, 7, 7, 192)    576         conv2d_156[0][0]                 
__________________________________________________________________________________________________
batch_normalization_161 (BatchN (None, 7, 7, 192)    576         conv2d_161[0][0]                 
__________________________________________________________________________________________________
activation_156 (Activation)     (None, 7, 7, 192)    0           batch_normalization_156[0][0]    
__________________________________________________________________________________________________
activation_161 (Activation)     (None, 7, 7, 192)    0           batch_normalization_161[0][0]    
__________________________________________________________________________________________________
average_pooling2d_15 (AveragePo (None, 7, 7, 768)    0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_154 (Conv2D)             (None, 7, 7, 192)    147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_157 (Conv2D)             (None, 7, 7, 192)    258048      activation_156[0][0]             
__________________________________________________________________________________________________
conv2d_162 (Conv2D)             (None, 7, 7, 192)    258048      activation_161[0][0]             
__________________________________________________________________________________________________
conv2d_163 (Conv2D)             (None, 7, 7, 192)    147456      average_pooling2d_15[0][0]       
__________________________________________________________________________________________________
batch_normalization_154 (BatchN (None, 7, 7, 192)    576         conv2d_154[0][0]                 
__________________________________________________________________________________________________
batch_normalization_157 (BatchN (None, 7, 7, 192)    576         conv2d_157[0][0]                 
__________________________________________________________________________________________________
batch_normalization_162 (BatchN (None, 7, 7, 192)    576         conv2d_162[0][0]                 
__________________________________________________________________________________________________
batch_normalization_163 (BatchN (None, 7, 7, 192)    576         conv2d_163[0][0]                 
__________________________________________________________________________________________________
activation_154 (Activation)     (None, 7, 7, 192)    0           batch_normalization_154[0][0]    
__________________________________________________________________________________________________
activation_157 (Activation)     (None, 7, 7, 192)    0           batch_normalization_157[0][0]    
__________________________________________________________________________________________________
activation_162 (Activation)     (None, 7, 7, 192)    0           batch_normalization_162[0][0]    
__________________________________________________________________________________________________
activation_163 (Activation)     (None, 7, 7, 192)    0           batch_normalization_163[0][0]    
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, 7, 7, 768)    0           activation_154[0][0]             
                                                                 activation_157[0][0]             
                                                                 activation_162[0][0]             
                                                                 activation_163[0][0]             
__________________________________________________________________________________________________
conv2d_166 (Conv2D)             (None, 7, 7, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_166 (BatchN (None, 7, 7, 192)    576         conv2d_166[0][0]                 
__________________________________________________________________________________________________
activation_166 (Activation)     (None, 7, 7, 192)    0           batch_normalization_166[0][0]    
__________________________________________________________________________________________________
conv2d_167 (Conv2D)             (None, 7, 7, 192)    258048      activation_166[0][0]             
__________________________________________________________________________________________________
batch_normalization_167 (BatchN (None, 7, 7, 192)    576         conv2d_167[0][0]                 
__________________________________________________________________________________________________
activation_167 (Activation)     (None, 7, 7, 192)    0           batch_normalization_167[0][0]    
__________________________________________________________________________________________________
conv2d_164 (Conv2D)             (None, 7, 7, 192)    147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_168 (Conv2D)             (None, 7, 7, 192)    258048      activation_167[0][0]             
__________________________________________________________________________________________________
batch_normalization_164 (BatchN (None, 7, 7, 192)    576         conv2d_164[0][0]                 
__________________________________________________________________________________________________
batch_normalization_168 (BatchN (None, 7, 7, 192)    576         conv2d_168[0][0]                 
__________________________________________________________________________________________________
activation_164 (Activation)     (None, 7, 7, 192)    0           batch_normalization_164[0][0]    
__________________________________________________________________________________________________
activation_168 (Activation)     (None, 7, 7, 192)    0           batch_normalization_168[0][0]    
__________________________________________________________________________________________________
conv2d_165 (Conv2D)             (None, 3, 3, 320)    552960      activation_164[0][0]             
__________________________________________________________________________________________________
conv2d_169 (Conv2D)             (None, 3, 3, 192)    331776      activation_168[0][0]             
__________________________________________________________________________________________________
batch_normalization_165 (BatchN (None, 3, 3, 320)    960         conv2d_165[0][0]                 
__________________________________________________________________________________________________
batch_normalization_169 (BatchN (None, 3, 3, 192)    576         conv2d_169[0][0]                 
__________________________________________________________________________________________________
activation_165 (Activation)     (None, 3, 3, 320)    0           batch_normalization_165[0][0]    
__________________________________________________________________________________________________
activation_169 (Activation)     (None, 3, 3, 192)    0           batch_normalization_169[0][0]    
__________________________________________________________________________________________________
max_pooling2d_7 (MaxPooling2D)  (None, 3, 3, 768)    0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, 3, 3, 1280)   0           activation_165[0][0]             
                                                                 activation_169[0][0]             
                                                                 max_pooling2d_7[0][0]            
__________________________________________________________________________________________________
conv2d_174 (Conv2D)             (None, 3, 3, 448)    573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_174 (BatchN (None, 3, 3, 448)    1344        conv2d_174[0][0]                 
__________________________________________________________________________________________________
activation_174 (Activation)     (None, 3, 3, 448)    0           batch_normalization_174[0][0]    
__________________________________________________________________________________________________
conv2d_171 (Conv2D)             (None, 3, 3, 384)    491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_175 (Conv2D)             (None, 3, 3, 384)    1548288     activation_174[0][0]             
__________________________________________________________________________________________________
batch_normalization_171 (BatchN (None, 3, 3, 384)    1152        conv2d_171[0][0]                 
__________________________________________________________________________________________________
batch_normalization_175 (BatchN (None, 3, 3, 384)    1152        conv2d_175[0][0]                 
__________________________________________________________________________________________________
activation_171 (Activation)     (None, 3, 3, 384)    0           batch_normalization_171[0][0]    
__________________________________________________________________________________________________
activation_175 (Activation)     (None, 3, 3, 384)    0           batch_normalization_175[0][0]    
__________________________________________________________________________________________________
conv2d_172 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_173 (Conv2D)             (None, 3, 3, 384)    442368      activation_171[0][0]             
__________________________________________________________________________________________________
conv2d_176 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
conv2d_177 (Conv2D)             (None, 3, 3, 384)    442368      activation_175[0][0]             
__________________________________________________________________________________________________
average_pooling2d_16 (AveragePo (None, 3, 3, 1280)   0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_170 (Conv2D)             (None, 3, 3, 320)    409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_172 (BatchN (None, 3, 3, 384)    1152        conv2d_172[0][0]                 
__________________________________________________________________________________________________
batch_normalization_173 (BatchN (None, 3, 3, 384)    1152        conv2d_173[0][0]                 
__________________________________________________________________________________________________
batch_normalization_176 (BatchN (None, 3, 3, 384)    1152        conv2d_176[0][0]                 
__________________________________________________________________________________________________
batch_normalization_177 (BatchN (None, 3, 3, 384)    1152        conv2d_177[0][0]                 
__________________________________________________________________________________________________
conv2d_178 (Conv2D)             (None, 3, 3, 192)    245760      average_pooling2d_16[0][0]       
__________________________________________________________________________________________________
batch_normalization_170 (BatchN (None, 3, 3, 320)    960         conv2d_170[0][0]                 
__________________________________________________________________________________________________
activation_172 (Activation)     (None, 3, 3, 384)    0           batch_normalization_172[0][0]    
__________________________________________________________________________________________________
activation_173 (Activation)     (None, 3, 3, 384)    0           batch_normalization_173[0][0]    
__________________________________________________________________________________________________
activation_176 (Activation)     (None, 3, 3, 384)    0           batch_normalization_176[0][0]    
__________________________________________________________________________________________________
activation_177 (Activation)     (None, 3, 3, 384)    0           batch_normalization_177[0][0]    
__________________________________________________________________________________________________
batch_normalization_178 (BatchN (None, 3, 3, 192)    576         conv2d_178[0][0]                 
__________________________________________________________________________________________________
activation_170 (Activation)     (None, 3, 3, 320)    0           batch_normalization_170[0][0]    
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, 3, 3, 768)    0           activation_172[0][0]             
                                                                 activation_173[0][0]             
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 3, 3, 768)    0           activation_176[0][0]             
                                                                 activation_177[0][0]             
__________________________________________________________________________________________________
activation_178 (Activation)     (None, 3, 3, 192)    0           batch_normalization_178[0][0]    
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, 3, 3, 2048)   0           activation_170[0][0]             
                                                                 mixed9_0[0][0]                   
                                                                 concatenate_2[0][0]              
                                                                 activation_178[0][0]             
__________________________________________________________________________________________________
conv2d_183 (Conv2D)             (None, 3, 3, 448)    917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_183 (BatchN (None, 3, 3, 448)    1344        conv2d_183[0][0]                 
__________________________________________________________________________________________________
activation_183 (Activation)     (None, 3, 3, 448)    0           batch_normalization_183[0][0]    
__________________________________________________________________________________________________
conv2d_180 (Conv2D)             (None, 3, 3, 384)    786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_184 (Conv2D)             (None, 3, 3, 384)    1548288     activation_183[0][0]             
__________________________________________________________________________________________________
batch_normalization_180 (BatchN (None, 3, 3, 384)    1152        conv2d_180[0][0]                 
__________________________________________________________________________________________________
batch_normalization_184 (BatchN (None, 3, 3, 384)    1152        conv2d_184[0][0]                 
__________________________________________________________________________________________________
activation_180 (Activation)     (None, 3, 3, 384)    0           batch_normalization_180[0][0]    
__________________________________________________________________________________________________
activation_184 (Activation)     (None, 3, 3, 384)    0           batch_normalization_184[0][0]    
__________________________________________________________________________________________________
conv2d_181 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_182 (Conv2D)             (None, 3, 3, 384)    442368      activation_180[0][0]             
__________________________________________________________________________________________________
conv2d_185 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
conv2d_186 (Conv2D)             (None, 3, 3, 384)    442368      activation_184[0][0]             
__________________________________________________________________________________________________
average_pooling2d_17 (AveragePo (None, 3, 3, 2048)   0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_179 (Conv2D)             (None, 3, 3, 320)    655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_181 (BatchN (None, 3, 3, 384)    1152        conv2d_181[0][0]                 
__________________________________________________________________________________________________
batch_normalization_182 (BatchN (None, 3, 3, 384)    1152        conv2d_182[0][0]                 
__________________________________________________________________________________________________
batch_normalization_185 (BatchN (None, 3, 3, 384)    1152        conv2d_185[0][0]                 
__________________________________________________________________________________________________
batch_normalization_186 (BatchN (None, 3, 3, 384)    1152        conv2d_186[0][0]                 
__________________________________________________________________________________________________
conv2d_187 (Conv2D)             (None, 3, 3, 192)    393216      average_pooling2d_17[0][0]       
__________________________________________________________________________________________________
batch_normalization_179 (BatchN (None, 3, 3, 320)    960         conv2d_179[0][0]                 
__________________________________________________________________________________________________
activation_181 (Activation)     (None, 3, 3, 384)    0           batch_normalization_181[0][0]    
__________________________________________________________________________________________________
activation_182 (Activation)     (None, 3, 3, 384)    0           batch_normalization_182[0][0]    
__________________________________________________________________________________________________
activation_185 (Activation)     (None, 3, 3, 384)    0           batch_normalization_185[0][0]    
__________________________________________________________________________________________________
activation_186 (Activation)     (None, 3, 3, 384)    0           batch_normalization_186[0][0]    
__________________________________________________________________________________________________
batch_normalization_187 (BatchN (None, 3, 3, 192)    576         conv2d_187[0][0]                 
__________________________________________________________________________________________________
activation_179 (Activation)     (None, 3, 3, 320)    0           batch_normalization_179[0][0]    
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, 3, 3, 768)    0           activation_181[0][0]             
                                                                 activation_182[0][0]             
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 3, 3, 768)    0           activation_185[0][0]             
                                                                 activation_186[0][0]             
__________________________________________________________________________________________________
activation_187 (Activation)     (None, 3, 3, 192)    0           batch_normalization_187[0][0]    
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, 3, 3, 2048)   0           activation_179[0][0]             
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_3[0][0]              
                                                                 activation_187[0][0]             
__________________________________________________________________________________________________
global_average_pooling2d_1 (Glo (None, 2048)         0           mixed10[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 1024)         2098176     global_average_pooling2d_1[0][0] 
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 1024)         0           dense_2[0][0]                    
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 1)            1025        dropout_1[0][0]                  
==================================================================================================
Total params: 23,901,985
Trainable params: 2,099,201
Non-trainable params: 21,802,784
__________________________________________________________________________________________________
None

Compile the Model

model.compile(optimizer = RMSprop(lr=0.0001), 
              loss = 'binary_crossentropy', 
              metrics = ['acc'])

Train the Model

A Model Saver

checkpoint = tensorflow.keras.callbacks.ModelCheckpoint(
    str(MODELS/"inception_transfer.hdf5"), monitor="val_acc", verbose=1, 
    save_best_only=True)

A Data Generator

This bundles up the steps to build the data generator.

class Data:
    """creates the data generators

    Args:
     path: path to the images
     validation_split: fraction that goes to the validation set
     batch_size: size for the batches in the epochs
    """
    def __init__(self, path: str, validation_split: float=0.2,
                 batch_size: int=20) -> None:
        self.path = path
        self.validation_split = validation_split
        self.batch_size = batch_size
        self._data_generator = None
        self._testing_data_generator = None
        self._training_generator = None
        self._validation_generator = None
        return

    @property
    def data_generator(self) -> ImageDataGenerator:
        """The data generator for training and validation"""
        if self._data_generator is None:
            self._data_generator = ImageDataGenerator(
                rescale=1/255,
                rotation_range=40,
                width_shift_range=0.2,
                height_shift_range=0.2,
                horizontal_flip=True,
                shear_range=0.2,
                zoom_range=0.2,
                fill_mode="nearest",
                validation_split=self.validation_split)
        return self._data_generator

    @property
    def training_generator(self):
        """The training data generator"""
        if self._training_generator is None:
            self._training_generator = (self.data_generator
                                        .flow_from_directory)(
                                            self.path,
                                            batch_size=self.batch_size,
                                            class_mode="binary",
                                            target_size=(150, 150),
                                            subset="training",
            )
        return self._training_generator

    @property
    def validation_generator(self):
        """the validation data generator"""
        if self._validation_generator is None:
            self._validation_generator = (self.data_generator
                                          .flow_from_directory)(
                                              self.path,
                                              batch_size=self.batch_size,
                                              class_mode="binary",
                                              target_size = (150, 150),
                                              subset="validation",
            )
        return self._validation_generator

    def __str__(self) -> str:
        return (f"(Data) - Path: {self.path}, "
                f"Validation Split: {self.validation_split},"
                f"Batch Size: {self.batch_size}")

A Model Builder

class Network:
    """The model to categorize the images

    Args:
     path: path to the training data
     epochs: number of epochs to train
     batch_size: size of the batches for each epoch
     convolution_layers: layers of cnn/max-pooling
     callbacks: things to stop the training
     set_steps: whether to set the training steps-per-epoch
    """
    def __init__(self, path: str, epochs: int=15,
                 batch_size: int=128, convolution_layers: int=3,
                 set_steps: bool=True,
                 callbacks: list=None) -> None:
        self.path = path
        self.epochs = epochs
        self.batch_size = batch_size
        self.convolution_layers = convolution_layers
        self.set_steps = set_steps
        self.callbacks = callbacks
        self._data = None
        self._model = None
        self.history = None
        return

    @property
    def data(self) -> Data:
        """The data generator builder"""
        if self._data is None:
            self._data = Data(self.path, batch_size=self.batch_size)
        return self._data

    @property
    def model(self) -> tensorflow.keras.models.Sequential:
        """The neural network"""
        if self._model is None:
            self._model = tensorflow.keras.models.Sequential([
                tensorflow.keras.layers.Conv2D(
                    32, (3,3), activation='relu', 
                    input_shape=(150, 150, 3)),
                tensorflow.keras.layers.MaxPooling2D(2,2)])
            self._model.add(
                tensorflow.keras.layers.Conv2D(
                    64, (3,3), activation='relu'))
            self._model.add(
                tensorflow.keras.layers.MaxPooling2D(2,2))

            for layer in range(self.convolution_layers - 2):
                self._model.add(tensorflow.keras.layers.Conv2D(
                    128, (3,3), activation='relu'))
                self._model.add(tensorflow.keras.layers.MaxPooling2D(2,2))
            for layer in [
                    tensorflow.keras.layers.Flatten(), 
                    tensorflow.keras.layers.Dense(512, activation='relu'), 
                    tensorflow.keras.layers.Dense(1, activation='sigmoid')]:
                self._model.add(layer)

            self._model.compile(optimizer=RMSprop(lr=0.001),
                               loss='binary_crossentropy',
                               metrics = ['acc'])
        return self._model

    def summary(self) -> None:
        """Prints the model summary"""
        print(self.model.summary())
        return

    def train(self) -> None:
        """Trains the model"""
        callbacks = self.callbacks if self.callbacks else []
        arguments = dict(
            generator=self.data.training_generator,
            validation_data=self.data.validation_generator,
            epochs = self.epochs,
            callbacks = callbacks,
            verbose=2,
        )
        if self.set_steps:
            arguments["steps_per_epoch"] = int(
                self.data.training_generator.samples/self.batch_size)
            arguments["validation_steps"] = int(
                self.data.validation_generator.samples/self.batch_size)

        self.history = self.model.fit_generator(**arguments)
        return

    def __str__(self) -> str:
        return (f"(Network) - \nPath: {self.path}\n Epochs: {self.epochs}\n "
                f"Batch Size: {self.batch_size}\n Callbacks: {self.callbacks}\n"
                f"Data: {self.data}\n"
                f"Callbacks: {self.callbacks}")

Train It

network = Network(str(training_path), 
                  set_steps = True,
                  epochs = 10,
                  callbacks=[checkpoint],
                  batch_size=1)
network._model = model
with TIMER:
    network.train()
2019-08-03 19:28:04,102 graeae.timers.timer start: Started: 2019-08-03 19:28:04.102954
I0803 19:28:04.102986 139918777980736 timer.py:70] Started: 2019-08-03 19:28:04.102954
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/10

Epoch 00001: val_acc improved from -inf to 0.43660, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 615s - loss: 0.7032 - acc: 0.4977 - val_loss: 0.8069 - val_acc: 0.4366
Epoch 2/10

Epoch 00002: val_acc improved from 0.43660 to 0.43780, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 631s - loss: 0.6933 - acc: 0.5049 - val_loss: 0.7958 - val_acc: 0.4378
Epoch 3/10

Epoch 00003: val_acc did not improve from 0.43780
20000/20000 - 670s - loss: 0.6932 - acc: 0.4990 - val_loss: 0.8142 - val_acc: 0.4230
Epoch 4/10

Epoch 00004: val_acc improved from 0.43780 to 0.45020, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 666s - loss: 0.6932 - acc: 0.4990 - val_loss: 0.7856 - val_acc: 0.4502
Epoch 5/10

Epoch 00005: val_acc did not improve from 0.45020
20000/20000 - 636s - loss: 0.6932 - acc: 0.4983 - val_loss: 0.7982 - val_acc: 0.4312
Epoch 6/10

Epoch 00006: val_acc did not improve from 0.45020
20000/20000 - 618s - loss: 0.6932 - acc: 0.4999 - val_loss: 0.8018 - val_acc: 0.4326
Epoch 7/10

Epoch 00007: val_acc did not improve from 0.45020
20000/20000 - 614s - loss: 0.6932 - acc: 0.4999 - val_loss: 0.7870 - val_acc: 0.4484
Epoch 8/10

Epoch 00008: val_acc improved from 0.45020 to 0.45660, saving model to /home/athena/models/dogs-vs-cats/inception_transfer.hdf5
20000/20000 - 607s - loss: 0.6932 - acc: 0.4981 - val_loss: 0.7773 - val_acc: 0.4566
Epoch 9/10

Epoch 00009: val_acc did not improve from 0.45660
20000/20000 - 608s - loss: 0.6932 - acc: 0.4891 - val_loss: 0.7811 - val_acc: 0.4414
Epoch 10/10

Epoch 00010: val_acc did not improve from 0.45660
20000/20000 - 619s - loss: 0.6932 - acc: 0.5010 - val_loss: 0.7878 - val_acc: 0.4474
2019-08-03 21:12:49,142 graeae.timers.timer end: Ended: 2019-08-03 21:12:49.142478
I0803 21:12:49.142507 139918777980736 timer.py:77] Ended: 2019-08-03 21:12:49.142478
2019-08-03 21:12:49,143 graeae.timers.timer end: Elapsed: 1:44:45.039524
I0803 21:12:49.143225 139918777980736 timer.py:78] Elapsed: 1:44:45.039524

End

Raw

Cats and Dogs

Beginning

This extends the previous lessons to harder images. We started with monochrome handwritten digits, moved to items of clothing and then artificially generated images of horses and humans. The problem with these images is that they were artificially uniform. In real life, you subject won't be centered and isolated, most pictures are messier, and this exercise will begin with images that were taken by people in real life, and thus aren't quite as orderly as the previous images were.

The dataset we're going to use here is the Dogs vs. Cats set hosted on Kaggle. It is a subset of a larger dataset compiled to test whether computers could pass the Animal Species Image Recognition for Restricting Access (ASIRRA) which was created as an alternative form of CAPTCHA which the creators of the dataset thought would prevent computers from being able to pass the test of selecting only the cats in a set of 12 photographs of cats and dogs.

Imports

Python

from csv import DictWriter
from datetime import datetime, timedelta
from functools import partial
from pathlib import Path
import random

PyPi

from holoviews.operation.datashader import datashade
from tabulate import tabulate
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import (ImageDataGenerator,
                                                  img_to_array, 
                                                  load_img)
import holoviews
import hvplot.pandas
import keras
import matplotlib.pyplot as pyplot
import numpy
import pandas
import tensorflow

Graeae

from graeae import SubPathLoader, EmbedHoloviews, Timer

Set Up

Plotting

Embed = partial(EmbedHoloviews, 
                folder_path="../../files/posts/keras/cats-and-dogs/")
holoviews.extension("bokeh")

The Timer

TIMER = Timer()

The Datasets

environment = SubPathLoader("DATASETS")
base_path = Path(environment["DOGS_VS_CATS"]).expanduser()
for item in base_path.iterdir():
    print(item)
WARNING: Logging before flag parsing goes to stderr.
I0727 20:33:31.896675 140643102242624 environment.py:35] Environment Path: /home/athena/.env
I0727 20:33:31.958876 140643102242624 environment.py:90] Environment Path: /home/athena/.config/datasets/env
/home/athena/data/datasets/images/dogs-vs-cats/train
/home/athena/data/datasets/images/dogs-vs-cats/exercise
/home/athena/data/datasets/images/dogs-vs-cats/test1

This is the dataset downloaded from kaggle. Since it's a competition, the test1 directory has the test images without labels (although as noted on the site, you could inspect them all and label them by hand) and the training set has the labeled images. Since the convention for the neural-network libraries is to ignore the file-names and to use instead the folder names I made two sub-directories - one for dogs and one for cats - and moved the images into them.

training_path = base_path/"train"
testing_path = base_path/"test1"
for item in training_path.iterdir():
    print(item)
/home/athena/data/datasets/images/dogs-vs-cats/train/dogs
/home/athena/data/datasets/images/dogs-vs-cats/train/cats

In the example exercise the datasets were further subdivided into training and validation sets, but keras has added validation set splitting to their image generator so it shouldn't be necessary.

The Model Storage

This is just the path to the folder to store models.

MODELS = Path("~/models/dogs-vs-cats").expanduser()

Middle

Helpers

def best_validation(data: pandas.DataFrame) -> tuple:
    """Gets and prints the rows with the best validation performance"""
    best_loss = data[data.validation_loss==data.validation_loss.min()]
    best_accuracy = data[data.validation_accuracy==data.validation_accuracy.max()]
    print(f"Best Accuracy: {best_accuracy.validation_accuracy.iloc[0]:.2f} "
          f"(loss={best_accuracy.validation_loss.iloc[0]:.2f})"
          f" Epoch: {best_accuracy.index[0]}")

    print(f"Best Loss: {best_loss.validation_loss.iloc[0]:.2f} "
          f"(accuracy={best_loss.validation_accuracy.iloc[0]:.2f}, "
          f"Epoch: {best_loss.index[0]})")
    return best_accuracy, best_loss

Looking at the Cats and Dogs

How much data do we have?

cat_images = list((training_path/'cats').iterdir())
dog_images = list((training_path/'dogs').iterdir())
test_images = list(testing_path.iterdir())
print(f"Training Cat Images: {len(cat_images):,}")
print(f"Training Dog Images: {len(dog_images):,}")
print(f"Testing Images: {len(test_images):,}")
Training Cat Images: 12,500
Training Dog Images: 12,500
Testing Images: 12,500

Note that we haven't separated out the validation set yet. If we do an 80-20 split then we will have:

training_count = len(cat_images)
print(f"Training images per species: {int(training_count * .9):,}")
print(f"Validation images per species: {int(training_count * .1):,}")
Training images per species: 11,250
Validation images per species: 1,250

So 20,000 training images in total.

Looking at Some Images

height = width = 250
count = 4
columns = 4

indices = [random.randrange(len(cat_images)) for index in range(count)]
cat_plots = []
dog_plots = []
for index in indices:
    cat_plots.append(holoviews.RGB.load_image(
        str(cat_images[index])).opts(
            height=height,
            width=width))
    dog_plots.append(holoviews.RGB.load_image(
        str(dog_images[index])).opts(
            height=height,
            width=width))

plot = holoviews.Layout(cat_plots + dog_plots).cols(columns).opts(
    title="Dogs vs Cats"
)
Embed(plot=plot, file_name="dogs_and_cats", height_in_pixels=600)()

Figure Missing

So the quality and setting might vary, but they appear to be the mug-shot type photographs that most people tend to take. Some of them have borders, so they aren't uniformly sized images.

The Data Preprocessor

This will load and prepare the image batches for training and validation.

A Data Generator

This bundles up the steps to build the data generator.

class Data:
    """creates the data generators

    Args:
     path: path to the images
     validation_split: fraction that goes to the validation set
     batch_size: size for the batches in the epochs
    """
    def __init__(self, path: str, validation_split: float=0.2,
                 batch_size: int=20) -> None:
        self.path = path
        self.validation_split = validation_split
        self.batch_size = batch_size
        self._data_generator = None
        self._testing_data_generator = None
        self._training_generator = None
        self._validation_generator = None
        return

    @property
    def data_generator(self) -> ImageDataGenerator:
        """The data generator for training and validation"""
        if self._data_generator is None:
            self._data_generator = ImageDataGenerator(
                rescale=1/255,
                rotation_range=40,
                width_shift_range=0.2,
                height_shift_range=0.2,
                horizontal_flip=True,
                shear_range=0.2,
                zoom_range=0.2,
                fill_mode="nearest",
                validation_split=self.validation_split)
        return self._data_generator

    @property
    def training_generator(self):
        """The training data generator"""
        if self._training_generator is None:
            self._training_generator = (self.data_generator
                                        .flow_from_directory)(
                                            self.path,
                                            batch_size=self.batch_size,
                                            class_mode="binary",
                                            target_size=(150, 150),
                                            subset="training",
            )
        return self._training_generator

    @property
    def validation_generator(self):
        """the validation data generator"""
        if self._validation_generator is None:
            self._validation_generator = (self.data_generator
                                          .flow_from_directory)(
                                              self.path,
                                              batch_size=self.batch_size,
                                              class_mode="binary",
                                              target_size = (150, 150),
                                              subset="validation",
            )
        return self._validation_generator

    def __str__(self) -> str:
        return (f"(Data) - Path: {self.path}, "
                f"Validation Split: {self.validation_split},"
                f"Batch Size: {self.batch_size}")

Some Callbacks To Stop Training

The Good Enough Callback

If our model is doing good enough on the validation set then let's stop.

class Stop(tensorflow.keras.callbacks.Callback):
    """Something to stop if we are good enough

    Args:
     minimum_accuracy: validation accuracy needed to quit
     maximum_loss: validation loss needed to quit
     check_after_batch: if True, try to interrupt after each batch, end of epoch otherwise
     call_on_stopping: function or method to call when training is interrupted
    """
    def __init__(self, minimum_accuracy: float=0.94, maximum_loss: float=0.24,
                 check_after_batch: bool=False, call_on_stopping: callable=None) -> None:
        self.minimum_accuracy = minimum_accuracy
        self.maximum_loss = maximum_loss
        self.check_after_batch = check_after_batch
        if check_after_batch:
            self.on_batch_end = self.on_end_handler
        else:
            self.on_epoch_end = self.on_end_handler
        return

    def on_end_handler(self, count, logs={}):
        if ("val_acc" in logs
            and logs.get("val_acc") >= self.minimum_accuracy 
            and logs.get("val_loss") <= self.maximum_loss):
            print(f"Stopping point reached at {count}")
            self.model.stop_training = True
            if self.call_on_stopping is not None:
                self.call_on_stopping(self.model)
        return

    def __str__(self) -> str:
        return (f"(Stop) - Minimum Accuracy: {self.minimum_accuracy}, "
                f"Maximum Loss: {self.maximum_loss}, "
                "By Batch: {self.check_after_batch}, "
                f"Call on Stop: {self.call_on_stopping}")

A CSV Writer

class CSVLog(tensorflow.keras.callbacks.Callback):
    """A callback to store the performance

    Args:
     path: path to the output file
    """
    def __init__(self, path: Path) -> None:
        self.path = path
        self._file_pointer = None
        self._writer = None
        return

    @property
    def file_pointer(self):
        """The write file"""
        if self._file_pointer is None:
            self._file_pointer = self.path.open("w")
        return self._file_pointer

    @property
    def writer(self) -> DictWriter:
        """a file writer"""
        if self._writer is None:
            self._writer = DictWriter(
                self.file_pointer,
                ["loss","acc","val_loss","val_acc"])
            self._writer.writeheader()
        return self._writer

    def on_epoch_end(self, count, logs={}) -> None:
        """method to call at end of each epoch

       writes to the csv file

       Args:
        count: number of epochs run so far
        logs: dict with the epoch values
       """
        if ("val_acc" in logs):
            self.writer.writerow(logs)
        return

A Timed Stop Callback

This timer callback comes from Make best use of a Kernel's limited uptime (Keras) by Digit Recognizer.

class TimedStop(tensorflow.keras.callbacks.Callback):
    """A callback to stop if we run out of time

    Args:
     minutes_allowed: Most time to allow training the model
     by_batch: if True, try to stop after each batch, end of epoch otherwise
     call_on_stopping: function or method to call when training is interrupted
    """
    def __init__(self, minutes_allowed: int=350, check_after_batch: bool=False,
                 call_on_stopping: callable=None) -> None:
        self.elapsed_allowed = timedelta(minutes = minutes_allowed)
        self.call_on_stopping = call_on_stopping
        self.check_after_batch = check_after_batch

        # These are part of the Callback class
        if check_after_batch:
            self.on_batch_end = self.on_end_handler
        else:
            self.on_epoch_end = self.on_end_handler
        return

    def on_train_begin(self, logs: dict):
        """Called by keras at the start of training"""
        self.start_time = datetime.now()
        self.longest_elapsed = timedelta()
        self.previous_end_time = self.start_time
        self.total_elapsed = timedelta()
        return

    def on_end_handler(self, count: int, logs: dict):
        """called when an epoch or batch ends (depending on ```check_after_batch```)

       Args:
        count: the current batch or epoch
        logs: the log for the batch or epoch
       """
        now = datetime.now()
        self.total_elapsed = now - self.start_time
        this_elapsed = now - self.previous_end_time
        self.last_time = now

        if this_elapsed > self.longest_elapsed:
            self.longest_elapsed = this_elapsed

        if (self.elapsed_allowed - self.total_elapsed) < self.longest_elapsed:
            # estimated time remaining will exceed our time-out
            self.model.stop_training = True
            print(f"TimedStop: Training out of time (Elapsed = {self.total_elapsed})")
            print(f"TimedStop: Longest Epoch = {self.longest_elapsed}")
            if self.call_on_stopping is not None:
                self.call_on_stopping(self.model)
        return

    def __str__(self) -> str:
        return (f"(TimedStop) - Minutes Allowed: {self.elapsed_allowed}, "
                f"Check After Each Batch: {self.check_after_batch}, "
                f"Call On Stopping: {self.call_on_stopping}")

Saving the Model

This gets used by the callbacks to save the model.

  • Just the Weights

    This saves it so you can re-use the model but it won't be trainable if you re-load it (so probably not a good idea for what I'm doing, which is just exploring).

    file_name = "model_weights"
    
    def save_model_weights(model, name=file_name):
        """Save the models weights to disk
    
        Args:
         model: model whose weights to save
         name: base name for the file
        """
        out_name = f"{name}.h5"
        print(f"Saving the Model Weights to {out_name}")
        model.save_weights(f"{out_name}")
        return
    

    This stores everything so you can re-train it after loading the model.

    def save_model(model: tensorflow.keras.models.Sequential, path: Path) -> None:
        """Save the model with the training state
    
        This allows you to load it and continue training
    
        Args:
         model: the model to save
         path: where to store the model
        """
        model.save(str(path))
        print(f"Saving the model to {path}")
        return
    

Building the Model

This is going to be a model with three convolutional layers feeding a dense layer with 512 neurons. We're going to resize the image to 150 x 150 x 3 so the first layer has to be set to expect that.

class Network:
    """The model to categorize the images

    Args:
     path: path to the training data
     epochs: number of epochs to train
     batch_size: size of the batches for each epoch
     convolution_layers: layers of cnn/max-pooling
     callbacks: things to stop the training
     set_steps: whether to set the training steps-per-epoch
    """
    def __init__(self, path: str, epochs: int=15,
                 batch_size: int=128, convolution_layers: int=3,
                 set_steps: bool=True,
                 callbacks: list=None) -> None:
        self.path = path
        self.epochs = epochs
        self.batch_size = batch_size
        self.convolution_layers = convolution_layers
        self.set_steps = set_steps
        self.callbacks = callbacks
        self._data = None
        self._model = None
        self.history = None
        return

    @property
    def data(self) -> Data:
        """The data generator builder"""
        if self._data is None:
            self._data = Data(self.path, batch_size=self.batch_size)
        return self._data

    @property
    def model(self) -> tensorflow.keras.models.Sequential:
        """The neural network"""
        if self._model is None:
            self._model = tensorflow.keras.models.Sequential([
                tensorflow.keras.layers.Conv2D(
                    32, (3,3), activation='relu', 
                    input_shape=(150, 150, 3)),
                tensorflow.keras.layers.MaxPooling2D(2,2)])
            self._model.add(
                tensorflow.keras.layers.Conv2D(
                    64, (3,3), activation='relu'))
            self._model.add(
                tensorflow.keras.layers.MaxPooling2D(2,2))

            for layer in range(self.convolution_layers - 2):
                self._model.add(tensorflow.keras.layers.Conv2D(
                    128, (3,3), activation='relu'))
                self._model.add(tensorflow.keras.layers.MaxPooling2D(2,2))
            for layer in [
                    tensorflow.keras.layers.Flatten(), 
                    tensorflow.keras.layers.Dense(512, activation='relu'), 
                    tensorflow.keras.layers.Dense(1, activation='sigmoid')]:
                self._model.add(layer)

            self._model.compile(optimizer=RMSprop(lr=0.001),
                               loss='binary_crossentropy',
                               metrics = ['acc'])
        return self._model

    def summary(self) -> None:
        """Prints the model summary"""
        print(self.model.summary())
        return

    def train(self) -> None:
        """Trains the model"""
        callbacks = self.callbacks if self.callbacks else []
        arguments = dict(
            generator=self.data.training_generator,
            validation_data=self.data.validation_generator,
            epochs = self.epochs,
            callbacks = callbacks,
            verbose=2,
        )
        if self.set_steps:
            arguments["steps_per_epoch"] = int(
                self.data.training_generator.samples/self.batch_size)
            arguments["validation_steps"] = int(
                self.data.validation_generator.samples/self.batch_size)

        self.history = self.model.fit_generator(**arguments)
        return

    def __str__(self) -> str:
        return (f"(Network) - \nPath: {self.path}\n Epochs: {self.epochs}\n "
                f"Batch Size: {self.batch_size}\n Callbacks: {self.callbacks}\n"
                f"Data: {self.data}\n"
                f"Callbacks: {self.callbacks}")

Warning: I'm adapting the class definitions after running the fitting so don't run these again or the output will change.

network = Network(str(training_path))
network.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 72, 72, 32)        4640      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 36, 36, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 34, 34, 64)        18496     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 17, 17, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 18496)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               9470464   
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 513       
=================================================================
Total params: 9,494,561
Trainable params: 9,494,561
Non-trainable params: 0
_________________________________________________________________
None

Training the Model

network.train()
Found 20000 images belonging to 2 classes.
Found 25000 images belonging to 2 classes.
Epoch 1/15
20000/20000 - 3149s - loss: 0.2677 - acc: 0.8957 - val_loss: 0.2148 - val_acc: 0.9392
Epoch 2/15
20000/20000 - 3136s - loss: 0.2614 - acc: 0.9259 - val_loss: 0.2645 - val_acc: 0.9149
Epoch 3/15
20000/20000 - 3122s - loss: 0.2803 - acc: 0.9262 - val_loss: 0.2668 - val_acc: 0.9334
Epoch 4/15
20000/20000 - 3136s - loss: 0.2785 - acc: 0.9325 - val_loss: 0.3490 - val_acc: 0.9104
Epoch 5/15
20000/20000 - 3211s - loss: 0.2813 - acc: 0.9344 - val_loss: 0.2397 - val_acc: 0.9447
Epoch 6/15
20000/20000 - 3107s - loss: 0.3108 - acc: 0.9273 - val_loss: 0.3096 - val_acc: 0.9277
Epoch 7/15
20000/20000 - 3116s - loss: 0.3351 - acc: 0.9241 - val_loss: 0.2756 - val_acc: 0.9365
Epoch 8/15
20000/20000 - 3112s - loss: 0.3723 - acc: 0.9201 - val_loss: 0.5705 - val_acc: 0.9031
Epoch 9/15
20000/20000 - 3175s - loss: 0.3909 - acc: 0.9169 - val_loss: 0.3232 - val_acc: 0.8964
Epoch 10/15
20000/20000 - 3176s - loss: 0.3838 - acc: 0.9179 - val_loss: 0.4460 - val_acc: 0.8797
Epoch 11/15
20000/20000 - 3180s - loss: 0.4013 - acc: 0.9117 - val_loss: 1.5652 - val_acc: 0.8393
Epoch 12/15
20000/20000 - 3176s - loss: 0.3854 - acc: 0.9070 - val_loss: 0.2922 - val_acc: 0.9028
Epoch 13/15
20000/20000 - 3174s - loss: 0.3736 - acc: 0.9087 - val_loss: 0.3601 - val_acc: 0.8856
Epoch 14/15
20000/20000 - 3184s - loss: 0.4744 - acc: 0.9065 - val_loss: 0.5963 - val_acc: 0.7796
Epoch 15/15
20000/20000 - 3149s - loss: 0.4771 - acc: 0.8997 - val_loss: 0.3037 - val_acc: 0.8814

It looks like the model is starting to overfit, maybe we need a callback.

Take Two

I noticed that I accidentally used a batch-size of 20, which will make it converge more quickly per epoch, but make the epochs take longer, so besides adding the callback I'm going to change the batch-size.

callback = Stop()
network = Network(str(training_path), callbacks=[callback])
network.data.batch_size = 512
network.data._training_generator = None
network.data._validation_generator = None
TIMER.message = "Finished training the cats and dogs network"
with TIMER:
    network.train()
2019-07-11 04:39:19,803 graeae.timers.timer start: Started: 2019-07-11 04:39:19.803252
I0711 04:39:19.803283 140573992724288 timer.py:70] Started: 2019-07-11 04:39:19.803252
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
20000/20000 - 1765s - loss: 0.3099 - acc: 0.8796 - val_loss: 0.9553 - val_acc: 0.6900
Epoch 2/15
20000/20000 - 1764s - loss: 0.3434 - acc: 0.8976 - val_loss: 0.7495 - val_acc: 0.7854
Epoch 3/15
20000/20000 - 1765s - loss: 0.3718 - acc: 0.8923 - val_loss: 1.2745 - val_acc: 0.8064
Epoch 4/15
20000/20000 - 1774s - loss: 0.3892 - acc: 0.8904 - val_loss: 0.4991 - val_acc: 0.7906
Epoch 5/15
20000/20000 - 1775s - loss: 0.4290 - acc: 0.8786 - val_loss: 0.5698 - val_acc: 0.8338
Epoch 6/15
20000/20000 - 1774s - loss: 0.4889 - acc: 0.8847 - val_loss: 0.5719 - val_acc: 0.8078
Epoch 7/15
20000/20000 - 1774s - loss: 0.3992 - acc: 0.8883 - val_loss: 0.4647 - val_acc: 0.8258
Epoch 8/15
20000/20000 - 1774s - loss: 0.4458 - acc: 0.8853 - val_loss: 0.6354 - val_acc: 0.8098
Epoch 9/15
20000/20000 - 1774s - loss: 0.3810 - acc: 0.8856 - val_loss: 0.5611 - val_acc: 0.8198
Epoch 10/15
20000/20000 - 1773s - loss: 0.3894 - acc: 0.8886 - val_loss: 0.4861 - val_acc: 0.8476
Epoch 11/15
20000/20000 - 1771s - loss: 0.4517 - acc: 0.8972 - val_loss: 0.4623 - val_acc: 0.8420
Epoch 12/15
20000/20000 - 1755s - loss: 0.3626 - acc: 0.9030 - val_loss: 0.5481 - val_acc: 0.8310
Epoch 13/15
20000/20000 - 1753s - loss: 0.3574 - acc: 0.9054 - val_loss: 0.5138 - val_acc: 0.8430
Epoch 14/15
20000/20000 - 1753s - loss: 0.3658 - acc: 0.9132 - val_loss: 1.0924 - val_acc: 0.8164
Epoch 15/15
20000/20000 - 1753s - loss: 0.3331 - acc: 0.9145 - val_loss: 0.5148 - val_acc: 0.8312
2019-07-11 12:00:57,205 graeae.timers.timer end: Ended: 2019-07-11 12:00:57.205007
I0711 12:00:57.205034 140573992724288 timer.py:77] Ended: 2019-07-11 12:00:57.205007
2019-07-11 12:00:57,205 graeae.timers.timer end: Elapsed: 7:21:37.401755
I0711 12:00:57.205761 140573992724288 timer.py:78] Elapsed: 7:21:37.401755

So the loss and accuracy never matched what happened when we used the tiny batch size. On the other hand it took less than half the time. Maybe something in between.

Take Three

  • Train Again

    I'm going to use the default of 350 minutes that my TimedStop class has based on the time-limit that Kaggle kernels have. If it can meet the accuracy we want before this then it is trainable in one session using a kernel (I think, I haven't actually tried it yet).

    path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_128").expanduser()
    saver = partial(save_model_weights, name=str(path))
    good_enough = Stop(on_interrupt=saver)
    out_of_time = TimedStop(on_interrupt=saver)
    
    network = Network(str(training_path), 
                      callbacks=[good_enough, out_of_time],
                      batch_size=128)
    with TIMER:
        network.train()
    
    2019-07-13 12:06:40,651 graeae.timers.timer start: Started: 2019-07-13 12:06:40.651269
    I0713 12:06:40.651483 140400353847104 timer.py:70] Started: 2019-07-13 12:06:40.651269
    Found 20000 images belonging to 2 classes.
    Found 5000 images belonging to 2 classes.
    Epoch 1/15
    TimedStop: Training out of time (Elapsed = 3:19:17.730138)
    20000/20000 - 11911s - loss: 0.0634 - acc: 0.9821 - val_loss: 4.4497 - val_acc: 0.8038
    2019-07-13 15:25:59,863 graeae.timers.timer end: Ended: 2019-07-13 15:25:59.863265
    I0713 15:25:59.863302 140400353847104 timer.py:77] Ended: 2019-07-13 15:25:59.863265
    2019-07-13 15:25:59,864 graeae.timers.timer end: Elapsed: 3:19:19.211996
    I0713 15:25:59.864300 140400353847104 timer.py:78] Elapsed: 3:19:19.211996
    

    So, we have several problems here:

    • It quit after just over 3 hours, not just under 6
    • It only did one epoch
    • It didn't come close to reaching the accuracy I wanted

    The early timeout is probably because since it took over 3 hours it probably wouldn't have finished another epoch in the allotted time. Let's double-check my times.

    print(f"Batch 20 First Epoch: {3136/3600:0.2f} Hours")
    print(f"Batch 512 First Epoch: {1765/3600: 0.2f} Hours")
    print(f"Batch 128 First Epoch: {11911/3600: 0.2f} Hours")
    
    Batch 20 First Epoch: 0.87 Hours
    Batch 512 First Epoch:  0.49 Hours
    Batch 128 First Epoch:  3.31 Hours
    

    So something went really wrong with this last run (and either my timer is wrong or the epoch times is wrong). What is it?

Take Four

Try one that times out early.

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_256").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(on_interrupt=saver)
out_of_time = TimedStop(1, on_interrupt=saver, by_batch=True)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()
2019-07-13 16:56:05,428 graeae.timers.timer start: Started: 2019-07-13 16:56:05.428254
I0713 16:56:05.428468 140249436239680 timer.py:70] Started: 2019-07-13 16:56:05.428254
W0713 16:56:05.429713 140249436239680 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0713 16:56:05.520384 140249436239680 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 256, Callbacks: [<__main__.Stop object at 0x7f8dcc0b8550>, <__main__.TimedStop object at 0x7f8dcc0b8048>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
2019-07-13 16:56:36,641 graeae.timers.timer end: Ended: 2019-07-13 16:56:36.641438
I0713 16:56:36.641472 140249436239680 timer.py:77] Ended: 2019-07-13 16:56:36.641438
2019-07-13 16:56:36,643 graeae.timers.timer end: Elapsed: 0:00:31.213184
I0713 16:56:36.643917 140249436239680 timer.py:78] Elapsed: 0:00:31.213184
TimedStop: Training out of time (Elapsed = 0:00:30.441487)
Saving the Model Weights to /home/athena/models/dogs-vs-cats/cnn_weights_batch_size_256.h5

Take Five

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_256").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(on_interrupt=saver)
out_of_time = TimedStop(on_interrupt=saver, by_batch=True)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()
2019-07-13 16:59:47,492 graeae.timers.timer start: Started: 2019-07-13 16:59:47.491986
I0713 16:59:47.492009 140249436239680 timer.py:70] Started: 2019-07-13 16:59:47.491986
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 256, Callbacks: [<__main__.Stop object at 0x7f8db84fa940>, <__main__.TimedStop object at 0x7f8db84fa908>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
2019-07-13 19:54:49,091 graeae.timers.timer end: Ended: 2019-07-13 19:54:49.091389
I0713 19:54:49.091431 140249436239680 timer.py:77] Ended: 2019-07-13 19:54:49.091389
2019-07-13 19:54:49,093 graeae.timers.timer end: Elapsed: 2:55:01.599403
I0713 19:54:49.093391 140249436239680 timer.py:78] Elapsed: 2:55:01.599403
TimedStop: Training out of time (Elapsed = 2:55:00.848875)
Saving the Model Weights to /home/athena/models/dogs-vs-cats/cnn_weights_batch_size_256.h5

Take Six

According to what I read, the smaller batch size is supposed to be faster and the larger batch-size slower, but that isn't what I saw with the 512 batch-size. Although I don't seem to see any pattern, really. Try this again.

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_32").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver, check_after_batch=True)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  batch_size=32)
print(str(network))
with TIMER:
    network.train()
2019-07-14 12:32:06,967 graeae.timers.timer start: Started: 2019-07-14 12:32:06.966983
I0714 12:32:06.967006 140261637994304 timer.py:70] Started: 2019-07-14 12:32:06.966983
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 32, Callbacks: [<__main__.Stop object at 0x7f902c28e860>, <__main__.TimedStop object at 0x7f902c28e278>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 32,Callbacks: [<__main__.Stop object at 0x7f902c28e860>, <__main__.TimedStop object at 0x7f902c28e278>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
625/625 - 87s - loss: 0.6216 - acc: 0.6469 - val_loss: 0.5601 - val_acc: 0.6947
Epoch 2/15
625/625 - 85s - loss: 0.5002 - acc: 0.7568 - val_loss: 0.4532 - val_acc: 0.7833
Epoch 3/15
625/625 - 85s - loss: 0.4236 - acc: 0.8059 - val_loss: 0.3850 - val_acc: 0.8269
Epoch 4/15
625/625 - 85s - loss: 0.3618 - acc: 0.8400 - val_loss: 0.3532 - val_acc: 0.8397
Epoch 5/15
625/625 - 85s - loss: 0.3117 - acc: 0.8667 - val_loss: 0.4227 - val_acc: 0.8177
Epoch 6/15
625/625 - 85s - loss: 0.2683 - acc: 0.8845 - val_loss: 0.3806 - val_acc: 0.8438
Epoch 7/15
625/625 - 85s - loss: 0.2356 - acc: 0.9021 - val_loss: 0.3288 - val_acc: 0.8682
Epoch 8/15
625/625 - 85s - loss: 0.2065 - acc: 0.9143 - val_loss: 0.3912 - val_acc: 0.8678
Epoch 9/15
625/625 - 85s - loss: 0.1854 - acc: 0.9255 - val_loss: 0.3263 - val_acc: 0.8812
Epoch 10/15
625/625 - 85s - loss: 0.1629 - acc: 0.9359 - val_loss: 0.3515 - val_acc: 0.8768
Epoch 11/15
625/625 - 85s - loss: 0.1475 - acc: 0.9408 - val_loss: 0.4157 - val_acc: 0.8808
Epoch 12/15
625/625 - 85s - loss: 0.1342 - acc: 0.9486 - val_loss: 0.3752 - val_acc: 0.8740
Epoch 13/15
625/625 - 85s - loss: 0.1276 - acc: 0.9536 - val_loss: 0.4363 - val_acc: 0.8744
Epoch 14/15
625/625 - 85s - loss: 0.1233 - acc: 0.9542 - val_loss: 0.3375 - val_acc: 0.8714
Epoch 15/15
625/625 - 85s - loss: 0.1149 - acc: 0.9568 - val_loss: 0.3805 - val_acc: 0.8788
2019-07-14 12:53:20,998 graeae.timers.timer end: Ended: 2019-07-14 12:53:20.998164
I0714 12:53:20.998192 140261637994304 timer.py:77] Ended: 2019-07-14 12:53:20.998164
2019-07-14 12:53:20,999 graeae.timers.timer end: Elapsed: 0:21:14.031181
I0714 12:53:20.999267 140261637994304 timer.py:78] Elapsed: 0:21:14.031181

So I think I fixed the time problem, lets try upping the batch size.

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_128").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver, check_after_batch=True)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  batch_size=128)
print(str(network))
with TIMER:
    network.train()
2019-07-14 13:29:11,456 graeae.timers.timer start: Started: 2019-07-14 13:29:11.456042
I0714 13:29:11.456063 140261637994304 timer.py:70] Started: 2019-07-14 13:29:11.456042
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 128, Callbacks: [<__main__.Stop object at 0x7f8fd911c9b0>, <__main__.TimedStop object at 0x7f902c09da20>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 128,Callbacks: [<__main__.Stop object at 0x7f8fd911c9b0>, <__main__.TimedStop object at 0x7f902c09da20>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
156/156 - 89s - loss: 0.6749 - acc: 0.5787 - val_loss: 0.6582 - val_acc: 0.6058
Epoch 2/15
156/156 - 89s - loss: 0.6266 - acc: 0.6545 - val_loss: 0.5754 - val_acc: 0.6967
Epoch 3/15
156/156 - 89s - loss: 0.5507 - acc: 0.7226 - val_loss: 0.5372 - val_acc: 0.7278
Epoch 4/15
156/156 - 89s - loss: 0.4921 - acc: 0.7646 - val_loss: 0.4557 - val_acc: 0.7889
Epoch 5/15
156/156 - 88s - loss: 0.4421 - acc: 0.7952 - val_loss: 0.5066 - val_acc: 0.7622
Epoch 6/15
156/156 - 87s - loss: 0.3993 - acc: 0.8194 - val_loss: 0.5141 - val_acc: 0.7684
Epoch 7/15
156/156 - 87s - loss: 0.3602 - acc: 0.8401 - val_loss: 0.4552 - val_acc: 0.7887
Epoch 8/15
156/156 - 87s - loss: 0.3194 - acc: 0.8634 - val_loss: 0.3735 - val_acc: 0.8401
Epoch 9/15
156/156 - 86s - loss: 0.2790 - acc: 0.8792 - val_loss: 0.3472 - val_acc: 0.8472
Epoch 10/15
156/156 - 86s - loss: 0.2460 - acc: 0.8964 - val_loss: 0.3934 - val_acc: 0.8281
Epoch 11/15
156/156 - 85s - loss: 0.2144 - acc: 0.9108 - val_loss: 0.3637 - val_acc: 0.8558
Epoch 12/15
156/156 - 85s - loss: 0.1857 - acc: 0.9249 - val_loss: 0.3656 - val_acc: 0.8516
Epoch 13/15
156/156 - 85s - loss: 0.1590 - acc: 0.9364 - val_loss: 0.3921 - val_acc: 0.8616
Epoch 14/15
156/156 - 84s - loss: 0.1400 - acc: 0.9449 - val_loss: 0.4903 - val_acc: 0.8297
Epoch 15/15
156/156 - 84s - loss: 0.1142 - acc: 0.9553 - val_loss: 0.4341 - val_acc: 0.8580
2019-07-14 13:50:52,575 graeae.timers.timer end: Ended: 2019-07-14 13:50:52.575319
I0714 13:50:52.575360 140261637994304 timer.py:77] Ended: 2019-07-14 13:50:52.575319
2019-07-14 13:50:52,576 graeae.timers.timer end: Elapsed: 0:21:41.119277
I0714 13:50:52.576494 140261637994304 timer.py:78] Elapsed: 0:21:41.119277

So it is now way faster but it doesn't reach the accuracy and validation we want. That's an incredible speedup, though. I guess I really wasn't using it correctly (you need to divide the number of sample by the batch size when setting the steps_per_epoch in the model.fit_generator call).

In the model.fit_generator:

steps_per_epoch=int(self.data.training_generator.samples/self.batch_size),
validation_steps=int(self.data.validation_generator.samples/self.batch_size),

Take Seven

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_256").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()
2019-07-14 14:51:25,349 graeae.timers.timer start: Started: 2019-07-14 14:51:25.349514
I0714 14:51:25.349538 140261637994304 timer.py:70] Started: 2019-07-14 14:51:25.349514
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 256, Callbacks: [<__main__.Stop object at 0x7f90993397f0>, <__main__.TimedStop object at 0x7f9099339e80>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256,Callbacks: [<__main__.Stop object at 0x7f90993397f0>, <__main__.TimedStop object at 0x7f9099339e80>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
78/78 - 89s - loss: 0.6720 - acc: 0.5733 - val_loss: 0.6325 - val_acc: 0.6295
Epoch 2/15
78/78 - 88s - loss: 0.6036 - acc: 0.6672 - val_loss: 0.6576 - val_acc: 0.6081
Epoch 3/15
78/78 - 87s - loss: 0.5626 - acc: 0.7084 - val_loss: 0.5236 - val_acc: 0.7358
Epoch 4/15
78/78 - 87s - loss: 0.5266 - acc: 0.7351 - val_loss: 0.4959 - val_acc: 0.7584
Epoch 5/15
78/78 - 85s - loss: 0.4957 - acc: 0.7556 - val_loss: 0.4607 - val_acc: 0.7767
Epoch 6/15
78/78 - 84s - loss: 0.4644 - acc: 0.7808 - val_loss: 0.4637 - val_acc: 0.7800
Epoch 7/15
78/78 - 85s - loss: 0.4296 - acc: 0.7976 - val_loss: 0.4066 - val_acc: 0.8100
Epoch 8/15
78/78 - 85s - loss: 0.3963 - acc: 0.8205 - val_loss: 0.4073 - val_acc: 0.8201
Epoch 9/15
78/78 - 83s - loss: 0.3736 - acc: 0.8294 - val_loss: 0.3809 - val_acc: 0.8248
Epoch 10/15
78/78 - 83s - loss: 0.3452 - acc: 0.8414 - val_loss: 0.4528 - val_acc: 0.7800
Epoch 11/15
78/78 - 81s - loss: 0.3128 - acc: 0.8624 - val_loss: 0.3582 - val_acc: 0.8452
Epoch 12/15
78/78 - 81s - loss: 0.2872 - acc: 0.8753 - val_loss: 0.4267 - val_acc: 0.8059
Epoch 13/15
78/78 - 80s - loss: 0.2662 - acc: 0.8864 - val_loss: 0.3332 - val_acc: 0.8557
Epoch 14/15
78/78 - 80s - loss: 0.2301 - acc: 0.9052 - val_loss: 0.3384 - val_acc: 0.8600
Epoch 15/15
78/78 - 79s - loss: 0.2084 - acc: 0.9122 - val_loss: 0.3857 - val_acc: 0.8298
2019-07-14 15:12:21,858 graeae.timers.timer end: Ended: 2019-07-14 15:12:21.857979
I0714 15:12:21.858018 140261637994304 timer.py:77] Ended: 2019-07-14 15:12:21.857979
2019-07-14 15:12:21,859 graeae.timers.timer end: Elapsed: 0:20:56.508465
I0714 15:12:21.859612 140261637994304 timer.py:78] Elapsed: 0:20:56.508465

So doubling the batch size doesn't change the time or the accuracy?

Take Eight

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_512").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  batch_size=512)
print(str(network))
with TIMER:
    network.train()
2019-07-14 15:21:24,657 graeae.timers.timer start: Started: 2019-07-14 15:21:24.657613
I0714 15:21:24.657636 140261637994304 timer.py:70] Started: 2019-07-14 15:21:24.657613
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 512, Callbacks: [<__main__.Stop object at 0x7f8fd8224128>, <__main__.TimedStop object at 0x7f90990c6e80>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 512,Callbacks: [<__main__.Stop object at 0x7f8fd8224128>, <__main__.TimedStop object at 0x7f90990c6e80>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
39/39 - 88s - loss: 0.6929 - acc: 0.5590 - val_loss: 0.6546 - val_acc: 0.6135
Epoch 2/15
39/39 - 86s - loss: 0.6552 - acc: 0.6109 - val_loss: 0.6258 - val_acc: 0.6523
Epoch 3/15
39/39 - 83s - loss: 0.6128 - acc: 0.6612 - val_loss: 0.5904 - val_acc: 0.6918
Epoch 4/15
39/39 - 83s - loss: 0.5830 - acc: 0.6929 - val_loss: 0.5612 - val_acc: 0.7198
Epoch 5/15
39/39 - 80s - loss: 0.5767 - acc: 0.6999 - val_loss: 0.5494 - val_acc: 0.7177
Epoch 6/15
39/39 - 80s - loss: 0.5416 - acc: 0.7262 - val_loss: 0.5036 - val_acc: 0.7591
Epoch 7/15
39/39 - 77s - loss: 0.5199 - acc: 0.7424 - val_loss: 0.4987 - val_acc: 0.7602
Epoch 8/15
39/39 - 76s - loss: 0.5040 - acc: 0.7501 - val_loss: 0.5018 - val_acc: 0.7565
Epoch 9/15
39/39 - 74s - loss: 0.4735 - acc: 0.7743 - val_loss: 0.5063 - val_acc: 0.7517
Epoch 10/15
39/39 - 74s - loss: 0.4635 - acc: 0.7778 - val_loss: 0.4747 - val_acc: 0.7682
Epoch 11/15
39/39 - 72s - loss: 0.4403 - acc: 0.7953 - val_loss: 0.4519 - val_acc: 0.7923
Epoch 12/15
39/39 - 72s - loss: 0.4175 - acc: 0.8014 - val_loss: 0.5312 - val_acc: 0.7385
Epoch 13/15
39/39 - 71s - loss: 0.4099 - acc: 0.8135 - val_loss: 0.4505 - val_acc: 0.7899
Epoch 14/15
39/39 - 71s - loss: 0.3855 - acc: 0.8246 - val_loss: 0.4089 - val_acc: 0.8121
Epoch 15/15
39/39 - 73s - loss: 0.3622 - acc: 0.8373 - val_loss: 0.4049 - val_acc: 0.8166
2019-07-14 15:40:46,189 graeae.timers.timer end: Ended: 2019-07-14 15:40:46.189544
I0714 15:40:46.189586 140261637994304 timer.py:77] Ended: 2019-07-14 15:40:46.189544
2019-07-14 15:40:46,191 graeae.timers.timer end: Elapsed: 0:19:21.531931
I0714 15:40:46.191108 140261637994304 timer.py:78] Elapsed: 0:19:21.531931

So now it looks like after a while the batch-size doesn't make a difference, and we don't get anywhere near the performance I got on that first run.

Take Nine

What if we cut down the number of convolutional layers?

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_256_3").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=3,
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()
2019-07-14 15:48:37,805 graeae.timers.timer start: Started: 2019-07-14 15:48:37.805388
I0714 15:48:37.805425 140261637994304 timer.py:70] Started: 2019-07-14 15:48:37.805388
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 256, Callbacks: [<__main__.Stop object at 0x7f8fb40cc668>, <__main__.TimedStop object at 0x7f8fb40ccb38>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256,Callbacks: [<__main__.Stop object at 0x7f8fb40cc668>, <__main__.TimedStop object at 0x7f8fb40ccb38>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
78/78 - 90s - loss: 0.7905 - acc: 0.5876 - val_loss: 0.6102 - val_acc: 0.6686
Epoch 2/15
78/78 - 88s - loss: 0.6018 - acc: 0.6747 - val_loss: 0.4995 - val_acc: 0.7588
Epoch 3/15
78/78 - 88s - loss: 0.5166 - acc: 0.7430 - val_loss: 0.4653 - val_acc: 0.7815
Epoch 4/15
78/78 - 87s - loss: 0.4609 - acc: 0.7849 - val_loss: 0.4388 - val_acc: 0.7975
Epoch 5/15
78/78 - 86s - loss: 0.4251 - acc: 0.8010 - val_loss: 0.4319 - val_acc: 0.7973
Epoch 6/15
78/78 - 86s - loss: 0.3864 - acc: 0.8254 - val_loss: 0.4892 - val_acc: 0.7574
Epoch 7/15
78/78 - 85s - loss: 0.3516 - acc: 0.8420 - val_loss: 0.4605 - val_acc: 0.7958
Epoch 8/15
78/78 - 84s - loss: 0.3164 - acc: 0.8619 - val_loss: 0.4136 - val_acc: 0.8178
Epoch 9/15
78/78 - 84s - loss: 0.2688 - acc: 0.8877 - val_loss: 0.5011 - val_acc: 0.7611
Epoch 10/15
78/78 - 83s - loss: 0.2265 - acc: 0.9060 - val_loss: 0.5449 - val_acc: 0.7876
Epoch 11/15
78/78 - 82s - loss: 0.2066 - acc: 0.9149 - val_loss: 0.5982 - val_acc: 0.7741
Epoch 12/15
78/78 - 82s - loss: 0.1662 - acc: 0.9397 - val_loss: 0.5129 - val_acc: 0.8244
Epoch 13/15
78/78 - 81s - loss: 0.1327 - acc: 0.9505 - val_loss: 0.6415 - val_acc: 0.8014
Epoch 14/15
78/78 - 82s - loss: 0.1043 - acc: 0.9643 - val_loss: 0.6042 - val_acc: 0.8168
Epoch 15/15
78/78 - 80s - loss: 0.0813 - acc: 0.9746 - val_loss: 0.6658 - val_acc: 0.8201
2019-07-14 16:09:47,278 graeae.timers.timer end: Ended: 2019-07-14 16:09:47.278102
I0714 16:09:47.278141 140261637994304 timer.py:77] Ended: 2019-07-14 16:09:47.278102
2019-07-14 16:09:47,279 graeae.timers.timer end: Elapsed: 0:21:09.472714
I0714 16:09:47.279699 140261637994304 timer.py:78] Elapsed: 0:21:09.472714

Nope, it's almost as if I didn't do anything to change it.

Take Ten

What if I get rid of the steps per epoch?

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_20_3").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=3,
                  set_steps = True,
                  batch_size=20)
print(str(network))
with TIMER:
    network.train()
2019-07-14 19:08:19,353 graeae.timers.timer start: Started: 2019-07-14 19:08:19.353471
I0714 19:08:19.353494 140261637994304 timer.py:70] Started: 2019-07-14 19:08:19.353471
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 15, Batch Size: 20, Callbacks: [<__main__.Stop object at 0x7f8dfa4dd048>, <__main__.TimedStop object at 0x7f8dfa4dd470>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 20,Callbacks: [<__main__.Stop object at 0x7f8dfa4dd048>, <__main__.TimedStop object at 0x7f8dfa4dd470>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/15
1000/1000 - 155s - loss: 0.6403 - acc: 0.6418 - val_loss: 0.5868 - val_acc: 0.6744
Epoch 2/15
1000/1000 - 153s - loss: 0.5746 - acc: 0.7024 - val_loss: 0.5579 - val_acc: 0.7180
Epoch 3/15
1000/1000 - 152s - loss: 0.5397 - acc: 0.7315 - val_loss: 0.5599 - val_acc: 0.7248
Epoch 4/15
1000/1000 - 152s - loss: 0.5237 - acc: 0.7452 - val_loss: 0.4945 - val_acc: 0.7576
Epoch 5/15
1000/1000 - 152s - loss: 0.5110 - acc: 0.7520 - val_loss: 0.4821 - val_acc: 0.7710
Epoch 6/15
1000/1000 - 152s - loss: 0.5012 - acc: 0.7627 - val_loss: 0.7423 - val_acc: 0.7008
Epoch 7/15
1000/1000 - 152s - loss: 0.4913 - acc: 0.7668 - val_loss: 0.4916 - val_acc: 0.7732
Epoch 8/15
1000/1000 - 152s - loss: 0.4834 - acc: 0.7726 - val_loss: 0.5013 - val_acc: 0.7634
Epoch 9/15
1000/1000 - 152s - loss: 0.4765 - acc: 0.7810 - val_loss: 0.4503 - val_acc: 0.7930
Epoch 10/15
1000/1000 - 152s - loss: 0.4712 - acc: 0.7858 - val_loss: 0.4780 - val_acc: 0.7872
Epoch 11/15
1000/1000 - 152s - loss: 0.4663 - acc: 0.7882 - val_loss: 0.4645 - val_acc: 0.7810
Epoch 12/15
1000/1000 - 152s - loss: 0.4590 - acc: 0.7926 - val_loss: 0.4233 - val_acc: 0.8208
Epoch 13/15
1000/1000 - 152s - loss: 0.4566 - acc: 0.7958 - val_loss: 0.4494 - val_acc: 0.8020
Epoch 14/15
1000/1000 - 154s - loss: 0.4499 - acc: 0.7985 - val_loss: 0.4125 - val_acc: 0.8248
Epoch 15/15
1000/1000 - 157s - loss: 0.4453 - acc: 0.8014 - val_loss: 0.4875 - val_acc: 0.7510
2019-07-14 19:46:32,393 graeae.timers.timer end: Ended: 2019-07-14 19:46:32.393181
I0714 19:46:32.393228 140261637994304 timer.py:77] Ended: 2019-07-14 19:46:32.393181
2019-07-14 19:46:32,394 graeae.timers.timer end: Elapsed: 0:38:13.039710
I0714 19:46:32.394544 140261637994304 timer.py:78] Elapsed: 0:38:13.039710

Take Eleven

I updated the model to have twice as many neurons at each layes than it did before and added fill_mode="nearest" to the image generator. Let's see how it does.

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_32_5_double").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=.95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=32)
print(str(network))
with TIMER:
    network.train()
2019-07-15 22:39:43,846 graeae.timers.timer start: Started: 2019-07-15 22:39:43.846462
I0715 22:39:43.846729 140093074827072 timer.py:70] Started: 2019-07-15 22:39:43.846462
(Network) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Epochs: 100, Batch Size: 32, Callbacks: [<__main__.Stop object at 0x7f6964315f98>, <__main__.TimedStop object at 0x7f6964315f60>],Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 32,Callbacks: [<__main__.Stop object at 0x7f6964315f98>, <__main__.TimedStop object at 0x7f6964315f60>]
Found 20000 images belonging to 2 classes.
W0715 22:39:44.526424 140093074827072 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Found 5000 images belonging to 2 classes.
W0715 22:39:44.941590 140093074827072 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
625/625 - 443s - loss: 0.6784 - acc: 0.5767 - val_loss: 0.6511 - val_acc: 0.6290
Epoch 2/100
625/625 - 153s - loss: 0.6261 - acc: 0.6617 - val_loss: 0.6095 - val_acc: 0.6833
Epoch 3/100
625/625 - 151s - loss: 0.5860 - acc: 0.6988 - val_loss: 0.5633 - val_acc: 0.7256
Epoch 4/100
625/625 - 151s - loss: 0.5576 - acc: 0.7165 - val_loss: 0.5596 - val_acc: 0.7079
Epoch 5/100
625/625 - 152s - loss: 0.5230 - acc: 0.7480 - val_loss: 0.4567 - val_acc: 0.7843
Epoch 6/100
625/625 - 151s - loss: 0.4899 - acc: 0.7704 - val_loss: 0.4682 - val_acc: 0.7778
Epoch 7/100
625/625 - 155s - loss: 0.4654 - acc: 0.7873 - val_loss: 0.4377 - val_acc: 0.8097
Epoch 8/100
625/625 - 152s - loss: 0.4426 - acc: 0.8019 - val_loss: 0.4038 - val_acc: 0.8249
Epoch 9/100
625/625 - 153s - loss: 0.4338 - acc: 0.8034 - val_loss: 0.3988 - val_acc: 0.8223
Epoch 10/100
625/625 - 152s - loss: 0.4278 - acc: 0.8113 - val_loss: 0.4619 - val_acc: 0.7979
Epoch 11/100
625/625 - 153s - loss: 0.4224 - acc: 0.8101 - val_loss: 0.4608 - val_acc: 0.7710
Epoch 12/100
625/625 - 153s - loss: 0.4223 - acc: 0.8140 - val_loss: 0.4496 - val_acc: 0.7726
Epoch 13/100
625/625 - 153s - loss: 0.4285 - acc: 0.8093 - val_loss: 0.4147 - val_acc: 0.8255
Epoch 14/100
625/625 - 156s - loss: 0.4224 - acc: 0.8070 - val_loss: 0.4451 - val_acc: 0.7885
Epoch 15/100
625/625 - 155s - loss: 0.4428 - acc: 0.8092 - val_loss: 0.4191 - val_acc: 0.8177
Epoch 16/100
625/625 - 156s - loss: 0.4267 - acc: 0.8089 - val_loss: 0.3791 - val_acc: 0.8127
Epoch 17/100
625/625 - 155s - loss: 0.4328 - acc: 0.8067 - val_loss: 0.6180 - val_acc: 0.7839
Epoch 18/100
625/625 - 156s - loss: 0.4361 - acc: 0.8083 - val_loss: 0.6032 - val_acc: 0.6785
Epoch 19/100
625/625 - 156s - loss: 0.4203 - acc: 0.8153 - val_loss: 0.3815 - val_acc: 0.8335
Epoch 20/100
625/625 - 156s - loss: 0.4403 - acc: 0.8116 - val_loss: 0.3653 - val_acc: 0.8393
Epoch 21/100
625/625 - 155s - loss: 0.4216 - acc: 0.8105 - val_loss: 0.4811 - val_acc: 0.7728
Epoch 22/100
625/625 - 155s - loss: 0.4423 - acc: 0.8049 - val_loss: 2.2071 - val_acc: 0.5913
Epoch 23/100
625/625 - 152s - loss: 0.4417 - acc: 0.8063 - val_loss: 0.3568 - val_acc: 0.8442
Epoch 24/100
625/625 - 151s - loss: 0.4353 - acc: 0.8105 - val_loss: 0.3727 - val_acc: 0.8405
Epoch 25/100
625/625 - 150s - loss: 0.4409 - acc: 0.8099 - val_loss: 0.4661 - val_acc: 0.7604
Epoch 26/100
625/625 - 151s - loss: 0.4467 - acc: 0.8021 - val_loss: 0.4138 - val_acc: 0.8227
Epoch 27/100
625/625 - 151s - loss: 0.4465 - acc: 0.8078 - val_loss: 0.3894 - val_acc: 0.8407
Epoch 28/100
625/625 - 156s - loss: 0.4269 - acc: 0.8126 - val_loss: 0.4504 - val_acc: 0.7901
Epoch 29/100
625/625 - 156s - loss: 0.4354 - acc: 0.8102 - val_loss: 0.3996 - val_acc: 0.8546
Epoch 30/100
625/625 - 156s - loss: 0.4677 - acc: 0.8122 - val_loss: 0.3758 - val_acc: 0.8508
Epoch 31/100
625/625 - 155s - loss: 0.4282 - acc: 0.8164 - val_loss: 0.4889 - val_acc: 0.7875
Epoch 32/100
625/625 - 156s - loss: 0.4306 - acc: 0.8108 - val_loss: 0.4583 - val_acc: 0.7562
Epoch 33/100
625/625 - 155s - loss: 0.4280 - acc: 0.8123 - val_loss: 0.4034 - val_acc: 0.7897
Epoch 34/100
625/625 - 157s - loss: 0.4316 - acc: 0.8058 - val_loss: 0.3685 - val_acc: 0.8462
Epoch 35/100
625/625 - 155s - loss: 0.4248 - acc: 0.8133 - val_loss: 0.5994 - val_acc: 0.8433
Epoch 36/100
625/625 - 156s - loss: 0.4450 - acc: 0.8112 - val_loss: 0.5043 - val_acc: 0.7099
Epoch 37/100
625/625 - 156s - loss: 0.4467 - acc: 0.8085 - val_loss: 0.3921 - val_acc: 0.8297
Epoch 38/100
625/625 - 156s - loss: 0.4377 - acc: 0.8106 - val_loss: 0.3842 - val_acc: 0.8403
Epoch 39/100
625/625 - 156s - loss: 0.4586 - acc: 0.8156 - val_loss: 0.6008 - val_acc: 0.6717
Epoch 40/100
625/625 - 156s - loss: 0.4234 - acc: 0.8180 - val_loss: 0.4325 - val_acc: 0.7943
Epoch 41/100
625/625 - 156s - loss: 0.4395 - acc: 0.8097 - val_loss: 0.3536 - val_acc: 0.8498
Epoch 42/100
625/625 - 156s - loss: 0.4453 - acc: 0.8062 - val_loss: 0.4309 - val_acc: 0.7971
Epoch 43/100
625/625 - 155s - loss: 0.4425 - acc: 0.8134 - val_loss: 0.3935 - val_acc: 0.8091
Epoch 44/100
625/625 - 156s - loss: 0.4301 - acc: 0.8153 - val_loss: 0.4786 - val_acc: 0.8207
Epoch 45/100
625/625 - 155s - loss: 0.4605 - acc: 0.8120 - val_loss: 0.3412 - val_acc: 0.8546
Epoch 46/100
625/625 - 156s - loss: 0.4261 - acc: 0.8221 - val_loss: 0.8955 - val_acc: 0.7194
Epoch 47/100
625/625 - 156s - loss: 0.4215 - acc: 0.8214 - val_loss: 0.4956 - val_acc: 0.8155
Epoch 48/100
625/625 - 156s - loss: 0.4270 - acc: 0.8144 - val_loss: 0.3912 - val_acc: 0.8331
Epoch 49/100
625/625 - 156s - loss: 0.4268 - acc: 0.8167 - val_loss: 0.3492 - val_acc: 0.8474
Epoch 50/100
625/625 - 155s - loss: 0.4275 - acc: 0.8135 - val_loss: 0.4503 - val_acc: 0.8383
Epoch 51/100
625/625 - 156s - loss: 0.4980 - acc: 0.8142 - val_loss: 0.4140 - val_acc: 0.8147
Epoch 52/100
625/625 - 155s - loss: 0.4528 - acc: 0.8221 - val_loss: 0.3570 - val_acc: 0.8456
Epoch 53/100
625/625 - 156s - loss: 0.4201 - acc: 0.8212 - val_loss: 0.5224 - val_acc: 0.6943
Epoch 54/100
625/625 - 153s - loss: 0.4399 - acc: 0.8113 - val_loss: 0.4587 - val_acc: 0.7782
Epoch 55/100
625/625 - 151s - loss: 0.4306 - acc: 0.8201 - val_loss: 0.3410 - val_acc: 0.8612
Epoch 56/100
625/625 - 150s - loss: 0.4986 - acc: 0.8180 - val_loss: 0.3842 - val_acc: 0.8345
Epoch 57/100
625/625 - 150s - loss: 0.4276 - acc: 0.8191 - val_loss: 0.4859 - val_acc: 0.7352
Epoch 58/100
625/625 - 150s - loss: 0.4344 - acc: 0.8084 - val_loss: 0.5267 - val_acc: 0.7242
Epoch 59/100
625/625 - 150s - loss: 0.4488 - acc: 0.8106 - val_loss: 0.3798 - val_acc: 0.8311
Epoch 60/100
625/625 - 150s - loss: 0.4445 - acc: 0.8111 - val_loss: 0.3818 - val_acc: 0.8367
Epoch 61/100
625/625 - 150s - loss: 0.5270 - acc: 0.8143 - val_loss: 0.3891 - val_acc: 0.8171
Epoch 62/100
625/625 - 150s - loss: 0.4444 - acc: 0.8131 - val_loss: 0.4622 - val_acc: 0.8061
Epoch 63/100
625/625 - 150s - loss: 0.4488 - acc: 0.8164 - val_loss: 0.4466 - val_acc: 0.7983
Epoch 64/100
625/625 - 150s - loss: 0.5321 - acc: 0.7947 - val_loss: 0.3542 - val_acc: 0.8546
Epoch 65/100
625/625 - 150s - loss: 0.4685 - acc: 0.7996 - val_loss: 0.3638 - val_acc: 0.8502
Epoch 66/100
625/625 - 151s - loss: 0.5376 - acc: 0.8097 - val_loss: 0.3572 - val_acc: 0.8464
Epoch 67/100
TimedStop: Training out of time (Elapsed = 2:56:51.937316)
Saving the Model Weights to /home/athena/models/dogs-vs-cats/cnn_weights_batch_size_32_5_double.h5
625/625 - 150s - loss: 0.4454 - acc: 0.8030 - val_loss: 0.3283 - val_acc: 0.8650
2019-07-16 01:36:37,054 graeae.timers.timer end: Ended: 2019-07-16 01:36:37.054383
I0716 01:36:37.054409 140093074827072 timer.py:77] Ended: 2019-07-16 01:36:37.054383
2019-07-16 01:36:37,055 graeae.timers.timer end: Elapsed: 2:56:53.207921
I0716 01:36:37.055413 140093074827072 timer.py:78] Elapsed: 2:56:53.207921

So we didn't quite get there. Maybe more training on the same model? Is it really getting better? It did the best at epoch 55. Maybe 86 is as good as it gets, but might as well try.

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_32_5_double").expanduser()
saver = partial(save_model_weights, name=str(path))
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=.95)
out_of_time = TimedStop(call_on_stopping=saver)

with TIMER:
    network.train()

Take Twelve

path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_32_5_double").expanduser()
saver = partial(save_model, path=path)
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=95)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=32)
print(str(network))
with TIMER:
    network.train()
2019-07-17 23:02:00,693 graeae.timers.timer start: Started: 2019-07-17 23:02:00.692917
I0717 23:02:00.693119 140065468053312 timer.py:70] Started: 2019-07-17 23:02:00.692917
(Network) - 
Path: /home/athena/data/datasets/images/dogs-vs-cats/train
 Epochs: 100
 Batch Size: 32
 Callbacks: [<__main__.Stop object at 0x7f62f68a6c18>, <__main__.TimedStop object at 0x7f62f68a6da0>]
Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 32
Callbacks: [<__main__.Stop object at 0x7f62f68a6c18>, <__main__.TimedStop object at 0x7f62f68a6da0>]
Found 20000 images belonging to 2 classes.
W0717 23:02:01.406333 140065468053312 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Found 5000 images belonging to 2 classes.
W0717 23:02:01.884152 140065468053312 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
625/625 - 533s - loss: 0.6770 - acc: 0.5823 - val_loss: 0.6773 - val_acc: 0.5877
Epoch 2/100
625/625 - 157s - loss: 0.6280 - acc: 0.6556 - val_loss: 0.5860 - val_acc: 0.6879
Epoch 3/100
625/625 - 156s - loss: 0.5955 - acc: 0.6870 - val_loss: 0.5364 - val_acc: 0.7322
Epoch 4/100
625/625 - 157s - loss: 0.5579 - acc: 0.7225 - val_loss: 0.5200 - val_acc: 0.7648
Epoch 5/100
625/625 - 157s - loss: 0.5208 - acc: 0.7455 - val_loss: 0.5350 - val_acc: 0.7410
Epoch 6/100
625/625 - 157s - loss: 0.4874 - acc: 0.7736 - val_loss: 0.4713 - val_acc: 0.7740
Epoch 7/100
625/625 - 157s - loss: 0.4630 - acc: 0.7847 - val_loss: 0.5257 - val_acc: 0.7530
Epoch 8/100
625/625 - 157s - loss: 0.4499 - acc: 0.7923 - val_loss: 0.5602 - val_acc: 0.7294
Epoch 9/100
625/625 - 157s - loss: 0.4299 - acc: 0.8048 - val_loss: 0.5801 - val_acc: 0.6901
Epoch 10/100
625/625 - 157s - loss: 0.4296 - acc: 0.8101 - val_loss: 0.5544 - val_acc: 0.7642
Epoch 11/100
625/625 - 156s - loss: 0.4328 - acc: 0.8066 - val_loss: 0.5598 - val_acc: 0.7354
Epoch 12/100
625/625 - 157s - loss: 0.4372 - acc: 0.8080 - val_loss: 0.3635 - val_acc: 0.8411
Epoch 13/100
625/625 - 156s - loss: 0.4376 - acc: 0.8092 - val_loss: 0.4418 - val_acc: 0.8003
Epoch 14/100
625/625 - 156s - loss: 0.4228 - acc: 0.8147 - val_loss: 0.3530 - val_acc: 0.8454
Epoch 15/100
625/625 - 156s - loss: 0.4181 - acc: 0.8167 - val_loss: 0.4420 - val_acc: 0.7879
Epoch 16/100
625/625 - 156s - loss: 0.4295 - acc: 0.8105 - val_loss: 0.3515 - val_acc: 0.8411
Epoch 17/100
625/625 - 156s - loss: 0.4400 - acc: 0.8087 - val_loss: 0.3885 - val_acc: 0.8323
Epoch 18/100
625/625 - 156s - loss: 0.4215 - acc: 0.8145 - val_loss: 0.3775 - val_acc: 0.8417
Epoch 19/100
625/625 - 156s - loss: 0.4323 - acc: 0.8159 - val_loss: 0.4359 - val_acc: 0.8065
Epoch 20/100
625/625 - 156s - loss: 0.4181 - acc: 0.8089 - val_loss: 0.5274 - val_acc: 0.8474
Epoch 21/100
625/625 - 156s - loss: 0.4392 - acc: 0.8116 - val_loss: 0.5858 - val_acc: 0.7163
Epoch 22/100
625/625 - 156s - loss: 0.4432 - acc: 0.8093 - val_loss: 0.5178 - val_acc: 0.7192
Epoch 23/100
625/625 - 156s - loss: 0.4629 - acc: 0.8091 - val_loss: 0.3340 - val_acc: 0.8604
Epoch 24/100
625/625 - 156s - loss: 0.4236 - acc: 0.8150 - val_loss: 0.3742 - val_acc: 0.8357
Epoch 25/100
625/625 - 156s - loss: 0.4257 - acc: 0.8152 - val_loss: 0.3582 - val_acc: 0.8367
Epoch 26/100
625/625 - 157s - loss: 0.4133 - acc: 0.8190 - val_loss: 0.3847 - val_acc: 0.8339
Epoch 27/100
625/625 - 156s - loss: 0.4298 - acc: 0.8136 - val_loss: 0.3627 - val_acc: 0.8446
Epoch 28/100
625/625 - 155s - loss: 0.4335 - acc: 0.8127 - val_loss: 0.4581 - val_acc: 0.8261
Epoch 29/100
625/625 - 156s - loss: 0.4234 - acc: 0.8123 - val_loss: 0.3819 - val_acc: 0.8333
Epoch 30/100
625/625 - 156s - loss: 0.4213 - acc: 0.8152 - val_loss: 0.3809 - val_acc: 0.8596
Epoch 31/100
625/625 - 156s - loss: 0.4399 - acc: 0.8029 - val_loss: 0.4879 - val_acc: 0.7662
Epoch 32/100
625/625 - 157s - loss: 0.4406 - acc: 0.8090 - val_loss: 0.3731 - val_acc: 0.8668
Epoch 33/100
625/625 - 156s - loss: 0.4336 - acc: 0.8162 - val_loss: 0.4306 - val_acc: 0.8389
Epoch 34/100
625/625 - 153s - loss: 0.4524 - acc: 0.8102 - val_loss: 0.3486 - val_acc: 0.8538
Epoch 35/100
625/625 - 151s - loss: 0.4306 - acc: 0.8127 - val_loss: 0.4819 - val_acc: 0.7923
Epoch 36/100
625/625 - 151s - loss: 0.4350 - acc: 0.8109 - val_loss: 0.3536 - val_acc: 0.8506
Epoch 37/100
625/625 - 151s - loss: 0.4480 - acc: 0.8073 - val_loss: 0.5374 - val_acc: 0.8165
Epoch 38/100
625/625 - 151s - loss: 0.4411 - acc: 0.8195 - val_loss: 1.1689 - val_acc: 0.6763
Epoch 39/100
625/625 - 151s - loss: 0.4652 - acc: 0.8189 - val_loss: 0.3614 - val_acc: 0.8395
Epoch 40/100
625/625 - 151s - loss: 0.4442 - acc: 0.8166 - val_loss: 0.3833 - val_acc: 0.8261
Epoch 41/100
625/625 - 151s - loss: 0.4259 - acc: 0.8128 - val_loss: 0.3824 - val_acc: 0.8349
Epoch 42/100
625/625 - 151s - loss: 0.4336 - acc: 0.8122 - val_loss: 0.3564 - val_acc: 0.8508
Epoch 43/100
625/625 - 151s - loss: 0.4268 - acc: 0.8151 - val_loss: 0.3299 - val_acc: 0.8636
Epoch 44/100
625/625 - 151s - loss: 0.4206 - acc: 0.8184 - val_loss: 0.3538 - val_acc: 0.8462
Epoch 45/100
625/625 - 151s - loss: 0.4318 - acc: 0.8173 - val_loss: 0.3854 - val_acc: 0.8482
Epoch 46/100
625/625 - 151s - loss: 0.4521 - acc: 0.8120 - val_loss: 0.3691 - val_acc: 0.8568
Epoch 47/100
625/625 - 150s - loss: 0.4561 - acc: 0.8090 - val_loss: 0.3810 - val_acc: 0.8474
Epoch 48/100
625/625 - 151s - loss: 0.4358 - acc: 0.8128 - val_loss: 0.4755 - val_acc: 0.8041
Epoch 49/100
625/625 - 151s - loss: 0.4296 - acc: 0.8132 - val_loss: 0.4490 - val_acc: 0.8618
Epoch 50/100
625/625 - 151s - loss: 0.4385 - acc: 0.8183 - val_loss: 0.3964 - val_acc: 0.8377
Epoch 51/100
625/625 - 150s - loss: 0.4368 - acc: 0.8164 - val_loss: 0.3463 - val_acc: 0.8592
Epoch 52/100
625/625 - 151s - loss: 0.4702 - acc: 0.8134 - val_loss: 0.3882 - val_acc: 0.8365
Epoch 53/100
625/625 - 151s - loss: 0.4555 - acc: 0.8025 - val_loss: 0.4516 - val_acc: 0.7558
Epoch 54/100
625/625 - 150s - loss: 0.4502 - acc: 0.8059 - val_loss: 0.4236 - val_acc: 0.8035
Epoch 55/100
625/625 - 151s - loss: 0.4829 - acc: 0.7970 - val_loss: 0.3800 - val_acc: 0.8349
Epoch 56/100
625/625 - 150s - loss: 0.4554 - acc: 0.8055 - val_loss: 0.4774 - val_acc: 0.8023
Epoch 57/100
625/625 - 151s - loss: 0.5059 - acc: 0.7813 - val_loss: 0.5145 - val_acc: 0.7332
Epoch 58/100
625/625 - 151s - loss: 0.4680 - acc: 0.8015 - val_loss: 0.3994 - val_acc: 0.8201
Epoch 59/100
625/625 - 151s - loss: 0.4938 - acc: 0.7962 - val_loss: 0.3682 - val_acc: 0.8421
Epoch 60/100
625/625 - 151s - loss: 0.4870 - acc: 0.7883 - val_loss: 0.3798 - val_acc: 0.8249
Epoch 61/100
625/625 - 151s - loss: 0.4785 - acc: 0.7902 - val_loss: 0.4609 - val_acc: 0.7967
Epoch 62/100
625/625 - 151s - loss: 0.4948 - acc: 0.7840 - val_loss: 0.4502 - val_acc: 0.7788
Epoch 63/100
625/625 - 150s - loss: 0.5021 - acc: 0.7847 - val_loss: 0.3829 - val_acc: 0.8307
Epoch 64/100
625/625 - 151s - loss: 0.4982 - acc: 0.7882 - val_loss: 0.6872 - val_acc: 0.6755
Epoch 65/100
625/625 - 151s - loss: 0.5514 - acc: 0.7707 - val_loss: 0.4682 - val_acc: 0.7804
Epoch 66/100
TimedStop: Training out of time (Elapsed = 2:55:40.163015)
625/625 - 151s - loss: 0.5037 - acc: 0.7713 - val_loss: 0.5198 - val_acc: 0.8091
2019-07-18 01:57:42,823 graeae.timers.timer end: Ended: 2019-07-18 01:57:42.823388
I0718 01:57:42.823415 140065468053312 timer.py:77] Ended: 2019-07-18 01:57:42.823388
2019-07-18 01:57:42,824 graeae.timers.timer end: Elapsed: 2:55:42.130471
I0718 01:57:42.824334 140065468053312 timer.py:78] Elapsed: 2:55:42.130471
best_accuracy = data.validation_accuracy.max()
best_loss = data.validation_loss.min()
accuracy_slice = data[data.validation_accuracy==best_accuracy]
loss_slice = data[data.validation_loss==best_loss]

accuracy_index = accuracy_slice.index[0]
loss_index = loss_slice.index[0]
print(f"Best Accuracy: {best_accuracy} "
      f"(loss={accuracy_slice.validation_loss.iloc[0]})"
      f" Epoch: {accuracy_index + 1}")
print(f"Best Loss: {best_loss} (accuracy="
      f"{loss_slice.validation_accuracy.iloc[0]})"
      f" Epoch: {loss_index + 1}")
Best Accuracy: 0.8668 (loss=0.3731) Epoch: 32
Best Loss: 0.3299 (accuracy=0.8636) Epoch: 43
data = pandas.read_csv("~/cats_vs_dogs.csv")
line_1 = holoviews.VLine(accuracy_index, label="Best Accuracy")
line_2 = holoviews.VLine(loss_index, label="Best Loss")

curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                          label="Training Loss",),
          holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                          label="Training Accuracy").opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                          label="Validation Loss",).opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                          label="Validation Accuracy").opts(tools=["hover"]),
          line_1, line_2]
plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                      ylabel="Performance", 
                                      title="Training vs Validation")
Embed(plot=plot, file_name="training_validation_loss_12")()

Figure Missing

There's more variance in the validation performance than I thought there would be. Although the best accuracy and loss come later, Epoch 22 (zero-based) has better loss (0.334) than the best accuracy's loss (0.37), and around the same accuracy (0.86) as the best loss' accuracy (0.864)

This next part won't work yet because I don't have the test data loaded and I need to save the predictions of the stored model.

predictions = model.predict(x_test)

loaded_model = kears.models.load_model(path)
new_predictions = loaded_model.predict(x_test)
numpy.assert.allclose(predictions, new_predictions, atol=1e-6)

Take Thirteen

data = pandas.read_csv("~/cats_vs_dogs.csv")
print(tabulate(data[data.validation_accuracy>=0.86], 
      headers="keys", tablefmt="orgtbl"))
  training_loss training_accuracy validation_loss validation_accuracy
22 0.4629 0.8091 0.334 0.8604
31 0.4406 0.809 0.3731 0.8668
42 0.4268 0.8151 0.3299 0.8636
48 0.4296 0.8132 0.449 0.8618
path = Path("~/models/dogs-vs-cats/cnn_weights_batch_size_32_5_double").expanduser()
saver = partial(save_model, path=path)
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=0.86)
out_of_time = TimedStop(call_on_stopping=saver)

network = Network(str(training_path), 
                  callbacks=[good_enough, out_of_time],
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=64)
print(str(network))
with TIMER:
    network.train()
2019-07-18 22:28:08,881 graeae.timers.timer start: Started: 2019-07-18 22:28:08.881418
I0718 22:28:08.881625 139703649425216 timer.py:70] Started: 2019-07-18 22:28:08.881418
(Network) - 
Path: /home/athena/data/datasets/images/dogs-vs-cats/train
 Epochs: 100
 Batch Size: 64
 Callbacks: [<__main__.Stop object at 0x7f0eb5d9f860>, <__main__.TimedStop object at 0x7f0eb5d9f828>]
Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 64
Callbacks: [<__main__.Stop object at 0x7f0eb5d9f860>, <__main__.TimedStop object at 0x7f0eb5d9f828>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
W0718 22:28:09.550659 139703649425216 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0718 22:28:10.023711 139703649425216 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
312/312 - 523s - loss: 0.6780 - acc: 0.5673 - val_loss: 0.6886 - val_acc: 0.5978
Epoch 2/100
312/312 - 158s - loss: 0.6415 - acc: 0.6353 - val_loss: 0.5944 - val_acc: 0.6789
Epoch 3/100
312/312 - 154s - loss: 0.6050 - acc: 0.6724 - val_loss: 0.5635 - val_acc: 0.7155
Epoch 4/100
312/312 - 153s - loss: 0.5789 - acc: 0.7044 - val_loss: 0.6398 - val_acc: 0.6212
Epoch 5/100
312/312 - 152s - loss: 0.5501 - acc: 0.7231 - val_loss: 0.5117 - val_acc: 0.7480
Epoch 6/100
312/312 - 153s - loss: 0.5155 - acc: 0.7478 - val_loss: 0.4671 - val_acc: 0.7788
Epoch 7/100
312/312 - 157s - loss: 0.4841 - acc: 0.7725 - val_loss: 0.4228 - val_acc: 0.8041
Epoch 8/100
312/312 - 157s - loss: 0.4513 - acc: 0.7875 - val_loss: 0.4660 - val_acc: 0.7877
Epoch 9/100
312/312 - 154s - loss: 0.4275 - acc: 0.8033 - val_loss: 0.3893 - val_acc: 0.8225
Epoch 10/100
312/312 - 157s - loss: 0.4140 - acc: 0.8084 - val_loss: 0.3792 - val_acc: 0.8249
Epoch 11/100
312/312 - 155s - loss: 0.3907 - acc: 0.8258 - val_loss: 0.3678 - val_acc: 0.8299
Epoch 12/100
312/312 - 156s - loss: 0.3772 - acc: 0.8301 - val_loss: 0.4905 - val_acc: 0.7558
Epoch 13/100
312/312 - 155s - loss: 0.3702 - acc: 0.8354 - val_loss: 0.5838 - val_acc: 0.7200
Epoch 14/100
312/312 - 155s - loss: 0.3537 - acc: 0.8412 - val_loss: 0.4032 - val_acc: 0.8225
Epoch 15/100
312/312 - 154s - loss: 0.3501 - acc: 0.8448 - val_loss: 0.3675 - val_acc: 0.8530
Epoch 16/100
312/312 - 150s - loss: 0.3480 - acc: 0.8451 - val_loss: 0.2927 - val_acc: 0.8770
Epoch 17/100
312/312 - 152s - loss: 0.3381 - acc: 0.8518 - val_loss: 0.2863 - val_acc: 0.8746
Epoch 18/100
312/312 - 152s - loss: 0.3386 - acc: 0.8491 - val_loss: 0.3782 - val_acc: 0.8419
Epoch 19/100
312/312 - 153s - loss: 0.3389 - acc: 0.8527 - val_loss: 0.2987 - val_acc: 0.8682
Epoch 20/100
312/312 - 152s - loss: 0.3294 - acc: 0.8589 - val_loss: 0.3645 - val_acc: 0.8456
Epoch 21/100
312/312 - 154s - loss: 0.3239 - acc: 0.8584 - val_loss: 1.4912 - val_acc: 0.6442
Epoch 22/100
312/312 - 153s - loss: 0.3208 - acc: 0.8629 - val_loss: 0.3812 - val_acc: 0.8243
Epoch 23/100
312/312 - 154s - loss: 0.3253 - acc: 0.8642 - val_loss: 0.2765 - val_acc: 0.8918
Epoch 24/100
312/312 - 154s - loss: 0.3252 - acc: 0.8615 - val_loss: 0.2959 - val_acc: 0.8704
Epoch 25/100
312/312 - 154s - loss: 0.3158 - acc: 0.8639 - val_loss: 0.2870 - val_acc: 0.8804
Epoch 26/100
312/312 - 154s - loss: 0.3332 - acc: 0.8594 - val_loss: 0.2946 - val_acc: 0.8752
Epoch 27/100
312/312 - 154s - loss: 0.3346 - acc: 0.8546 - val_loss: 0.3017 - val_acc: 0.8754
Epoch 28/100
312/312 - 153s - loss: 0.3245 - acc: 0.8595 - val_loss: 0.3335 - val_acc: 0.8401
Epoch 29/100
312/312 - 153s - loss: 0.3180 - acc: 0.8626 - val_loss: 0.3673 - val_acc: 0.8237
Epoch 30/100
312/312 - 153s - loss: 0.3224 - acc: 0.8610 - val_loss: 0.2796 - val_acc: 0.8860
Epoch 31/100
312/312 - 153s - loss: 0.3227 - acc: 0.8613 - val_loss: 0.4363 - val_acc: 0.8173
Epoch 32/100
312/312 - 153s - loss: 0.3156 - acc: 0.8676 - val_loss: 0.5863 - val_acc: 0.8027
Epoch 33/100
312/312 - 154s - loss: 0.3222 - acc: 0.8623 - val_loss: 0.2824 - val_acc: 0.8870
Epoch 34/100
312/312 - 153s - loss: 0.3183 - acc: 0.8629 - val_loss: 0.2856 - val_acc: 0.8790
Epoch 35/100
312/312 - 153s - loss: 0.3152 - acc: 0.8647 - val_loss: 0.3953 - val_acc: 0.8267
Epoch 36/100
312/312 - 154s - loss: 0.3134 - acc: 0.8643 - val_loss: 0.2576 - val_acc: 0.8840
Epoch 37/100
312/312 - 154s - loss: 0.3296 - acc: 0.8595 - val_loss: 0.2680 - val_acc: 0.8838
Epoch 38/100
312/312 - 153s - loss: 0.3285 - acc: 0.8552 - val_loss: 0.3820 - val_acc: 0.8504
Epoch 39/100
312/312 - 154s - loss: 0.3098 - acc: 0.8653 - val_loss: 0.4445 - val_acc: 0.8089
Epoch 40/100
312/312 - 153s - loss: 0.3272 - acc: 0.8579 - val_loss: 0.3113 - val_acc: 0.8758
Epoch 41/100
312/312 - 153s - loss: 0.3382 - acc: 0.8570 - val_loss: 0.3215 - val_acc: 0.8640
Epoch 42/100
312/312 - 154s - loss: 0.3239 - acc: 0.8609 - val_loss: 0.2939 - val_acc: 0.8784
Epoch 43/100
312/312 - 154s - loss: 0.3302 - acc: 0.8582 - val_loss: 0.3737 - val_acc: 0.8431
Epoch 44/100
312/312 - 153s - loss: 0.3178 - acc: 0.8657 - val_loss: 0.2844 - val_acc: 0.8844
Epoch 45/100
312/312 - 153s - loss: 0.3260 - acc: 0.8601 - val_loss: 0.3615 - val_acc: 0.8333
Epoch 46/100
312/312 - 153s - loss: 0.3248 - acc: 0.8603 - val_loss: 0.3506 - val_acc: 0.8488
Epoch 47/100
312/312 - 153s - loss: 0.3260 - acc: 0.8616 - val_loss: 0.2585 - val_acc: 0.8940
Epoch 48/100
312/312 - 154s - loss: 0.3168 - acc: 0.8647 - val_loss: 0.2970 - val_acc: 0.8728
Epoch 49/100
312/312 - 154s - loss: 0.3418 - acc: 0.8630 - val_loss: 0.2806 - val_acc: 0.8852
Epoch 50/100
312/312 - 154s - loss: 0.3146 - acc: 0.8636 - val_loss: 0.2879 - val_acc: 0.8822
Epoch 51/100
312/312 - 153s - loss: 0.3185 - acc: 0.8639 - val_loss: 0.3118 - val_acc: 0.8668
Epoch 52/100
312/312 - 154s - loss: 0.3136 - acc: 0.8610 - val_loss: 0.2760 - val_acc: 0.8852
Epoch 53/100
312/312 - 153s - loss: 0.3079 - acc: 0.8704 - val_loss: 0.3753 - val_acc: 0.8329
Epoch 54/100
312/312 - 153s - loss: 0.3078 - acc: 0.8688 - val_loss: 0.2850 - val_acc: 0.8790
Epoch 55/100
312/312 - 153s - loss: 0.3123 - acc: 0.8658 - val_loss: 0.2774 - val_acc: 0.8884
Epoch 56/100
312/312 - 154s - loss: 0.3161 - acc: 0.8685 - val_loss: 0.3321 - val_acc: 0.8584
Epoch 57/100
312/312 - 153s - loss: 0.3084 - acc: 0.8662 - val_loss: 0.2941 - val_acc: 0.8710
Epoch 58/100
312/312 - 154s - loss: 0.3224 - acc: 0.8651 - val_loss: 0.2577 - val_acc: 0.8906
Epoch 59/100
312/312 - 154s - loss: 0.3105 - acc: 0.8639 - val_loss: 0.3438 - val_acc: 0.8722
Epoch 60/100
312/312 - 153s - loss: 0.3070 - acc: 0.8684 - val_loss: 0.2602 - val_acc: 0.8898
Epoch 61/100
312/312 - 153s - loss: 0.3140 - acc: 0.8658 - val_loss: 0.2677 - val_acc: 0.8882
Epoch 62/100
312/312 - 153s - loss: 0.3096 - acc: 0.8684 - val_loss: 0.2614 - val_acc: 0.8842
Epoch 63/100
312/312 - 153s - loss: 0.3081 - acc: 0.8686 - val_loss: 0.2908 - val_acc: 0.8740
Epoch 64/100
312/312 - 153s - loss: 0.3134 - acc: 0.8668 - val_loss: 0.4757 - val_acc: 0.7314
Epoch 65/100
312/312 - 154s - loss: 0.3085 - acc: 0.8699 - val_loss: 0.3414 - val_acc: 0.8538
Epoch 66/100
TimedStop: Training out of time (Elapsed = 2:55:50.573494)
312/312 - 154s - loss: 0.3208 - acc: 0.8635 - val_loss: 0.3494 - val_acc: 0.8399
2019-07-19 01:24:01,375 graeae.timers.timer end: Ended: 2019-07-19 01:24:01.375685
I0719 01:24:01.375718 139703649425216 timer.py:77] Ended: 2019-07-19 01:24:01.375685
2019-07-19 01:24:01,376 graeae.timers.timer end: Elapsed: 2:55:52.494267
I0719 01:24:01.376679 139703649425216 timer.py:78] Elapsed: 2:55:52.494267

Weirdly, we got up to 89 % validation accuracy this time but it didn't stop. Looking at the stop I had the maximum loss set to 24%, which is probably why it didn't stop, but it looks like I should be able to get better accuracy anyway.

data = pandas.read_csv("~/cats_vs_dogs_2.csv")
print(tabulate(data[data.validation_accuracy>=0.88],
      headers="keys", tablefmt="orgtbl"))
  training_loss training_accuracy validation_loss validation_accuracy
22 0.3253 0.8642 0.2765 0.8918
24 0.3158 0.8639 0.287 0.8804
29 0.3224 0.861 0.2796 0.886
32 0.3222 0.8623 0.2824 0.887
35 0.3134 0.8643 0.2576 0.884
36 0.3296 0.8595 0.268 0.8838
43 0.3178 0.8657 0.2844 0.8844
46 0.326 0.8616 0.2585 0.894
48 0.3418 0.863 0.2806 0.8852
49 0.3146 0.8636 0.2879 0.8822
51 0.3136 0.861 0.276 0.8852
54 0.3123 0.8658 0.2774 0.8884
57 0.3224 0.8651 0.2577 0.8906
59 0.307 0.8684 0.2602 0.8898
60 0.314 0.8658 0.2677 0.8882
61 0.3096 0.8684 0.2614 0.8842

By increasing the batch size we got better faster. So, maybe it isn't done learning yet. Or maybe a larger batch size would be useful. It doesn't seem to be improving, by much, though.

best_accuracy = data.validation_accuracy.max()
best_loss = data.validation_loss.min()
accuracy_index = data[data.validation_accuracy==best_accuracy].index[0]
loss_index = data[data.validation_loss==best_loss].index[0]

print(f"Highest Accuracy: {best_accuracy} Epoch: {accuracy_index + 1}")
print(f"Lowest Loss: {best_loss} Epoch: {loss_index}")
Highest Accuracy: 0.894 Epoch: 47
Lowest Loss: 0.2576 Epoch: 35
line_1 = holoviews.VLine(accuracy_index, label="Best Accuracy Epoch")
line_2 = holoviews.VLine(loss_index, label="Best Loss Epoch")
accuracy_line = holoviews.HLine(best_accuracy, label="Highest Accuracy")
loss_line = holoviews.HLine(best_loss, label="lowest Loss")

curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                          label="Training Loss",),
          holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                          label="Training Accuracy").opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                          label="Validation Loss",).opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                          label="Validation Accuracy").opts(tools=["hover"]),
          line_1, line_2, accuracy_line, loss_line]
plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                      ylabel="Performance", 
                                      title="Training vs Validation")
Embed(plot=plot, file_name="training_validation_loss_13")()

Figure Missing

It looks like it might be plateauing. Only one way to really find out, I guess - more training.

  • Take Thirteen Point Two

    I'll re-train it without using the Stop condition to see if it gets better than I was allowing it to get.

    saver = partial(save_model, path=path)
    good_enough = Stop(call_on_stopping=saver,
                       minimum_accuracy=0.90)
    out_of_time = TimedStop(call_on_stopping=saver)
    
    network = Network(str(training_path), 
                      callbacks=[out_of_time],
                      convolution_layers=5,
                      set_steps = True,
                      epochs = 100,
                      batch_size=64)
    network._model = tensorflow.keras.models.load_model(str(path))
    with TIMER:
        network.train()
    
    W0720 10:27:38.686592 139935525873472 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0720 10:27:38.688725 139935525873472 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0720 10:27:38.690803 139935525873472 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0720 10:27:53.533317 139935525873472 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use tf.where in 2.0, which has the same broadcast rule as np.where
    2019-07-20 10:27:54,312 graeae.timers.timer start: Started: 2019-07-20 10:27:54.312000
    I0720 10:27:54.312226 139935525873472 timer.py:70] Started: 2019-07-20 10:27:54.312000
    Found 20000 images belonging to 2 classes.
    Found 5000 images belonging to 2 classes.
    Epoch 1/100
    312/312 - 448s - loss: 0.3622 - acc: 0.8679 - val_loss: 0.2536 - val_acc: 0.8898
    Epoch 2/100
    312/312 - 154s - loss: 0.3059 - acc: 0.8716 - val_loss: 0.2927 - val_acc: 0.8932
    Epoch 3/100
    312/312 - 154s - loss: 0.3164 - acc: 0.8691 - val_loss: 0.3203 - val_acc: 0.8620
    Epoch 4/100
    312/312 - 154s - loss: 0.3190 - acc: 0.8620 - val_loss: 1.0261 - val_acc: 0.6781
    Epoch 5/100
    312/312 - 153s - loss: 0.3308 - acc: 0.8662 - val_loss: 0.2791 - val_acc: 0.8736
    Epoch 6/100
    312/312 - 152s - loss: 0.3234 - acc: 0.8633 - val_loss: 0.2434 - val_acc: 0.9054
    Epoch 7/100
    312/312 - 151s - loss: 0.3040 - acc: 0.8702 - val_loss: 0.2895 - val_acc: 0.8828
    Epoch 8/100
    312/312 - 150s - loss: 0.3063 - acc: 0.8714 - val_loss: 0.2788 - val_acc: 0.8744
    Epoch 9/100
    312/312 - 150s - loss: 0.3043 - acc: 0.8717 - val_loss: 0.4412 - val_acc: 0.8011
    Epoch 10/100
    312/312 - 150s - loss: 0.3059 - acc: 0.8690 - val_loss: 0.3050 - val_acc: 0.8784
    Epoch 11/100
    312/312 - 150s - loss: 0.3368 - acc: 0.8633 - val_loss: 0.2813 - val_acc: 0.8864
    Epoch 12/100
    312/312 - 150s - loss: 0.3282 - acc: 0.8633 - val_loss: 0.4125 - val_acc: 0.8209
    Epoch 13/100
    312/312 - 150s - loss: 0.3091 - acc: 0.8701 - val_loss: 0.3506 - val_acc: 0.8842
    Epoch 14/100
    312/312 - 149s - loss: 0.3369 - acc: 0.8585 - val_loss: 0.2702 - val_acc: 0.8908
    Epoch 15/100
    312/312 - 150s - loss: 0.3253 - acc: 0.8624 - val_loss: 0.2360 - val_acc: 0.8996
    Epoch 16/100
    312/312 - 149s - loss: 0.3206 - acc: 0.8622 - val_loss: 0.4118 - val_acc: 0.8454
    Epoch 17/100
    312/312 - 149s - loss: 0.3396 - acc: 0.8629 - val_loss: 0.4940 - val_acc: 0.7899
    Epoch 18/100
    312/312 - 149s - loss: 0.3190 - acc: 0.8632 - val_loss: 0.2928 - val_acc: 0.8762
    Epoch 19/100
    312/312 - 150s - loss: 0.3154 - acc: 0.8658 - val_loss: 0.2806 - val_acc: 0.8914
    Epoch 20/100
    312/312 - 151s - loss: 0.3241 - acc: 0.8632 - val_loss: 0.2797 - val_acc: 0.8778
    Epoch 21/100
    312/312 - 150s - loss: 0.4349 - acc: 0.8620 - val_loss: 0.2824 - val_acc: 0.8836
    Epoch 22/100
    312/312 - 150s - loss: 0.3255 - acc: 0.8626 - val_loss: 0.3107 - val_acc: 0.8736
    Epoch 23/100
    312/312 - 150s - loss: 0.3139 - acc: 0.8678 - val_loss: 0.2830 - val_acc: 0.8796
    Epoch 24/100
    312/312 - 151s - loss: 0.3231 - acc: 0.8641 - val_loss: 0.3103 - val_acc: 0.8874
    Epoch 25/100
    312/312 - 151s - loss: 0.3222 - acc: 0.8647 - val_loss: 0.7359 - val_acc: 0.6821
    Epoch 26/100
    312/312 - 151s - loss: 0.3140 - acc: 0.8657 - val_loss: 0.5905 - val_acc: 0.8235
    Epoch 27/100
    312/312 - 150s - loss: 0.3431 - acc: 0.8684 - val_loss: 0.3469 - val_acc: 0.8776
    Epoch 28/100
    312/312 - 149s - loss: 0.3124 - acc: 0.8681 - val_loss: 0.3550 - val_acc: 0.8444
    Epoch 29/100
    312/312 - 149s - loss: 0.3159 - acc: 0.8682 - val_loss: 0.3107 - val_acc: 0.8642
    Epoch 30/100
    312/312 - 149s - loss: 0.3177 - acc: 0.8621 - val_loss: 0.2972 - val_acc: 0.8700
    Epoch 31/100
    312/312 - 149s - loss: 0.3370 - acc: 0.8636 - val_loss: 0.2857 - val_acc: 0.8876
    Epoch 32/100
    312/312 - 149s - loss: 0.3476 - acc: 0.8602 - val_loss: 0.2762 - val_acc: 0.8866
    Epoch 33/100
    312/312 - 150s - loss: 0.3383 - acc: 0.8600 - val_loss: 0.3066 - val_acc: 0.8874
    Epoch 34/100
    312/312 - 150s - loss: 0.3316 - acc: 0.8681 - val_loss: 0.4850 - val_acc: 0.7244
    Epoch 35/100
    312/312 - 149s - loss: 0.3158 - acc: 0.8642 - val_loss: 0.2958 - val_acc: 0.8748
    Epoch 36/100
    312/312 - 151s - loss: 0.3285 - acc: 0.8597 - val_loss: 0.2739 - val_acc: 0.8762
    Epoch 37/100
    312/312 - 150s - loss: 0.3239 - acc: 0.8622 - val_loss: 0.3081 - val_acc: 0.8714
    Epoch 38/100
    312/312 - 150s - loss: 0.3277 - acc: 0.8601 - val_loss: 0.3068 - val_acc: 0.8882
    Epoch 39/100
    312/312 - 152s - loss: 0.3228 - acc: 0.8601 - val_loss: 0.3480 - val_acc: 0.8546
    Epoch 40/100
    312/312 - 152s - loss: 0.3631 - acc: 0.8633 - val_loss: 0.2939 - val_acc: 0.8906
    Epoch 41/100
    312/312 - 152s - loss: 0.3313 - acc: 0.8592 - val_loss: 0.2674 - val_acc: 0.8918
    Epoch 42/100
    312/312 - 152s - loss: 0.4150 - acc: 0.8648 - val_loss: 0.2846 - val_acc: 0.8796
    Epoch 43/100
    312/312 - 154s - loss: 0.3161 - acc: 0.8665 - val_loss: 0.2665 - val_acc: 0.8858
    Epoch 44/100
    312/312 - 153s - loss: 0.3330 - acc: 0.8627 - val_loss: 0.2977 - val_acc: 0.8726
    Epoch 45/100
    312/312 - 149s - loss: 0.3250 - acc: 0.8668 - val_loss: 0.2746 - val_acc: 0.8796
    Epoch 46/100
    312/312 - 153s - loss: 0.3091 - acc: 0.8715 - val_loss: 0.2864 - val_acc: 0.8924
    Epoch 47/100
    312/312 - 151s - loss: 0.3288 - acc: 0.8622 - val_loss: 0.2662 - val_acc: 0.8886
    Epoch 48/100
    312/312 - 149s - loss: 0.3594 - acc: 0.8589 - val_loss: 0.2818 - val_acc: 0.8842
    Epoch 49/100
    312/312 - 148s - loss: 0.3275 - acc: 0.8604 - val_loss: 0.4913 - val_acc: 0.7610
    Epoch 50/100
    312/312 - 149s - loss: 0.3611 - acc: 0.8586 - val_loss: 0.3517 - val_acc: 0.8438
    Epoch 51/100
    312/312 - 148s - loss: 0.3413 - acc: 0.8568 - val_loss: 0.3549 - val_acc: 0.8602
    Epoch 52/100
    312/312 - 148s - loss: 0.3275 - acc: 0.8627 - val_loss: 0.2567 - val_acc: 0.8926
    Epoch 53/100
    312/312 - 148s - loss: 0.3679 - acc: 0.8592 - val_loss: 0.3676 - val_acc: 0.8554
    Epoch 54/100
    312/312 - 148s - loss: 0.3332 - acc: 0.8580 - val_loss: 0.2862 - val_acc: 0.8754
    Epoch 55/100
    312/312 - 148s - loss: 0.3254 - acc: 0.8644 - val_loss: 11.4265 - val_acc: 0.5905
    Epoch 56/100
    312/312 - 148s - loss: 0.3735 - acc: 0.8634 - val_loss: 0.3241 - val_acc: 0.8728
    Epoch 57/100
    312/312 - 148s - loss: 0.3401 - acc: 0.8588 - val_loss: 0.2946 - val_acc: 0.8746
    Epoch 58/100
    312/312 - 149s - loss: 0.4813 - acc: 0.8624 - val_loss: 0.3596 - val_acc: 0.8452
    Epoch 59/100
    312/312 - 148s - loss: 0.3279 - acc: 0.8633 - val_loss: 0.2789 - val_acc: 0.8734
    Epoch 60/100
    312/312 - 147s - loss: 0.3375 - acc: 0.8591 - val_loss: 0.3680 - val_acc: 0.8331
    Epoch 61/100
    312/312 - 148s - loss: 0.3359 - acc: 0.8585 - val_loss: 0.3290 - val_acc: 0.8802
    Epoch 62/100
    312/312 - 148s - loss: 0.3259 - acc: 0.8633 - val_loss: 0.3308 - val_acc: 0.8676
    Epoch 63/100
    312/312 - 149s - loss: 0.3290 - acc: 0.8598 - val_loss: 0.2710 - val_acc: 0.8850
    Epoch 64/100
    312/312 - 148s - loss: 0.3383 - acc: 0.8507 - val_loss: 0.2740 - val_acc: 0.8838
    Epoch 65/100
    312/312 - 149s - loss: 0.3287 - acc: 0.8627 - val_loss: 0.3301 - val_acc: 0.8540
    Epoch 66/100
    312/312 - 149s - loss: 1.9072 - acc: 0.8593 - val_loss: 0.3180 - val_acc: 0.8602
    Epoch 67/100
    312/312 - 148s - loss: 0.3265 - acc: 0.8597 - val_loss: 0.2972 - val_acc: 0.8708
    Epoch 68/100
    312/312 - 150s - loss: 0.3515 - acc: 0.8613 - val_loss: 0.4104 - val_acc: 0.8444
    Epoch 69/100
    TimedStop: Training out of time (Elapsed = 2:57:23.471615)
    TimedStop: Longest Epoch = 2:57:23.471615
    312/312 - 148s - loss: 0.3617 - acc: 0.8605 - val_loss: 0.6406 - val_acc: 0.8221
    2019-07-20 13:25:18,577 graeae.timers.timer end: Ended: 2019-07-20 13:25:18.577775
    I0720 13:25:18.577804 139935525873472 timer.py:77] Ended: 2019-07-20 13:25:18.577775
    2019-07-20 13:25:18,578 graeae.timers.timer end: Elapsed: 2:57:24.265775
    I0720 13:25:18.578630 139935525873472 timer.py:78] Elapsed: 2:57:24.265775
    
    data = pandas.read_csv("~/cats_vs_dogs_3.csv")
    print(tabulate(data[data.validation_accuracy>=0.89],
          headers="keys", tablefmt="orgtbl"))
    
      training_loss training_accuracy validation_loss validation_accuracy
    1 0.3059 0.8716 0.2927 0.8932
    5 0.3234 0.8633 0.2434 0.9054
    13 0.3369 0.8585 0.2702 0.8908
    14 0.3253 0.8624 0.236 0.8996
    18 0.3154 0.8658 0.2806 0.8914
    39 0.3631 0.8633 0.2939 0.8906
    40 0.3313 0.8592 0.2674 0.8918
    45 0.3091 0.8715 0.2864 0.8924
    51 0.3275 0.8627 0.2567 0.8926
    best_accuracy = data.validation_accuracy.max()
    best_loss = data.validation_loss.min()
    accuracy_index = data[data.validation_accuracy==best_accuracy].index[0]
    loss_index = data[data.validation_loss==best_loss].index[0]
    
    print(f"Highest Accuracy: {best_accuracy} Epoch: {accuracy_index + 1}")
    print(f"Lowest Loss: {best_loss} Epoch: {loss_index}")
    
    Highest Accuracy: 0.9054 Epoch: 6
    Lowest Loss: 0.236 Epoch: 14
    
    line_1 = holoviews.VLine(accuracy_index, label="Best Accuracy Epoch")
    line_2 = holoviews.VLine(loss_index, label="Best Loss Epoch")
    accuracy_line = holoviews.HLine(best_accuracy, label="Highest Accuracy")
    loss_line = holoviews.HLine(best_loss, label="lowest Loss")
    
    curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                              label="Training Loss",),
              holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                              label="Training Accuracy").opts(tools=["hover"]),
              holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                              label="Validation Loss",).opts(tools=["hover"]),
              holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                              label="Validation Accuracy").opts(tools=["hover"]),
              line_1, line_2, accuracy_line, loss_line]
    plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                          ylabel="Performance", 
                                          title="Training vs Validation")
    Embed(plot=plot, file_name="training_validation_loss_13")()
    

    Figure Missing

    So, I'm not sure what to make of this. It looks like it did improve a little, but that it peaked early on, and yet kept rising back up to a high level of validation accuracy. Is it overfitting or not? I guess I'll train it some more and find out.

    with TIMER:
        network.train()
    
    2019-07-20 13:37:51,641 graeae.timers.timer start: Started: 2019-07-20 13:37:51.641628
    I0720 13:37:51.641655 139935525873472 timer.py:70] Started: 2019-07-20 13:37:51.641628
    Epoch 1/100
    312/312 - 152s - loss: 0.3367 - acc: 0.8564 - val_loss: 0.4919 - val_acc: 0.7398
    Epoch 2/100
    312/312 - 152s - loss: 0.3354 - acc: 0.8538 - val_loss: 0.4213 - val_acc: 0.7724
    Epoch 3/100
    312/312 - 151s - loss: 0.3568 - acc: 0.8515 - val_loss: 0.4105 - val_acc: 0.8249
    Epoch 4/100
    312/312 - 152s - loss: 0.3413 - acc: 0.8557 - val_loss: 0.2920 - val_acc: 0.8850
    Epoch 5/100
    312/312 - 154s - loss: 0.3542 - acc: 0.8528 - val_loss: 0.2883 - val_acc: 0.8846
    Epoch 6/100
    312/312 - 153s - loss: 0.3494 - acc: 0.8558 - val_loss: 0.4051 - val_acc: 0.7983
    Epoch 7/100
    312/312 - 151s - loss: 0.3653 - acc: 0.8487 - val_loss: 0.3081 - val_acc: 0.8710
    Epoch 8/100
    312/312 - 152s - loss: 0.3699 - acc: 0.8447 - val_loss: 0.2843 - val_acc: 0.8882
    Epoch 9/100
    312/312 - 151s - loss: 0.8043 - acc: 0.8519 - val_loss: 0.3204 - val_acc: 0.8518
    Epoch 10/100
    312/312 - 151s - loss: 0.3429 - acc: 0.8528 - val_loss: 0.2779 - val_acc: 0.8796
    Epoch 11/100
    312/312 - 150s - loss: 0.3486 - acc: 0.8572 - val_loss: 0.3647 - val_acc: 0.8435
    Epoch 12/100
    312/312 - 149s - loss: 0.3324 - acc: 0.8572 - val_loss: 0.3560 - val_acc: 0.8602
    Epoch 13/100
    312/312 - 149s - loss: 0.3465 - acc: 0.8529 - val_loss: 0.3772 - val_acc: 0.8584
    Epoch 14/100
    312/312 - 148s - loss: 0.4045 - acc: 0.8446 - val_loss: 0.4425 - val_acc: 0.7360
    Epoch 15/100
    312/312 - 150s - loss: 6.0508 - acc: 0.8242 - val_loss: 0.3422 - val_acc: 0.8460
    Epoch 16/100
    312/312 - 150s - loss: 0.3486 - acc: 0.8495 - val_loss: 0.4366 - val_acc: 0.7712
    Epoch 17/100
    312/312 - 149s - loss: 0.3509 - acc: 0.8523 - val_loss: 0.2882 - val_acc: 0.8730
    Epoch 18/100
    312/312 - 149s - loss: 0.3509 - acc: 0.8513 - val_loss: 0.3403 - val_acc: 0.8508
    Epoch 19/100
    312/312 - 148s - loss: 0.3557 - acc: 0.8486 - val_loss: 0.2864 - val_acc: 0.8886
    Epoch 20/100
    312/312 - 148s - loss: 0.3596 - acc: 0.8491 - val_loss: 0.3974 - val_acc: 0.7899
    Epoch 21/100
    312/312 - 149s - loss: 0.3630 - acc: 0.8465 - val_loss: 0.3558 - val_acc: 0.8482
    Epoch 22/100
    312/312 - 149s - loss: 0.3745 - acc: 0.8441 - val_loss: 0.2867 - val_acc: 0.8870
    Epoch 23/100
    312/312 - 148s - loss: 0.3749 - acc: 0.8490 - val_loss: 0.2815 - val_acc: 0.8806
    Epoch 24/100
    312/312 - 149s - loss: 0.5479 - acc: 0.8500 - val_loss: 0.3382 - val_acc: 0.8830
    Epoch 25/100
    312/312 - 148s - loss: 0.3600 - acc: 0.8485 - val_loss: 0.5508 - val_acc: 0.6703
    Epoch 26/100
    312/312 - 148s - loss: 0.3494 - acc: 0.8533 - val_loss: 0.4929 - val_acc: 0.7678
    Epoch 27/100
    312/312 - 148s - loss: 0.3487 - acc: 0.8493 - val_loss: 0.3008 - val_acc: 0.8728
    Epoch 28/100
    312/312 - 148s - loss: 0.3657 - acc: 0.8483 - val_loss: 0.3355 - val_acc: 0.8668
    Epoch 29/100
    312/312 - 148s - loss: 0.3730 - acc: 0.8422 - val_loss: 0.7513 - val_acc: 0.7348
    Epoch 30/100
    312/312 - 148s - loss: 0.3703 - acc: 0.8477 - val_loss: 0.3497 - val_acc: 0.8558
    Epoch 31/100
    312/312 - 148s - loss: 0.3626 - acc: 0.8453 - val_loss: 0.2778 - val_acc: 0.8940
    Epoch 32/100
    312/312 - 148s - loss: 0.3627 - acc: 0.8470 - val_loss: 0.2933 - val_acc: 0.8768
    Epoch 33/100
    312/312 - 148s - loss: 0.4131 - acc: 0.8341 - val_loss: 0.3067 - val_acc: 0.8698
    Epoch 34/100
    312/312 - 149s - loss: 0.3766 - acc: 0.8491 - val_loss: 0.5967 - val_acc: 0.6657
    Epoch 35/100
    312/312 - 148s - loss: 0.5109 - acc: 0.8372 - val_loss: 0.3495 - val_acc: 0.8440
    Epoch 36/100
    312/312 - 148s - loss: 0.3736 - acc: 0.8472 - val_loss: 0.7757 - val_acc: 0.7460
    Epoch 37/100
    312/312 - 149s - loss: 0.3557 - acc: 0.8509 - val_loss: 0.3210 - val_acc: 0.8726
    Epoch 38/100
    312/312 - 149s - loss: 0.3841 - acc: 0.8386 - val_loss: 0.2899 - val_acc: 0.8640
    Epoch 39/100
    312/312 - 149s - loss: 0.4163 - acc: 0.8462 - val_loss: 0.4341 - val_acc: 0.7977
    Epoch 40/100
    312/312 - 148s - loss: 0.3674 - acc: 0.8446 - val_loss: 0.5823 - val_acc: 0.8526
    Epoch 41/100
    312/312 - 148s - loss: 0.3612 - acc: 0.8510 - val_loss: 0.2640 - val_acc: 0.8804
    Epoch 42/100
    312/312 - 149s - loss: 0.3752 - acc: 0.8491 - val_loss: 0.4750 - val_acc: 0.7726
    Epoch 43/100
    312/312 - 149s - loss: 0.3563 - acc: 0.8424 - val_loss: 0.3072 - val_acc: 0.8810
    Epoch 44/100
    312/312 - 149s - loss: 0.3587 - acc: 0.8474 - val_loss: 0.3906 - val_acc: 0.8866
    Epoch 45/100
    312/312 - 149s - loss: 0.3615 - acc: 0.8461 - val_loss: 0.5745 - val_acc: 0.7776
    Epoch 46/100
    312/312 - 149s - loss: 0.4017 - acc: 0.8384 - val_loss: 0.3786 - val_acc: 0.8722
    Epoch 47/100
    312/312 - 149s - loss: 0.3969 - acc: 0.8383 - val_loss: 0.4942 - val_acc: 0.7286
    Epoch 48/100
    312/312 - 148s - loss: 0.3628 - acc: 0.8395 - val_loss: 0.3358 - val_acc: 0.8606
    Epoch 49/100
    312/312 - 149s - loss: 0.3804 - acc: 0.8353 - val_loss: 0.3131 - val_acc: 0.8698
    Epoch 50/100
    312/312 - 150s - loss: 0.3997 - acc: 0.8378 - val_loss: 0.4310 - val_acc: 0.7873
    Epoch 51/100
    312/312 - 152s - loss: 0.3905 - acc: 0.8364 - val_loss: 0.3262 - val_acc: 0.8750
    Epoch 52/100
    312/312 - 151s - loss: 0.4109 - acc: 0.8449 - val_loss: 0.7448 - val_acc: 0.7512
    Epoch 53/100
    312/312 - 150s - loss: 0.3737 - acc: 0.8426 - val_loss: 0.3926 - val_acc: 0.8317
    Epoch 54/100
    312/312 - 148s - loss: 0.3573 - acc: 0.8420 - val_loss: 0.3395 - val_acc: 0.8516
    Epoch 55/100
    312/312 - 149s - loss: 0.3886 - acc: 0.8314 - val_loss: 0.3221 - val_acc: 0.8423
    Epoch 56/100
    312/312 - 149s - loss: 0.3991 - acc: 0.8325 - val_loss: 0.4920 - val_acc: 0.8243
    Epoch 57/100
    312/312 - 149s - loss: 0.3845 - acc: 0.8332 - val_loss: 0.3332 - val_acc: 0.8518
    Epoch 58/100
    312/312 - 149s - loss: 0.3764 - acc: 0.8336 - val_loss: 0.5819 - val_acc: 0.7029
    Epoch 59/100
    312/312 - 149s - loss: 0.4083 - acc: 0.8326 - val_loss: 0.3206 - val_acc: 0.8494
    Epoch 60/100
    312/312 - 149s - loss: 0.4010 - acc: 0.8372 - val_loss: 1.2613 - val_acc: 0.6791
    Epoch 61/100
    312/312 - 149s - loss: 0.4093 - acc: 0.8316 - val_loss: 0.4272 - val_acc: 0.7905
    Epoch 62/100
    312/312 - 149s - loss: 0.4389 - acc: 0.8319 - val_loss: 0.3596 - val_acc: 0.8486
    Epoch 63/100
    312/312 - 148s - loss: 0.4066 - acc: 0.8385 - val_loss: 0.3253 - val_acc: 0.8774
    Epoch 64/100
    312/312 - 148s - loss: 0.4000 - acc: 0.8325 - val_loss: 0.3190 - val_acc: 0.8920
    Epoch 65/100
    312/312 - 149s - loss: 0.3924 - acc: 0.8355 - val_loss: 0.5483 - val_acc: 0.6829
    Epoch 66/100
    312/312 - 150s - loss: 0.4344 - acc: 0.8272 - val_loss: 0.4520 - val_acc: 0.7971
    Epoch 67/100
    312/312 - 150s - loss: 0.3973 - acc: 0.8338 - val_loss: 0.2962 - val_acc: 0.8770
    Epoch 68/100
    312/312 - 150s - loss: 0.3868 - acc: 0.8299 - val_loss: 0.3322 - val_acc: 0.8510
    Epoch 69/100
    312/312 - 150s - loss: 0.4051 - acc: 0.8243 - val_loss: 0.4731 - val_acc: 0.8295
    Epoch 70/100
    312/312 - 150s - loss: 0.4241 - acc: 0.8228 - val_loss: 0.3180 - val_acc: 0.8682
    Epoch 71/100
    TimedStop: Training out of time (Elapsed = 2:56:41.032776)
    TimedStop: Longest Epoch = 2:56:41.032776
    312/312 - 149s - loss: 0.4185 - acc: 0.8227 - val_loss: 0.3691 - val_acc: 0.8041
    2019-07-20 16:34:32,739 graeae.timers.timer end: Ended: 2019-07-20 16:34:32.739264
    I0720 16:34:32.739288 139935525873472 timer.py:77] Ended: 2019-07-20 16:34:32.739264
    2019-07-20 16:34:32,740 graeae.timers.timer end: Elapsed: 2:56:41.097636
    I0720 16:34:32.740128 139935525873472 timer.py:78] Elapsed: 2:56:41.097636
    
    data = pandas.read_csv("~/cats_vs_dogs_4.csv")
    print(tabulate(data[data.validation_accuracy>=0.89],
          headers="keys", tablefmt="orgtbl"))
    
      training_loss training_accuracy validation_loss validation_accuracy
    30 0.3626 0.8453 0.2778 0.894
    63 0.4 0.8325 0.319 0.892

    So, we don't see a lot of degradation, but we also don't see much improvement.

    best_accuracy = data.validation_accuracy.max()
    best_loss = data.validation_loss.min()
    accuracy_index = data[data.validation_accuracy==best_accuracy].index[0]
    loss_index = data[data.validation_loss==best_loss].index[0]
    
    print(f"Highest Accuracy: {best_accuracy} Epoch: {accuracy_index + 1}")
    print(f"Lowest Loss: {best_loss} Epoch: {loss_index}")
    
    Highest Accuracy: 0.894 Epoch: 31
    Lowest Loss: 0.264 Epoch: 40
    
    line_1 = holoviews.VLine(accuracy_index, label="Best Accuracy Epoch")
    line_2 = holoviews.VLine(loss_index, label="Best Loss Epoch")
    accuracy_line = holoviews.HLine(best_accuracy, label="Highest Accuracy")
    loss_line = holoviews.HLine(best_loss, label="lowest Loss")
    
    curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                              label="Training Loss",),
              holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                              label="Training Accuracy").opts(tools=["hover"]),
              holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                              label="Validation Loss",).opts(tools=["hover"]),
              holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                              label="Validation Accuracy").opts(tools=["hover"]),
              line_1, line_2, accuracy_line, loss_line]
    plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                          ylabel="Performance", 
                                          title="Training vs Validation")
    Embed(plot=plot, file_name="training_validation_loss_13")()
    

    Figure Missing

Take Fourteen

Okay, I'm going to say tha 90 % is our target.

saver = partial(save_model, path=path)
good_enough = Stop(call_on_stopping=saver,
                   minimum_accuracy=0.9)

network = Network(str(training_path), 
                  callbacks=[good_enough],
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=128)
print(str(network))
with TIMER:
    network.train()
2019-07-20 18:21:55,292 graeae.timers.timer start: Started: 2019-07-20 18:21:55.292917
I0720 18:21:55.292948 139935525873472 timer.py:70] Started: 2019-07-20 18:21:55.292917
(Network) - 
Path: /home/athena/data/datasets/images/dogs-vs-cats/train
 Epochs: 100
 Batch Size: 128
 Callbacks: [<__main__.Stop object at 0x7f440c04d518>]
Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 128
Callbacks: [<__main__.Stop object at 0x7f440c04d518>]
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/100
156/156 - 158s - loss: 0.7231 - acc: 0.5330 - val_loss: 0.6808 - val_acc: 0.5960
Epoch 2/100
156/156 - 158s - loss: 0.6639 - acc: 0.6077 - val_loss: 0.6349 - val_acc: 0.6396
Epoch 3/100
156/156 - 157s - loss: 0.6362 - acc: 0.6355 - val_loss: 0.6278 - val_acc: 0.6306
Epoch 4/100
156/156 - 158s - loss: 0.6144 - acc: 0.6669 - val_loss: 0.6162 - val_acc: 0.6522
Epoch 5/100
156/156 - 157s - loss: 0.5877 - acc: 0.6917 - val_loss: 0.5351 - val_acc: 0.7424
Epoch 6/100
156/156 - 152s - loss: 0.5736 - acc: 0.7032 - val_loss: 0.5579 - val_acc: 0.7125
Epoch 7/100
156/156 - 152s - loss: 0.5476 - acc: 0.7265 - val_loss: 0.5104 - val_acc: 0.7510
Epoch 8/100
156/156 - 151s - loss: 0.5268 - acc: 0.7372 - val_loss: 0.4760 - val_acc: 0.7692
Epoch 9/100
156/156 - 150s - loss: 0.5057 - acc: 0.7574 - val_loss: 0.4546 - val_acc: 0.7800
Epoch 10/100
156/156 - 149s - loss: 0.4862 - acc: 0.7667 - val_loss: 0.4432 - val_acc: 0.7987
Epoch 11/100
156/156 - 149s - loss: 0.4697 - acc: 0.7787 - val_loss: 0.4413 - val_acc: 0.7941
Epoch 12/100
156/156 - 151s - loss: 0.4419 - acc: 0.7924 - val_loss: 0.5044 - val_acc: 0.7454
Epoch 13/100
156/156 - 150s - loss: 0.4204 - acc: 0.8046 - val_loss: 0.4150 - val_acc: 0.8059
Epoch 14/100
156/156 - 150s - loss: 0.4038 - acc: 0.8155 - val_loss: 0.3776 - val_acc: 0.8245
Epoch 15/100
156/156 - 149s - loss: 0.3849 - acc: 0.8244 - val_loss: 0.4047 - val_acc: 0.8033
Epoch 16/100
156/156 - 149s - loss: 0.3745 - acc: 0.8311 - val_loss: 0.6245 - val_acc: 0.6635
Epoch 17/100
156/156 - 150s - loss: 0.3556 - acc: 0.8410 - val_loss: 0.3274 - val_acc: 0.8544
Epoch 18/100
156/156 - 150s - loss: 0.3434 - acc: 0.8447 - val_loss: 0.3143 - val_acc: 0.8606
Epoch 19/100
156/156 - 150s - loss: 0.3356 - acc: 0.8473 - val_loss: 0.3181 - val_acc: 0.8600
Epoch 20/100
156/156 - 148s - loss: 0.3295 - acc: 0.8551 - val_loss: 0.3042 - val_acc: 0.8630
Epoch 21/100
156/156 - 150s - loss: 0.3165 - acc: 0.8612 - val_loss: 0.2961 - val_acc: 0.8668
Epoch 22/100
156/156 - 149s - loss: 0.3060 - acc: 0.8650 - val_loss: 0.3530 - val_acc: 0.8484
Epoch 23/100
156/156 - 148s - loss: 0.3062 - acc: 0.8659 - val_loss: 0.3327 - val_acc: 0.8494
Epoch 24/100
156/156 - 150s - loss: 0.2942 - acc: 0.8725 - val_loss: 0.2533 - val_acc: 0.8944
Epoch 25/100
156/156 - 150s - loss: 0.2897 - acc: 0.8712 - val_loss: 0.3440 - val_acc: 0.8458
Epoch 26/100
156/156 - 149s - loss: 0.2829 - acc: 0.8769 - val_loss: 0.2627 - val_acc: 0.8842
Epoch 27/100
156/156 - 150s - loss: 0.2691 - acc: 0.8836 - val_loss: 0.2857 - val_acc: 0.8840
Epoch 28/100
156/156 - 149s - loss: 0.2814 - acc: 0.8780 - val_loss: 0.2717 - val_acc: 0.8800
Epoch 29/100
156/156 - 150s - loss: 0.2632 - acc: 0.8826 - val_loss: 0.3521 - val_acc: 0.8584
Epoch 30/100
156/156 - 150s - loss: 0.2652 - acc: 0.8868 - val_loss: 0.2736 - val_acc: 0.8872
Epoch 31/100
156/156 - 148s - loss: 0.2591 - acc: 0.8875 - val_loss: 0.2842 - val_acc: 0.8782
Epoch 32/100
156/156 - 150s - loss: 0.2569 - acc: 0.8851 - val_loss: 0.2657 - val_acc: 0.9014
Epoch 33/100
156/156 - 149s - loss: 0.2623 - acc: 0.8866 - val_loss: 0.2539 - val_acc: 0.8894
Epoch 34/100
156/156 - 150s - loss: 0.2564 - acc: 0.8893 - val_loss: 0.3088 - val_acc: 0.8606
Epoch 35/100
156/156 - 151s - loss: 0.2496 - acc: 0.8924 - val_loss: 0.2415 - val_acc: 0.8996
Epoch 36/100
156/156 - 147s - loss: 0.2459 - acc: 0.8918 - val_loss: 0.2661 - val_acc: 0.8816
Epoch 37/100
156/156 - 147s - loss: 0.2473 - acc: 0.8939 - val_loss: 0.2842 - val_acc: 0.8826
Epoch 38/100
156/156 - 146s - loss: 0.2418 - acc: 0.8948 - val_loss: 0.2875 - val_acc: 0.8864
Epoch 39/100
156/156 - 145s - loss: 0.2502 - acc: 0.8924 - val_loss: 0.2174 - val_acc: 0.9089
Epoch 40/100
156/156 - 146s - loss: 0.2416 - acc: 0.8966 - val_loss: 0.2251 - val_acc: 0.9058
Epoch 41/100
156/156 - 147s - loss: 0.2489 - acc: 0.8934 - val_loss: 0.2548 - val_acc: 0.8958
Epoch 42/100
156/156 - 150s - loss: 0.2341 - acc: 0.9011 - val_loss: 0.2150 - val_acc: 0.9065
Epoch 43/100
156/156 - 151s - loss: 0.2400 - acc: 0.9011 - val_loss: 0.2103 - val_acc: 0.9131
Epoch 44/100
156/156 - 150s - loss: 0.2340 - acc: 0.8990 - val_loss: 0.6287 - val_acc: 0.7945
Epoch 45/100
156/156 - 150s - loss: 0.2359 - acc: 0.8981 - val_loss: 0.2213 - val_acc: 0.9131
Epoch 46/100
156/156 - 150s - loss: 0.2277 - acc: 0.9034 - val_loss: 0.2491 - val_acc: 0.8920
Epoch 47/100
156/156 - 151s - loss: 0.2433 - acc: 0.8946 - val_loss: 0.2540 - val_acc: 0.8990
Epoch 48/100
156/156 - 145s - loss: 0.2354 - acc: 0.9000 - val_loss: 0.3256 - val_acc: 0.8464
Epoch 49/100
156/156 - 146s - loss: 0.2374 - acc: 0.8994 - val_loss: 0.2516 - val_acc: 0.8932
Epoch 50/100
156/156 - 145s - loss: 0.2335 - acc: 0.8997 - val_loss: 0.2138 - val_acc: 0.9127
Epoch 51/100
156/156 - 145s - loss: 0.2283 - acc: 0.9040 - val_loss: 0.2800 - val_acc: 0.9087
Epoch 52/100
156/156 - 146s - loss: 0.2367 - acc: 0.8992 - val_loss: 0.2279 - val_acc: 0.9113
Epoch 53/100
156/156 - 146s - loss: 0.2316 - acc: 0.9008 - val_loss: 0.2157 - val_acc: 0.9123
Epoch 54/100
156/156 - 146s - loss: 0.2236 - acc: 0.9059 - val_loss: 0.2653 - val_acc: 0.8946
Epoch 55/100
156/156 - 145s - loss: 0.2348 - acc: 0.8993 - val_loss: 0.2310 - val_acc: 0.9077
Epoch 56/100
156/156 - 144s - loss: 0.2383 - acc: 0.8982 - val_loss: 0.3550 - val_acc: 0.8544
Epoch 57/100
156/156 - 147s - loss: 0.2292 - acc: 0.9028 - val_loss: 0.2356 - val_acc: 0.9095
Epoch 58/100
156/156 - 146s - loss: 0.2315 - acc: 0.9042 - val_loss: 0.2088 - val_acc: 0.9163
Epoch 59/100
156/156 - 146s - loss: 0.2256 - acc: 0.9037 - val_loss: 0.4032 - val_acc: 0.8149
Epoch 60/100
156/156 - 145s - loss: 0.2302 - acc: 0.9024 - val_loss: 0.2611 - val_acc: 0.8918
Epoch 61/100
156/156 - 144s - loss: 0.2227 - acc: 0.9048 - val_loss: 0.2068 - val_acc: 0.9103
Epoch 62/100
156/156 - 145s - loss: 0.2353 - acc: 0.8988 - val_loss: 0.2446 - val_acc: 0.8862
Epoch 63/100
156/156 - 146s - loss: 0.2313 - acc: 0.9008 - val_loss: 0.2295 - val_acc: 0.9119
Epoch 64/100
156/156 - 144s - loss: 0.2237 - acc: 0.9042 - val_loss: 0.2193 - val_acc: 0.9179
Epoch 65/100
156/156 - 145s - loss: 0.2292 - acc: 0.9055 - val_loss: 0.2427 - val_acc: 0.8960
Epoch 66/100
156/156 - 145s - loss: 0.2353 - acc: 0.9015 - val_loss: 0.4461 - val_acc: 0.8035
Epoch 67/100
156/156 - 145s - loss: 0.2205 - acc: 0.9074 - val_loss: 0.2056 - val_acc: 0.9107
Epoch 68/100
156/156 - 145s - loss: 0.2257 - acc: 0.9035 - val_loss: 0.2528 - val_acc: 0.8846
Epoch 69/100
156/156 - 146s - loss: 0.2275 - acc: 0.9053 - val_loss: 0.2302 - val_acc: 0.9034
Epoch 70/100
156/156 - 144s - loss: 0.2261 - acc: 0.9032 - val_loss: 0.2248 - val_acc: 0.9107
Epoch 71/100
156/156 - 145s - loss: 0.2280 - acc: 0.9005 - val_loss: 0.2289 - val_acc: 0.9060
Epoch 72/100
156/156 - 145s - loss: 0.2276 - acc: 0.9070 - val_loss: 0.2604 - val_acc: 0.8960
Epoch 73/100
156/156 - 145s - loss: 0.2312 - acc: 0.9032 - val_loss: 0.2774 - val_acc: 0.8874
Epoch 74/100
156/156 - 145s - loss: 0.2249 - acc: 0.9062 - val_loss: 0.2025 - val_acc: 0.9173
Epoch 75/100
156/156 - 145s - loss: 0.2332 - acc: 0.9044 - val_loss: 0.2055 - val_acc: 0.9151
Epoch 76/100
156/156 - 145s - loss: 0.2238 - acc: 0.9065 - val_loss: 0.2933 - val_acc: 0.8954
Epoch 77/100
156/156 - 145s - loss: 0.2312 - acc: 0.9021 - val_loss: 0.2556 - val_acc: 0.8940
Epoch 78/100
156/156 - 145s - loss: 0.2186 - acc: 0.9070 - val_loss: 0.1914 - val_acc: 0.9217
Epoch 79/100
156/156 - 146s - loss: 0.2315 - acc: 0.9055 - val_loss: 0.2192 - val_acc: 0.9087
Epoch 80/100
156/156 - 145s - loss: 0.2264 - acc: 0.9039 - val_loss: 0.2494 - val_acc: 0.8862
Epoch 81/100
156/156 - 145s - loss: 0.2209 - acc: 0.9086 - val_loss: 0.2026 - val_acc: 0.9185
Epoch 82/100
156/156 - 145s - loss: 0.2356 - acc: 0.9026 - val_loss: 0.2011 - val_acc: 0.9163
Epoch 83/100
156/156 - 144s - loss: 0.2236 - acc: 0.9059 - val_loss: 0.1996 - val_acc: 0.9229
Epoch 84/100
156/156 - 145s - loss: 0.2277 - acc: 0.9023 - val_loss: 0.2071 - val_acc: 0.9157
Epoch 85/100
156/156 - 145s - loss: 0.2365 - acc: 0.9053 - val_loss: 0.2343 - val_acc: 0.9115
Epoch 86/100
156/156 - 145s - loss: 0.2237 - acc: 0.9026 - val_loss: 0.2111 - val_acc: 0.9187
Epoch 87/100
156/156 - 144s - loss: 0.2285 - acc: 0.9022 - val_loss: 0.1987 - val_acc: 0.9167
Epoch 88/100
156/156 - 146s - loss: 0.2390 - acc: 0.8996 - val_loss: 0.2169 - val_acc: 0.9069
Epoch 89/100
156/156 - 144s - loss: 0.2478 - acc: 0.9030 - val_loss: 0.2021 - val_acc: 0.9195
Epoch 90/100
156/156 - 146s - loss: 0.2221 - acc: 0.9037 - val_loss: 0.3306 - val_acc: 0.8758
Epoch 91/100
156/156 - 146s - loss: 0.2342 - acc: 0.9005 - val_loss: 0.3883 - val_acc: 0.8183
Epoch 92/100
156/156 - 145s - loss: 0.2286 - acc: 0.9052 - val_loss: 0.2461 - val_acc: 0.8920
Epoch 93/100
156/156 - 146s - loss: 0.2260 - acc: 0.9060 - val_loss: 0.2327 - val_acc: 0.9155
Epoch 94/100
156/156 - 144s - loss: 0.2391 - acc: 0.9017 - val_loss: 0.2106 - val_acc: 0.9151
Epoch 95/100
156/156 - 145s - loss: 0.2320 - acc: 0.9008 - val_loss: 0.2006 - val_acc: 0.9217
Epoch 96/100
156/156 - 146s - loss: 0.2305 - acc: 0.9044 - val_loss: 0.2095 - val_acc: 0.9129
Epoch 97/100
156/156 - 145s - loss: 0.2312 - acc: 0.9029 - val_loss: 0.2112 - val_acc: 0.9123
Epoch 98/100
156/156 - 145s - loss: 0.2214 - acc: 0.9055 - val_loss: 0.1963 - val_acc: 0.9203
Epoch 99/100
156/156 - 144s - loss: 0.2364 - acc: 0.9029 - val_loss: 0.2055 - val_acc: 0.9085
Epoch 100/100
156/156 - 146s - loss: 0.2198 - acc: 0.9074 - val_loss: 0.2404 - val_acc: 0.9006
2019-07-20 22:27:48,713 graeae.timers.timer end: Ended: 2019-07-20 22:27:48.713962
I0720 22:27:48.713997 139935525873472 timer.py:77] Ended: 2019-07-20 22:27:48.713962
2019-07-20 22:27:48,714 graeae.timers.timer end: Elapsed: 4:05:53.421045
I0720 22:27:48.714990 139935525873472 timer.py:78] Elapsed: 4:05:53.421045

So, two things to notice. One is that it did even better than I thought it would, the other is that it didn't stop. The fact that it didn't stop is beacuse I was assiging Stop.on_epoch_end to Stop.on_epoch_end instead of assigning Stop.on_end_handler (so Stop.on_end_handler was never called). Hopefully I fixed it.

Take Fifteen

I originally set this up with a batch-size of 256 and no callbacks and then had to kill it because it hadn't stopped by the next morning when I had to go to work. That's probably something to be dealt with when training a real model, but in this case I'm just doing an exercise, so I'll try it with 256 and the have it time-out after 8 hours.

network = Network(str(training_path), 
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()
2019-07-22 22:29:32,865 graeae.timers.timer start: Started: 2019-07-22 22:29:32.865426
I0722 22:29:32.865622 140052197230400 timer.py:70] Started: 2019-07-22 22:29:32.865426
(Network) - 
Path: /home/athena/data/datasets/images/dogs-vs-cats/train
 Epochs: 100
 Batch Size: 256
 Callbacks: None
Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256
Callbacks: None
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
W0722 22:29:33.512133 140052197230400 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0722 22:29:33.928766 140052197230400 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
78/78 - 429s - loss: 0.6963 - acc: 0.5277 - val_loss: 0.6838 - val_acc: 0.6053
Epoch 2/100
78/78 - 152s - loss: 0.6823 - acc: 0.5664 - val_loss: 0.7476 - val_acc: 0.5107
Epoch 3/100
78/78 - 152s - loss: 0.6827 - acc: 0.6109 - val_loss: 0.6231 - val_acc: 0.6567
Epoch 4/100
78/78 - 154s - loss: 0.6475 - acc: 0.6241 - val_loss: 0.6528 - val_acc: 0.6182
Epoch 5/100
78/78 - 155s - loss: 0.6304 - acc: 0.6465 - val_loss: 0.5851 - val_acc: 0.6846
Epoch 6/100
78/78 - 155s - loss: 0.6161 - acc: 0.6607 - val_loss: 0.5997 - val_acc: 0.6780
Epoch 7/100
78/78 - 149s - loss: 0.5978 - acc: 0.6828 - val_loss: 0.5866 - val_acc: 0.6748
Epoch 8/100
78/78 - 150s - loss: 0.5883 - acc: 0.6894 - val_loss: 0.5663 - val_acc: 0.7079
Epoch 9/100
78/78 - 149s - loss: 0.5803 - acc: 0.6977 - val_loss: 0.6254 - val_acc: 0.6332
Epoch 10/100
78/78 - 147s - loss: 0.5617 - acc: 0.7119 - val_loss: 0.5867 - val_acc: 0.6865
Epoch 11/100
78/78 - 146s - loss: 0.5559 - acc: 0.7217 - val_loss: 0.5812 - val_acc: 0.6959
Epoch 12/100
78/78 - 141s - loss: 0.5521 - acc: 0.7208 - val_loss: 0.5157 - val_acc: 0.7605
Epoch 13/100
78/78 - 143s - loss: 0.5440 - acc: 0.7283 - val_loss: 0.5218 - val_acc: 0.7541
Epoch 14/100
78/78 - 143s - loss: 0.5225 - acc: 0.7407 - val_loss: 0.4998 - val_acc: 0.7593
Epoch 15/100
78/78 - 142s - loss: 0.5044 - acc: 0.7538 - val_loss: 0.4718 - val_acc: 0.7695
Epoch 16/100
78/78 - 141s - loss: 0.4946 - acc: 0.7607 - val_loss: 0.5743 - val_acc: 0.6912
Epoch 17/100
78/78 - 141s - loss: 0.4643 - acc: 0.7780 - val_loss: 0.4124 - val_acc: 0.8156
Epoch 18/100
78/78 - 142s - loss: 0.4480 - acc: 0.7894 - val_loss: 0.4762 - val_acc: 0.7593
Epoch 19/100
78/78 - 143s - loss: 0.4426 - acc: 0.7880 - val_loss: 0.4288 - val_acc: 0.8074
Epoch 20/100
78/78 - 141s - loss: 0.4271 - acc: 0.8006 - val_loss: 0.3893 - val_acc: 0.8248
Epoch 21/100
78/78 - 142s - loss: 0.4111 - acc: 0.8073 - val_loss: 0.3602 - val_acc: 0.8355
Epoch 22/100
78/78 - 144s - loss: 0.4066 - acc: 0.8094 - val_loss: 0.3546 - val_acc: 0.8403
Epoch 23/100
78/78 - 143s - loss: 0.3927 - acc: 0.8160 - val_loss: 0.3536 - val_acc: 0.8368
Epoch 24/100
78/78 - 138s - loss: 0.3839 - acc: 0.8228 - val_loss: 0.3409 - val_acc: 0.8481
Epoch 25/100
78/78 - 141s - loss: 0.3704 - acc: 0.8290 - val_loss: 0.3364 - val_acc: 0.8514
Epoch 26/100
78/78 - 141s - loss: 0.3633 - acc: 0.8362 - val_loss: 0.3329 - val_acc: 0.8577
Epoch 27/100
78/78 - 142s - loss: 0.3434 - acc: 0.8447 - val_loss: 0.3131 - val_acc: 0.8670
Epoch 28/100
78/78 - 142s - loss: 0.3481 - acc: 0.8401 - val_loss: 0.3515 - val_acc: 0.8433
Epoch 29/100
78/78 - 142s - loss: 0.3225 - acc: 0.8555 - val_loss: 0.4498 - val_acc: 0.8049
Epoch 30/100
78/78 - 142s - loss: 0.3280 - acc: 0.8482 - val_loss: 0.3338 - val_acc: 0.8475
Epoch 31/100
78/78 - 144s - loss: 0.3252 - acc: 0.8534 - val_loss: 0.3694 - val_acc: 0.8339
Epoch 32/100
78/78 - 142s - loss: 0.3232 - acc: 0.8524 - val_loss: 0.3124 - val_acc: 0.8571
Epoch 33/100
78/78 - 142s - loss: 0.3011 - acc: 0.8619 - val_loss: 0.3094 - val_acc: 0.8657
Epoch 34/100
78/78 - 143s - loss: 0.2908 - acc: 0.8702 - val_loss: 0.3324 - val_acc: 0.8540
Epoch 35/100
78/78 - 140s - loss: 0.2943 - acc: 0.8697 - val_loss: 0.2880 - val_acc: 0.8717
Epoch 36/100
78/78 - 142s - loss: 0.2848 - acc: 0.8707 - val_loss: 0.2671 - val_acc: 0.8894
Epoch 37/100
78/78 - 143s - loss: 0.2783 - acc: 0.8774 - val_loss: 0.2870 - val_acc: 0.8756
Epoch 38/100
78/78 - 141s - loss: 0.2876 - acc: 0.8750 - val_loss: 0.2666 - val_acc: 0.8855
Epoch 39/100
78/78 - 139s - loss: 0.2729 - acc: 0.8790 - val_loss: 0.2557 - val_acc: 0.8914
Epoch 40/100
78/78 - 137s - loss: 0.2667 - acc: 0.8809 - val_loss: 0.2424 - val_acc: 0.8929
Epoch 41/100
78/78 - 137s - loss: 0.2626 - acc: 0.8824 - val_loss: 0.2746 - val_acc: 0.8820
Epoch 42/100
78/78 - 135s - loss: 0.2515 - acc: 0.8925 - val_loss: 0.2525 - val_acc: 0.8912
Epoch 43/100
78/78 - 137s - loss: 0.2487 - acc: 0.8905 - val_loss: 0.2516 - val_acc: 0.8956
Epoch 44/100
78/78 - 137s - loss: 0.2544 - acc: 0.8856 - val_loss: 0.2444 - val_acc: 0.8914
Epoch 45/100
78/78 - 138s - loss: 0.2484 - acc: 0.8899 - val_loss: 0.2727 - val_acc: 0.8785
Epoch 46/100
78/78 - 135s - loss: 0.2447 - acc: 0.8919 - val_loss: 0.2385 - val_acc: 0.9032
Epoch 47/100
78/78 - 138s - loss: 0.2349 - acc: 0.8973 - val_loss: 0.2296 - val_acc: 0.8999
Epoch 48/100
78/78 - 137s - loss: 0.2324 - acc: 0.8966 - val_loss: 0.2772 - val_acc: 0.8843
Epoch 49/100
78/78 - 136s - loss: 0.2383 - acc: 0.8948 - val_loss: 0.2698 - val_acc: 0.8820
Epoch 50/100
78/78 - 138s - loss: 0.2293 - acc: 0.9000 - val_loss: 0.2763 - val_acc: 0.8789
Epoch 51/100
78/78 - 137s - loss: 0.2300 - acc: 0.9006 - val_loss: 0.2231 - val_acc: 0.9060
Epoch 52/100
78/78 - 135s - loss: 0.2236 - acc: 0.9029 - val_loss: 0.2321 - val_acc: 0.9021
Epoch 53/100
78/78 - 137s - loss: 0.2256 - acc: 0.9012 - val_loss: 0.2229 - val_acc: 0.9075
Epoch 54/100
78/78 - 138s - loss: 0.2147 - acc: 0.9079 - val_loss: 0.2330 - val_acc: 0.9040
Epoch 55/100
78/78 - 137s - loss: 0.2192 - acc: 0.9066 - val_loss: 0.2390 - val_acc: 0.9040
Epoch 56/100
78/78 - 138s - loss: 0.2138 - acc: 0.9068 - val_loss: 0.2234 - val_acc: 0.9077
Epoch 57/100
78/78 - 137s - loss: 0.2115 - acc: 0.9078 - val_loss: 0.2734 - val_acc: 0.8810
Epoch 58/100
78/78 - 136s - loss: 0.2312 - acc: 0.8987 - val_loss: 0.2320 - val_acc: 0.9038
Epoch 59/100
78/78 - 138s - loss: 0.2054 - acc: 0.9129 - val_loss: 0.2188 - val_acc: 0.9120
Epoch 60/100
78/78 - 137s - loss: 0.2215 - acc: 0.9047 - val_loss: 0.2329 - val_acc: 0.9019
Epoch 61/100
78/78 - 137s - loss: 0.2088 - acc: 0.9087 - val_loss: 0.2357 - val_acc: 0.8937
Epoch 62/100
78/78 - 135s - loss: 0.2052 - acc: 0.9113 - val_loss: 0.2478 - val_acc: 0.8890
Epoch 63/100
78/78 - 138s - loss: 0.2013 - acc: 0.9138 - val_loss: 0.2079 - val_acc: 0.9134
Epoch 64/100
78/78 - 137s - loss: 0.2042 - acc: 0.9135 - val_loss: 0.2106 - val_acc: 0.9141
Epoch 65/100
78/78 - 137s - loss: 0.2036 - acc: 0.9122 - val_loss: 0.2266 - val_acc: 0.9054
Epoch 66/100
78/78 - 137s - loss: 0.1954 - acc: 0.9158 - val_loss: 0.2151 - val_acc: 0.9087
Epoch 67/100
78/78 - 137s - loss: 0.2013 - acc: 0.9125 - val_loss: 0.2062 - val_acc: 0.9116
Epoch 68/100
78/78 - 137s - loss: 0.1969 - acc: 0.9140 - val_loss: 0.2053 - val_acc: 0.9122
Epoch 69/100
78/78 - 137s - loss: 0.1944 - acc: 0.9152 - val_loss: 0.2159 - val_acc: 0.9112
Epoch 70/100
78/78 - 137s - loss: 0.1899 - acc: 0.9172 - val_loss: 0.2509 - val_acc: 0.8927
Epoch 71/100
78/78 - 137s - loss: 0.1966 - acc: 0.9187 - val_loss: 0.2110 - val_acc: 0.9128
Epoch 72/100
78/78 - 137s - loss: 0.1847 - acc: 0.9215 - val_loss: 0.2299 - val_acc: 0.9093
Epoch 73/100
78/78 - 137s - loss: 0.1838 - acc: 0.9227 - val_loss: 0.1841 - val_acc: 0.9278
Epoch 74/100
78/78 - 137s - loss: 0.1966 - acc: 0.9176 - val_loss: 0.2924 - val_acc: 0.8742
Epoch 75/100
78/78 - 138s - loss: 0.1842 - acc: 0.9220 - val_loss: 0.1932 - val_acc: 0.9223
Epoch 76/100
78/78 - 137s - loss: 0.1844 - acc: 0.9210 - val_loss: 0.2169 - val_acc: 0.9198
Epoch 77/100
78/78 - 137s - loss: 0.1906 - acc: 0.9219 - val_loss: 0.2597 - val_acc: 0.8818
Epoch 78/100
78/78 - 137s - loss: 0.1735 - acc: 0.9255 - val_loss: 0.1994 - val_acc: 0.9204
Epoch 79/100
78/78 - 137s - loss: 0.1826 - acc: 0.9230 - val_loss: 0.1862 - val_acc: 0.9245
Epoch 80/100
78/78 - 137s - loss: 0.1847 - acc: 0.9239 - val_loss: 0.2384 - val_acc: 0.9122
Epoch 81/100
78/78 - 152s - loss: 0.1843 - acc: 0.9226 - val_loss: 0.2039 - val_acc: 0.9151
Epoch 82/100
78/78 - 151s - loss: 0.1778 - acc: 0.9225 - val_loss: 0.2089 - val_acc: 0.9192
Epoch 83/100
78/78 - 150s - loss: 0.1765 - acc: 0.9225 - val_loss: 0.1998 - val_acc: 0.9134
Epoch 84/100
78/78 - 148s - loss: 0.1780 - acc: 0.9238 - val_loss: 0.1837 - val_acc: 0.9266
Epoch 85/100
78/78 - 147s - loss: 0.1739 - acc: 0.9259 - val_loss: 0.2011 - val_acc: 0.9204
Epoch 86/100
78/78 - 146s - loss: 0.1718 - acc: 0.9271 - val_loss: 0.1885 - val_acc: 0.9307
Epoch 87/100
78/78 - 144s - loss: 0.1775 - acc: 0.9242 - val_loss: 0.2126 - val_acc: 0.9213
Epoch 88/100
78/78 - 142s - loss: 0.1732 - acc: 0.9265 - val_loss: 0.1929 - val_acc: 0.9204
Epoch 89/100
78/78 - 141s - loss: 0.1773 - acc: 0.9242 - val_loss: 0.2050 - val_acc: 0.9243
Epoch 90/100
78/78 - 140s - loss: 0.1745 - acc: 0.9278 - val_loss: 0.2168 - val_acc: 0.9122
Epoch 91/100
78/78 - 139s - loss: 0.1802 - acc: 0.9214 - val_loss: 0.2263 - val_acc: 0.9079
Epoch 92/100
78/78 - 139s - loss: 0.1642 - acc: 0.9320 - val_loss: 0.2503 - val_acc: 0.9087
Epoch 93/100
78/78 - 136s - loss: 0.1705 - acc: 0.9285 - val_loss: 0.2184 - val_acc: 0.9126
Epoch 94/100
78/78 - 136s - loss: 0.1714 - acc: 0.9253 - val_loss: 0.2365 - val_acc: 0.9130
Epoch 95/100
78/78 - 136s - loss: 0.1678 - acc: 0.9305 - val_loss: 0.1882 - val_acc: 0.9231
Epoch 96/100
78/78 - 137s - loss: 0.1923 - acc: 0.9212 - val_loss: 0.1931 - val_acc: 0.9211
Epoch 97/100
78/78 - 136s - loss: 0.1574 - acc: 0.9335 - val_loss: 0.2011 - val_acc: 0.9215
Epoch 98/100
78/78 - 136s - loss: 0.1673 - acc: 0.9280 - val_loss: 0.2561 - val_acc: 0.9032
Epoch 99/100
78/78 - 137s - loss: 0.1670 - acc: 0.9304 - val_loss: 0.2018 - val_acc: 0.9180
Epoch 100/100
78/78 - 137s - loss: 0.1663 - acc: 0.9292 - val_loss: 0.2245 - val_acc: 0.9065
2019-07-23 02:29:24,212 graeae.timers.timer end: Ended: 2019-07-23 02:29:24.212763
I0723 02:29:24.212822 140052197230400 timer.py:77] Ended: 2019-07-23 02:29:24.212763
2019-07-23 02:29:24,214 graeae.timers.timer end: Elapsed: 3:59:51.347337
I0723 02:29:24.214272 140052197230400 timer.py:78] Elapsed: 3:59:51.347337
print(f"About {(4 * 60)/100} minutes per epoch")
About 2.4 minutes per epoch

So with a batch size of 256 it doesn't take an unreasonable amount of time to train.

data = pandas.read_csv("~/cats_vs_dogs_5.csv")
curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                          label="Training Loss",),
          holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                          label="Training Accuracy").opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                          label="Validation Loss",).opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                          label="Validation Accuracy").opts(tools=["hover"]),
]
plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                      ylabel="Performance", 
                                      title="Training vs Validation")
Embed(plot=plot, file_name="training_validation_loss_5")()

Figure Missing

best_validation(data)
Best Accuracy: 0.9307 (loss=0.1885) Epoch: 85
Best Loss: 0.1837 (accuracy=0.9266, Epoch: 83)

The model is doing pretty good already. Unfortunately I didn't save the model before turning off my machine…

Take Sixteen

network = Network(str(training_path), 
                  convolution_layers=5,
                  set_steps = True,
                  epochs = 100,
                  batch_size=256)
print(str(network))
with TIMER:
    network.train()

save_model(network.model, MODELS/"layers_5_batch_size_256.h5")
2019-07-27 11:17:39,023 graeae.timers.timer start: Started: 2019-07-27 11:17:39.023595
I0727 11:17:39.023817 140052539918144 timer.py:70] Started: 2019-07-27 11:17:39.023595
(Network) - 
Path: /home/athena/data/datasets/images/dogs-vs-cats/train
 Epochs: 100
 Batch Size: 256
 Callbacks: None
Data: (Data) - Path: /home/athena/data/datasets/images/dogs-vs-cats/train, Validation Split: 0.2,Batch Size: 256
Callbacks: None
Found 20000 images belonging to 2 classes.
W0727 11:17:39.640755 140052539918144 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Found 5000 images belonging to 2 classes.
W0727 11:17:39.796763 140052539918144 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/100
78/78 - 368s - loss: 0.6996 - acc: 0.5326 - val_loss: 0.6777 - val_acc: 0.5983
Epoch 2/100
78/78 - 235s - loss: 0.6823 - acc: 0.5713 - val_loss: 0.6554 - val_acc: 0.6205
Epoch 3/100
78/78 - 229s - loss: 0.6718 - acc: 0.5961 - val_loss: 0.6645 - val_acc: 0.5913
Epoch 4/100
78/78 - 229s - loss: 0.6458 - acc: 0.6334 - val_loss: 0.5993 - val_acc: 0.6822
Epoch 5/100
78/78 - 227s - loss: 0.6395 - acc: 0.6479 - val_loss: 0.5940 - val_acc: 0.6984
Epoch 6/100
78/78 - 234s - loss: 0.6139 - acc: 0.6639 - val_loss: 0.6270 - val_acc: 0.6340
Epoch 7/100
78/78 - 228s - loss: 0.6043 - acc: 0.6742 - val_loss: 0.5533 - val_acc: 0.7231
Epoch 8/100
78/78 - 228s - loss: 0.5862 - acc: 0.6912 - val_loss: 0.5496 - val_acc: 0.7243
Epoch 9/100
78/78 - 227s - loss: 0.5698 - acc: 0.7089 - val_loss: 0.5726 - val_acc: 0.7007
Epoch 10/100
78/78 - 230s - loss: 0.5673 - acc: 0.7010 - val_loss: 0.5106 - val_acc: 0.7432
Epoch 11/100
78/78 - 226s - loss: 0.5416 - acc: 0.7245 - val_loss: 0.5233 - val_acc: 0.7387
Epoch 12/100
78/78 - 230s - loss: 0.5411 - acc: 0.7306 - val_loss: 0.5489 - val_acc: 0.7132
Epoch 13/100
78/78 - 230s - loss: 0.5301 - acc: 0.7332 - val_loss: 0.5358 - val_acc: 0.7163
Epoch 14/100
78/78 - 225s - loss: 0.5200 - acc: 0.7432 - val_loss: 0.4981 - val_acc: 0.7463
Epoch 15/100
78/78 - 225s - loss: 0.5004 - acc: 0.7521 - val_loss: 0.4841 - val_acc: 0.7718
Epoch 16/100
78/78 - 226s - loss: 0.4934 - acc: 0.7590 - val_loss: 0.4679 - val_acc: 0.7640
Epoch 17/100
78/78 - 223s - loss: 0.4709 - acc: 0.7733 - val_loss: 0.6309 - val_acc: 0.7130
Epoch 18/100
78/78 - 224s - loss: 0.4722 - acc: 0.7707 - val_loss: 0.4204 - val_acc: 0.7983
Epoch 19/100
78/78 - 226s - loss: 0.4553 - acc: 0.7832 - val_loss: 0.4041 - val_acc: 0.8129
Epoch 20/100
78/78 - 222s - loss: 0.4325 - acc: 0.7968 - val_loss: 0.3785 - val_acc: 0.8326
Epoch 21/100
78/78 - 226s - loss: 0.4227 - acc: 0.8004 - val_loss: 0.5409 - val_acc: 0.7358
Epoch 22/100
78/78 - 226s - loss: 0.4090 - acc: 0.8063 - val_loss: 0.3555 - val_acc: 0.8438
Epoch 23/100
78/78 - 228s - loss: 0.3903 - acc: 0.8179 - val_loss: 0.3571 - val_acc: 0.8429
Epoch 24/100
78/78 - 226s - loss: 0.3745 - acc: 0.8307 - val_loss: 0.3516 - val_acc: 0.8421
Epoch 25/100
78/78 - 227s - loss: 0.3728 - acc: 0.8304 - val_loss: 0.3703 - val_acc: 0.8285
Epoch 26/100
78/78 - 228s - loss: 0.3584 - acc: 0.8344 - val_loss: 0.3641 - val_acc: 0.8392
Epoch 27/100
78/78 - 227s - loss: 0.3481 - acc: 0.8400 - val_loss: 0.3430 - val_acc: 0.8386
Epoch 28/100
78/78 - 228s - loss: 0.3265 - acc: 0.8497 - val_loss: 0.3399 - val_acc: 0.8470
Epoch 29/100
78/78 - 227s - loss: 0.3356 - acc: 0.8459 - val_loss: 0.3095 - val_acc: 0.8629
Epoch 30/100
78/78 - 226s - loss: 0.3167 - acc: 0.8592 - val_loss: 0.3169 - val_acc: 0.8536
Epoch 31/100
78/78 - 227s - loss: 0.3155 - acc: 0.8599 - val_loss: 0.2829 - val_acc: 0.8820
Epoch 32/100
78/78 - 235s - loss: 0.2984 - acc: 0.8664 - val_loss: 0.2902 - val_acc: 0.8688
Epoch 33/100
78/78 - 236s - loss: 0.3002 - acc: 0.8642 - val_loss: 0.3434 - val_acc: 0.8586
Epoch 34/100
78/78 - 238s - loss: 0.3082 - acc: 0.8629 - val_loss: 0.3796 - val_acc: 0.8271
Epoch 35/100
78/78 - 236s - loss: 0.2828 - acc: 0.8756 - val_loss: 0.2850 - val_acc: 0.8719
Epoch 36/100
78/78 - 235s - loss: 0.2877 - acc: 0.8716 - val_loss: 0.2975 - val_acc: 0.8664
Epoch 37/100
78/78 - 238s - loss: 0.2839 - acc: 0.8723 - val_loss: 0.2544 - val_acc: 0.8910
Epoch 38/100
78/78 - 230s - loss: 0.2749 - acc: 0.8801 - val_loss: 0.2558 - val_acc: 0.8845
Epoch 39/100
78/78 - 228s - loss: 0.2603 - acc: 0.8852 - val_loss: 0.2621 - val_acc: 0.8797
Epoch 40/100
78/78 - 258s - loss: 0.2664 - acc: 0.8841 - val_loss: 0.2513 - val_acc: 0.8937
Epoch 41/100
78/78 - 255s - loss: 0.2548 - acc: 0.8865 - val_loss: 0.2523 - val_acc: 0.8929
Epoch 42/100
78/78 - 255s - loss: 0.2561 - acc: 0.8866 - val_loss: 0.2484 - val_acc: 0.8962
Epoch 43/100
78/78 - 247s - loss: 0.2575 - acc: 0.8870 - val_loss: 0.2452 - val_acc: 0.8960
Epoch 44/100
78/78 - 247s - loss: 0.2562 - acc: 0.8882 - val_loss: 0.3566 - val_acc: 0.8390
Epoch 45/100
78/78 - 251s - loss: 0.2493 - acc: 0.8903 - val_loss: 0.2950 - val_acc: 0.8731
Epoch 46/100
78/78 - 262s - loss: 0.2386 - acc: 0.8957 - val_loss: 0.2778 - val_acc: 0.8806
Epoch 47/100
78/78 - 226s - loss: 0.2519 - acc: 0.8948 - val_loss: 0.2276 - val_acc: 0.9046
Epoch 48/100
78/78 - 234s - loss: 0.2386 - acc: 0.8963 - val_loss: 0.3851 - val_acc: 0.8037
Epoch 49/100
78/78 - 271s - loss: 0.2299 - acc: 0.8997 - val_loss: 0.2298 - val_acc: 0.9015
Epoch 50/100
78/78 - 242s - loss: 0.2283 - acc: 0.9016 - val_loss: 0.2885 - val_acc: 0.8748
Epoch 51/100
78/78 - 234s - loss: 0.2226 - acc: 0.9026 - val_loss: 0.2814 - val_acc: 0.8771
Epoch 52/100
78/78 - 263s - loss: 0.2303 - acc: 0.9002 - val_loss: 0.2829 - val_acc: 0.8781
Epoch 53/100
78/78 - 253s - loss: 0.2259 - acc: 0.9031 - val_loss: 0.2312 - val_acc: 0.9042
Epoch 54/100
78/78 - 241s - loss: 0.2177 - acc: 0.9052 - val_loss: 0.2423 - val_acc: 0.8986
Epoch 55/100
78/78 - 251s - loss: 0.2285 - acc: 0.9001 - val_loss: 0.2126 - val_acc: 0.9124
Epoch 56/100
78/78 - 257s - loss: 0.2200 - acc: 0.9040 - val_loss: 0.2224 - val_acc: 0.9062
Epoch 57/100
78/78 - 249s - loss: 0.2151 - acc: 0.9066 - val_loss: 0.2279 - val_acc: 0.9067
Epoch 58/100
78/78 - 256s - loss: 0.2106 - acc: 0.9087 - val_loss: 0.2813 - val_acc: 0.8851
Epoch 59/100
78/78 - 255s - loss: 0.2091 - acc: 0.9098 - val_loss: 0.2058 - val_acc: 0.9128
Epoch 60/100
78/78 - 258s - loss: 0.2108 - acc: 0.9108 - val_loss: 0.2587 - val_acc: 0.8949
Epoch 61/100
78/78 - 257s - loss: 0.2065 - acc: 0.9126 - val_loss: 0.2341 - val_acc: 0.8972
Epoch 62/100
78/78 - 259s - loss: 0.2032 - acc: 0.9121 - val_loss: 0.1973 - val_acc: 0.9137
Epoch 63/100
78/78 - 257s - loss: 0.2032 - acc: 0.9127 - val_loss: 0.1991 - val_acc: 0.9190
Epoch 64/100
78/78 - 256s - loss: 0.2000 - acc: 0.9143 - val_loss: 0.2605 - val_acc: 0.8888
Epoch 65/100
78/78 - 258s - loss: 0.2008 - acc: 0.9140 - val_loss: 0.2061 - val_acc: 0.9186
Epoch 66/100
78/78 - 253s - loss: 0.1934 - acc: 0.9175 - val_loss: 0.1978 - val_acc: 0.9192
Epoch 67/100
78/78 - 237s - loss: 0.1996 - acc: 0.9138 - val_loss: 0.1912 - val_acc: 0.9186
Epoch 68/100
78/78 - 229s - loss: 0.1976 - acc: 0.9150 - val_loss: 0.2157 - val_acc: 0.9134
Epoch 69/100
78/78 - 236s - loss: 0.1909 - acc: 0.9179 - val_loss: 0.1942 - val_acc: 0.9204
Epoch 70/100
78/78 - 249s - loss: 0.1929 - acc: 0.9187 - val_loss: 0.2266 - val_acc: 0.9132
Epoch 71/100
78/78 - 265s - loss: 0.1883 - acc: 0.9174 - val_loss: 0.2101 - val_acc: 0.9126
Epoch 72/100
78/78 - 280s - loss: 0.1924 - acc: 0.9165 - val_loss: 0.2090 - val_acc: 0.9118
Epoch 73/100
78/78 - 284s - loss: 0.1932 - acc: 0.9177 - val_loss: 0.1864 - val_acc: 0.9250
Epoch 74/100
78/78 - 281s - loss: 0.1797 - acc: 0.9220 - val_loss: 0.2070 - val_acc: 0.9198
Epoch 75/100
78/78 - 261s - loss: 0.1843 - acc: 0.9212 - val_loss: 0.2201 - val_acc: 0.9052
Epoch 76/100
78/78 - 233s - loss: 0.1933 - acc: 0.9199 - val_loss: 0.1908 - val_acc: 0.9180
Epoch 77/100
78/78 - 232s - loss: 0.1858 - acc: 0.9214 - val_loss: 0.3415 - val_acc: 0.8300
Epoch 78/100
78/78 - 227s - loss: 0.1841 - acc: 0.9193 - val_loss: 0.2657 - val_acc: 0.9023
Epoch 79/100
78/78 - 226s - loss: 0.1886 - acc: 0.9203 - val_loss: 0.1911 - val_acc: 0.9268
Epoch 80/100
78/78 - 237s - loss: 0.1872 - acc: 0.9219 - val_loss: 0.1882 - val_acc: 0.9245
Epoch 81/100
78/78 - 260s - loss: 0.1839 - acc: 0.9220 - val_loss: 0.2338 - val_acc: 0.9040
Epoch 82/100
78/78 - 262s - loss: 0.1866 - acc: 0.9218 - val_loss: 0.2314 - val_acc: 0.8986
Epoch 83/100
78/78 - 261s - loss: 0.1761 - acc: 0.9257 - val_loss: 0.2218 - val_acc: 0.9038
Epoch 84/100
78/78 - 263s - loss: 0.1791 - acc: 0.9236 - val_loss: 0.3364 - val_acc: 0.8466
Epoch 85/100
78/78 - 262s - loss: 0.1814 - acc: 0.9217 - val_loss: 0.1917 - val_acc: 0.9229
Epoch 86/100
78/78 - 262s - loss: 0.1838 - acc: 0.9213 - val_loss: 0.1783 - val_acc: 0.9264
Epoch 87/100
78/78 - 264s - loss: 0.1790 - acc: 0.9269 - val_loss: 0.2196 - val_acc: 0.9110
Epoch 88/100
78/78 - 264s - loss: 0.1730 - acc: 0.9285 - val_loss: 0.1876 - val_acc: 0.9280
Epoch 89/100
78/78 - 230s - loss: 0.1782 - acc: 0.9227 - val_loss: 0.2387 - val_acc: 0.9046
Epoch 90/100
78/78 - 229s - loss: 0.1719 - acc: 0.9255 - val_loss: 0.2171 - val_acc: 0.9258
Epoch 91/100
78/78 - 232s - loss: 0.1743 - acc: 0.9279 - val_loss: 0.2246 - val_acc: 0.9134
Epoch 92/100
78/78 - 229s - loss: 0.1688 - acc: 0.9287 - val_loss: 0.2299 - val_acc: 0.9141
Epoch 93/100
78/78 - 227s - loss: 0.1763 - acc: 0.9224 - val_loss: 0.1850 - val_acc: 0.9264
Epoch 94/100
78/78 - 230s - loss: 0.1810 - acc: 0.9222 - val_loss: 0.1880 - val_acc: 0.9260
Epoch 95/100
78/78 - 229s - loss: 0.1788 - acc: 0.9243 - val_loss: 0.1884 - val_acc: 0.9141
Epoch 96/100
78/78 - 231s - loss: 0.1688 - acc: 0.9302 - val_loss: 0.2187 - val_acc: 0.9100
Epoch 97/100
78/78 - 228s - loss: 0.1721 - acc: 0.9265 - val_loss: 0.2140 - val_acc: 0.9102
Epoch 98/100
78/78 - 231s - loss: 0.1677 - acc: 0.9312 - val_loss: 0.1884 - val_acc: 0.9278
Epoch 99/100
78/78 - 228s - loss: 0.1691 - acc: 0.9275 - val_loss: 0.1891 - val_acc: 0.9229
Epoch 100/100
78/78 - 229s - loss: 0.1684 - acc: 0.9265 - val_loss: 0.1866 - val_acc: 0.9200
2019-07-27 18:00:56,707 graeae.timers.timer end: Ended: 2019-07-27 18:00:56.707395
I0727 18:00:56.707435 140052539918144 timer.py:77] Ended: 2019-07-27 18:00:56.707395
2019-07-27 18:00:56,708 graeae.timers.timer end: Elapsed: 6:43:17.683800
I0727 18:00:56.708784 140052539918144 timer.py:78] Elapsed: 6:43:17.683800
Saving the model to /home/athena/models/dogs-vs-cats/layers_5_batch_size_256.h5
data = pandas.read_csv("~/cats_vs_dogs_6.csv")
curves = [holoviews.Curve(data, ("index", "Epoch"), "training_loss", 
                          label="Training Loss",),
          holoviews.Curve(data, ("index", "Epoch"), "training_accuracy", 
                          label="Training Accuracy").opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_loss", 
                          label="Validation Loss",).opts(tools=["hover"]),
          holoviews.Curve(data, ("index", "Epoch"), "validation_accuracy", 
                          label="Validation Accuracy").opts(tools=["hover"]),
]
plot = holoviews.Overlay(curves).opts(tools=["hover"], height=800, width=1000, 
                                      ylabel="Performance", 
                                      title="Training vs Validation")
Embed(plot=plot, file_name="training_validation_loss_5")()

Figure Missing

best_validation(data)
Best Accuracy: 0.928 (loss=0.1876) Epoch: 87
Best Loss: 0.1783 (accuracy=0.9264, Epoch: 85)
  • Another Training Session
    model_path = MODELS/"layers_5_batch_size_256.h5"
    network = Network(str(training_path), 
                      convolution_layers=5,
                      set_steps = True,
                      epochs = 100,
                      batch_size=256)
    model = tensorflow.keras.models.load_model(str(model_path))
    network._model = model
    with TIMER:
        network.train()
    
    network.model.save(str(model_path))
    
    W0728 10:40:02.467012 140643102242624 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0728 10:40:02.469725 140643102242624 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0728 10:40:02.471851 140643102242624 deprecation.py:506] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
    Instructions for updating:
    Call initializer instance with the dtype argument instead of passing it to the constructor
    W0728 10:40:16.260661 140643102242624 deprecation.py:323] From /home/athena/.virtualenvs/In-Too-Deep/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use tf.where in 2.0, which has the same broadcast rule as np.where
    2019-07-28 10:40:16,840 graeae.timers.timer start: Started: 2019-07-28 10:40:16.840941
    I0728 10:40:16.840974 140643102242624 timer.py:70] Started: 2019-07-28 10:40:16.840941
    Found 20000 images belonging to 2 classes.
    Found 5000 images belonging to 2 classes.
    Epoch 1/100
    78/78 - 422s - loss: 0.1641 - acc: 0.9317 - val_loss: 0.2265 - val_acc: 0.9157
    Epoch 2/100
    78/78 - 151s - loss: 0.1770 - acc: 0.9264 - val_loss: 0.1841 - val_acc: 0.9235
    Epoch 3/100
    78/78 - 149s - loss: 0.1696 - acc: 0.9282 - val_loss: 0.3041 - val_acc: 0.8935
    Epoch 4/100
    78/78 - 148s - loss: 0.1678 - acc: 0.9290 - val_loss: 0.1862 - val_acc: 0.9211
    Epoch 5/100
    78/78 - 146s - loss: 0.1722 - acc: 0.9272 - val_loss: 0.2598 - val_acc: 0.8750
    Epoch 6/100
    78/78 - 145s - loss: 0.1701 - acc: 0.9299 - val_loss: 0.1904 - val_acc: 0.9235
    Epoch 7/100
    78/78 - 144s - loss: 0.1605 - acc: 0.9331 - val_loss: 0.2940 - val_acc: 0.8616
    Epoch 8/100
    78/78 - 143s - loss: 0.1789 - acc: 0.9242 - val_loss: 0.2029 - val_acc: 0.9192
    Epoch 9/100
    78/78 - 143s - loss: 0.1604 - acc: 0.9313 - val_loss: 0.2291 - val_acc: 0.9097
    Epoch 10/100
    78/78 - 143s - loss: 0.1688 - acc: 0.9291 - val_loss: 0.1833 - val_acc: 0.9280
    Epoch 11/100
    78/78 - 139s - loss: 0.1717 - acc: 0.9291 - val_loss: 0.2187 - val_acc: 0.9110
    Epoch 12/100
    78/78 - 138s - loss: 0.1703 - acc: 0.9293 - val_loss: 0.2419 - val_acc: 0.9108
    Epoch 13/100
    78/78 - 139s - loss: 0.1663 - acc: 0.9276 - val_loss: 0.2721 - val_acc: 0.8812
    Epoch 14/100
    78/78 - 138s - loss: 0.1744 - acc: 0.9272 - val_loss: 0.2325 - val_acc: 0.9062
    Epoch 15/100
    78/78 - 137s - loss: 0.1662 - acc: 0.9297 - val_loss: 0.2134 - val_acc: 0.9089
    Epoch 16/100
    78/78 - 137s - loss: 0.1745 - acc: 0.9301 - val_loss: 0.1962 - val_acc: 0.9169
    Epoch 17/100
    78/78 - 137s - loss: 0.1670 - acc: 0.9288 - val_loss: 0.1876 - val_acc: 0.9276
    Epoch 18/100
    78/78 - 136s - loss: 0.1725 - acc: 0.9277 - val_loss: 0.2187 - val_acc: 0.9270
    Epoch 19/100
    78/78 - 138s - loss: 0.1638 - acc: 0.9304 - val_loss: 0.2063 - val_acc: 0.9102
    Epoch 20/100
    78/78 - 137s - loss: 0.1696 - acc: 0.9276 - val_loss: 0.1685 - val_acc: 0.9346
    Epoch 21/100
    78/78 - 137s - loss: 0.1664 - acc: 0.9296 - val_loss: 0.4021 - val_acc: 0.7954
    Epoch 22/100
    78/78 - 136s - loss: 0.1626 - acc: 0.9306 - val_loss: 0.1839 - val_acc: 0.9317
    Epoch 23/100
    78/78 - 136s - loss: 0.1783 - acc: 0.9309 - val_loss: 0.1931 - val_acc: 0.9237
    Epoch 24/100
    78/78 - 136s - loss: 0.1615 - acc: 0.9309 - val_loss: 0.2030 - val_acc: 0.9153
    Epoch 25/100
    78/78 - 138s - loss: 0.1595 - acc: 0.9318 - val_loss: 0.2336 - val_acc: 0.9091
    Epoch 26/100
    78/78 - 136s - loss: 0.1671 - acc: 0.9299 - val_loss: 0.1836 - val_acc: 0.9342
    Epoch 27/100
    78/78 - 136s - loss: 0.1716 - acc: 0.9287 - val_loss: 0.1740 - val_acc: 0.9289
    Epoch 28/100
    78/78 - 137s - loss: 0.1597 - acc: 0.9327 - val_loss: 0.1679 - val_acc: 0.9324
    Epoch 29/100
    78/78 - 141s - loss: 0.1617 - acc: 0.9313 - val_loss: 0.1904 - val_acc: 0.9326
    Epoch 30/100
    78/78 - 138s - loss: 0.1638 - acc: 0.9335 - val_loss: 0.2675 - val_acc: 0.8966
    Epoch 31/100
    78/78 - 138s - loss: 0.1716 - acc: 0.9284 - val_loss: 0.1997 - val_acc: 0.9141
    Epoch 32/100
    78/78 - 139s - loss: 0.1579 - acc: 0.9338 - val_loss: 0.2569 - val_acc: 0.8857
    Epoch 33/100
    78/78 - 139s - loss: 0.1613 - acc: 0.9324 - val_loss: 0.2751 - val_acc: 0.8921
    Epoch 34/100
    78/78 - 136s - loss: 0.1663 - acc: 0.9295 - val_loss: 0.3323 - val_acc: 0.8645
    Epoch 35/100
    78/78 - 139s - loss: 0.1756 - acc: 0.9301 - val_loss: 0.2702 - val_acc: 0.8945
    Epoch 36/100
    78/78 - 137s - loss: 0.1630 - acc: 0.9326 - val_loss: 0.2327 - val_acc: 0.9291
    Epoch 37/100
    78/78 - 141s - loss: 0.1636 - acc: 0.9319 - val_loss: 0.1659 - val_acc: 0.9346
    Epoch 38/100
    78/78 - 138s - loss: 0.1639 - acc: 0.9317 - val_loss: 0.1837 - val_acc: 0.9270
    Epoch 39/100
    78/78 - 139s - loss: 0.1585 - acc: 0.9334 - val_loss: 0.2631 - val_acc: 0.8863
    Epoch 40/100
    78/78 - 138s - loss: 0.1640 - acc: 0.9301 - val_loss: 0.2654 - val_acc: 0.9081
    Epoch 41/100
    78/78 - 140s - loss: 0.1575 - acc: 0.9341 - val_loss: 0.2170 - val_acc: 0.9200
    Epoch 42/100
    78/78 - 138s - loss: 0.1636 - acc: 0.9338 - val_loss: 0.2877 - val_acc: 0.8785
    Epoch 43/100
    78/78 - 137s - loss: 0.1598 - acc: 0.9329 - val_loss: 0.4987 - val_acc: 0.8964
    Epoch 44/100
    78/78 - 137s - loss: 0.1695 - acc: 0.9275 - val_loss: 0.1844 - val_acc: 0.9338
    Epoch 45/100
    78/78 - 139s - loss: 0.1654 - acc: 0.9313 - val_loss: 0.2525 - val_acc: 0.9042
    Epoch 46/100
    78/78 - 139s - loss: 0.1660 - acc: 0.9308 - val_loss: 0.1734 - val_acc: 0.9322
    Epoch 47/100
    78/78 - 137s - loss: 0.1608 - acc: 0.9345 - val_loss: 0.2090 - val_acc: 0.9289
    Epoch 48/100
    78/78 - 138s - loss: 0.1694 - acc: 0.9310 - val_loss: 0.2494 - val_acc: 0.8869
    Epoch 49/100
    78/78 - 138s - loss: 0.1603 - acc: 0.9353 - val_loss: 0.2201 - val_acc: 0.9085
    Epoch 50/100
    78/78 - 135s - loss: 0.1665 - acc: 0.9301 - val_loss: 0.2500 - val_acc: 0.9069
    Epoch 51/100
    78/78 - 139s - loss: 0.1531 - acc: 0.9353 - val_loss: 0.1769 - val_acc: 0.9346
    Epoch 52/100
    78/78 - 142s - loss: 0.1642 - acc: 0.9339 - val_loss: 0.1953 - val_acc: 0.9254
    Epoch 53/100
    78/78 - 140s - loss: 0.1620 - acc: 0.9327 - val_loss: 0.1592 - val_acc: 0.9354
    Epoch 54/100
    78/78 - 137s - loss: 0.1533 - acc: 0.9370 - val_loss: 0.1649 - val_acc: 0.9348
    Epoch 55/100
    78/78 - 136s - loss: 0.1652 - acc: 0.9323 - val_loss: 0.1732 - val_acc: 0.9287
    Epoch 56/100
    78/78 - 137s - loss: 0.1527 - acc: 0.9369 - val_loss: 0.3824 - val_acc: 0.8201
    Epoch 57/100
    78/78 - 136s - loss: 0.1618 - acc: 0.9312 - val_loss: 0.2443 - val_acc: 0.8943
    Epoch 58/100
    78/78 - 137s - loss: 0.1553 - acc: 0.9366 - val_loss: 0.1598 - val_acc: 0.9369
    Epoch 59/100
    78/78 - 136s - loss: 0.1550 - acc: 0.9332 - val_loss: 0.2080 - val_acc: 0.9097
    Epoch 60/100
    78/78 - 136s - loss: 0.1597 - acc: 0.9351 - val_loss: 0.1973 - val_acc: 0.9336
    Epoch 61/100
    78/78 - 136s - loss: 0.1565 - acc: 0.9349 - val_loss: 0.1700 - val_acc: 0.9342
    Epoch 62/100
    78/78 - 137s - loss: 0.1571 - acc: 0.9332 - val_loss: 0.1830 - val_acc: 0.9309
    Epoch 63/100
    78/78 - 138s - loss: 0.1575 - acc: 0.9349 - val_loss: 0.2413 - val_acc: 0.9252
    Epoch 64/100
    78/78 - 139s - loss: 0.1607 - acc: 0.9320 - val_loss: 0.1989 - val_acc: 0.9225
    Epoch 65/100
    78/78 - 143s - loss: 0.2276 - acc: 0.9272 - val_loss: 0.1754 - val_acc: 0.9309
    Epoch 66/100
    78/78 - 139s - loss: 0.1510 - acc: 0.9374 - val_loss: 0.1707 - val_acc: 0.9330
    Epoch 67/100
    78/78 - 144s - loss: 0.1578 - acc: 0.9321 - val_loss: 0.2149 - val_acc: 0.9192
    Epoch 68/100
    78/78 - 141s - loss: 0.1557 - acc: 0.9345 - val_loss: 0.2180 - val_acc: 0.9149
    Epoch 69/100
    78/78 - 142s - loss: 0.1721 - acc: 0.9329 - val_loss: 0.1878 - val_acc: 0.9223
    Epoch 70/100
    78/78 - 142s - loss: 0.1603 - acc: 0.9339 - val_loss: 0.2094 - val_acc: 0.9262
    Epoch 71/100
    78/78 - 143s - loss: 0.1593 - acc: 0.9328 - val_loss: 0.1864 - val_acc: 0.9245
    Epoch 72/100
    78/78 - 145s - loss: 0.1526 - acc: 0.9357 - val_loss: 0.3221 - val_acc: 0.8830
    Epoch 73/100
    78/78 - 145s - loss: 0.1633 - acc: 0.9311 - val_loss: 0.2339 - val_acc: 0.9202
    Epoch 74/100
    78/78 - 144s - loss: 0.1605 - acc: 0.9316 - val_loss: 0.2211 - val_acc: 0.9272
    Epoch 75/100
    78/78 - 143s - loss: 0.1428 - acc: 0.9407 - val_loss: 0.1861 - val_acc: 0.9270
    Epoch 76/100
    78/78 - 139s - loss: 0.1517 - acc: 0.9373 - val_loss: 0.1944 - val_acc: 0.9344
    Epoch 77/100
    78/78 - 139s - loss: 0.1684 - acc: 0.9312 - val_loss: 0.1645 - val_acc: 0.9332
    Epoch 78/100
    78/78 - 138s - loss: 0.1628 - acc: 0.9346 - val_loss: 0.2015 - val_acc: 0.9229
    Epoch 79/100
    78/78 - 136s - loss: 0.1520 - acc: 0.9387 - val_loss: 0.1919 - val_acc: 0.9272
    Epoch 80/100
    78/78 - 136s - loss: 0.1570 - acc: 0.9357 - val_loss: 0.2247 - val_acc: 0.9305
    Epoch 81/100
    78/78 - 152s - loss: 0.1580 - acc: 0.9358 - val_loss: 0.1721 - val_acc: 0.9309
    Epoch 82/100
    78/78 - 150s - loss: 0.1471 - acc: 0.9374 - val_loss: 0.1836 - val_acc: 0.9317
    Epoch 83/100
    78/78 - 149s - loss: 0.1565 - acc: 0.9334 - val_loss: 0.2132 - val_acc: 0.9217
    Epoch 84/100
    78/78 - 148s - loss: 0.1466 - acc: 0.9372 - val_loss: 0.1786 - val_acc: 0.9276
    Epoch 85/100
    78/78 - 147s - loss: 0.1577 - acc: 0.9346 - val_loss: 0.2986 - val_acc: 0.9050
    Epoch 86/100
    78/78 - 147s - loss: 0.1551 - acc: 0.9359 - val_loss: 0.1581 - val_acc: 0.9400
    Epoch 87/100
    78/78 - 144s - loss: 0.1620 - acc: 0.9324 - val_loss: 0.1769 - val_acc: 0.9363
    Epoch 88/100
    78/78 - 143s - loss: 0.1593 - acc: 0.9355 - val_loss: 0.2341 - val_acc: 0.9023
    Epoch 89/100
    78/78 - 141s - loss: 0.1557 - acc: 0.9346 - val_loss: 0.3245 - val_acc: 0.8935
    Epoch 90/100
    78/78 - 140s - loss: 0.1582 - acc: 0.9346 - val_loss: 0.2618 - val_acc: 0.9044
    Epoch 91/100
    78/78 - 139s - loss: 0.1514 - acc: 0.9388 - val_loss: 0.3941 - val_acc: 0.8302
    Epoch 92/100
    78/78 - 137s - loss: 0.1688 - acc: 0.9331 - val_loss: 0.1791 - val_acc: 0.9317
    Epoch 93/100
    78/78 - 138s - loss: 0.1533 - acc: 0.9365 - val_loss: 0.2477 - val_acc: 0.9169
    Epoch 94/100
    78/78 - 136s - loss: 0.1485 - acc: 0.9381 - val_loss: 0.2371 - val_acc: 0.9354
    Epoch 95/100
    78/78 - 136s - loss: 0.1656 - acc: 0.9321 - val_loss: 0.1838 - val_acc: 0.9221
    Epoch 96/100
    78/78 - 136s - loss: 0.1544 - acc: 0.9346 - val_loss: 0.1894 - val_acc: 0.9344
    Epoch 97/100
    78/78 - 138s - loss: 0.1529 - acc: 0.9378 - val_loss: 0.2735 - val_acc: 0.9192
    Epoch 98/100
    78/78 - 135s - loss: 0.1557 - acc: 0.9370 - val_loss: 0.1895 - val_acc: 0.9309
    Epoch 99/100
    78/78 - 137s - loss: 0.1609 - acc: 0.9349 - val_loss: 0.1683 - val_acc: 0.9375
    Epoch 100/100
    78/78 - 137s - loss: 0.1512 - acc: 0.9368 - val_loss: 0.5123 - val_acc: 0.8454
    2019-07-28 14:37:54,241 graeae.timers.timer end: Ended: 2019-07-28 14:37:54.241406
    I0728 14:37:54.241441 140643102242624 timer.py:77] Ended: 2019-07-28 14:37:54.241406
    2019-07-28 14:37:54,242 graeae.timers.timer end: Elapsed: 3:57:37.400465
    I0728 14:37:54.242489 140643102242624 timer.py:78] Elapsed: 3:57:37.400465
    
    data = pandas.DataFrame(network.model.history.history).rename(
        columns={
            "loss": "Training Loss",
            "acc": "Training Accuracy",
            "val_loss": "Validation Loss",
            "val_acc": "Validation Accuracy"
        })
    best_validation(data.rename(columns={
        "Validation Loss": "validation_loss", 
        "Validation Accuracy": "validation_accuracy"}))
    
    Best Accuracy: 0.94 (loss=0.16) Epoch: 85
    Best Loss: 0.16 (accuracy=0.94, Epoch: 85)
    
    line = holoviews.VLine(85)
    plot = data.hvplot(x="index", value_label="Epoch").opts(
        tools=["hover"],
        title="Accuracy and Loss For 100 Epochs",
        width=1000,
        height=800,
    ) * line
    Embed(plot=plot, file_name="five_layers_batch_size_256")()
    

    Figure Missing

    By re-training I managed to squeeze out another 1 percent. I might be able to get another percentage point improvement, but maybe at this point it's good enough.

Using the Model

test_path = Path("~/data/datasets/images/dogs-vs-cats/test1/").expanduser()
test_image = random.choice(list(test_path.iterdir()))
plot = holoviews.RGB.load_image(str(test_image)).opts(width=800, height=800)
Embed(plot=plot, file_name="test_image")()

Figure Missing

image = keras.preprocessing.image.load_img(test_image, target_size=(150, 150))
x = keras.preprocessing.image.img_to_array(image)
x = numpy.expand_dims(x, axis=0)
predictions = network.model.predict(x)
print(predictions[0])
classification = "dog" if predictions[0] else "cat"
print(f"{test_image.name} is a {classification}")
[1.]
8595.jpg is a dog

If you run this more than once you'll get different images, but it seems to have a bias towards predicting dogs.

Visualizing Intermediate Representations

Let's define a new Model that will take an image as input, and will output intermediate representations for all layers in the previous model after the first.

successive_outputs = [layer.output for layer in model.layers[1:]]
visualization_model = tensorflow.keras.models.Model(
    inputs = model.input, 
    outputs = successive_outputs)

Now prepare a random input image of a cat or dog from the training set.

image_path = random.choice(list(testing_path.iterdir()))
image = load_img(image_path, target_size=(150, 150))  # this is a PIL image

x = img_to_array(image) 
print(f"X ({type(x)}): {x.shape}")
x   = x.reshape((1,) + x.shape)
print(f"X re-shape: {x.shape}")
# Rescale by 1/255
x /= 255.0

successive_feature_maps = visualization_model.predict(x)

# These are the names of the layers, so can have them as part of our plot
layer_names = [layer.name for layer in model.layers]
X (<class 'numpy.ndarray'>): (150, 150, 3)
X re-shape: (1, 150, 150, 3)

Let's run our image through our network, thus obtaining all intermediate representations for this image.

# -----------------------------------------------------------------------
# Now let's display our representations
# -----------------------------------------------------------------------
for layer_name, feature_map in zip(layer_names, successive_feature_maps):

  if len(feature_map.shape) == 4:

    #-------------------------------------------
    # Just do this for the conv / maxpool layers, not the fully-connected layers
    #-------------------------------------------
    n_features = feature_map.shape[-1]  # number of features in the feature map
    size       = feature_map.shape[ 1]  # feature map shape (1, size, size, n_features)

    # We will tile our images in this matrix
    display_grid = numpy.zeros((size, size * n_features))

    #-------------------------------------------------
    # Postprocess the feature to be visually palatable
    #-------------------------------------------------
    for i in range(n_features):
      x  = feature_map[0, :, :, i]
      x -= x.mean()
      x /= x.std ()
      x *=  64
      x += 128
      x  = numpy.clip(x, 0, 255).astype('uint8')
      display_grid[:, i * size : (i + 1) * size] = x # Tile each filter into a horizontal grid

    #-----------------
    # Display the grid
    #-----------------

    scale = 40. / n_features
    pyplot.figure(figsize=(scale * n_features, scale))
    pyplot.title (layer_name)
    pyplot.grid (False )
    pyplot.imshow(display_grid, aspect='auto', cmap='viridis' ) 

layer_visualisation.png

End

Adding Automatic Validation

Beginning

Imports

Python

from functools import partial
from pathlib import Path

PyPi

from holoviews.operation.datashader import datashade
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import cv2
import holoviews
import numpy
import tensorflow

Graeae

from graeae import EmbedHoloviews, ZipDownloader

Set Up

The Plotting

Embed = partial(
    EmbedHoloviews, 
    folder_path="../../files/posts/keras/adding-automatic-validation/")
holoviews.extension("bokeh")

The Training Images

URL = ("https://storage.googleapis.com/"
        "laurencemoroney-blog.appspot.com/"
       "horse-or-human.zip")
BASE = "~/data/datasets/images/horse-or-human/"
TARGET = f"{BASE}training"
download = ZipDownloader(url=URL, target=TARGET)
download()
training_path = download.target
Files exist, not downloading

The Validation Images

URL = (
    "https://storage.googleapis.com/"
    "laurencemoroney-blog.appspot.com/"
    "validation-horse-or-human.zip")
TARGET = f"{BASE}validation"
download = ZipDownloader(url=URL, target=TARGET)
download()
validation_path = download.target
Downloading the zip file

Middle

Examining the Data

The training set is the same one that I used before to train a model to recognize whether a picture contained a human or a horse, but the validation set is new.

print("Training")
for path in training_path.iterdir():
    print(path.name)

print("\nValidation")
for path in validation_path.iterdir():
    print(path.name)
Training
horses
humans

Validation
horses
humans

How many images do we have?

print("Training")
for path in training_path.iterdir():
    print(f"{len(list(path.iterdir())):,} images in {path.name}")

print("\nValidation")
for path in validation_path.iterdir():
    print(f"{len(list(path.iterdir())):,} images in {path.name}")    
Training
500 images in horses
527 images in humans

Validation
128 images in horses
128 images in humans

Looking At A Few Images

As I noted, the training set is the same one I looked at before, but still, it never hurts to look.

height = width = 300
human_files = list((training_path/"humans").iterdir())
horse_files = list((training_path/"horses").iterdir())
human_indexes = numpy.random.randint(0, 527, 2)
horse_indexes = numpy.random.randint(0, 500, 2)

humans = [holoviews.RGB.load_image(str(human_files[index])).opts(
    width = width,
    height = height,
) for index in human_indexes]
horses = [holoviews.RGB.load_image(str(horse_files[index])).opts(
    width = width,
    height = height,
) for index in horse_indexes]
plot = holoviews.Layout(humans + horses).cols(2).opts(
    title="Sample Training Images"
)
Embed(plot=plot, file_name="training_images", height_in_pixels=700)()

Figure Missing

Preprocessing the Data

When we train the model we'll use a batch generator. This next bit of code is just a convenience class to bundle the code together.

class Data:
    """creates the data generator

    Args:
     path: path to the dataset
     target_size: tuple of pixel size for the generated images
    """
    def __init__(self, path: str, target_size: tuple=(300, 300)) -> None:
        self.path = path
        self.target_size = target_size
        self._batches = None
        return

    @property
    def batches(self) -> tensorflow.keras.preprocessing.image.DirectoryIterator:
        """Generator of image batches"""
        if self._batches is None:
            data_generator = ImageDataGenerator(rescale=1/255)
            self._batches = data_generator.flow_from_directory(
                self.path,
                target_size=self.target_size,
                batch_size=128,
                class_mode="binary",
            )
        return self._batches

The Model

This bundles together the different parts needed to train and use the model.

class Model:
    """A CNN Builder

    Args:
     training_path: training data folder path
     validation_path: validation data folder path
     image_size: single-dimension for the inputs to the model
     epochs: number of training epochs
     callback: something to stop the training
    """
    def __init__(self, training_path: str, validation_path: str, 
                 image_size: int=300,
                 epochs: int=15, 
                 callback: tensorflow.keras.callbacks.Callback=None) -> None:
        self.training_path = training_path
        self.validation_path = validation_path
        self.image_size = image_size
        self.epochs = epochs
        self.callback = callback
        self._model = None
        self._training_data = None
        self._validation_data = None
        return

    @property
    def training_data(self) -> (tensorflow.keras.preprocessing
                                     .image.DirectoryIterator):
        """generator of training data batches"""
        if self._training_data is None: 
           self._training_data = Data(
               self.training_path,
               (self.image_size, self.image_size)).batches
        return self._training_data

    @property
    def validation_data(self) -> (tensorflow.keras.preprocessing
                                       .image.DirectoryIterator):
        """generator of validation batches"""
        if self._validation_data is None:
            self._validation_data = Data(
                self.validation_path,
                (self.image_size, self.image_size)).batches
        return self._validation_data

    @property
    def model(self) -> tensorflow.keras.models.Sequential:
        """A model with five CNN layers"""
        if self._model is None:
            self._model = tensorflow.keras.models.Sequential()
            for layer in (
                    tensorflow.keras.layers.Conv2D(
                        16, (3,3), 
                        activation='relu', 
                        input_shape=(self.image_size, self.image_size, 3)),
                    tensorflow.keras.layers.MaxPooling2D(2, 2),

                    tensorflow.keras.layers.Conv2D(32, (3,3), 
                                                   activation='relu'),
                    tensorflow.keras.layers.MaxPooling2D(2,2),

                    tensorflow.keras.layers.Conv2D(64, (3,3), 
                                                   activation='relu'),
                    tensorflow.keras.layers.MaxPooling2D(2,2),

                    tensorflow.keras.layers.Conv2D(64, (3,3), 
                                                   activation='relu'),
                    tensorflow.keras.layers.MaxPooling2D(2,2),

                    tensorflow.keras.layers.Conv2D(64, (3,3), 
                                                   activation='relu'),
                    tensorflow.keras.layers.MaxPooling2D(2,2),

                    tensorflow.keras.layers.Flatten(),

                    tensorflow.keras.layers.Dense(512, 
                                                  activation='relu'),
                    tensorflow.keras.layers.Dense(1, activation='sigmoid'),
            ):
                self._model.add(layer)

            self._model.compile(loss='binary_crossentropy',
                                optimizer=RMSprop(lr=0.001),
                                metrics=['acc'])
        return self._model

    def print_summary(self) -> None:
        """Prints a summary of the model's layers"""
        print(self.model.summary())
        return

    def train(self) -> None:
        """Trains the model"""
        fit = partial(self.model.fit_generator,
                      self.training_data,
                      steps_per_epoch=8,  
                      epochs=self.epochs,
                      verbose=2,
                      validation_data = self.validation_data,
                      validation_steps=8)
        if self.callback:
            fit(callbacks=[self.callback])
        else:
            fit()
        return

    def predict(self, image) -> str:
        """Predicts whether the image contains a horse or a human

        Returns:
         label: label for the image
        """
        classes = self.model.predict(image)
        return "human" if classes[0] else "horse"

Training The Model

model = Model(str(training_path), str(validation_path))
model.train()
Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/15
8/8 - 9s - loss: 1.5885 - acc: 0.5640 - val_loss: 0.9410 - val_acc: 0.5000
Epoch 2/15
8/8 - 7s - loss: 0.7624 - acc: 0.6407 - val_loss: 0.7195 - val_acc: 0.5000
Epoch 3/15
8/8 - 7s - loss: 0.8388 - acc: 0.6908 - val_loss: 0.6150 - val_acc: 0.6758
Epoch 4/15
8/8 - 7s - loss: 0.3347 - acc: 0.8818 - val_loss: 1.4559 - val_acc: 0.7070
Epoch 5/15
8/8 - 7s - loss: 0.2710 - acc: 0.8832 - val_loss: 1.2360 - val_acc: 0.8242
Epoch 6/15
8/8 - 6s - loss: 0.1465 - acc: 0.9433 - val_loss: 1.5440 - val_acc: 0.8320
Epoch 7/15
8/8 - 6s - loss: 0.4357 - acc: 0.8454 - val_loss: 1.2532 - val_acc: 0.8242
Epoch 8/15
8/8 - 6s - loss: 0.3896 - acc: 0.8888 - val_loss: 1.4711 - val_acc: 0.8008
Epoch 9/15
8/8 - 5s - loss: 0.1057 - acc: 0.9588 - val_loss: 2.0512 - val_acc: 0.8164
Epoch 10/15
8/8 - 5s - loss: 0.1610 - acc: 0.9366 - val_loss: 1.3215 - val_acc: 0.6602
Epoch 11/15
8/8 - 8s - loss: 0.0889 - acc: 0.9736 - val_loss: 1.7946 - val_acc: 0.8281
Epoch 12/15
8/8 - 7s - loss: 0.0163 - acc: 0.9944 - val_loss: 1.6159 - val_acc: 0.8672
Epoch 13/15
8/8 - 7s - loss: 0.5203 - acc: 0.8915 - val_loss: 0.9708 - val_acc: 0.8125
Epoch 14/15
8/8 - 6s - loss: 0.1073 - acc: 0.9800 - val_loss: 1.1768 - val_acc: 0.8438
Epoch 15/15
8/8 - 7s - loss: 0.0305 - acc: 0.9922 - val_loss: 1.4107 - val_acc: 0.8555

It looks like the accuracy for both the training and the validation sets are going up. Maybe a little more training will help.

model.epochs = 5
model.train()
Epoch 1/5
8/8 - 7s - loss: 0.0109 - acc: 0.9978 - val_loss: 1.6156 - val_acc: 0.8672
Epoch 2/5
8/8 - 7s - loss: 0.0067 - acc: 0.9989 - val_loss: 2.5671 - val_acc: 0.8242
Epoch 3/5
8/8 - 7s - loss: 0.2348 - acc: 0.9477 - val_loss: 1.2397 - val_acc: 0.8633
Epoch 4/5
8/8 - 7s - loss: 0.0132 - acc: 0.9961 - val_loss: 1.5193 - val_acc: 0.8750
Epoch 5/5
8/8 - 7s - loss: 0.0101 - acc: 0.9978 - val_loss: 0.9305 - val_acc: 0.8945

Everything is still improving. Try a little more.

model.epochs = 10
model.train()
Epoch 1/10
8/8 - 8s - loss: 0.0413 - acc: 0.9844 - val_loss: 0.8631 - val_acc: 0.9062
Epoch 2/10
8/8 - 7s - loss: 0.2625 - acc: 0.9244 - val_loss: 1.3837 - val_acc: 0.8438
Epoch 3/10
8/8 - 7s - loss: 0.7150 - acc: 0.8776 - val_loss: 8.2253 - val_acc: 0.6328
Epoch 4/10
8/8 - 7s - loss: 0.0937 - acc: 0.9785 - val_loss: 1.9342 - val_acc: 0.8281
Epoch 5/10
8/8 - 7s - loss: 0.0126 - acc: 0.9978 - val_loss: 1.7459 - val_acc: 0.8672
Epoch 6/10
8/8 - 7s - loss: 0.0064 - acc: 1.0000 - val_loss: 1.8857 - val_acc: 0.8633
Epoch 7/10
8/8 - 6s - loss: 0.0025 - acc: 1.0000 - val_loss: 2.1456 - val_acc: 0.8672
Epoch 8/10
8/8 - 6s - loss: 0.0027 - acc: 1.0000 - val_loss: 2.0877 - val_acc: 0.8711
Epoch 9/10
8/8 - 6s - loss: 9.8538e-04 - acc: 1.0000 - val_loss: 2.3224 - val_acc: 0.8672
Epoch 10/10
8/8 - 6s - loss: 4.4454e-04 - acc: 1.0000 - val_loss: 2.8453 - val_acc: 0.8672

The training loss and accuracy keeps getting better but it looks like it might be overfitting, after about epoch 21, since the validation metrics start to get worse.

I'll try making a callback that stops whene the validation accuracy reaches 90 %.

class Stop(tensorflow.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if (logs.get("val_acc") >= 0.9):
            print(f"Stopping point reached at epoch {epoch}")
            self.model.stop_training = True
callback = Stop()
model = Model(str(training_path), str(validation_path), 
              epochs=30,
              callback=callback)
model.train()
Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/30
8/8 - 8s - loss: 1.7387 - acc: 0.5006 - val_loss: 0.6752 - val_acc: 0.5000
Epoch 2/30
8/8 - 7s - loss: 0.6397 - acc: 0.6630 - val_loss: 0.4168 - val_acc: 0.8438
Epoch 3/30
8/8 - 7s - loss: 0.8124 - acc: 0.6162 - val_loss: 0.5096 - val_acc: 0.7617
Epoch 4/30
8/8 - 7s - loss: 0.3740 - acc: 0.8498 - val_loss: 0.8950 - val_acc: 0.7891
Epoch 5/30
8/8 - 7s - loss: 0.2619 - acc: 0.8867 - val_loss: 0.8874 - val_acc: 0.8477
Epoch 6/30
8/8 - 6s - loss: 0.2136 - acc: 0.9010 - val_loss: 0.5653 - val_acc: 0.8789
Epoch 7/30
8/8 - 6s - loss: 0.0980 - acc: 0.9566 - val_loss: 1.4001 - val_acc: 0.8320
Epoch 8/30
8/8 - 6s - loss: 0.2865 - acc: 0.8665 - val_loss: 0.5963 - val_acc: 0.8906
Epoch 9/30
8/8 - 5s - loss: 0.1949 - acc: 0.9288 - val_loss: 0.9161 - val_acc: 0.8984
Epoch 10/30
8/8 - 5s - loss: 0.1328 - acc: 0.9488 - val_loss: 1.7331 - val_acc: 0.8164
Epoch 11/30
8/8 - 7s - loss: 0.1825 - acc: 0.9266 - val_loss: 1.1965 - val_acc: 0.8438
Epoch 12/30
8/8 - 7s - loss: 0.1108 - acc: 0.9633 - val_loss: 1.8896 - val_acc: 0.7852
Epoch 13/30
8/8 - 7s - loss: 0.0309 - acc: 0.9883 - val_loss: 1.7577 - val_acc: 0.8477
Epoch 14/30
8/8 - 7s - loss: 0.0140 - acc: 0.9956 - val_loss: 2.0667 - val_acc: 0.8320
Epoch 15/30
8/8 - 6s - loss: 1.5402 - acc: 0.8359 - val_loss: 1.3396 - val_acc: 0.8203
Epoch 16/30
8/8 - 6s - loss: 0.0144 - acc: 0.9990 - val_loss: 1.8488 - val_acc: 0.8203
Epoch 17/30
8/8 - 6s - loss: 0.0092 - acc: 0.9989 - val_loss: 2.0972 - val_acc: 0.8320
Epoch 18/30
8/8 - 5s - loss: 0.0031 - acc: 1.0000 - val_loss: 1.9660 - val_acc: 0.8594
Epoch 19/30
8/8 - 6s - loss: 0.0752 - acc: 0.9785 - val_loss: 2.6233 - val_acc: 0.7578
Epoch 20/30
8/8 - 7s - loss: 0.0086 - acc: 0.9987 - val_loss: 2.2535 - val_acc: 0.8203
Epoch 21/30
8/8 - 7s - loss: 0.0012 - acc: 1.0000 - val_loss: 2.5086 - val_acc: 0.8242
Epoch 22/30
8/8 - 7s - loss: 8.1537e-04 - acc: 1.0000 - val_loss: 2.6183 - val_acc: 0.8203
Epoch 23/30
8/8 - 7s - loss: 4.3476e-04 - acc: 1.0000 - val_loss: 2.5576 - val_acc: 0.8477
Epoch 24/30
8/8 - 7s - loss: 1.6678e-04 - acc: 1.0000 - val_loss: 2.7958 - val_acc: 0.8398
Epoch 25/30
8/8 - 6s - loss: 2.6736e-04 - acc: 1.0000 - val_loss: 2.8162 - val_acc: 0.8398
Epoch 26/30
8/8 - 6s - loss: 6.3831e-05 - acc: 1.0000 - val_loss: 3.0070 - val_acc: 0.8398
Epoch 27/30
8/8 - 6s - loss: 3.5260e-05 - acc: 1.0000 - val_loss: 3.4427 - val_acc: 0.8320
Epoch 28/30
8/8 - 5s - loss: 2.8581e-05 - acc: 1.0000 - val_loss: 3.0836 - val_acc: 0.8594
Epoch 29/30
8/8 - 7s - loss: 1.9179 - acc: 0.8610 - val_loss: 1.5853 - val_acc: 0.8281
Epoch 30/30
8/8 - 7s - loss: 0.0118 - acc: 0.9951 - val_loss: 2.7055 - val_acc: 0.8086

So this time it never reached 90 % accuracy the way it did previously so the callback didn't work. Maybe I'll just set it to use 21 epochs.

model = Model(str(training_path), str(validation_path), epochs=21)
model.train()
Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/21
8/8 - 8s - loss: 0.8662 - acc: 0.5428 - val_loss: 0.6637 - val_acc: 0.5000
Epoch 2/21
8/8 - 7s - loss: 0.7301 - acc: 0.6118 - val_loss: 0.5114 - val_acc: 0.8398
Epoch 3/21
8/8 - 7s - loss: 0.5781 - acc: 0.8516 - val_loss: 0.4985 - val_acc: 0.8203
Epoch 4/21
8/8 - 6s - loss: 0.6889 - acc: 0.8346 - val_loss: 0.8576 - val_acc: 0.7969
Epoch 5/21
8/8 - 6s - loss: 0.2113 - acc: 0.9310 - val_loss: 2.0597 - val_acc: 0.6875
Epoch 6/21
8/8 - 6s - loss: 0.3143 - acc: 0.8865 - val_loss: 0.8110 - val_acc: 0.8320
Epoch 7/21
8/8 - 6s - loss: 0.1289 - acc: 0.9570 - val_loss: 1.1169 - val_acc: 0.8672
Epoch 8/21
8/8 - 6s - loss: 0.1513 - acc: 0.9288 - val_loss: 1.1159 - val_acc: 0.8398
Epoch 9/21
8/8 - 6s - loss: 0.0882 - acc: 0.9700 - val_loss: 1.4653 - val_acc: 0.8125
Epoch 10/21
8/8 - 5s - loss: 0.1803 - acc: 0.9522 - val_loss: 1.2575 - val_acc: 0.8711
Epoch 11/21
8/8 - 7s - loss: 0.0753 - acc: 0.9766 - val_loss: 1.0846 - val_acc: 0.8633
Epoch 12/21
8/8 - 8s - loss: 0.1993 - acc: 0.9580 - val_loss: 0.9569 - val_acc: 0.8672
Epoch 13/21
8/8 - 8s - loss: 0.0452 - acc: 0.9867 - val_loss: 1.1035 - val_acc: 0.8906
Epoch 14/21
8/8 - 6s - loss: 0.0139 - acc: 0.9948 - val_loss: 1.7541 - val_acc: 0.8516
Epoch 15/21
8/8 - 6s - loss: 0.0191 - acc: 0.9911 - val_loss: 1.6554 - val_acc: 0.8555
Epoch 16/21
8/8 - 6s - loss: 0.0327 - acc: 0.9967 - val_loss: 10.3868 - val_acc: 0.6523
Epoch 17/21
8/8 - 6s - loss: 2.2541 - acc: 0.9004 - val_loss: 0.9508 - val_acc: 0.8594
Epoch 18/21
8/8 - 6s - loss: 0.0282 - acc: 0.9889 - val_loss: 1.3172 - val_acc: 0.8672
Epoch 19/21
8/8 - 5s - loss: 0.0064 - acc: 0.9989 - val_loss: 1.6202 - val_acc: 0.8477
Epoch 20/21
8/8 - 7s - loss: 0.0033 - acc: 1.0000 - val_loss: 2.0371 - val_acc: 0.8125
Epoch 21/21
8/8 - 8s - loss: 0.0066 - acc: 0.9990 - val_loss: 1.8340 - val_acc: 0.8672

It looks like the 90 % validation accuracy was a fluke.

Looking At Some Predictions

These are the same images I tested previously. The architecture of the model is the same, but I didn't train it for as many epochs on the current pass through this data set.

test_path = Path("~/test_images/").expanduser()
height = width = 400
plots = [datashade(holoviews.RGB.load_image(str(path))).opts(
    title=f"{path.name}",
    height=height,
    width=width
) for path in test_path.iterdir()]
plot = holoviews.Layout(plots).cols(2).opts(title="Test Images")
Embed(plot=plot, file_name="test_images", height_in_pixels=900)()

Figure Missing

target_size = (300, 300)

images = (("horse.jpg", "Horse"), 
          ("centaur.jpg", "Centaur"), 
          ("tomb_figure.jpg", "Statue of a Man Riding a Horse"),
          ("rembrandt.jpg", "Woman"))
for filename, label in images:
    loaded = cv2.imread(str(test_path/filename))
    x = cv2.resize(loaded, target_size)
    x = numpy.reshape(x, (1, 300, 300, 3))
    prediction = model.predict(x)
    print(f"The {label} is a {prediction}.")
The Horse is a horse.
The Centaur is a horse.
The Statue of a Man Riding a Horse is a human.
The Woman is a horse.

Well, now it got the horse right and the woman wrong. Peculiar.

A re-try with smaller images.

model = Model(str(training_path), str(validation_path), 
              image_size=150, 
              epochs=21)
model.train()
Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/21
8/8 - 6s - loss: 0.7257 - acc: 0.5072 - val_loss: 0.6794 - val_acc: 0.6719
Epoch 2/21
8/8 - 6s - loss: 0.6691 - acc: 0.6118 - val_loss: 0.4503 - val_acc: 0.8633
Epoch 3/21
8/8 - 6s - loss: 0.5535 - acc: 0.7402 - val_loss: 0.4486 - val_acc: 0.7969
Epoch 4/21
8/8 - 5s - loss: 0.5850 - acc: 0.7959 - val_loss: 0.4330 - val_acc: 0.8555
Epoch 5/21
8/8 - 4s - loss: 0.1967 - acc: 0.9321 - val_loss: 1.1319 - val_acc: 0.7891
Epoch 6/21
8/8 - 4s - loss: 0.1969 - acc: 0.9310 - val_loss: 0.8440 - val_acc: 0.8125
Epoch 7/21
8/8 - 4s - loss: 0.1309 - acc: 0.9522 - val_loss: 1.4648 - val_acc: 0.8008
Epoch 8/21
8/8 - 5s - loss: 0.2732 - acc: 0.9023 - val_loss: 0.8364 - val_acc: 0.8398
Epoch 9/21
8/8 - 4s - loss: 0.1071 - acc: 0.9611 - val_loss: 1.2082 - val_acc: 0.8359
Epoch 10/21
8/8 - 4s - loss: 0.0725 - acc: 0.9711 - val_loss: 1.9165 - val_acc: 0.7148
Epoch 11/21
8/8 - 6s - loss: 0.2651 - acc: 0.9062 - val_loss: 0.8687 - val_acc: 0.8398
Epoch 12/21
8/8 - 6s - loss: 0.0568 - acc: 0.9789 - val_loss: 1.0587 - val_acc: 0.8359
Epoch 13/21
8/8 - 5s - loss: 0.1405 - acc: 0.9522 - val_loss: 1.3749 - val_acc: 0.7773
Epoch 14/21
8/8 - 5s - loss: 0.2003 - acc: 0.9395 - val_loss: 0.7942 - val_acc: 0.8555
Epoch 15/21
8/8 - 4s - loss: 0.0313 - acc: 0.9889 - val_loss: 0.8540 - val_acc: 0.8594
Epoch 16/21
8/8 - 4s - loss: 0.0280 - acc: 0.9922 - val_loss: 0.9602 - val_acc: 0.8516
Epoch 17/21
8/8 - 4s - loss: 0.1560 - acc: 0.9544 - val_loss: 0.6488 - val_acc: 0.8359
Epoch 18/21
8/8 - 4s - loss: 0.0366 - acc: 0.9933 - val_loss: 1.0103 - val_acc: 0.8555
Epoch 19/21
8/8 - 4s - loss: 0.0238 - acc: 0.9967 - val_loss: 0.7084 - val_acc: 0.8555
Epoch 20/21
8/8 - 5s - loss: 0.0555 - acc: 0.9778 - val_loss: 0.9348 - val_acc: 0.8594
Epoch 21/21
8/8 - 6s - loss: 0.0046 - acc: 1.0000 - val_loss: 1.1267 - val_acc: 0.8633
target_size = (150, 150)

images = (("horse.jpg", "Horse"), 
          ("centaur.jpg", "Centaur"), 
          ("tomb_figure.jpg", "Statue of a Man Riding a Horse"),
          ("rembrandt.jpg", "Woman"))
for filename, label in images:
    loaded = cv2.imread(str(test_path/filename))
    x = cv2.resize(loaded, target_size)
    x = numpy.reshape(x, (1, 150, 150, 3))
    prediction = model.predict(x)
    print(f"The {label} is a {prediction}.")
The Horse is a horse.
The Centaur is a horse.
The Statue of a Man Riding a Horse is a horse.
The Woman is a horse.

Although it looked like it did about the same except getting to high accuracy, it now appears to predict everything is a horse.

A re-try with smaller images.

model = Model(str(training_path), str(validation_path), 
              image_size=150, 
              epochs=21)
model.train()
Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/21
8/8 - 6s - loss: 0.7257 - acc: 0.5072 - val_loss: 0.6794 - val_acc: 0.6719
Epoch 2/21
8/8 - 6s - loss: 0.6691 - acc: 0.6118 - val_loss: 0.4503 - val_acc: 0.8633
Epoch 3/21
8/8 - 6s - loss: 0.5535 - acc: 0.7402 - val_loss: 0.4486 - val_acc: 0.7969
Epoch 4/21
8/8 - 5s - loss: 0.5850 - acc: 0.7959 - val_loss: 0.4330 - val_acc: 0.8555
Epoch 5/21
8/8 - 4s - loss: 0.1967 - acc: 0.9321 - val_loss: 1.1319 - val_acc: 0.7891
Epoch 6/21
8/8 - 4s - loss: 0.1969 - acc: 0.9310 - val_loss: 0.8440 - val_acc: 0.8125
Epoch 7/21
8/8 - 4s - loss: 0.1309 - acc: 0.9522 - val_loss: 1.4648 - val_acc: 0.8008
Epoch 8/21
8/8 - 5s - loss: 0.2732 - acc: 0.9023 - val_loss: 0.8364 - val_acc: 0.8398
Epoch 9/21
8/8 - 4s - loss: 0.1071 - acc: 0.9611 - val_loss: 1.2082 - val_acc: 0.8359
Epoch 10/21
8/8 - 4s - loss: 0.0725 - acc: 0.9711 - val_loss: 1.9165 - val_acc: 0.7148
Epoch 11/21
8/8 - 6s - loss: 0.2651 - acc: 0.9062 - val_loss: 0.8687 - val_acc: 0.8398
Epoch 12/21
8/8 - 6s - loss: 0.0568 - acc: 0.9789 - val_loss: 1.0587 - val_acc: 0.8359
Epoch 13/21
8/8 - 5s - loss: 0.1405 - acc: 0.9522 - val_loss: 1.3749 - val_acc: 0.7773
Epoch 14/21
8/8 - 5s - loss: 0.2003 - acc: 0.9395 - val_loss: 0.7942 - val_acc: 0.8555
Epoch 15/21
8/8 - 4s - loss: 0.0313 - acc: 0.9889 - val_loss: 0.8540 - val_acc: 0.8594
Epoch 16/21
8/8 - 4s - loss: 0.0280 - acc: 0.9922 - val_loss: 0.9602 - val_acc: 0.8516
Epoch 17/21
8/8 - 4s - loss: 0.1560 - acc: 0.9544 - val_loss: 0.6488 - val_acc: 0.8359
Epoch 18/21
8/8 - 4s - loss: 0.0366 - acc: 0.9933 - val_loss: 1.0103 - val_acc: 0.8555
Epoch 19/21
8/8 - 4s - loss: 0.0238 - acc: 0.9967 - val_loss: 0.7084 - val_acc: 0.8555
Epoch 20/21
8/8 - 5s - loss: 0.0555 - acc: 0.9778 - val_loss: 0.9348 - val_acc: 0.8594
Epoch 21/21
8/8 - 6s - loss: 0.0046 - acc: 1.0000 - val_loss: 1.1267 - val_acc: 0.8633
target_size = (150, 150)

images = (("horse.jpg", "Horse"), 
          ("centaur.jpg", "Centaur"), 
          ("tomb_figure.jpg", "Statue of a Man Riding a Horse"),
          ("rembrandt.jpg", "Woman"))
for filename, label in images:
    loaded = cv2.imread(str(test_path/filename))
    x = cv2.resize(loaded, target_size)
    x = numpy.reshape(x, (1, 150, 150, 3))
    prediction = model.predict(x)
    print(f"The {label} is a {prediction}.")
The Horse is a horse.
The Centaur is a horse.
The Statue of a Man Riding a Horse is a horse.
The Woman is a horse.

Although it looked like it did about the same except getting to high accuracy, it now appears to predict everything is a horse.

End

Source

This is a walk-through of the Course 1 - Part 8 - Lesson 3 - Notebook.ipynb on github.