This tutorial shows how to make a Convolutional Neural Network for recognition images from the CIFAR-10 data set with the following concept.

  1. Data Augmentation
  2. Save/load checkpoint
  3. Save Model
  4. Visualizing Learning in Tensorbord

Why Keras API


Keras is an API that makes building deep-learning models easier and faster. It’s a deep-learning toolbox. It’s all about ease of use, reducing complexity, and reducing cognitive load.

The key idea with Keras is to put deep learning into the hands of everyone. The future of deep learning will be part of every developer’s toolbox. It will not just be a tool for experts and researchers. This will happen with Keras API.

Keras is not a library or a code base. It’s more of an API specification. It’s an API specification that has several different implementations. There are the Theano implementations and TensorFlow implementations.

If you’re a TensorFlow user, it gives you access to the full scope of the Keras API to make your life easier without leaving your existing TensorFlow workflow. So you can just start using the Keras API at no loss of flexibility. You don’t have to adopt all of Keras, you can just the layers you need.

CIFAR-10 dataset

The size of all images in this dataset is 32x32x3 (RGB). There are 50,000 images for training a model and 10,000 images for evaluating the performance of the model. The classes and randomly selected 10 images of each class could be seen in the picture below.

CIFAR10 Dataset

Load Data

The keras.datasets module includes methods to load and fetch CIFAR-10 datasets.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

import tensorflow_model

class_mapping = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

x_predict = x_test[:44]
y_predict = y_test[:44]


num_class = 10

y_train = tf.keras.utils.to_categorical(y_train, num_class)
y_test = tf.keras.utils.to_categorical(y_test, num_class)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train = x_train / 255.0
x_test = x_test / 255.0

Rescale(/255.0)

Our original images consist in RGB coefficients in the 0-255, but such values would be too high for our models to process (given a typical learning rate), so we target values between 0 and 1 instead of by scaling with a 1/255. factor.

Data Augmentation

You can augment data via a number of random transformations so that our model would never see twice the exact same image. This helps prevent overfitting and helps the model generalize better.

In Keras, this can be done via the tf.keras.preprocessing.image.ImageDataGenerator class. This class allows you to
configure random transformations and normalization operations to be done on your image data during training and instantiate generators of augmented image batches and labels) via .flow(data, labels) or .flow_from_directory(directory). These generators can then be used with the Keras model methods that accept data generators as inputs, fit_generator, evaluate_generator and predict_generator.

data_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rotation_range=90,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

data_gen.fit(x_train)

These are just a few of the options available for more, see the documentation.

Create Model

In our case, we will use a small convnNet alongside data augmentation and dropout. Dropout helps reduce overfitting, by preventing a layer from seeing twice the exact same pattern. You could say that both dropout and data augmentation tend to disrupt random correlations occurring in your data.

BatchNormalization normalizes the activation of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1. It addresses the problem of internal covariate shift. It also acts as a regularizer, in some cases eliminating the need for Dropout.

CNN model for CIFAR10

The code snippet below is our TensoFlow model using Keras API, a simple stack of 2 convolution layers with a ReLU activation and followed by max-pooling layers.

def cnn_model():
    input_layer = tf.keras.layers.Input(shape=(32, 32, 3), name="input_layer")
    use_bias = True

    # Conv1
    conv = tf.keras.layers.Conv2D(32,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(input_layer)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)

    # Conv2
    conv = tf.keras.layers.Conv2D(32,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(activation)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)

    # MaxPooling1
    max_pool = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(activation)
    dropout = tf.keras.layers.Dropout(0.2)(max_pool)

    # Conv3
    conv = tf.keras.layers.Conv2D(64,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(dropout)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)

    # Conv4
    conv = tf.keras.layers.Conv2D(64,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(activation)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)

    # MaxPooling2
    max_pool = tf.keras.layers.MaxPooling2D()(activation)
    dropout = tf.keras.layers.Dropout(0.3)(max_pool)

    # Conv5
    conv = tf.keras.layers.Conv2D(128,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(dropout)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)
    # Conv6
    conv = tf.keras.layers.Conv2D(128,
                                  kernel_size=(3, 3),
                                  padding='same',
                                  use_bias=use_bias,
                                  activation=None)(activation)
    bn = tf.keras.layers.BatchNormalization(epsilon=1e-06, axis=-1, momentum=0.9)(conv)
    activation = tf.keras.layers.Activation(tf.nn.relu)(bn)

    # MaxPooling3
    max_pool = tf.keras.layers.MaxPooling2D()(activation)
    dropout = tf.keras.layers.Dropout(0.4)(max_pool)

    # Dense Layer
    flatten = tf.keras.layers.Flatten()(dropout)

    # Softmax Layer
    output = tf.keras.layers.Dense(10, activation=tf.nn.softmax, name='output')(flatten)

    return tf.keras.Model(inputs=input_layer, outputs=output)

Compile Model

We are now ready to compile our model. The categorical cross-entropy function has been picked out as a loss function because we have more than 2 labels.

model = tensorflow_model.cnn_model()

model.summary()

opt_rms = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=opt_rms, metrics=['accuracy'])

Visualizing Learning in Tensorbord

This code writes a log for TensorBoard, which allows you to visualize dynamic graphs of your training and test metrics, as well as activation histograms for the different layers in your model. See for more options.

log_dir = "training_3"

tbCallback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=50,
                                            write_graph=True, write_grads=True, batch_size=batch_size,
                                            write_images=True)

Run flowing command in command prompt after training starts.

$ tensorboard --logdir /home/logdir

Save and restore models

Model progress can be saved during—and after—training. This means a model can resume where it left off and avoid long training times or pick-up training where you left off—in case the training process was interrupted.

checkpoint_path = os.path.join(log_dir, "cp.ckpt")

# Create checkpoint callback
cp_callback = tf.keras.callbacks.ModelCheckpoint(
    checkpoint_path, verbose=1, save_weights_only=True,
    # Save weights, every 5-epochs.
    period=5)

Then load the weights from the checkpoint, and re-evaluate:

model.load_weights(checkpoint_path)
loss,acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Train Model

We are now ready to compile our model using  fit_generator()

ef lr_schedule(epoch):
    lrate = 0.001
    if epoch > 75:
        lrate = 0.0005
    elif epoch > 100:
        lrate = 0.0003

    return lrate

batch_size = 64
model.fit_generator(data_gen.flow(x_train, y_train, batch_size=batch_size),
                    steps_per_epoch=x_train.shape[0] // batch_size, epochs=125,
                    verbose=1, validation_data=(x_test, y_test),
                    callbacks=[tf.keras.callbacks.LearningRateScheduler(lr_schedule), cp_callback, tbCallback])

The output of the above Python implementation for the object recognition task is shown below:

CIFAR-10 Output

Evaluate and predict

The tf.keras.Model.evaluate and tf.keras.Model.predict methods can use NumPy data and a tf.data.Dataset.

scores = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1)

print('\nTest accuracy: %.3f loss: %.3f' % (scores[1] * 100, scores[0]))
CIFAR-10 evaluate model

Once you choose fit a final deep learning model in Keras, you can use it to make predictions on new data instances.

result = model.predict(x_predict / 255.0)

pos = 1
for img, lbl, predict_lbl in zip(x_predict, y_predict, result):
    output = np.argmax(predict_lbl, axis=None)
    plt.subplot(4, 11, pos)
    plt.imshow(img)
    plt.axis('off')
    if output == lbl:
        plt.title(class_mapping[output])
    else:
        plt.title(class_mapping[output] + "/" + class_mapping[lbl[0]], color='#ff0000')
    pos += 1

plt.show()
Tensorflow Keras model prediction

Save Model

You can save the entire model to a file that contains the weight values, the model’s configuration, and the optimizer’s configuration. This allows you to checkpoint a model and resume training later—from the exact same state—without access to the original code. Keras provides a basic save format using the HDF5 standard.

Install TensorFlow dependencies:

!pip install -q h5py pyyaml
model.save('my_model.h5')

In this tutorial, you discovered how you can train CNN image classification mode using TensorFlow Keras High-Level API.

Specifically, you learned:

  • How to save and load a checkpoint.
  • How to use the Tensorboard callback of Keras.
  • How to save the model.

Download this project from GitHub