TensorFlow provides a high-level API that makes it easy to build a neural network. The  layers module enable you to build fully connected layers and convolutional layers, adding activation functions, and applying dropout regularization and batch normilization.

The tf.data API enables you to build input pipelines for an image model might aggregate data from files in a distributed file system, apply random perturbations to each image, and merge randomly selected images into a batch for training.

The Estimators encapsulate training, evaluation, prediction, export for serving. In this tutorial, you’ll learn how to use TensorFlow high-level API to build a convolutional neural network model to classify the Plane or not Plane in the Satellite Imagery dataset.


TensorFlow Classify planes or not

The Keggle ‘s Planes in Satellite Imagery dataset consists of 8000 color images of the plane and 24000 color images of non-plane that we use for training. It provided is a zipped directory planesnet.zip that contains the entire dataset as a 20×20 RGB .png image. Each individual image filename follows a specific format: {label} __ {scene id} __ {longitude} _ {latitude}.png.The label valued 1 or 0, representing the “plane” and “no-plane” class, respectively.

Build Convolutional Neural Networks Model

The first step in creating a model function is to write branching code that implements prediction, evaluation, and training.The model function gets invoked whenever someone calls the Estimator’s train, evaluate, or predict methods.
Convolutional Neural Network using High Level API

This is what we will build. There are three convolutional layers and 20x20x3 image as input. We need to connect this to our softmax layer, which will classify the airplane or non-airplane. The last data cube, we reshape it, flatten it out as one big vector. We apply normal dense layers to this vector and end up with our softmax activation and softmax cross entropy loss because this is a classifier.

def model_fn(features, labels, mode):
    def learn_rate(lr, step):
        return 0.0001 + tf.train.exponential_decay(lr, step, 800, 1 / math.e)

    # Input Layer
    input_layer = tf.reshape(features["image"], [-1, 20, 20, 3])
    input_layer = tf.to_float(input_layer) / 255.0

    Y_ = labels

    # 1 layer [filter_size:4x4,stride:1,padding:0,filters:16]
    conv1 = tf.layers.conv2d(input_layer, filters=16, kernel_size=[4, 4], strides=1, padding="same", activation=None,
    batch_norm1 = tf.layers.batch_normalization(conv1, axis=-1, momentum=0.993, epsilon=1e-5, center=True,
                                                scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN))
    relu1 = tf.nn.relu(batch_norm1)

    # 2 layer
    conv2 = tf.layers.conv2d(relu1, filters=32, kernel_size=[3, 3], strides=2, padding="same", activation=None,
    batch_norm2 = tf.layers.batch_normalization(conv2, axis=-1, momentum=0.993, epsilon=1e-5, center=True,
                                                scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN))
    relu2 = tf.nn.relu(batch_norm2)

    # 3 layer
    conv3 = tf.layers.conv2d(relu2, filters=64, kernel_size=[2, 2], strides=2, padding="same", activation=None,
    batch_norm3 = tf.layers.batch_normalization(conv3, axis=-1, momentum=0.993, epsilon=1e-5, center=True,
                                                scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN))
    relu3 = tf.nn.relu(batch_norm3)

    # Flatten all values for fully connected layer
    relu3_flat = tf.reshape(relu3, [-1, 4 * 16 * 5 * 5])

    # Dense Layer
    dense = tf.layers.dense(relu3_flat, 80, activation=None, use_bias=False)
    batch_norm_dense = tf.layers.batch_normalization(dense, axis=-1, momentum=0.993, epsilon=1e-5, center=True,
                                                     scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN))
    relu_dense = tf.nn.relu(batch_norm_dense)

    # Logits Layer
    Ylogits = tf.layers.dense(relu_dense, 2)

    predictions = {
        "classes": tf.argmax(input=Ylogits, axis=1),
        "probabilities": tf.nn.softmax(Ylogits, name="softmax_tensor")

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    predict = tf.nn.softmax(Ylogits)
    classes = tf.cast(tf.argmax(predict, 1), tf.uint8)

    # Calculate Loss for TRAIN and EVAL modes
    loss = tf.reduce_mean(tf.losses.softmax_cross_entropy(tf.one_hot(Y_, 2), Ylogits)) * 100

    # Configure the Training Op for TRAIN mode
    if mode == tf.estimator.ModeKeys.TRAIN:
        train_op = tf.contrib.layers.optimize_loss(loss, tf.train.get_global_step(), learning_rate=0.01,
                                                   optimizer="Adam", learning_rate_decay_fn=learn_rate)
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

    # Add evaluation metrics
    eval_metrics = {'accuracy': tf.metrics.accuracy(classes, Y_)}

    return tf.estimator.EstimatorSpec(
        mode=mode, loss=loss, eval_metric_ops=eval_metrics)

The model function is your layers. It returns the predictions and the loss. You put the loss into an optimizer. You’ve got this training operation, which the estimator will run in a loop and whatever evaluation metrics you care about.

Load Training and Test Data

First, download data from kaggle and extract label from the file name.Each individual image filename follows a specific format: {label} __ {scene id} __ {longitude} _ {latitude}.png.Valued 1 or 0, representing the “plane” and “no-plane” class, respectively.

DATA_DIR = "/home/manu/PycharmProjects/DataSet/planesnet/planesnet"

image_files = [join(DATA_DIR, f) for f in listdir(DATA_DIR) if isfile(join(DATA_DIR, f))]

TEST_SIZE = 5000

test_file = image_files[:TEST_SIZE]
train_file = image_files[TEST_SIZE:]

def load_dataset(filenames):
    labels = list(map(lambda filename: int(os.path.basename(filename)[0:1] == '1'), filenames))
    return tf.data.Dataset.from_tensor_slices((tf.constant(filenames), tf.constant(labels)))

If all of your input data fit in memory, the simplest way to create a Dataset from them is to convert them to Tensor objects and use Dataset.from_tensor_slices(). You can process large datasets that do not fit in memory by converting into TFRecord.

The function starts by using the tf.data.Dataset.from_tensor_slices function to create a tf.data.Datasetrepresenting slices of the array. The array is sliced across the first dimension. For example, an array containing the training data has a shape of (27000, 20, 20). Passing this to from_tensor_slices returns a Datasetobject containing 27000 slices, each one a 20×20 image.

Input Function

The training input function will define how your data goes into the model and I use this dataset API. That’s really good because this dataset API is designed for out of memory datasets. You define what your dataset is and then as your model is training the data is loaded and the loading triggers the loading of additional files from disk.

def load(filename, label):
    return tf.read_file(filename), label

def decode(img_bytes, label):
    img_decoded = tf.image.decode_image(img_bytes, channels=3)
    return img_decoded, label

def features_and_labels(dataset):
    it = dataset.make_one_shot_iterator()
    images, labels = it.get_next()
    features = {'image': images}
    return features, labels

def dataset_input_fn(dataset):
    dataset = dataset.map(load)
    dataset = dataset.map(decode)
    dataset = dataset.shuffle(20)
    dataset = dataset.batch(1)
    dataset = dataset.repeat()
    return features_and_labels(dataset)

def dataset_eval_input_fn(dataset):
    dataset = dataset.map(load)
    dataset = dataset.map(decode)
    dataset = dataset.batch(TEST_SIZE)
    return features_and_labels(dataset)

I’m reading images here, The dataset is initialized from files. I apply some loading operations, which will load those files and decompress them. I usually need to shuffle my data into batches because the training always processed by batches and usually are repeated indefinitely and my dataset is done.

Create Estimators

An Estimator is TensorFlow’s high-level representation of a complete model. It handles the details of initialization, logging, saving and restoring, and many other features so you can concentrate on your model.

training_config = tf.contrib.learn.RunConfig(save_checkpoints_secs=None, save_checkpoints_steps=500)

estimator = tf.estimator.Estimator(model_fn=model.model_fn, model_dir="/tmp/cnn_data", config=training_config)

I advise you to wrap your model into estimator API. That’s just because in an estimator, TensorFlow has written for you a ton of boilerplate code that is not interesting to write, things like regularly outputting checkpoints. If your training crashes after 24 hours, you can restart from where you were, exporting the model at the end so that you have something that is ready to deploy to a serving infrastructure or distributed training. The distributional algorithms of distributed training also baked in into the estimator.

Train the model

Train the model by calling the Estimator’s train method as follows:

dataset_train = load_dataset(train_file)
    dataset_test = load_dataset(test_file)

tensors_to_log = {"probabilities": "softmax_tensor"}
    logging_hook = tf.train.LoggingTensorHook(
    tensors=tensors_to_log, every_n_iter=50)

estimator.train(input_fn=lambda: dataset_input_fn(dataset_train),

Here we wrap up our input_fn in a lambda to capture the arguments.

Evaluate the trained model

The final step is to evaluate the trained model. We can get some statistics on its performance. The following code block evaluates the accuracy of the trained model on the test data:

eval_results = estimator.evaluate(input_fn=lambda: dataset_eval_input_fn(dataset_test))

Download this project from GitHub

Related Post

Feeding your own data set into the CNN model in TensorFlow
Convert a directory of images to TFRecords
Deep learning model for Car Price prediction using TensorFlow
Importance of Batch Normalization in TensorFlow