TensorFlow provides a high-level API that makes it easy to build a neural network. The layers
module enable you to build fully connected layers and convolutional layers, adding activation functions, and applying dropout regularization and batch normilization.
The tf.data
API enables you to build input pipelines for an image model might aggregate data from files in a distributed file system, apply random perturbations to each image, and merge randomly selected images into a batch for training.
The Estimators encapsulate training, evaluation, prediction, export for serving. In this tutorial, you’ll learn how to use TensorFlow high-level API to build a convolutional neural network model to classify the Plane or not Plane in the Satellite Imagery dataset.
Dataset
The Keggle ‘s Planes in Satellite Imagery dataset consists of 8000 color images of the plane and 24000 color images of non-plane that we use for training. It provided is a zipped directory planesnet.zip that contains the entire dataset as a 20×20 RGB .png image. Each individual image filename follows a specific format: {label} __ {scene id} __ {longitude} _ {latitude}.png.The label valued 1 or 0, representing the “plane” and “no-plane” class, respectively.
Build Convolutional Neural Networks Model
The first step in creating a model function is to write branching code that implements prediction, evaluation, and training.The model function gets invoked whenever someone calls the Estimator’s train, evaluate, or predict methods.
This is what we will build. There are three convolutional layers and 20x20x3 image as input. We need to connect this to our softmax layer, which will classify the airplane or non-airplane. The last data cube, we reshape it, flatten it out as one big vector. We apply normal dense layers to this vector and end up with our softmax activation and softmax cross entropy loss because this is a classifier.
def model_fn(features, labels, mode): def learn_rate(lr, step): return 0.0001 + tf.train.exponential_decay(lr, step, 800, 1 / math.e) # Input Layer input_layer = tf.reshape(features["image"], [-1, 20, 20, 3]) input_layer = tf.to_float(input_layer) / 255.0 Y_ = labels # 1 layer [filter_size:4x4,stride:1,padding:0,filters:16] conv1 = tf.layers.conv2d(input_layer, filters=16, kernel_size=[4, 4], strides=1, padding="same", activation=None, use_bias=False) batch_norm1 = tf.layers.batch_normalization(conv1, axis=-1, momentum=0.993, epsilon=1e-5, center=True, scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN)) relu1 = tf.nn.relu(batch_norm1) # 2 layer conv2 = tf.layers.conv2d(relu1, filters=32, kernel_size=[3, 3], strides=2, padding="same", activation=None, use_bias=False) batch_norm2 = tf.layers.batch_normalization(conv2, axis=-1, momentum=0.993, epsilon=1e-5, center=True, scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN)) relu2 = tf.nn.relu(batch_norm2) # 3 layer conv3 = tf.layers.conv2d(relu2, filters=64, kernel_size=[2, 2], strides=2, padding="same", activation=None, use_bias=False) batch_norm3 = tf.layers.batch_normalization(conv3, axis=-1, momentum=0.993, epsilon=1e-5, center=True, scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN)) relu3 = tf.nn.relu(batch_norm3) # Flatten all values for fully connected layer relu3_flat = tf.reshape(relu3, [-1, 4 * 16 * 5 * 5]) # Dense Layer dense = tf.layers.dense(relu3_flat, 80, activation=None, use_bias=False) batch_norm_dense = tf.layers.batch_normalization(dense, axis=-1, momentum=0.993, epsilon=1e-5, center=True, scale=False, training=(mode == tf.estimator.ModeKeys.TRAIN)) relu_dense = tf.nn.relu(batch_norm_dense) # Logits Layer Ylogits = tf.layers.dense(relu_dense, 2) predictions = { "classes": tf.argmax(input=Ylogits, axis=1), "probabilities": tf.nn.softmax(Ylogits, name="softmax_tensor") } if mode == tf.estimator.ModeKeys.PREDICT: return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) predict = tf.nn.softmax(Ylogits) classes = tf.cast(tf.argmax(predict, 1), tf.uint8) # Calculate Loss for TRAIN and EVAL modes loss = tf.reduce_mean(tf.losses.softmax_cross_entropy(tf.one_hot(Y_, 2), Ylogits)) * 100 # Configure the Training Op for TRAIN mode if mode == tf.estimator.ModeKeys.TRAIN: train_op = tf.contrib.layers.optimize_loss(loss, tf.train.get_global_step(), learning_rate=0.01, optimizer="Adam", learning_rate_decay_fn=learn_rate) return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) # Add evaluation metrics eval_metrics = {'accuracy': tf.metrics.accuracy(classes, Y_)} return tf.estimator.EstimatorSpec( mode=mode, loss=loss, eval_metric_ops=eval_metrics)
The model function is your layers. It returns the predictions and the loss. You put the loss into an optimizer. You’ve got this training operation, which the estimator will run in a loop and whatever evaluation metrics you care about.
Load Training and Test Data
First, download data from kaggle and extract label from the file name.Each individual image filename follows a specific format: {label} __ {scene id} __ {longitude} _ {latitude}.png.Valued 1 or 0, representing the “plane” and “no-plane” class, respectively.
DATA_DIR = "/home/manu/PycharmProjects/DataSet/planesnet/planesnet" image_files = [join(DATA_DIR, f) for f in listdir(DATA_DIR) if isfile(join(DATA_DIR, f))] TEST_SIZE = 5000 test_file = image_files[:TEST_SIZE] train_file = image_files[TEST_SIZE:] def load_dataset(filenames): labels = list(map(lambda filename: int(os.path.basename(filename)[0:1] == '1'), filenames)) return tf.data.Dataset.from_tensor_slices((tf.constant(filenames), tf.constant(labels)))
If all of your input data fit in memory, the simplest way to create a Dataset from them is to convert them to Tensor objects and use Dataset.from_tensor_slices(). You can process large datasets that do not fit in memory by converting into TFRecord.
The function starts by using the tf.data.Dataset.from_tensor_slices
function to create a tf.data.Dataset
representing slices of the array. The array is sliced across the first dimension. For example, an array containing the training data has a shape of (27000, 20, 20)
. Passing this to from_tensor_slices
returns a Dataset
object containing 27000 slices, each one a 20×20 image.
Input Function
The training input function will define how your data goes into the model and I use this dataset API. That’s really good because this dataset API is designed for out of memory datasets. You define what your dataset is and then as your model is training the data is loaded and the loading triggers the loading of additional files from disk.
def load(filename, label): return tf.read_file(filename), label def decode(img_bytes, label): img_decoded = tf.image.decode_image(img_bytes, channels=3) return img_decoded, label def features_and_labels(dataset): it = dataset.make_one_shot_iterator() images, labels = it.get_next() features = {'image': images} return features, labels def dataset_input_fn(dataset): dataset = dataset.map(load) dataset = dataset.map(decode) dataset = dataset.shuffle(20) dataset = dataset.batch(1) dataset = dataset.repeat() return features_and_labels(dataset) def dataset_eval_input_fn(dataset): dataset = dataset.map(load) dataset = dataset.map(decode) dataset = dataset.batch(TEST_SIZE) return features_and_labels(dataset)
I’m reading images here, The dataset is initialized from files. I apply some loading operations, which will load those files and decompress them. I usually need to shuffle my data into batches because the training always processed by batches and usually are repeated indefinitely and my dataset is done.
Create Estimators
An Estimator is TensorFlow’s high-level representation of a complete model. It handles the details of initialization, logging, saving and restoring, and many other features so you can concentrate on your model.
training_config = tf.contrib.learn.RunConfig(save_checkpoints_secs=None, save_checkpoints_steps=500) estimator = tf.estimator.Estimator(model_fn=model.model_fn, model_dir="/tmp/cnn_data", config=training_config)
I advise you to wrap your model into estimator API. That’s just because in an estimator, TensorFlow has written for you a ton of boilerplate code that is not interesting to write, things like regularly outputting checkpoints. If your training crashes after 24 hours, you can restart from where you were, exporting the model at the end so that you have something that is ready to deploy to a serving infrastructure or distributed training. The distributional algorithms of distributed training also baked in into the estimator.
Train the model
Train the model by calling the Estimator’s train method as follows:
dataset_train = load_dataset(train_file) dataset_test = load_dataset(test_file) tensors_to_log = {"probabilities": "softmax_tensor"} logging_hook = tf.train.LoggingTensorHook( tensors=tensors_to_log, every_n_iter=50) estimator.train(input_fn=lambda: dataset_input_fn(dataset_train), steps=10000, hooks=[logging_hook])
Here we wrap up our input_fn
in a lambda
to capture the arguments.
Evaluate the trained model
The final step is to evaluate the trained model. We can get some statistics on its performance. The following code block evaluates the accuracy of the trained model on the test data:
eval_results = estimator.evaluate(input_fn=lambda: dataset_eval_input_fn(dataset_test)) print(eval_results)
Download this project from GitHub
Related Post
Feeding your own data set into the CNN model in TensorFlow
Convert a directory of images to TFRecords
Deep learning model for Car Price prediction using TensorFlow
Importance of Batch Normalization in TensorFlow