Feeding your own data set into the CNN model in TensorFlow

I’m assuming you already know a fair bit about Neural Network and Convolutional Neural Network, as I won’t go into too much detail about their background and how they work. I am using TensorFlow as a Machine Learning framework. In case you are not familiar with TensorFlow, make sure to check out my recent post getting started with TensorFlow.


The Kaggle Dog vs Cat dataset consists of 25,000 color images of dogs and cats that we are supposed to use for training.Each image is a different size of pixel intensities, represented as [0, 255] integer values in RGB color space.


Before you run the training script for the first time, you will need to convert the data to native TFRecord format. The TFRecord format consists of a set of sharded files where each entry is a serialized tf.Example proto. Each tf.Example proto contains the image (JPEG encoded) as well as metadata such as label height, width no of channels.Google provide a single script for converting Image data to TFRecord format.

When the script finishes you will find 2 shards for the training and validation files in the DATA_DIR. The files will match the patterns train-?????-of-00002 and validation-?????-of-00002, respectively.

Convolution neural network architecture

ConvNet is a sequence of layers, and every layer of a ConvNet transforms one volume of activations to another through a differentiable function. We use three main types of layers to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected Layer.We will stack these layers to form a full ConvNet architecture.

Building the CNN for Image Classifier

You need to know the building block to building a full convolution neural network. Let’s look at an example let’s say that you’re inputting an image which is 252x252x3 it’s an RGB image and trying to recognize either Dog or Cat.Let’s build a neural network to do this.
What’s gonna use in this post is inspired and it’s actually quite similar to one of the classic neural networks called LeNet-5.what up the show here isn’t exactly LeNet-5 but inspired by it but many of parameter choices were inspired by it.
convolution neural network architecture
252x252x3 input image lets say that the first layer uses a 32,5x5 filter stride of 1 and same padding so so the output of this layer same as the input call this layer conv1. Next, let’s apply a pooling layer so I’m going apply max pooling here and let’s use a filter 2x2 and strides=2.This should reduce the height and width of the representation by a factor of 2 so 252x252x32 now become 126x126x32.The number of channels remains the same. we are going to call this max pooling 1.
Next given 126x126x32 volume let’s apply another convolution layer to it let’s use a filter size this 5×5 and stride 1 and let’s use a 64 filters this time so now you end up with a 126x126x64 volume so called conv2 and then in this network lets’ do max pooling with a Filter:2×2 and Strides:2 and the 126X126X64 this will the half the height and width.

Dense Layer

Next, we want to add a dense layer (with 1,024 neurons and ReLU activation) to our CNN to perform classification on the features extracted by the convolution/pooling layers. Before we connect the layer, we’ll flatten our feature map (max pooling 2) to shape [batch_size, features], so that our tensor has only two dimensions:
63x63x64=254016 so let’s now fatten output to a 254016x1 dimensional vector we also think of this a flattened result into just a set of neurons.What we’re going to do is then take this 254016 units and let’s build the next layer as having 1024 units so this is actually our first fully connected layer I’m gonna call this FC2 because we have 254016 unit density connected to 1024 units. So this fully connected unit is just like the single neural network layer or this is just a standard neural network where you have a weight matrix that’s call W3 of dimension 1024x254016 and this is called fully connected because each of the 254016 units here is connected to each of the 1024 units.You also have a bias parameter that’s going to be just 1024 dimensional because of 1024 outputs.

Logits Layer

Finally, you now have 1024 real numbers that you can feed to a softmax unit and if you’re trying to do classifying images like either dog or cat then this would be a softmax with 2 outputs so this is a reasonably typical example of what a convolutional network looks like.

Generate Predictions

The logits layer of our model returns our predictions as raw values in a [batch_size, 2]-dimensional tensor. Let’s convert these raw values into two different formats that our model function can return:

  • The predicted class for each example: Dog or Cat

Our predicted class is the element in the corresponding row of the logits tensor with the highest raw value. We can find the index of this element using the tf.argmax function:

 The input argument specifies the tensor from which to extract maximum values—here logits. The axisargument specifies the axis of the input tensor along which to find the greatest value. Here, we want to find the largest value along the dimension with index of 1, which corresponds to our predictions (recall that our logits tensor has shape [batch_size, 2]).

We can derive probabilities from our logits layer by applying softmax activation using tf.nn.softmax:

Calculate Loss

For training and evaluation, we need to define a loss function that measures how closely the model’s predictions match the target classes. For classification problems, cross entropy is typically used as the loss metric. The following code calculates cross entropy when the model runs in either TRAIN or EVAL mode:

Training Operation

we defined loss for the model as the softmax cross-entropy of the logits layer and our labels. Let’s configure our model to optimize this loss value during training. We’ll use a learning rate of 0.001 and stochastic gradient descent as the optimization algorithm:

Add evaluation metrics

Define eval_metric_ops dict in EVAL mode as follows:

Load Training and Test Data

Convert whatever data you have into a TFRecordes supported format.This approach makes it easier to mix and match data sets. The recommended format for TensorFlow is an TFRecords file containing tf.train.Example protocol buffers  which contain Features as a field.

To read a file of TFRecords, use tf.TFRecordReader with the tf.parse_single_example decoder. The parse_single_example op decodes the example protocol buffers into tensors.

Train a model with a different image size.

The simplest solution is to artificially resize your images to 252×252 pixels. See Images section for many resizing, cropping and padding methods. Note that the entire model architecture is predicated on a 252x252 image, thus if you wish to change the input image size, then you may need to redesign the entire model architecture.

Fused decode and crop

If inputs are JPEG images that also require cropping, use fused tf.image.decode_and_crop_jpeg to speed up preprocessing. tf.image.decode_and_crop_jpeg only decodes the part of the image within the crop window. This significantly speeds up the process if the crop window is much smaller than the full image. For image data, this approach could speed up the input pipeline by up to 30%.

Create input functions

You must create input functions to supply data for training, evaluating, and prediction.Input function is a function that returns the following two-element tuple:

  • “features” – A Python dictionary in which:
    • Each key is the name of a feature.
    • Each value is an array containing all of that feature’s values.
  • “label” – An array containing the values of the label for every example.

The Dataset API can handle a lot of common cases for you. Using the Dataset API, you can easily read in records from a large collection of files in parallel and join them into a single stream.

Create the Estimator

Next, let’s create an Estimator a TensorFlow class for performing high-level model training, evaluation, and inference for our model. Add the following code to main():

The model_fn argument specifies the model function to use for training, evaluation, and prediction; we pass it the cnn_model_fn that we have created.The model_dir argument specifies the directory where model data (checkpoints) will be saved (here, we specify the temp directory /tmp/convnet_model, but feel free to change to another directory of your choice).

Set Up a Logging Hook

CNNs can take time to train, let’s set up some logging so we can track progress during training. We can use TensorFlow’s tf.train.SessionRunHook to create a tf.train.LoggingTensorHook that will log the probability values from the softmax layer of our CNN. Add the following to main().

We store a dict of the tensors we want to log in tensors_to_log. Each key is a label of our choice that will be printed in the log output, and the corresponding label is the name of a Tensor in the TensorFlow graph. Here, our probabilities can be found in softmax_tensor, the name we gave our softmax operation earlier when we generated the probabilities in cnn_model_fn.

Next, we create the LoggingTensorHook, passing tensors_to_log to the tensors argument. We set every_n_iter=50, which specifies that probabilities should be logged after every 50 steps of training.

Train the Model

Now we’re ready to train our model, which we can do by creating train_input_fn ans calling train() on mnist_classifier. Add the following to main()

Evaluate the Model

Once training is complete, we want to evaluate our model to determine its accuracy on the test set. We call the evaluate method, which evaluates the metrics we specified in eval_metric_ops argument in the cnn_model_fn. Add the following to main()

Run the Model

We’ve coded the CNN model function, Estimator, and the training/evaluation logic; now run the python script.

Training CNNs is quite computationally intensive. Estimated completion time of python script will vary depending on your processor.To train more quickly, you can decrease the number of steps passed to train(), but note that this will affect accuracy.

Download this project from GitHub


Related Post






Leave a Reply

Your email address will not be published. Required fields are marked *