Feeding your own data set into the CNN model in TensorFlow

I’m assuming you already know a fair bit about Neural Network and Convolutional Neural Network, as I won’t go into too much detail about their background and how they work. I am using TensorFlow as a Machine Learning framework. In case you are not familiar with TensorFlow, make sure to check out my recent post getting started with TensorFlow.


The Kaggle Dog vs Cat dataset consists of 25,000 color images of dogs and cats that we are supposed to use for training.Each image is a different size of pixel intensities, represented as [0, 255] integer values in RGB color space.


Before you run the training script for the first time, you will need to convert the data to native TFRecord format. The TFRecord format consists of a set of sharded files where each entry is a serialized tf.Example proto. Each tf.Example proto contains the image (JPEG encoded) as well as metadata such as label height, width no of channels.Google provide a single script for converting Image data to TFRecord format.

When the script finishes you will find 2 shards for the training and validation files in the DATA_DIR. The files will match the patterns train-?????-of-00002 and validation-?????-of-00002, respectively.

Convolution neural network architecture

ConvNet is a sequence of layers, and every layer of a ConvNet transforms one volume of activations to another through a differentiable function. We use three main types of layers to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected Layer.We will stack these layers to form a full ConvNet architecture.

Building the CNN for Image Classifier

You need to know the building block to building a full convolution neural network. Let’s look at an example let’s say that you’re inputting an image which is 252x252x3 it’s an RGB image and trying to recognize either Dog or Cat.Let’s build a neural network to do this.
What’s gonna use in this post is inspired and it’s actually quite similar to one of the classic neural networks called LeNet-5.what up the show here isn’t exactly LeNet-5 but inspired by it but many of parameter choices were inspired by it.
convolution neural network architecture
252x252x3 input image lets say that the first layer uses a 32,5x5 filter stride of 1 and same padding so so the output of this layer same as the input call this layer conv1. Next, let’s apply a pooling layer so I’m going apply max pooling here and let’s use a filter 2x2 and strides=2.This should reduce the height and width of the representation by a factor of 2 so 252x252x32 now become 126x126x32.The number of channels remains the same. we are going to call this max pooling 1.
Next given 126x126x32 volume let’s apply another convolution layer to it let’s use a filter size this 5×5 and stride 1 and let’s use a 64 filters this time so now you end up with a 126x126x64 volume so called conv2 and then in this network lets’ do max pooling with a Filter:2×2 and Strides:2 and the 126X126X64 this will the half the height and width.

Dense Layer

Next, we want to add a dense layer (with 1,024 neurons and ReLU activation) to our CNN to perform classification on the features extracted by the convolution/pooling layers. Before we connect the layer, we’ll flatten our feature map (max pooling 2) to shape [batch_size, features], so that our tensor has only two dimensions:
63x63x64=254016 so let’s now fatten output to a 254016x1 dimensional vector we also think of this a flattened result into just a set of neurons.What we’re going to do is then take this 254016 units and let’s build the next layer as having 1024 units so this is actually our first fully connected layer I’m gonna call this FC2 because we have 254016 unit density connected to 1024 units. So this fully connected unit is just like the single neural network layer or this is just a standard neural network where you have a weight matrix that’s call W3 of dimension 1024x254016 and this is called fully connected because each of the 254016 units here is connected to each of the 1024 units.You also have a bias parameter that’s going to be just 1024 dimensional because of 1024 outputs.

Logits Layer

Finally, you now have 1024 real numbers that you can feed to a softmax unit and if you’re trying to do classifying images like either dog or cat then this would be a softmax with 2 outputs so this is a reasonably typical example of what a convolutional network looks like.

Generate Predictions

The logits layer of our model returns our predictions as raw values in a [batch_size, 2]-dimensional tensor. Let’s convert these raw values into two different formats that our model function can return:

  • The predicted class for each example: Dog or Cat

Our predicted class is the element in the corresponding row of the logits tensor with the highest raw value. We can find the index of this element using the tf.argmax function:

 The input argument specifies the tensor from which to extract maximum values—here logits. The axisargument specifies the axis of the input tensor along which to find the greatest value. Here, we want to find the largest value along the dimension with index of 1, which corresponds to our predictions (recall that our logits tensor has shape [batch_size, 2]).

We can derive probabilities from our logits layer by applying softmax activation using tf.nn.softmax:

Calculate Loss

For training and evaluation, we need to define a loss function that measures how closely the model’s predictions match the target classes. For classification problems, cross entropy is typically used as the loss metric. The following code calculates cross entropy when the model runs in either TRAIN or EVAL mode:

Training Operation

we defined loss for the model as the softmax cross-entropy of the logits layer and our labels. Let’s configure our model to optimize this loss value during training. We’ll use a learning rate of 0.001 and stochastic gradient descent as the optimization algorithm:

Add evaluation metrics

Define eval_metric_ops dict in EVAL mode as follows:

Load Training and Test Data

Convert whatever data you have into a TFRecordes supported format.This approach makes it easier to mix and match data sets. The recommended format for TensorFlow is an TFRecords file containing tf.train.Example protocol buffers  which contain Features as a field.

To read a file of TFRecords, use tf.TFRecordReader with the tf.parse_single_example decoder. The parse_single_example op decodes the example protocol buffers into tensors.

Train a model with a different image size.

The simplest solution is to artificially resize your images to 252×252 pixels. See Images section for many resizing, cropping and padding methods. Note that the entire model architecture is predicated on a 252x252 image, thus if you wish to change the input image size, then you may need to redesign the entire model architecture.

Fused decode and crop

If inputs are JPEG images that also require cropping, use fused tf.image.decode_and_crop_jpeg to speed up preprocessing. tf.image.decode_and_crop_jpeg only decodes the part of the image within the crop window. This significantly speeds up the process if the crop window is much smaller than the full image. For image data, this approach could speed up the input pipeline by up to 30%.

Create input functions

You must create input functions to supply data for training, evaluating, and prediction.Input function is a function that returns the following two-element tuple:

  • “features” – A Python dictionary in which:
    • Each key is the name of a feature.
    • Each value is an array containing all of that feature’s values.
  • “label” – An array containing the values of the label for every example.

The Dataset API can handle a lot of common cases for you. Using the Dataset API, you can easily read in records from a large collection of files in parallel and join them into a single stream.

Create the Estimator

Next, let’s create an Estimator a TensorFlow class for performing high-level model training, evaluation, and inference for our model. Add the following code to main():

The model_fn argument specifies the model function to use for training, evaluation, and prediction; we pass it the cnn_model_fn that we have created.The model_dir argument specifies the directory where model data (checkpoints) will be saved (here, we specify the temp directory /tmp/convnet_model, but feel free to change to another directory of your choice).

Set Up a Logging Hook

CNNs can take time to train, let’s set up some logging so we can track progress during training. We can use TensorFlow’s tf.train.SessionRunHook to create a tf.train.LoggingTensorHook that will log the probability values from the softmax layer of our CNN. Add the following to main().

We store a dict of the tensors we want to log in tensors_to_log. Each key is a label of our choice that will be printed in the log output, and the corresponding label is the name of a Tensor in the TensorFlow graph. Here, our probabilities can be found in softmax_tensor, the name we gave our softmax operation earlier when we generated the probabilities in cnn_model_fn.

Next, we create the LoggingTensorHook, passing tensors_to_log to the tensors argument. We set every_n_iter=50, which specifies that probabilities should be logged after every 50 steps of training.

Train the Model

Now we’re ready to train our model, which we can do by creating train_input_fn ans calling train() on mnist_classifier. Add the following to main()

Evaluate the Model

Once training is complete, we want to evaluate our model to determine its accuracy on the test set. We call the evaluate method, which evaluates the metrics we specified in eval_metric_ops argument in the cnn_model_fn. Add the following to main()

Run the Model

We’ve coded the CNN model function, Estimator, and the training/evaluation logic; now run the python script.

Training CNNs is quite computationally intensive. Estimated completion time of python script will vary depending on your processor.To train more quickly, you can decrease the number of steps passed to train(), but note that this will affect accuracy.

Download this project from GitHub


Related Post






Convert a directory of images to TFRecords

In this post, I’ll show you how you can convert the dataset into a TFRecord file so you can fine-tune the model.

Before you run the training script for the first time, you will need to convert the Image data to native TFRecord format. The TFRecord format consists of a set of shared files where each entry is a serialized tf.Example proto. Each tf.Example proto contains the image as well as metadata such as label and bounding box information.

TFRecord file format is a simple record-oriented binary format that many TensorFlow applications use for training data.It is default file format for TensorFlow.Binary files are sometimes easier to use because you don’t have to specify different directories for images and annotations. While storing your data in the binary file, you have your data in one block of memory, compared to storing each image and annotation separately. Opening a file is a considerably time-consuming operation especially if you use HDD.Overall, by using binary files you make it easier to distribute and make the data better aligned for efficient reading.

This native file format used in Tensorflow allows you to shuffle, batch and split datasets with its own functions.Most of the batch operations aren’t done directly from images, rather they are converted into a single tfrecord file.

Convert images into a TFRecord

Before you start any training, you’ll need a set of images to teach the model about the new classes you want to recognize.When you are working with an image dataset, what is the first thing you do? Split into Train and Validate sets.

Here’s an example, which assumes you have a folder containing class-named subfolders, each full of images for each label. The example folder animal_photos should have a structure like this:

The subfolder names are important since they define what label is applied to each image, but the filenames themselves don’t matter.The label for each image is taken from the name of the subfolder it’s in.

The list of valid labels is held in label file. The code assumes that the fill contains entries as such:

where each line corresponds to a label. Script map each label contained in the file to an integer corresponding to the line number starting from 0.

Code Organization

The code for this tutorial resides in data/build_image_data.py.Change train_directory path which contain training image data,validation_directory path which contain validation image data,output_directory which contain tfrecord file after run python script and labels_file which is contains a list of valid labels are held in this file.

This TensorFlow script converts the training and evaluation data into a sharded data set consisting of TFRecord files

where we have selected 1024 and 128 shards for each data set. Each record within the TFRecord file is a serialized Example proto.


Related Post


Deep learning model for Car Price prediction using TensorFlow

Before AI, Image search looks at the metadata and looks where the images are and now, with AI the computer looks at the images.If you search for say, Tiger, it looks at the images, all of them in the world and whenever it sees one that has Tiger in it, it returns it. AI is now going to be everywhere, and machine learning is everywhere.What makes this image search work, in reality, is TensorFlow.Once you’ve written one of these models in TensorFlow you can deploy it anywhere in mobile.TensorFlow is truly what enables apps to classify the image.

TensorFlow gives you distribution out of the box. So that you can run it in the cloud if you need to do that.It works on all of the hardware you need to work on.It’s fast and it’s flexible and what I’m going to tell you today is that it’s also super easy to get started.

tensorflow programming environment

The generic thing that people used to say this is TensorFlow, it’s pretty low level.So you’re thinking about like multiplying matrices, adding vectors together, that kind of thing.What TensorFlow built on top of this is libraries that help you do more complex things easier.TensorFlow built a library of layers to help you build models, it built training infrastructure that helps you actually train a model and evaluate a model and put it into production.This you can do with Keras or you can do with estimators.Finally, it built models in a box and those are really full, complete machine learning algorithms such as run, and all you have to do is instantiate one and go.That’s mostly what I’m going to talk about today.

So usually when you talk about, oh, my first model in TensorFlow it’s usually something simple, like let’s fit a line to a bunch of points or something like that.But nobody is actually interested in fitting a line to a bunch of points, distributed.It doesn’t really happen all that much in reality.So we’re not going to do that. I’m going to show you instead how to handle a variety of features, and then train and evaluate different types of models.We do that on a data set of cars.So the first model today will be about predicting the price of a car from a bunch of features about the car, information about the car.


The next thing I can do with TensorBord is I can actually look at the model that was created and look at the lower levels of the model, look at what we call the graph. TehsorFlow works by generating a graph, and then this graph is shipped to all of the distributed workers that it has, it’s executed there.You don’t have to worry about this too much, but it’s awfully useful to be able to inspect this graph when you’re doing debugging or something like that.

Launching TensorBoard from Python

To run TensorBoard, use the following code.logdir points to the directory where the FileWriter serialized its data.Once TensorBoard is running, navigate your web browser to localhost:6006 to view the TensorBoard.

Data Set

First thing to do Download dataset.We’re using pandas to read the CSV file. This is easy for small datasets but for large and complex datasets.

The CSV file does not have a header, so we have to fill in column names.We also have to specify dtypes.

The training set contains the examples that we’ll use to train the model; the test set contains the examples that we’ll use to evaluate the trained model’s effectiveness.

The training set and test set started out as a single data set. Then, we split the examples, with the majority going into the training set and the remainder going into the test set. Adding examples to the training set usually builds a better model; however, adding more examples to the test set enables us to better gauge the model’s effectiveness. Regardless of the split, the examples in the test set must be separate from the examples in the training set. Otherwise, you can’t accurately determine the model’s effectiveness.

Feature Columns

Feature Columns are the intermediaries between raw data and Estimators. Feature columns are very rich, enabling you to transform a diverse range of raw data into formats that Estimators can use, allowing easy experimentation.

Every neuron in a neural network performs multiplication and addition operations on weights and input data. Real-life input data often contains non-numerical (categorical) data. For example, consider a fuel-type feature that can contain the following two non-numerical values:

  • gas
  • diesel

ML models generally represent categorical values as simple vectors in which a 1 represents the presence of a value and a 0 represents the absence of a value. For example, when fuel_type is set to diesel, an ML model would usually represent fuel_type as [1,0], meaning:

  • 0gas is absent
  • 1diesel is present

So, although raw data can be numerical or categorical, an ML model represents all features as numbers.

Numeric column

The price predictor calls the tf.feature_column.numeric_column function for numeric input features:

Categorical column

We cannot input strings directly to a model. Instead, we must first map strings to numeric or categorical values. Categorical vocabulary columns provide a good way to represent strings as a one-hot vector.

Hashed Column

the number of categories can be so big that it’s not possible to have individual categories for each vocabulary word or integer because that would consume too much memory. For these cases, we can instead turn the question around and ask, “How many categories am I willing to have for my input?” In fact, the tf.feature_column.categorical_column_with_hash_bucket function enables you to specify the number of categories.

Create Input Functions

I still have to give it some input data.TensorFlow has off-the-shelf input pipeline for most formats, or for many formats.And particularly, here in this example, I’m using input from pandas.So I’m going to read input from a pandas data frame.

What I’m telling it here is I want to use the batches of 64. So each iteration of the algorithm will use 64 input data pieces. I’m going to shuffle the input, which always a good thing to do when you’re training.Please always shuffle the input and num_epochs=None means to cycle through the data indefinitely.If you’re done with the data, just do it again.

Instantiate an Estimator

We specify what kind of machine learning algorithm we want to apply to prediction Car price and in my case here, I’m going to use first a linear regression, which is kind of the simplest way to learn something and all I have to do is tell him, hey,look,you’re going to use these input features that I’ve just declared.

Train, Evaluate, and Predict

Now that we have an Estimator object, we can call methods to do the following:

  • Train the model.
  • Evaluate the trained model.
  • Use the trained model to make predictions.

Train the model

Train the model by calling the Estimator’s train method as follows:

The steps argument tells the method to stop training after a number of training steps.

Evaluate the trained model

Now that the model has been trained, we can get some statistics on its performance. The following code block evaluates the accuracy of the trained model on the test data:

Unlike our call to the train method, we did not pass the steps argument to evaluate. Our eval_input_fn only yields a single epoch of data.

Making predictions (inferring) from the trained model

We now have a trained model that produces good evaluation results. We can now use the trained model to predict the price of a car flower based on some unlabeled measurements. As with training and evaluation, we make predictions using a single function call:

Deep Neural Network

We have to obviously change the name of the class that we’re using.Then we’ll also have to adapt the inputs to something that this new model can use.So in this case, a DNN model can’t use these categorical features directly.we have to do something to it and the two things that you can do to a categorical feature, typically, to make it work with a deep neural network is you either embed it or you transform it into what’s called a one-hot or an indicator.So we do this by simply saying, hey, make me an embedding, and out of the cylinders, make it an indicator column because there are not so many values there.Usually, this is fairly complicated stuff, and you have to write a lot of code.

Then also, most of these more complicated models have hyperparameters and in this case, the DNN, basically we tell it, hey, make me a three-layer neural network with layer size 50,30, and 10 and that’s all really you need to–this is a very high-level interface.


TensorFlow, implementations of complete machine learning models.You can get started with them extremely quickly.They come with all of the integrations, with TensorBord, visualization for serving and production, for different hardware, different use cases.They obviously work in distributed settings.We use them in data centers.You can use them on your home computer network if that’s what you’d like.You can use them in flocks of mobile devices.Everything is possible.They run on all kinds of different hardware.Particularly, they will run on TPU.They also always run on GPU, on CPU.

Download this project from GitHub


Related Post


Image Classify Using TensorFlow Lite

We know that machine learning adds great power to your mobile app.TensorFlow Lite is a lightweight ML library for mobile and embedded devices.TensorFlow works well on large devices and TensorFlow Lite works really well on small devices. So that it’s easier and faster and smaller to work on mobile devices.

Getting Started with an Android App

This post contains an example application using TensorFlow Lite for Android App. The app is a simple camera app that classifies images continuously using a quantized MobileNets model.

Step 1: Decide which Model to use

Depending on the use case, you may choose to use one of the popular open-sourced models such as InceptionV3 or MobileNets or re-train these models with their own custom data set or even build their own custom model.In this example, we use pre-train MobileNets model.

Step 2: Add TensorFlow Lite Android AAR

Android apps need to be written in Java, and core TensorFlow is in C++, a JNI library is provided to interface between the two. Its interface is aimed only at inference, so it provides the ability to load a graph, set up inputs, and run the model to calculate particular outputs.

This app uses a pre-compiled TFLite Android Archive (AAR). This AAR is hosted on jcenter.

The following lines in the app’s build.gradle file includes the newest version of the AAR, from the TensorFlow maven repository, in the project.

We use the following block, to instruct the Android Asset Packaging Tool that .lite or .tflite assets should not be compressed. This is important as the .lite file will be memory-mapped, and that will not work when the file is compressed.

Step 3: Add your model files to the project

Download the quantized Mobilenet TensorFlow Lite model from here, unzip and copy mobilenet_quant_v1_224.tflite and label.txt to the assets directory: src/main/assets

Step 4: Load TensorFlow Lite Model

TensorFlow Lite’s Java API supports on-device inference and is provided as an Android Studio Library that allows loading models, feeding inputs, and retrieving inference outputs.

The Interpreter.java class drives model inference with TensorFlow Lite. In most of the cases, this is the only class an app developer will need.Initializing an Interpreter with a Model File.The Interpreter can be initialized with a MappedByteBuffer:

This byte buffer is sized to contain the image data once converted to float. The interpreter can accept float arrays directly as input, but the ByteBuffer is more efficient as it avoids extra copies in the interpreter.

The following lines load the label list and create the output buffer:

The output buffer is a float array with one element for each label where the model will write the output probabilities.

Running Model Inference

If a model takes only one input and returns only one output, the following will trigger an inference run:

For models with multiple inputs, or multiple outputs, use:

where each entry in inputs corresponds to an input tensor and map_of_indices_to_outputs maps indices of output tensors to the corresponding output data. In both cases the tensor indices should correspond to the values given to the TensorFlow Lite Optimized Converter when the model was created. Be aware that the order of tensors in input must match the order given to the TensorFlow Lite Optimized Converter.

Following method takes a Bitmap as input, runs the model and returns the text to print in the app.

This method does three things. First converts and copies the input Bitmap to the imgData ByteBuffer for input to the model. Then it calls the interpreter’s run method, passing the input buffer and the output array as arguments. The interpreter sets the values in the output array to the probability calculated for each class. The input and output nodes are defined by the arguments to the toco conversion step that created the .lite model file earlier.


The app is resizing each camera image frame to (224 width * 224 height) to match the quantized Mobilenet model being used. The resized image is converted into a ByteBuffer row by row of size 1 * 224 * 224 * 3 bytes, where 1 is the number of images in a batch 224 * 224 is the width and height of the image 3 bytes represents three colors of a pixel. This app uses the TensorFlow Lite Java inference API for models which take a single input and provide a single output. This outputs a two-dimensional array, with the first dimension being the category index and the second dimension being the confidence of classification. The Mobilenet model has 1001 unique categories and the app sorts the probabilities of all the categories and displays the top three. The Mobilenet quantized model is bundled within the assets directory of the app.


Download this project from GitHub

Related Post

TensorFlow Lite

Train Image classifier with TensorFlow


Speech Recognition Using TensorFlow

This tutorial will show you how to runs a simple speech recognition model built by the audio training tutorial. Listens for a small set of words, and display them in the UI when they are recognized.

It’s important to know that real speech and audio recognition systems are much more complex, but like MNIST for images, it should give you a basic understanding of the techniques involved. Once you’ve completed this tutorial, you’ll have a application that tries to classify a one second audio clip as either silence, an unknown word, “yes”, “no”, “up”, “down”, “left”, “right”, “on”, “off”, “stop”, or “go”.

TensorFow speech recognition model


You can train your model on the desktop or on the laptop or on the server and then you can use that pre-trained model on our mobile device.So there’s no training that would happen on the device the training would happen on our bigger machine either a server or our laptop.You can download a pretrained model from tensorflow.org

2. Adding Dependencies

The TensorFlow Inference Interface is available as a JCenter package and can be included quite simply in your android project with a couple of lines in the project’s build.gradle file:

Add the following dependency in app’s build.gradle

This will tell Gradle to use the latest version of the TensorFlow AAR that has been released to https://bintray.com/google/tensorflow/tensorflow-android. You may replace the + with an explicit version label if you wish to use a specific release of TensorFlow in your app.

3.Add Pre-trained Model to Project

You need the pre-trained model and label file.You can download the model from here.Unzip this zip file, You will get conv_actions_labels.txt(label for objects) and conv_actions_frozen.pb(pre-trained model).

Put conv_actions_labels.txt and conv_actions_frozen.pb into android/assets directory.

4.Microphone Permission

To request microphone, you should be requesting RECORD_AUDIO permission in your manifest file as below:

Since Android 6.0 Marshmallow, the application will not be granted any permission at installation time. Instead, the application has to ask the user for a permission one-by-one at runtime.

5.Recording Audio

The AudioRecord class manages the audio resources for Java applications to record audio from the audio input hardware of the platform. This is achieved by “pulling” (reading) the data from the AudioRecord object. The application is responsible for polling the AudioRecord object in time using read(short[], int, int).

6.Run TensorFlow Model

TensorFlowInferenceInterface class that provides a smaller API surface suitable for inference and summarizing the performance of model execution.

7.Recognize Commands

RecognizeCommands class is fed the output of running the TensorFlow model over time, it averages the signals and returns information about a label when it has enough evidence to think that a recognized word has been found. The implementation is fairly small, just keeping track of the last few predictions and averaging them.

The demo app updates its UI of results automatically based on the labels text file you copy into assets alongside your frozen graph, which means you can easily try out different models without needing to make any code changes. You will need to updaye LABEL_FILENAME and MODEL_FILENAME to point to the files you’ve added if you change the paths though.


You can easily replace it with a model you’ve trained yourself. If you do this, you’ll need to make sure that the constants in the main MainActivity Java source file like SAMPLE_RATE and SAMPLE_DURATION match any changes you’ve made to the defaults while training. You’ll also see that there’s a Java version of the RecognizeCommands module that’s very similar to the C++ version in this tutorial. If you’ve tweaked parameters for that, you can also update them in MainActivity to get the same results as in your server testing.


Download this project from GitHub


Related Post

Android TensorFlow Machine Learning

Google Cloud Speech API in Android APP




Train your Object Detection model locally with TensorFlow

In this post, we’re going to train machine learning models capable of localizing and identifying multiple objects in an image. You’ll need to install TensorFlow and you’ll need to understand how to use the command line.

Tensorflow Object Detection API

The TensorFlow Object Detection API is an open source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models.

This post walks through the steps required to train an object detection model locally.

1.Cloning an Object Detection API repository

or you can download directly ZIP file.


Tensorflow Object Detection API depends on the following libraries.

  • Protobuf 2.6
  • Protobuf 2.6
  • Pillow 1.0
  • Lxml

The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. Before the framework can be used, the Protobuf libraries must be compiled. This should be done by running the following command from the tensorflow/models directory:

  • Jupyter notebook

  • Matplotlib

Add Libraries to PYTHONPATH

When running locally, the tensorflow/models/ and slim directories should be appended to PYTHONPATH. This can be done by running the following from tensorflow/models/:

Note: This command needs to run from every new terminal you start. If you wish to avoid running this manually, you can add it as a new line to the end of your ~/.bashrc file.

Testing the Installation

You can test that you have correctly installed the Tensorflow Object Detection API by running the following command:

Above command generate following output.

Install Object Detection API

3.Preparing Inputs

Tensorflow Object Detection API reads data using the TFRecord file format. Two sample scripts (create_pascal_tf_record.py and create_pet_tf_record.py) are provided to convert \dataset to TFRecords.

Directory Structure for Training input data

  • To prepare the input file for the sample scripts you need to consider two things. Firstly, you need an RGB image which is encoded as jpg or png and secondly, you need a list of bounding boxes (xmin, ymin, xmax, ymax) for the image and the class of the object in the bounding box.
  • I scraped 200 pet from Google Images.Here is a subset of the pet image data set that I collected in images folder:


Afterward, labeled them manually with LabelImg. LabelImg is a graphical image annotation tool that is written in Python. It’s super easy to use and the annotations are saved as XML files.Save image annotations xml in /annotations/xmls folder.

Image Annotation

Create trainval.txt in annotations folder which content name of the images without extension.Use the following command to generate trainval.txt.

Label Maps

Each dataset is required to have a label map associated with it. This label map defines a mapping from string class names to integer class Ids.Label maps should always start from id 1.Create label.pbtxt file with the following label map:

Generating the Pet TFRecord files.

Run the following commands.

You should end up with two TFRecord files named    pet_train.record and pet_val.record in the tensorflow/modelsdirectory.

4.Training the model

After creating the required input file for the API, Now you can train your model.For training, you need the following command:

An object detection training pipeline also provide sample config files on the repo. For my training, I used ssd_mobilenet_v1_pets.config basis. I needed to adjust the num_classes to one and also set the path (PATH_TO_BE_CONFIGURED) for the model checkpoint, the train, and test data files as well as the label map. In terms of other configurations like the learning rate, batch size and many more, I used their default settings.

Running the Evaluation Job

Evaluation is run as a separate job. The eval job will periodically poll the train directory for new checkpoints and evaluate them on a test dataset. The job can be run using the following command:

where ${PATH_TO_YOUR_PIPELINE_CONFIG} points to the pipeline config, ${PATH_TO_TRAIN_DIR} points to the directory in which training checkpoints were saved (same as the training job) and ${PATH_TO_EVAL_DIR} points to the directory in which evaluation events will be saved. As with the training job, the eval job run until terminated by default.

Running Tensorboard

Progress for training and eval jobs can be inspected using Tensorboard. If using the recommended directory structure, Tensorboard can be run using the following command:

where ${PATH_TO_MODEL_DIRECTORY} points to the directory that contains the train and eval directories. Please note it may take Tensorboard a couple minutes to populate with data.

5.Exporting the Tensorflow Graph

After your model has been trained, you should export it to a Tensorflow graph proto. First, you need to identify a candidate checkpoint to export. The checkpoint will typically consist of three files in pet folder:

  1.  model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001
  2. model.ckpt-${CHECKPOINT_NUMBER}.index
  3. model.ckpt-${CHECKPOINT_NUMBER}.meta

Run the following command to export Tensorflow graph.Change the check point number.

Related Post

Introduction TensorFlow Machine Learning Library

TenserFlow Lite

Train Image classifier with TensorFlow

Android TensorFlow Machine Learning


TensorFlow is an open source software library for Machine Learning.It was originally developed by researchers and engineers working on the Google Brain team within Google’s Machine Intelligence Research organization for the purposes of conducting machine learning and deep neural networks research.

TensorFlow is programming paradigm and some of its main features are that numeric computation is expressed as a computational graph backbone.TensorFlow program is going to be a graph where the graph nodes are the group to be operations, shorthand as an operation in your code.They have any number of inputs and a single output.The edges between our nodes are going to be tensors that flow between them.

The best way of thinking about what tensors are in practice is as n-dimensional arrays.The advantage of using flow graphs as the backbone of your deep learning framework is that it allows you to build complex models in terms of small and simple operations.This is going to make your gradient calculation extremely simple when we get to that.You’re going to be very grateful for the automatic different when you’re coding large models in your project and in the feature.

Another way of thinking about a TensorFlow graph is that each operation is a function that can be evaluated at that point.

Neural Network computational graph


Neural network with one hidden layer what it’s computational graph in TensorFlow might look like.



So we have some hidden layer that we are trying to compute, as the ReLU activation of some parameter matrix W times some input x plus a bias term.

ReLu is an activation function standing for rectified linear unit.We are applying some nonlinear function over our linear input that is what gives the neural networks their expressive function.The ReLU takes the max of your input and zero.

We have variables b and W.We have a placeholder with the x. Nodes for each of the operations in our graph.

Variables are going to be stateful nodes which output their current value.In our case, Variable b and W is retained their current value over multiple executions.It’s easy to restore saved values to variables.

Variable has a number of other useful features.they can be saved to your disk during and after training. That allows people from different companies and group to save, store, and send over their model parameter to other people.They also make gradient update by default.It will apply over all of the variables and your graph.

The variable is the things that you wanna tune to minimize the loss.It is really important to remember that variable in the graph like b and W are still operations.

All of your node in the graph are operations.When you evaluate the operation that is these variables in our run time you will get the value of those variables.

Placeholders(x) is nodes whose value is fed in at execution time. Placeholder values that we’re going to add into our computation during training.So this going to be our input.

So for placeholders, we don’t give any initial values.we just assign a data type, and we assign a shape of tensor so the graph still knows what to computer even though it doesn’t have any stored values yet.

The third type of node is mathematical operations. This going to be your matrix multiplication, your addition, and your ReLU. All of these are nodes in your TensorFlow graphs.It’s very important that we’re actually calling on TensorFlow mathematical operations as opposed to Numpy operations.

Related Post

TenserFlow Lite

Install TensorFlow

Train Image classifier with TensorFlow

Train your Object Detection model locally with TensorFlow

Android TensorFlow Machine Learning


Android TensorFlow Machine Learning

TensorFlow is an open source software library for machine learning, developed by Google and currently used in many of their projects.

In this article, we will create an Android app that can recognize five types of fruits.

Machine Learning

There have been so many buzzwords such as AI, Machine Learning, Neural Network, or Deep Learning.What’s the difference?

AI or Artificial Intelligence–you can say that is a science for making smart things like building an autonomous driving car or having a computer drawing a beautiful picture or composing music.One way to realize that vision of AI is in Machine Learning.Machine learning is a technology where you can a computer train itself, rather than having human programmers instructing every step, to process the data by itself.

One of the many different algorithms in MI is a neural network.Since around 2012, Google has been seeing a big breakthrough in the world of the neural network, especially for image recognition, voice recognition, or natural language processing and many other applications.

Neural Network

You can think of it just like a function in mathematics or a function in the programming language.So you can put any kind of data as an input and do some matrix operation or calculations inside neural networks. You would get an output vector which has the many labels or speculative values.

Neural Network

For example, if you have a bunch of images, you can train the neural network to classify which one is the image of a cat or the image of a dog, this is just one example of the use cases of neural networks.You can apply the technology to solve any kind of business problems you have.
There are so many possible use cases for the combination between ML and mobile applications, starting from image recognition, OCR, speech-to-text, and text-to-speech, translation.You can apply machine learning to mobile-specification applications such as motion detection or GPS location tracking.

Why do you want to run machine learning model inside your mobile applications?

By using the machine learnings, you can reduce the significant amount of traffic, and you can get much faster responses from your cloud services.Because you can extract the meaning from the raw data.For example, if you are using machine learning for image recognition, the easiest way to implement that is to send all the raw image data taken by the camera to the server. But instead, you can have the machine learning model running inside your mobile application so that your mobile application can recognize what kind of object is in each image.So that you can just send the label, such as a flower or human face, to the server.That can reduce the traffic to 1/10 or 1/100 It’s a significant amount of saving of traffic.

Build an application that is powered by machine learning

The starting point could be the TensorFlow, which is the open-source library for machine intelligence from Google.TensorFlow is the latest framework for building machine learning or AI-based service developed in Google.Google open source it in November 2015. TensorFlow is the most popular framework for building neural networks or deep learning in the world.One benefit you could get with TensorFlow is easy of development.So It’s really easy to get started.You can just write a few lines of Python code.

TensorFlow is very valuable for people like me because I don’t have any sophisticated mathematical background.So when you started reading the textbook on neural networks, you found many mathematical equations on the textbook, like differentiation backpropagation and gradient descent.You really didn’t want to implement everything by yourself.Instead, now you can just download TensorFlow, Where you can write a single line of Python code, like GradientDescentOptimizer.That single line of code can encapsulate all these obfuscated algorithms such as gradient descent, backpropagation, or any other latest algorithm implemented by the Google Engineers.So you yourself don’t have to have the skill set to implement the neural network technologies from scratch.The main benefits of the TensorFlow is the portability and scalability.

Implement TensorFlow in Android

Android just added a JSON integration, which makes step a lot, a lot easier.Just add one line to the build.gradle, and the Gradle take care or the rest of steps.Under the library archive, holding TensorFlow shared object is downloaded from JCenter, linked against the application automatically.

Android release inference library to integrate TensorFlow for Java Application.

Add your model to the project

We need the pre-trained model and label file.In the previous tutorial, we train model.which does the object detection on a given image.You can download the model from here.Unzip this zip file, we will get retrained_labels.txt(label for objects) and rounded_graph.pb (pre-trained model).

Put retrained_labels.txt and rounded_graph.pb into android/assets directory.


At first, create TensorFlow inference interface, opening the model file from the asset in the APK.Then, Set up the input feed using Feed API.On mobile, the input feed tends to be retrieved from various sensors like a camera, accelerometer, Then run the interface, finally, you can fetch the results using fetch method over there.You would notice that those calls are all blocking calls.So you’d want to run them in a worker thread, rather than the main thread because API would take a long time.This one is Java API.you can use regular C++ API as well.

Download this project from GitHub


Related Past

Image Classify Using TensorFlow Lite

Google Cloud Vision API in Android APP

Introduction TensorFlow Machine Learning Library

TenserFlow Lite

Train Image classifier with TensorFlow

Train your Object Detection model locally with TensorFlow

Speech Recognition Using TensorFlow




Train Image classifier with TensorFlow

In this post, we’re going to train an Image Classifier for the Android Machine Learning demo app. You’ll need to install TensorFlow and you’ll need to understand how to use the command line.

What is Machine Learning? and Why important?

You can think of Machine Learning as a subfield of AI.Machine Learning is the study of algorithms that learn from examples and experience instead of relying on hard-coded rules.So That’s the state-of-the-art.
Can you write code to tell the difference between an apple and an orange? Takes file as input, does some analysis, and outputs the type of fruit.

We’re going to train a classifier.For now, you can think of a classifier as a function.It takes some data as input and assigns a label to It as output.

The technique to write the classifier automatically is called supervised learning.It begins with the example of the problems you want to solve.

To use supervised learning, we’ll follow a few standard steps.

1.Collect training data

We’re going to write a function to classify a piece of fruit Image. For starters, it will take an image of the fruit as input and predict whether it’s an apple or oranges as output.The more training data you have, the better a classifier you can create (at least 30 images of each, more is better).

Image classifier folder

We will create a ~/tf_files/fruits folder and place each set of jpeg images in subdirectories (such as ~/tf_files/fruits/apple, ~/tf_files/fruits/orange etc)

A quick way to download multiple images at once is to search something on Google Images, and use a Chrome extension for batch download.

2.Train an Image Classifier with TensorFLow for Poets

You want to build a classifier that can tell the difference between a picture of an Apple and Orange.TensorFlow for Poets This is a great way to get started learning about and working with image classification.
To train our classifier we’ll basically just need to run a couple of scripts.To train an image classifier with TensorFlow for Poets, we’ll only need to provide one thing–training data.

The classifier we’ll be using is called a neural network.At a high level, that’s just another type of classifier, like the nearest neighbor one wrote last lime.The different a neural network can learn more complex functions.

Training Inception

Step1: The retrain.py script is part of the TensorFlow repo.You need to download it manually, to the current directory(~/tf_files):

download retrain.py
Now, we have a trainer, we have data(Image), so let’s train! We will train the Inception v3 network.

Step2: Before starting the training, active  TensorFlow.

active tensorflow environment

Step3: Start your image retraining with one big command.

Train Image classifier

These commands will make TensorFlow download the inception model and retrain it to detect images from ~/tf_files/fruits.

Train Image classifier accuracy

This operation can take several minutes depending on how many images you have and how many training steps you specified.

The script will generate two files: the model in a protobuf file (retrained_graph.pb) and a label list of all the objects it can recognize (retrained_labels.txt).

retrained grape files

Clone the Git repository for test model

The following command will clone the Git repository containing the files required for the test model.

Copy tf file

The repo contains two directories: android/, and scripts/

1.android/: Directory contains nearly all the files necessary to build a simple Android app that classifies images.

2.scripts/: Directory contains the python scripts. These include scripts to prepare, test and evaluate the model.

Now copy the tf_files directory from the first part, into /tensorflow-for-poets-2 working directory.

Test the Model

The scripts/ directory contains a simple command line script, label_image.py, to test the network.

test trained model

Optimize model for Android

TensorFlow installation includes a tool, optimize_for_inference, that removes all nodes that aren’t needed for a given set of input and output nodes.

Optimize for inference

It creates a new file at tf_files/optimized_graph.pb.

Make the model compressible

The retrained model is still 84MB in size at this point. That large download size may be a limiting factor for any app that includes it.

Neural network operation requires a bunch of matrix characterizations, which means tons of multiply and add operations.current mobile devices are capable of doing some of them with specialized hardware.


Quantization is one of the techniques to reduce both memory footprint and computer load.Usually, TensorFlow takes a single precision floating value for input and math, and also output as well.As you know, the single precision floating point takes 32 bits each.W can reduce the precision to 16 bits,8 bit, or even less while keeping a good result, just because our learning process involves some noise by nature.Adding some extra noise wouldn’t matter much.So quantized weight is the optimization for storage size, which reduces the precision of the constant node in the graph file.

Quantize Image

Quantized calculations.

With the quantized calculations, we can reduce computing precision by using the quantized value directory.This one is good for first memory bandwidth, which is a limiting factor in mobile devices.Also, hardware can handle these precision values faster than single precision floating values.

Now use the quantize_graph script to apply changes:

quantize graph

It does this without any changes to the structure of the network, it simply quantizes the constants in place.It creates a new file at tf_files/rounded_graph.pb.


Every mobile app distribution system compresses the package before distribution. So test how much the graph can be compressed:


Quantize compare

You should see a significant improvement. I get 73% optimize model.


Related Post

Image Classify Using TensorFlow Lite

Train your Object Detection model locally with TensorFlow

Android TensorFlow Machine Learning

Introduction TensorFlow Machine Learning Library

TenserFlow Lite


Install TensorFlow

Installing TensorFlow for CPU on Ubuntu using virtualenv.

This blog explains how to install TensorFlow on Ubuntu using Virtualenv. Virtualenv is a virtual Python environment isolated from other Python development, incapable of interfering with or being affected by other Python programs on the same machine. To start working with TensorFlow, you simply need to “activate” the virtual environment. All in all, virtualenv provides a safe and reliable mechanism for installing and running TensorFlow.

Take the following steps to install TensorFlow with virtualenv.


Step 1: Install pip and virtualenv by issuing following commands:

TensorFlow Step 1

Step 2: Create a virtualenv environment by issuing the following commands:

The preceding source command should change your prompt to the following:

Tensorflow step 2

Step 3:

Activate the virtualenv environment by issuing one of the following commands:

The preceding source command should change your prompt to the following:

TensorFlow step 3


Step 4:

Issue one of the following commands to install TensorFlow in the active virtualenv environment:

TensorFlow step 4


No module named tensorflow

Activate TensorFlow

Note that you must activate the virtualenv environment each time you use TensorFlow. If the virtualenv environment is not currently active, invoke one of the following commands:

When the virtualenv environment is active, you may run TensorFlow programs from this shell. Your prompt will become the following to indicate that your tensorflow environment is active:

active tensorflow environment


Deactivate TensorFlow

you may deactivate the TensorFlow environment by invoking the deactivate function as follows:

Run TensorFlow program

Enter the following short program inside the python interactive shell:

The preceding source code should change your prompt to the following.

TensorFlow Hello World



Related Post

Introduction TensorFlow Machine Learning Library

TenserFlow Lite

Train Image classifier with TensorFlow

Train your Object Detection model locally with TensorFlow

Android TensorFlow Machine Learning