Convolutional neural network (CNN) is a little bit of a black box. Where some input image of raw pixels is input. It goes to the many layers of the convolution and pooling layer and we end up with some set of class scores or bounding box or labeled pixels or something like that. But the question is what are all these other layers in the middle doing? What kinds of things in the input image are they looking for? How ConvNet is working? What types of things in the image they are looking for?
model=tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(32,(3,3),activation='relu',
input_shape=input_shape,name='input_layer'))
model.add(tf.keras.layers.Conv2D(64,(3,3),activation='relu',name='conv_1'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2),name='pool_1'))
model.add(tf.keras.layers.Dropout(0.25,name='dropout_1'))
model.add(tf.keras.layers.Flatten(name='flate_1'))
model.add(tf.keras.layers.Dense(128,activation='relu',name='dense_1'))
model.add(tf.keras.layers.Dropout(0.5,name='dropout_2'))
model.add(tf.keras.layers.Dense(num_class,activation='softmax',name='output_layer'))
Here we create a simple ConvNet for MNIST digits classification. You can assign the name of each layer using the name attribute of the layer.
Model Summary
The model summary gives the output shape of each layer, e.g. the shape of the resulting ConvNet layer.
model.summary()

At each layer in the convolutional network, our input image is like 28x28x1
and then it goes through many stages of convolution. Then, after each convolutional layer is some three-dimensional chunk of numbers which are the outputs from that layer of the convolutional network.
The entire three-dimensional chunk of numbers which are the output of the previous convolutional layer we call an activation volume and then one of those slices is a, it’s an activation map.
Change Names of Layers
We can change the name of the layer. Changing the name attribute of a layer should not affect the accuracy of a model. They are simply descriptors. To get the layer name associated with a model you can use the layers index.
model.layers[0]._name='conv_0'
print(model.layers[0].name)
Create New Model
The new model would have the same input layer as the original model, but the output would be the output of a given convolutional layer, which we know would be the activation of the layer or the feature map.
def visualize_conv_layer(layer_name):
layer_output=model.get_layer(layer_name).output
intermediate_model=tf.keras.models.Model(inputs=model.input,outputs=layer_output)
intermediate_prediction=intermediate_model.predict(x_train[2].reshape(1,28,28,1))
row_size=4
col_size=8
img_index=0
print(np.shape(intermediate_prediction))
fig,ax=plt.subplots(row_size,col_size,figsize=(10,8))
for row in range(0,row_size):
for col in range(0,col_size):
ax[row][col].imshow(intermediate_prediction[0, :, :, img_index], cmap='gray')
img_index=img_index+1
The human visual system is known to detect edges at the very early layers. It turns out that these convolutional networks tend to do something similar at their first convolutional layers as well.
visualize_conv_layer('conv_0')
Our model has 32 filters. In the first layer, we can get a sense of what these layers are looking for by simply visualizing the layer. We can just visualize that layer as a little 26x26x1 image with one channel. Because there are 32 of these filters we just visualize 32 little 26×26 images.

We can actually go and visualize each of those 26x26
elements slices of the feature map as a grayscale image and this gives us some sense of what types of things in the input are those features in that convolutional layer looking for.
visualize_conv_layer('conv_1')
Layer 2 gives us this 24x24x64
dimensional tensor. But we can think of that as 64 different 24×24 images.

The second convolutional layer receives the 32-channel input. It does 3×3 convulsions with 32 convolutional filters. The problem is that you can’t really visualize these directly as images.
Most of these intermediate layer is kind of noisy. But there’s one highlighted intermediate feature that seems that it’s activating on the portions of the feature map corresponding to the digits. That kind of suggests that maybe this particular slice of the feature map of this layer of this particular network is maybe looking for digits or something like that.
Related Post
- Calculate Output Size of Convolutional and Pooling layers in CNN.
- Explain Pooling layers: Max Pooling, Average Pooling, Global Average Pooling, and Global Max pooling.
- Calculate the number of parameters for a Convolutional and Dense layer in Keras