When dealing with convolutional networks, we have two ways to know what a model sees. First are the filters (weights)and second is the feature maps(activation map). In this tutorial, we will visualize feature maps in a convolutional neural network.

The idea of visualizing a feature map for a specific input image would be to understand what features of the input are detected or preserved in the feature maps. The expectation would be that the feature maps detect small or fine-grained detail. Let’s import all the libraries and modules first. 

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

from torch.utils.data import  DataLoader
from torchvision import models

import torchvision.transforms as transforms
import torchvision.datasets as dataset

import matplotlib.pyplot as plt
import numpy as np
import cv2 as cv

Pre-train VGG Model

We need a CNN model to visualize the feature map. Instead of fitting a model from scratch, we can use a pre-train state-of-the-art image classification model.

PyTorch provides many well-performing image classification models developed by different research groups for the ImageNet. One example is the VGG-16 model that achieved top results in the 2014 competition. This is a good model to use for visualization because it has a simple uniform structure of serially ordered convolutional and pooling layers.

modelVGG = models.vgg16(pretrained=True)

Running the example will load the model weights into memory and print a summary of the loaded model. 

VGG16 Arch

It is deep with 16 learned layers, and it performed very well, meaning that the filters and resulting feature maps will capture useful features.

In order to explore the feature maps, we need input for the VGG16 model that can be used to create activations. We will use a simple image of a bee.

Feature map Input

We need to load the bee image with the size expected by the model, in this case, 224×224. Next, the image object needs to be converted to a NumPy array of pixel data and expanded from a 3D array to a 4D array with the dimensions of [samples, rows, cols, channels], where we only have one sample.


transform = transforms.Compose([
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

Accessing Convolutional Layers

We need to save all the convolutional layers from the VGG net. We will traverse through all these nestings to retrieve the convolutional layers. The following code shows how to retrieve all the convolutional layers.



for child in model_children:
  if type(child)==nn.Conv2d:
  elif type(child)==nn.Sequential:
    for layer in child.children():
      if type(layer)==nn.Conv2d:

First, we initialize a no_of_layers variable to keep track of the number of convolutional layers. Next, we going through all the layers of the VGG16 model.

results = [conv_layers[0](img)]
for i in range(1, len(conv_layers)):
outputs = results

Give the image as an input to the first convolutional layer after that, we will use a for loop to pass the last layer’s outputs to the next layer until we reach the last convolutional layer.

Visualizing the Feature Maps

We know that the number of feature maps (e.g. depth or a number of channels) in deeper layers is much more than 1, such as 64, 256, or 512. We plot only 16 two-dimensional images as a 4×4 square of images.

for num_layer in range(len(outputs)):
    plt.figure(figsize=(50, 10))
    layer_viz = outputs[num_layer][0, :, :, :]
    layer_viz = layer_viz.data
    print("Layer ",num_layer+1)
    for i, filter in enumerate(layer_viz):
        if i == 16: 
        plt.subplot(2, 8, i + 1)
        plt.imshow(filter, cmap='gray')
PyTorch Feature Map

The feature maps are a result of applying filters to input images. Feature maps output by prior layers could provide insight into the internal representation that the model has of a specific input at a given point in the model.

Run this code in Google colab