There are two ways to define a model in PyTorch: using the Sequential class (only for linear stacks of layers, which is the most common network architecture by far) or the standard PyTorch approach to create a child class that inherits from the torch.nn.Module class. 

A Sequential model is useful for quickly defining a linear stack of layers (i.e., where one layer follows directly from the previous layer without any branching). We can define our MLP model using the Sequential class as shown:

seq_model = nn.Sequential(nn.Linear(1, 15),
                          nn.Linear(15, 1))


nn provides a simple way to concatenate modules through the nn.Sequential container. Instantiate and view the network

PyTorch Sequential Model

As a refresher, here’s a two-layer model defined using the Sequential class (note that we’re passing the expected shape of the input data to the first layer).

Because nn.Sequential is a module, we can get its parameters(Weight and Bias), which will return a list of all the parameters of all the modules it contains. Let’s try it out! 

Calling model.parameters() will collect weight and bias from both the first and second linear modules. It’s instructive to inspect the parameters in this case by printing their shapes: 

[param.shape for param in seq_model.parameters()]

#[torch.Size([15, 1]), torch.Size([15]), torch.Size([1, 15]), torch.Size([1])]

When inspecting the parameters of a model made up of several submodules, it is handy to identify parameters by name. There’s a method for that, called named_parameters: 

for name, param in seq_model.named_parameters():
            print(name, param.shape)
Sequential Model Name Parameters

The name of each module in Sequential is just the ordinal with which the module appears in the arguments. 


Interestingly, Sequential also accepts an OrderedDict, in which we can name each module passed to Sequential: 

from collections import OrderedDict

seq_model = nn.Sequential(OrderedDict([
            ('hidden_linear', nn.Linear(1, 8)),
            ('hidden_activation', nn.Tanh()),
            ('output_linear', nn.Linear(8, 1))


This allows us to get more explanatory names for submodules. We can also access a particular Parameter by using submodules as attributes. This is useful for inspecting parameters.

for name, param in seq_model.named_parameters():
            print(name, param.shape)
Sequential Model name dict

#Parameter containing:
#tensor([0.2297], requires_grad=True)

Access Weight of Submodules

When we want to build models that do more complex things than just applying one layer after another, we need to leave nn.Sequential for something that gives us added flexibility. PyTorch allows us to use any computation in our model by subclassing nn.Module

class MNISTConvNet(nn.Module):
    def __init__(self):
      super(MNISTConvNet, self).__init__()
      self.conv1 = nn.Sequential(
          nn.Conv2d(1, 32, 5, padding='same'),
      self.conv2 = nn.Sequential(
          nn.Conv2d(32, 64, 5, padding='same'),
      self.fc1 = nn.Sequential(
          nn.Linear(7*7*64, 1024),
          nn.Linear(1024, 10)
    def forward(self, x):
      x = self.conv1(x)
      x = self.conv2(x)
      return self.fc1(x)

Sequential sub module

This allows MNISTConvNet to have access to the parameters of its submodules without further action by the user:

numel_list = [p.numel() for p in mnist.parameters()]
sum(numel_list), numel_list

#(3274634, [800, 32, 51200, 64, 3211264, 1024, 10240, 10])
PyTorch Sequential Submodule Weight

Related Post

Access PyTorch model weights and bise with its name and ‘requires_grad value’

Convolutional Neural Network using Sequential model in PyTorch.

Print Computed Gradient Values of PyTorch Model

What is PyTorch nn.Parameters?

Difference between ‘register_buffer’ and ‘register_parameter’ of PyTorch Module.