There are two ways to define a model in PyTorch: using the Sequential class (only for linear stacks of layers, which is the most common network architecture by far) or the standard PyTorch approach to create a child class that inherits from the torch.nn.Module class.
A Sequential model is useful for quickly defining a linear stack of layers (i.e., where one layer follows directly from the previous layer without any branching). We can define our MLP model using the Sequential class as shown:
seq_model = nn.Sequential(nn.Linear(1, 15),
nn.ReLU(),
nn.Linear(15, 1))
print(seq_model)
nn
provides a simple way to concatenate modules through the nn.Sequential
container. Instantiate and view the network

As a refresher, here’s a two-layer model defined using the Sequential class (note that we’re passing the expected shape of the input data to the first layer).
Because nn.Sequential
is a module, we can get its parameters(Weight and Bias), which will return a list of all the parameters of all the modules it contains. Let’s try it out!
Calling model.parameters()
will collect weight and bias from both the first and second linear modules. It’s instructive to inspect the parameters in this case by printing their shapes:
[param.shape for param in seq_model.parameters()]
#[torch.Size([15, 1]), torch.Size([15]), torch.Size([1, 15]), torch.Size([1])]
When inspecting the parameters of a model made up of several submodules, it is handy to identify parameters by name. There’s a method for that, called named_parameters:
for name, param in seq_model.named_parameters():
print(name, param.shape)

The name of each module in Sequential is just the ordinal with which the module appears in the arguments.
seq_model[0].weight
Interestingly, Sequential also accepts an OrderedDict, in which we can name each module passed to Sequential:
from collections import OrderedDict
seq_model = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(1, 8)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(8, 1))
]))
print(seq_model)
This allows us to get more explanatory names for submodules. We can also access a particular Parameter by using submodules as attributes. This is useful for inspecting parameters.
for name, param in seq_model.named_parameters():
print(name, param.shape)

seq_model.output_linear.bias
#Parameter containing:
#tensor([0.2297], requires_grad=True)
Access Weight of Submodules
When we want to build models that do more complex things than just applying one layer after another, we need to leave nn.Sequential
for something that gives us added flexibility. PyTorch allows us to use any computation in our model by subclassing nn.Module
.
class MNISTConvNet(nn.Module):
def __init__(self):
super(MNISTConvNet, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(1, 32, 5, padding='same'),
nn.ReLU(),
nn.MaxPool2d(2)
)
self.conv2 = nn.Sequential(
nn.Conv2d(32, 64, 5, padding='same'),
nn.ReLU(),
nn.MaxPool2d(2)
)
self.fc1 = nn.Sequential(
nn.Flatten(),
nn.Linear(7*7*64, 1024),
nn.Dropout(0.5),
nn.Linear(1024, 10)
)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
return self.fc1(x)

This allows MNISTConvNet
to have access to the parameters of its submodules without further action by the user:
numel_list = [p.numel() for p in mnist.parameters()]
sum(numel_list), numel_list
#(3274634, [800, 32, 51200, 64, 3211264, 1024, 10240, 10])

Related Post
Access PyTorch model weights and bise with its name and ‘requires_grad value’
Convolutional Neural Network using Sequential model in PyTorch.
Print Computed Gradient Values of PyTorch Model
What is PyTorch nn.Parameters?
Difference between ‘register_buffer’ and ‘register_parameter’ of PyTorch Module.