PyTorch provides elegantly designed modules and functions like torch.nn and torch.nn.functional to help you create neural network models. Layers are often implemented as either one of torch.nn.Module
objects or torch.nn.functional
functions. In this post, we will be discussing the difference between nn.Dropout
and functional.dropout
and Which one to use? Which one is better?
torch.nn vs torch.nn.functional
torch.nn.Module
is basically the cornerstone of PyTorch. The way it works is you first define an nn.Module
object, and then invoke its forward method to run it. This is an Object Oriented way of doing things. PyTorch’s nn
classes make it more concise and flexible. We should make our code: shorter, more understandable, and/or more flexible.
torch.nn.Module
creates a callable, which behaves like a function but can also contain a state such as neural net layer weights and bias. It knows what Parameters it contains and can zero all their gradients, loop through them for weight updates, etc.
On the other hand, nn.functional
provides layers and activations in the form of functions that can be directly called on the input rather than defining the object. For example, in order to call the pooling layer, you call F.max_pool2d(x, 2)
.
This module contains all the functions of the torch.nn
library. As well as a wide range of loss and activation functions, you’ll also find some convenient functions for creating neural nets, such as pooling functions. There are also functions for doing convolutions, linear layers, etc, but these are usually better handled using other parts of the library.
torch.nn.functional
module is usually imported into the F namespace by convention, which contains activation functions, loss functions, etc, as well as non-stateful versions of layers such as convolutional and linear layers.
Create a Model
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return
In a very simple use case, you would just call all created layers one by one passing the output of one layer to the other. The torch.nn
layers contain trainable parameters while torch.nn.functional
are purely functional. In your forward method, you are creating the logic of your forward and backward pass.
PyTorch Model State
The layers defined in nn.functional
don’t maintain any state. Thus, for torch.nn.functional.dropout
you will need to provide probability p
and is training
to nn.functional.dropout
along with your input.
In contrast, nn.Dropout
initializes and maintains its own parameters (probability p and is training) and takes care of all of that for you. But under the hood, nn.Dropout
is literally creating Parameters, and whenever you input a tensor to nn.Dropout
, it is passing that input, along with its weight and bias tensors to nn.functional.Dropout
.
So it’s no different than if you called the functional version yourself, it’s just helping with some of the bookkeeping and tracking of its own needed parameters.
We can also think of the ReLU activation as a “layer”. However, there are no tunable parameters associated with the ReLU activation function. We don’t need to keep track of “states” associated with the ReLU activation, so it is not initiated as a “layer” in the __init__ function.
The main difference between the functional.dropout
and the nn.Dropout
is that one has a state and one does not. the modules (nn.Module
) use internally the functional API. There is no difference as long as you store the parameters somewhere (manually if you prefer the functional API or in an nn.Module “automatically”).
Dropout is designed to be only applied during training, so when doing predictions or evaluations, you want dropout to be turned off. The nn.Dropout conveniently handles this and shuts dropout off as soon as your model enters evaluation mode, while the nn.functional.dropout does not care about the evaluation/prediction mode.
Having the nn.Module containers as an abstraction layer make development easy and keep the flexibility to use the functional API.