PyTorch provides elegantly designed modules and functions like torch.nn and torch.nn.functional to help you create neural network models. Layers are often implemented as either one of torch.nn.Module objects or torch.nn.functional functions. In this post, we will be discussing the difference between nn.Dropout and functional.dropout and Which one to use? Which one is better?

torch.nn vs torch.nn.functional

torch.nn.Module is basically the cornerstone of PyTorch. The way it works is you first define an nn.Module object, and then invoke its forward method to run it. This is an Object Oriented way of doing things. PyTorch’s nn classes make it more concise and flexible. We should make our code: shorter, more understandable, and/or more flexible.

torch.nn.Module creates a callable, which behaves like a function but can also contain a state such as neural net layer weights and bias. It knows what Parameters it contains and can zero all their gradients, loop through them for weight updates, etc.

On the other hand, nn.functional provides layers and activations in the form of functions that can be directly called on the input rather than defining the object. For example, in order to call the pooling layer, you call F.max_pool2d(x, 2).

This module contains all the functions of the torch.nn library. As well as a wide range of loss and activation functions, you’ll also find some convenient functions for creating neural nets, such as pooling functions. There are also functions for doing convolutions, linear layers, etc, but these are usually better handled using other parts of the library.

torch.nn.functional module is usually imported into the F namespace by convention, which contains activation functions, loss functions, etc, as well as non-stateful versions of layers such as convolutional and linear layers.

Create a Model

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return 

In a very simple use case, you would just call all created layers one by one passing the output of one layer to the other. The torch.nn layers contain trainable parameters while torch.nn.functional are purely functional. In your forward method, you are creating the logic of your forward and backward pass.

PyTorch Model State

The layers defined in nn.functional don’t maintain any state. Thus, for torch.nn.functional.dropout you will need to provide probability p and is training to nn.functional.dropout along with your input.

In contrast, nn.Dropout initializes and maintains its own parameters (probability p and is training) and takes care of all of that for you. But under the hood, nn.Dropout is literally creating Parameters, and whenever you input a tensor to nn.Dropout, it is passing that input, along with its weight and bias tensors to nn.functional.Dropout.

So it’s no different than if you called the functional version yourself, it’s just helping with some of the bookkeeping and tracking of its own needed parameters.

We can also think of the ReLU activation as a “layer”. However, there are no tunable parameters associated with the ReLU activation function. We don’t need to keep track of “states” associated with the ReLU activation, so it is not initiated as a “layer” in the __init__ function.

The main difference between the functional.dropout and the nn.Dropout is that one has a state and one does not.  the modules (nn.Module) use internally the functional API. There is no difference as long as you store the parameters somewhere (manually if you prefer the functional API or in an nn.Module “automatically”).

Dropout is designed to be only applied during training, so when doing predictions or evaluations, you want dropout to be turned off. The nn.Dropout conveniently handles this and shuts dropout off as soon as your model enters evaluation mode, while the nn.functional.dropout does not care about the evaluation/prediction mode.

Having the nn.Module containers as an abstraction layer make development easy and keep the flexibility to use the functional API.

Related Post