There are various types of layers used in the deep learning model. It can be considered as the architecture of the model. All of these different layers have their own importance based on their features. Like we use LSTM layers mostly in the time series analysis or in the NLP, convolutional layers in image processing, etc. 

The linear layer is used in the final stages of the neural network. It is also called a fully connected layer or Dense layer in Keras. This layer helps in changing the dimensionality of the output from the preceding layer so that the model can easily define the relationship between the values of the data in which the model is working. In this article, we will discuss the Linear layer in detail with examples.

Linear layers use matrix multiplication to transform their input features into output features using a weight matrix. The input features are received by a linear layer are passed in the form of a flattened one-dimension tensor and then multiplied by the weight matrix. 

import torch
import torch.nn as nn

input = torch.randn(10, 4)
weight = torch.randn(2,4),weight.t())

This matrix multiplication produces the output features. The weight matrix defines by a linear function. The mathematical notation for the linear transformation is:


  • W=Weight matrix.
  • x=Input tensor
  • b=Bias
  • y=Output tensor.

This equation is a more general form of the equation for Linear transformation. The bias is an additive parameter in the convolution. If you set bias=False, it will drop the bias, which might make sense in some cases, e.g. if the next layer is an affine BatchNorm layer.

Let’s see how to create a PyTorch Linear layer.


Here we define a linear layer that accepts 4 input features and transforms these into 2 out features. We know that a weight matrix is used to perform this operation but where is the weight matrix lives inside the PyTorch linear layer class. It’s created by PyTorch and PyTorch Linear layer class uses the numbers 2×4(out_features x in_features) that are passed into the constructor to create a 2×4 weight matrix. Let’s explicitly set the weight matrix now for the Linear layer.


PyTorch module weights need to be Parameter that lives inside the neural network module this is why we wrap the weight matrix tensor inside a parameter class instance. Let’s verify this by taking a look of weights.

#Parameter containing:
tensor([[ 1.1868, -1.1860, -0.0804, -0.4910],
        [ 0.8994,  0.0661, -0.1195,  1.0835]], requires_grad=True)

Let’s see how we can call our layer now by passing the input features tensor. We can call the object instance like this because PyTorch neural network modules are callable Python objects. 


#tensor([[ 1.2080, -0.3693],
        [-2.2544, -1.1819],
        [-1.3294, -1.1608],
        [-0.0223,  0.8998],
        [ 0.4197,  0.7985],
        [ 0.7865,  1.4045],
        [ 1.1019, -1.5440],
        [ 0.2025,  1.4424],
        [ 0.4802, -1.5648],
        [-2.2520, -0.8136]], grad_fn=<MmBackward0>)

As we’ve seen when we multiply a 10×4 matrix with a 2×4 matrix the result is a 10×2 matrix. This way these are linear algebra rules for matrix multiplication.

PyTorch creates a weight matrix and initializes it with random values this means that the linear functions from the two examples are indeed different. Remember the values inside the weight matrix defined the linear function this basically demonstrates how the network mapping changes as the weights are updated during the training process when we update the weight we are changing the function.

Related Post