The softmax activation function transforms a vector of K real values into values between 0 and 1 so that they can be interpreted as probabilities. The input values can be positive, negative, zero, or greater than one.
One advantage of using sparse categorical cross-entropy is it saves time in memory as well as computation because it simply uses a single integer for a class, rather than a whole vector.
Forcing your network to learn redundant representations might sound very inefficient. But in practice, it makes things more robust and prevents overfitting. It also makes your network act as if taking the consensus over an ensemble of networks.
As we go deeper in the neural network typically you start off with larger images [32x32x3] then the height and width will gradually trend down as you go deeper in the neural network. Whereas the number of channels generally increases. You see this general trend in a lot of other convolutional neural networks.
A custom collate_fn can be used to customize collation, e.g., padding sequential data to a max length of a batch.collate_fn is called with a list of data samples at each time. It is expected to collate the input samples into a batch for yielding from the data loader iterator.
The pre-trained model is “frozen” and only the weights of the classifier get updated during training. In this case, the convolutional base extracted all the features associated with each image
In this tutorial, we’ll discuss what regularization is and when and why it may be helpful to add it to our model.
PyTorch provides many classes to make data loading easy and code more readable. In this tutorial, we will see how to load and preprocess/augment custom datasets.
Network would already be able to extract generic features from your dataset. The network will not have to learn to extract generic features from scratch.
It is sometimes desirable to use a separate penalty with a different coefficient for each layer of the network.