Keras accuracy metrics are functions that are used to evaluate the performance of your deep learning model. Keras provides a rich pool of inbuilt metrics. Depending on your problem, you’ll use different ones.

Choosing the right accuracy metric for your problem is usually a difficult task. You need to understand which metrics are already available in Keras and how to use them. It seems simple but in reality, it’s not obvious. This decision is based on certain parameters like the output shape and the loss functions.

Categorical Accuracy.

These metrics are used for classification problems involving more than two classes. Like the MNIST dataset, you have 10 classes. Since we are classifying more than two images, this is a multiclass classification problem.

Categorical Accuracy calculates the percentage of predicted values (yPred) that match with actual values (yTrue) for one-hot labels.

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),

It’s the K.argmax method to compare the index of the maximal true value with the index of the maximal predicted value. In other words “how often predictions have maximum in the same spot as true values”.

First, we identify the index at which the maximum value occurs using argmax() If it is the same for both yPred and yTrue, it is considered accurate.

The shape of yTrue is the number of entries by 1 that is (n,1) but the shape of yPred is the number of entries by the number of classes(n,c). It computes the mean accuracy rate across all predictions.

We then calculate Categorical Accuracy by dividing the number of accurately predicted records by the total number of records. As Categorical Accuracy looks for the index of the maximum value, yPred can be logit or probability of predictions.

Sparse Categorical Accuracy

sparse_categorical_accuracy is similar to categorical_accuracy but mostly used when making predictions for sparse targets. A great example of this is working with text in deep learning problems such as word2vec. In this case, one works with thousands of classes with the aim of predicting the next word. This task produces a situation where the yTrue is a huge matrix that is almost all zeros, a perfect spot to use a sparse matrix.

def sparse_categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.max(y_true, axis=-1),
                          K.cast(K.argmax(y_pred, axis=-1), K.floatx())),

For sparse categorical metrics, the shapes of yTrue and yPred are different. This checks to see if the maximal true value is equal to the index of the maximal predicted value.

In categorical_accuracy you need to specify your target (y) as a one-hot encoded vector (e.g. in the case of 3 classes, when a true class is second class, y should be (0, 1, 0). In sparse_categorical_accuracy you need should only provide an integer of the true class (in the case of the previous example – it would be 1 as classes indexing is 0-based).

In sparse categorical accuracy, you do not need to provide an integer – instead, you may provide an array of length one with the index only – since keras chooses the max value from the array – but you may also provide an array of any length – for example of three results – and keras will choose the maximum value from this array and check if it corresponds to the index of the max value in yPred

Both, categorical accuracy and sparse categorical accuracy have the same function the only difference is the format.If your Yi  are one-hot encoded, use categorical_accuracy. For examples 3-class classification: [1,0,0] , [0,1,0], [0,0,1].But if your Yi are integers, use sparse_categorical_crossentropy. Examples for above 3-class classification problem: [1] , [2], [3]

The usage entirely depends on how you load your dataset. One advantage of using sparse categorical cross-entropy is it saves time in memory as well as computation because it simply uses a single integer for a class, rather than a whole vector.

In short, if the classes are mutually exclusive then use sparse_categorical_accuracy instead of categorical_accuracy, this usually improves the outputs.