1. Introduction

In machine learning, we often find ourselves measuring the accuracy of our models, but, are we doing it correctly?

In this tutorial, we’ll talk about the difference between Top-1 Accuracy and Top-N Accuracy, and why they’re important.

2. Top-1 Accuracy

Let’s say we have a model, which tries to classify images of animals. Let’s assume we show the model the image of a cat. Using Top-1 Accuracy, this measurement will consider a prediction as correct if and only if the most probable prediction is a cat.

Let’s expand our example to several predictions:

Rendered by QuickLaTeX.com

Given this example, our model predicted correctly 3/5 images, having an accuracy of 60%. As can be seen, Top-1 Accuracy is just what we generally refer to when talking about accuracy.

3. Top-N Accuracy

Top-N Accuracy takes the N model predictions with higher probability. If one of them is a true label, it classifies the prediction as correct. Top-1 Accuracy is a special case, in which only the highest probability prediction is taken into account.

Let’s use the same example as before, assuming a Top-3 Accuracy:

Rendered by QuickLaTeX.com

Now, using the 3 most probable predictions, we can see that the model predicted correctly 4/5 images, having a Top-3 Accuracy of 80%.

Notice that, with N>K, Top-N Accuracy \geq Top-K Accuracy. In other words, with a higher N, the Top-N Accuracy can either get higher or remain the same. This allows us to get insight into how our model works. For example, if the Top-1 Accuracy is really low we might think our model doesn’t know much about the dataset. However, if N accuracy increases significantly, we can find that it is actually learning but is lacking some fine-tuning. This can be especially helpful for classification problems with a high number of classes. Depending on the problem, this metric might be more appropriate to measure the model. For example, in the case of a recommendation system. Whether it is for videos, music, or online shops, we value novelty and diversity. We, as a client, are looking for new and diverse videos, music, or products. Therefore, we do not aim to find the most relevant recommendation, but a set of interesting recommendations. It might be more interesting to have the best prediction among a set of interesting predictions; rather than just one good prediction.

4. Conclusion

There are several different methods to measure how good a model is. It is really important to find the most appropriate one for the given problem. In this article, we showed how Top-N Accuracy can be used for certain problems. Also, we’ve seen the difference between Top-1 Accuracy and Top-N Accuracy, and how they can be used to get a better understanding of our model.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.