Hash vs. Message Authentication Code | Baeldung on Computer Science

1. Introduction

Cryptography is the process of securing information and communication using a set of algorithms and mathematical rules. It prevents third parties from reading a private message or corrupting it. Modern cryptography makes use of a number of primitives including Hashes, MACs which stands for Message Authentication Code, and digital signatures.

In this tutorial, we’ll learn about Hash and MAC functions and the differences between them. First, we’ll provide a technical and conceptual comparison of both functions. Second, we’ll present HMAC, a technique that combines both, Hash and MAC. Then, we’ll provide examples and use cases.

2. Hash and MAC: Main Differences

Let’s start with a comparison where we expose the technique for both processes. Hashing is a one-way encryption process applied to the original plaintext to generate a fixed-size ciphertext, called a digest or a Hash. Also, Hash is a deterministic function – it produces an output of identical size regardless of the input size. When we say a hash function, we typically refer to a cryptographic hash function. Examples of hashing algorithms are SHA1, SHA2, SHA3, SHA256, and MD5. So, Hash is a public and deterministic function that takes, as input, a single message composed of a sequence of bits:

MAC, in turn, is an encryption process applied to a hashed message using a symmetric key. It is also called a tag or a keyed hash function since it usually uses a cryptographic hash as part of its algorithm. Popular examples of MACs are CBC-MAC using DES, UMAC, and HMAC. Essentially, MAC is an algorithm that takes, as input, a message combined with a shared secret key.

Let’s continue with a conceptual comparison where we define security goals for both processes. Hash functions are used to ensure data integrity. Any change in the original message results in generating a different Hash. Typically, given a Hash message, the attacker will have no clue what the original message was.

Meanwhile, MACs are employed for data integrity and authentication. Any change in the message and/or the key results in a different MAC. Without possessing the secret key, it is impossible for the attacker to identify and validate the MAC. The following figure illustrates the process of the MAC algorithm. Once receiving the original message and the MAC, the receiver computes its own MAC using the same shared key and checks the equality between the received MAC and the calculated one:

3. HMAC, a Combination of Hash and MAC

HMAC stands for Hash-based message authentication code. It is an authentication technique that combines a hash function and a secret key. Depending on the hash function used to calculate the MAC, numerous examples can be defined such as HMAC_MD5, HMAC_SHA1, HMAC_SHA256, and HMAC_SHA256.

HMAC derives two keys from the main secret key, let’s say K1 and K2, and performs two hash computation rounds. The first round of the algorithm generates an internal hash HMAC1 from the original message and the first key K1. Then, the second round creates the final HMAC code using the resulting internal hash and the second key K2. The receiver computes its own HMAC, the same way as the sender and compares it to the received HMAC to verify the authentication and integrity of the message, as shown in the figure below:

4. Use Cases

Hash functions exhibit a number of properties, helping to define their applications. For example, they are helpful to store passwords safely, thanks to their non-reversibility property. In fact, if an attacker gains access to a user’s database, it is extremely difficult for him to retrieve the password from a hash.

Also, we have mentioned earlier that hashes are deterministic functions. That is why they are beneficial in identifying files, specifically when distributing software materials. For instance, when downloading a Linux distribution, we can verify if the file has been modified or harmed by getting the file digest and comparing it to the original file digest.

In cryptography, a collision occurs when injecting two different input messages in a hash function results in producing the same output. Other cryptography primitives, such as digital signatures, are generally conducted on the hash message rather than the original message. This is technically possible since a hash function must have a strong collision resistance. This means it is hard to find two different input messages that generate the same hash output.

As for the MACs, they are not commonly employed on their own. Instead, they are integrated into an encryption algorithm, building what we call AEADs (Authenticated Encryption with Associated Data). They can also be combined with hash functions, creating algorithms such as HMAC, SHA1-HMAC, and MD5-HMAC.

As a real-life application, Message Authentication Codes are operated in financial cryptography. More specifically, they are used to create any type of financial account in institutions such as banks and insurance companies. Also, Electronic financial transfers (EFTs) frequently use MACs to preserve information integrity. They verify a message’s authenticity, that it truly originates from the legitimate sender and wasn’t altered in transit.

The following table contains the main differences between the two cryptographic primitives, Hash and MAC:

5. Conclusion

In this article, we covered Hash and MAC, the differences between them, and defined HMAC as a combination of both.

Full Archive

About Baeldung

Core Concepts

Operating Systems

Artificial Intelligence

Graph Theory

Latex

1. Introduction

2. Hash and MAC: Main Differences

3. HMAC, a Combination of Hash and MAC

4. Use Cases

5. Conclusion