This Blog Post assumes some familiarity with Hash Functions. If you want to brush up your knowledge on such you can check out my introduction to Hash Functions.
A Message Authentication Code (abbreviated as MAC) is a cryptographic primitive that protects the integrity of data. It essentially ensures that data that's sent over a channel hasn't been tampered with.
Intuitively one can think of a MAC as a Hash + Secret Key. Knowing the secret key one can check that the data is still intact. It's important to note that while a secret key is used, a MAC doesn't ensure confidentiality by performing data encryption!
A Message Authentication Code takes two inputs, namely a Secret Key and a Message and produces one output called the Authentication Tag.
A MAC's output is generated deterministically, meaning that the same combination of Secret Key and Message will always produce the same Authentication Tag.
Authentication Tags are usually sent alongside the message they were generated for.
To validate that the message wasn't modified the recipient would re-compute the Authentication Tag with the Message and Secret Key and check the output against the Authentication Tag that was transmitted.
The message was submitted as is if both Authentication Tags are equal.
Note that a MAC requires that both, the sender and receiver share the same secret key. We'll learn in an upcoming Blog Post about Key Exchanges and how we can ensure that both parties agree on the same key even if they'll never meet in person.
The Secret Key used in a MAC construction should be random, used only once and long enough (at least 128 bits) to ensure that the Authentication Tag can't be forged.
In addition to that one should make sure to use a serialization scheme if an Authentication Tag should be generated for structured data as otherwise, forgeries might be trivial.
Authentication Tag Length
Given that Authentication Tags are generated via Hash Functions they're susceptible to Birthday Attacks in which a malicious party tries to find a collision, meaning that the output of
MAC(Key, Input A) equals that of
MAC(Key, Input B).
Generally speaking, collisions aren't that exploitable in practice and the consensus is that Authentication Tags with lengths greater than or equal to 128-bit are considered secure as per the Birthday Bound and the attacker likely finds a collision after 2^128/2 = 2^64 tries.
If you read my Blog Post on Hash Functions you might remember that an output of a Hash Function should be at least 256-bit to ensure that collisions are only found after 2^256/2 = 2^128 operations which implies a security of 128-bit. Why are we fine with 64-bit security for Authentication Tags? The reason is that the malicious actor needs to query an "Oracle" of sorts that computes a valid Authentication Tag for the given message (remember, that only the Oracle has the Secret Key to create a tag). This means that the attacker can't compute Authentication Tags offline which is much faster compared to querying an Oracle to do the computation for them.
Constant Time Evaluation
Checking an authentication tag for validity should be done in constant time to ensure that an attacker can't measure how long it took until the check rejected a wrong Authentication Tag.
To outline why this is necessary, imagine a naive implementation of a check that starts with the first character of the Authentication Check and returns an error as soon as it finds a mismatch. The longer this check takes, the more characters are valid. An attacker could use this function to slowly re-create a valid tag one character at a time.
This type of attack is called a Side-Channel Attack and isn't specific to Message Authentication Codes. It's a very common tool used to attack cryptographic construction.
Having access to a valid Authentication Tag doesn't prevent anyone who has access to the message and the tag from resending it. It's valid after all.
Depending on the application in which MACs are used it might be necessary to forbid the resending of already sent / processed messages. Imagine a baking application where transferring money twice just by re-sending the message would be catastrophic.
To protect against such replay attacks one can use a nonce (number only used once) or counter that's auto-incremented. The recipient would need to store some state with the most recent number it saw and would reject any message with a number it already processed.
Note that one runs out of numbers eventually. This is especially true when using a counter whose value might "wrap around". To solve this issue a key rotation (agreeing on a new key) could be done in which case the counter would be reset to 0.
MAC Use Cases
MACs can be used for more than message authentication. The following is an overview of other problems MACs can be applied to.
Message Authentication and Integrity are the primary applications MACs are used for. With Message Authentication Codes we can ensure that the received data wasn't manipulated in transit.
MACs can be applied to both, plaintexts and ciphertexts, depending on the need. It's worth highlighting that even though MACs use a secret key they won't encrypt data and therefore don't provide confidentiality!
Given that MAC constructions take a secret key as an input and produce a random-looking output in a deterministic way, they can be used to derive other, secret keys.
The idea here is that the main secret key is used to "extract" a random output which is then "expanded" into other keys. A popular standard that does exactly that is the HKDF key derivation function that's more formally defined in RFC 5869.
Deterministic Random Number Generation
A MAC implementation can act as a pseudorandom number generator that takes a seed (the secret key) and produces randomness deterministically.
While most MACs can be used for this purpose, not all MACs are pseudorandom functions and therefore shouldn't be used as a random number generator for cryptographic applications.
There are many different MAC implementations one can use in practice. However the most widely used MACs out there are HMAC (based on SHA-2) and KMAC (based on SHA-3).
HMAC (Hash-based Message Authentication Code) is a MAC construction that combines a hash from the SHA-2 family with a secret key.
An HMAC is created as follows:
Derive a key
k1from the secret key
kwith an "inner padding" (a constant)
Derive a key
k2from the secret key
kwith an "outer padding" (a constant)
Hash the concatenation of
k1with the message
mto get an output
Concatenate the key
k2with the output
Oand hash it to generate the Authentication Tag
In summary, HMAC looks like this
tag = SHA-2(k2 || SHA-2(k1 || m))
Given that HMAC uses SHA-2 under the hood, the length of the Authentication Tag depends on the SHA-2 hash function that's used. In practice, tags can be truncated based on the application's needs. However, always keep collisions in mind and ensure that the output has 128-bit security.
KMAC (Keccak Message Authentication Code) is a MAC implementation based on cSHAKE, which is an extendable output function (XOF) that allows for variable-length outputs. cSHAKE in turn is based on SHA-3 which is an "optimized" Keccak hash implementation (that's where the K in KMAC comes from).
One can think of KMAC as a wrapper around cSHAKE. The cSHAKE function takes four inputs:
The main input
The requested output length
A function name (usually defined by NIST)
A customization string
To create an Authentication Tag, an encoding of the secret key, message and output length is created:
encode(key, message, output length). This encoding is used as the main input. The function name parameter is set to "KMAC".
Next up, all inputs are passed-into cSHAKE function to generate an Authentication Tag.
If you want to dive deeper into the topics discussed in this blog post you might want to take a look at the following resources: