Table of contents
Open Table of contents
What it is
Poly1305 is a universal hash family that was invented by Daniel J. Bernstein and can be used for one-time message authentication codes (MAC).
The design of Poly1305 is optimized to take advantage of modern CPU architectures. It is therefore widely used in environments where hardware acceleration (e.g. via specific CPU instructions) is not available.
Bernstein proposed Poly1305 in his paper ”The Poly1305-AES message-authentication code” as an alternative MAC for AES. Since then, Poly1305 was also used in combination with other symmetric ciphers, most notably with ChaCha20 to create an Authenticated Encryption with Associated Data (AEAD) construction called ChaCha20-Poly1305.
This AEAD algorithm got standardized in RFC 8439 which also includes a specification of Poly1305 that we’ll base this writeup on.
How it works
Poly1305 can be used to generate a MAC for any arbitrary-length message using a per-message 32-byte key.
The first step that has to occur is to split the key into two subkeys and . is derived from the main key by taking its first 16-bytes, while is defined as the last 16-bytes.
The subkey is furthermore manipulated via an operation called “clamping” which clears the top four bits and bottom two bits. Doing so ensures that the top four bits are always smaller than 16, while the bottom two bits are divisible by 4. Without going to deep into the technical aspects, the key takeaway is that this manipulation is done to optimize for very fast implementations that exploit the underlying hardware to ensure the most efficient code execution possible.
In pseudocode this would look something like the following (where &=
is the bitwise AND
operator):
r[3] &= 15;
r[7] &= 15;
r[11] &= 15;
r[15] &= 15;
r[4] &= 252;
r[8] &= 252;
r[12] &= 252;
Or as a one-liner:
r &= 0x0ffffffc0ffffffc0ffffffc0fffffff
With the derivation of the subkeys and from the main key we can now take a look how the arbitrarily long message is prepared for the generation of a corresponding MAC via Poly1305.
Poly1305 operates on 17-byte blocks which means the message needs to be split up into blocks which are then fed into Poly1305 for processing.
Rather than splitting the message into 17-byte blocks, it’s split into 16-byte blocks first.
Next up, the bit is appended to every block which results in the block now being 17-bytes long.
The last block is furthermore padded with if it’s shorter than 17-bytes.
These block-wide manipulations are once again done to exploit the underlying machine’s hardware to ensure that the Poly1305 algorithm can be executed as fast as possible. It also increases the security. We’ll take a closer look why that is later.
To summarize, here’s a diagram that shows all the individual steps that are performed to turn an arbitrarily long message into processable message blocks.
With these preparations we’re now able to produce the message authentication code.
At first, an accumulator with the value is defined.
Next up, we iterate over every block and interpret its bits as a number defined in little endian order. This number is then added to the accumulator.
The accumulator value is then re-defined as the current accumulator value (the most recent addition we just did) multiplied by the clamped value that’s also interpreted as a little endian number modulo the prime (see the numbers and ? That’s where Poly1305 got its name from).
Once all the blocks were processed, we add (also interpreted as a little endian number) to the accumulator to derive its final value.
The MAC of the message is the 128 least significant bits (16-bytes) of the accumulator.
In pseudocode this would look like the following:
accumulator = 0
for (block in blocks) {
accumulator += blockToNumber(block)
accumulator = (clamp(r) * accumulator) % P
}
accumulator += s
mac = numberToBytes(accumulator)[0:16]
Poly1305 can also be expressed mathematically as:
Where is defined as the clamped value of .
Why it works
Poly1305 uses a universal hash function that’s fast, but not cryptographically secure. In fact, the only requirement is that two different messages that were hashed with this universal hash using the same key shouldn’t have the same hash value (i.e. ).
More specifically, Poly1305 is based on a polynomial evaluation hash of the form:
As one can see, the subkey is added to finalize the MAC. This is done to ensure that root-finding methods that could be used to extract from the polynomial evaluation become infeasible. It is therefore important that as otherwise the polynomial would only be evaluated at which makes it trivial to find the value which would then allow for the creation of forged MACs for any message.
As briefly mentioned in the beginning, Poly1305 is a one-time message authentication code. The key which is used to derive and should under no circumstance ever be reused.
Why that is is easy to see if we remember that should never be . An attacker could simply observe two MACs that were generated with the same key. Those two MACs can then be subtracted from each other to eliminate the value. A polynomial root-finding method can then be used to find and once is found, it’s easy to find the correct . Both and values can then be used to forge MACs for arbitrary messages.
Given these limitations of Poly1305, it’s most often used in an Authenticated Encryption with Associated Data (AEAD) construction like ChaCha20-Poly1305 where the key for Poly1305 is derived via a long-term key and nonce that’s used for the symmetric cipher (ChaCha20 in this case). One therefore doesn’t have to think about the Poly1305 key at all. This construction is called a Wegman-Carter Construction and is used in the aforementioned ChaCha20-Poly1305 algorithm.
The final thing that’s important to highlight is the extension of the message blocks by adding the bit number to every block to turn such a 16-byte block into a 17-byte block.
Adding this bit to the end of each block increases security as it ensures that the message length will always have an impact on the generated MAC. It’s essentially used to “detect trailing zeros” and ensures that there’s never an all-zero block. The value 0xCAFFE
, for example should have a different MAC than 0xCAFFE00
which will be ensured if the bit is appended.
References
The following resources have been invaluable for me to learn the concepts discussed in this article.
You should definitely give them a read if you want to dive deeper into the topic.
- Wikipedia - Poly1305
- Loup Vaillant - The Design of Poly1305
- Filippo Valsorda - A Go implementation of Poly1305 that makes sense
- Daniel J. Bernstein - A State-of-The-Art Message-Authentication Code
- Daniel J. Bernstein - The Poly1305-AES Message-Authentication Code
- RFC 7539 - ChaCha20 and Poly1305 for IETF Protocols
- RFC 8439 - ChaCha20 and Poly1305 for IETF Protocols
- Andrea Corbellini - Authenticated encryption: why you need it and how it works