-
Notifications
You must be signed in to change notification settings - Fork 95
Generic hashing
This API transforms an arbitrary-long input into a fixed length fingerprint, and offers a lot of flexibility regarding the key and output sizes.
It accepts an optional hydro_hash_KEYBYTES
bytes key, and can produce a result of any size from 128 bits to 524,280 bits.
#define CONTEXT "Examples"
#define MESSAGE "Arbitrary data to hash"
#define MESSAGE_LEN 22
uint8_t hash[hydro_hash_BYTES];
hydro_hash_hash(hash, sizeof hash, MESSAGE, MESSAGE_LEN, CONTEXT, NULL);
#define CONTEXT "Examples"
#define MESSAGE "Arbitrary data to hash"
#define MESSAGE_LEN 22
uint8_t hash[hydro_hash_BYTES];
uint8_t key[hydro_hash_KEYBYTES];
hydro_hash_keygen(key);
hydro_hash_hash(hash, sizeof hash, MESSAGE, MESSAGE_LEN, CONTEXT, key);
#define CONTEXT "Examples"
#define MESSAGE_PART1 "Arbitrary data to hash"
#define MESSAGE_PART1_LEN 22
#define MESSAGE_PART2 "is longer than expected"
#define MESSAGE_PART2_LEN 23
uint8_t hash[hydro_hash_BYTES];
uint8_t key[hydro_hash_KEYBYTES];
hydro_hash_state state;
hydro_hash_keygen(key);
hydro_hash_init(&state, CONTEXT, key);
hydro_hash_update(&state, MESSAGE_PART1, MESSAGE_PART1_LEN);
hydro_hash_update(&state, MESSAGE_PART2, MESSAGE_PART2_LEN);
hydro_hash_final(&state, hash, sizeof hash);
This API computes a fixed-length fingerprint for an arbitrary long message.
Sample use cases:
- File integrity checking
- Creating unique identifiers to index arbitrary long data.
This API requires a context. Refer to the documentation on contexts for more information.
void hydro_hash_keygen(uint8_t *key);
The hydro_hash_keygen()
function creates a secret key suitable for use with the hydro_hash_*
API.
int hydro_hash_hash(uint8_t *out, size_t out_len, const void *in_,
size_t in_len, const char ctx[hydro_hash_CONTEXTBYTES], const uint8_t *key);
The hydro_hash_hash()
function puts a fingerprint of the message in_
whose length is inlen
bytes into out
.
The output size can be chosen by the application.
The minimum recommended output size is hydro_hash_BYTES
. This size makes it practically impossible for two messages to produce the same fingerprint.
But for specific use cases, the size can be any value between hydro_hash_BYTES_MIN
(included) and hydro_hash_BYTES_MAX
(included).
key
can be NULL
. In this case, a message will always have the same fingerprint, similar to the MD5
or SHA-1
functions for which hydro_hash_hash()
is a faster and more secure alternative.
But a key can also be specified. A message will always have the same fingerprint for a given key, but different keys used to hash the same message are very likely to produce distinct fingerprints.
In particular, the key can be used to make sure that different applications generate different fingerprints even if they process the same data.
The key size is hydro_hash_KEYBYTES
bytes.
int hydro_hash_init(hydro_hash_state *state,
const char ctx[hydro_hash_CONTEXTBYTES], const uint8_t *key);
int hydro_hash_update(hydro_hash_state *state, const void *in_, size_t in_len);
int hydro_hash_final(hydro_hash_state *state, uint8_t *out, size_t out_len);
The message doesn't have to be provided as a single chunk. The hash
operation also supports a streaming API.
The hydro_hash_init()
function initializes a state state
with a key key
(that can be NULL
), in order to eventually produce outlen
bytes of output.
Each chunk of the complete message can then be sequentially processed by calling hydro_hash_update()
, providing the previously initialized state state
, a pointer to the chunk in_
and the length of the chunk in bytes, inlen
.
The hydro_hash_final()
function completes the operation and puts the final fingerprint into out
as outlen
bytes.
After hydro_hash_final()
returns, the state should not be used any more.
This alternative API is especially useful to process very large files and data streams.
#define hydro_hash_BYTES 32
#define hydro_hash_BYTES_MAX 65535
#define hydro_hash_BYTES_MIN 16
#define hydro_hash_CONTEXTBYTES 8
#define hydro_hash_KEYBYTES 32
hydro_hash_state
Unlike MD5, SHA-1 and SHA-256, this function is safe against hash length extension attacks.
However, this API is not suitable for hashing passwords.
The construction used for generic hashing is similar to the NIST SP 800-185 KMAC construction, leveraging the Gimli hash function instead of cSHAKE.
H(pad(str_enc("kmac") || str_enc(context)) || pad(str_enc(k)) ||
msg || right_enc(msg_len) || 0x00)
We also define a variant that extends KMAC to include a 64 bit tweak, currently used for key derivation:
H(pad(str_enc("tmac") || str_enc(context)) || pad(str_enc(k)) ||
pad(right_enc(tweak)) || msg || right_enc(msg_len) || 0x00)