﻿ Pluto Scarab — Mixing Functions

# Hash Functions (continued)

## Mixing Functions

The heart of a hash function is its mixing step. The behavior of the mixing function largely determines whether the hash function is collision-resistant. Therefore it shouldn't be surprising that good mixing functions have the same attributes as good hash functions, namely collision resistance and uniform distribution.

The most notable difference between a mixing function and a hash function is that the input and output of a mixing function are the same size. The purpose of the mixing function is to "scramble" or mix the internal state of the hash function. The input to the function is the current internal state and the output of the function becomes the new internal state.

Because the input and output are the same size, mixing functions can attain complete collision-resistance, and they should. If the mixing function exhibits collisions, the hash function will exhibit more collisions than are necessary. Collisions are guaranteed not to happen if the mixing function is reversible, that is, it's possible to determine the input of the mixing function by examining the output. If the function is not reversible, it implies that there are at least two inputs that result in the same output. By definition, that is a collision.

## Reversible Operations

Figure 1 lists some reversible and non-reversible mixing operations. It is assumed that the "hash" variable is an unsigned integer and that the arithmetic operations are performed modulo 2n where n is the size of the hash variable in bits. It is also assumed that 0 < constant < 2n. A mixing function is guaranteed to be reversible if it consists of steps that are each reversible.

For the purposes of this article it's not important how to reverse these operations, but you may be curious and it's definitely not obvious for some of them. The ^= operation is easy—just repeat it. The += and -= operations can be reversed with -= and +=.

The ^= operation combined with << or >> shift is an interesting one. To reverse `hash ^= hash >> 9` for example, you would do `hash ^= (hash >> 9) ^ (hash >> 18) ^ (hash >> 27)`.

The += and -= operations combined with << or >> shifts are equivalent to *= and can be restated that way.

The *= operation is the most difficult to reverse. Look up modular inverse and Euclidean algorithm in your favorite search engine for details. Because of this difficulty, it's tempting to use this type of mixing step for one-way hash functions. Whether or not this is a good choice largely depends on the performance of the CPU. Some CPU's perform multiplication quickly and some take much longer than addition and subtraction. If multiplication is slow and the constant has a small number of bits set to 1, then equivalent operations using += and left-shifts will work.

## Avalanche!

The purpose of the mixing function is to spread the effect of each message bit throughout all the bits of the internal state. Ideally every bit in the hash state is affected by every bit in the message. And we want to do that as quickly as possible simply for the sake of program performance.

A function is said to satisfy the strict avalanche criterion if, whenever a single input bit is complemented (toggled between 0 and 1), each of the output bits should change with a probability of one half for an arbitrary selection of the remaining input bits.

The avalanche criterion can be tested for the hash function as a whole, or for just the mixing function. As a part of the overall evaluation of the hash function it makes sense to examine the avalanche behavior of the mixing function first.

The avalanche behavior can be exactly determined if the size of the hash state is small. If it is larger, such as 160 bits for the SHA-1 hash, it usually has to be estimated. It is possible to construct a mixing function with large bit size for which the avalanche behavior can be exactly calculated, but these mixing functions are usually slow.

## Calculating Avalanche

Let's look at a 4-bit mixing function to illustrate the general method for calculating the probabilities that each input but will affect each output bit. Figure 2 shows a mixing operation and lists all possible input values and their corresponding output values. By examining this table we can determine the probability that bit 1, for example, will affect bit 3.

To determine the effect of bit 1, we look at all eight possible combinations of the other three input bits and group the sixteen input values into eight pairs where each element of each pair differs only in bit 1. Input values 4 and 6 form such a pair if we are considering bit 1. These two input values match in bit positions 0, 2, and 3. The other pairs for bit 1 are 0 and 2, 1 and 3, 5 and 7, 8 and 10, etc.

Next we examine the corresponding pairs of output values to see which bits change when bit 1 changes. For input values 4 and 6 we see that the corresponding output values change in bits 1, 2, and 3 but output bit 0 does not change when bit 1 of the input changes. Therefore we say that bit 1 of the input affects bits 1, 2, and 3 of the output but not bit 0. If we look at all eight input pairs for bit 1, we find that bit 0 changes for none of the pairs, bit 1 changes for all of the pairs, bit 2 changes for 50% of the pairs, and bit 3 changes for 75% of the pairs. Therefore this mixing step by itself does not come close to satisfying the strict avalanche criterion. We want to see about 50% for each output bit position for each input bit.

This method of calculating the avalanche behavior is only feasible because there are only four bits to deal with. For a 32-bit mixing function, we would have to examine sixty-four-thousand million pairs. This can be done with a PC given sufficient time, but it would be out of the question for a 64-bit, 96-bit, or larger mixing function.

For larger mixing functions we can estimate the avalanche behavior by choosing input values at random and toggling each bit on and off to see which output bits are affected. If we keep a total of the results for each random choice, then if we choose a lot of random input values we can get a good estimate of the probabilities.

## Actual C# Code

Before we write a program to calculate bit-to-bit avalanche behavior, we'll need a "harness" into which we can strap different mixing functions. Figure 3 shows an abstract base class for a mixing function that ensures that each mixing function will have the same interface. I used an abstract class instead of an `interface` because we need to have a member variable for the hash state. This version has a 32-bit member variable called `state`.

Figure 3 also shows a 32-bit mixing function designed by Bob Jenkins. It is intended to be used for a hash table with integer keys. It's tempting to think you can tell which bits affect which other bits just by looking at the code. For example, after the first step `state += (state << 12)`, it's pretty clear that bits 0 through 19 will affect bits 12 through 31 in some way. But when you combine this step with subsequent steps it's possible that some of these effects will cancel out earlier effects. The last step shifts things 12 bits to the right, for example. And if you look at all the steps except the last and add up all the shifts, counting left shifts as positive and right shifts as negative, you'll see that the total amount of shifting before the last step is zero. It's complicated. We'll just have to measure it to see how it does.

We'll create a function for calculating the avalanche matrix. Each element `i,j` of this 32x32 matrix of floating-point values will tell us the probability that bit `i` of the input will affect bit `j` of the output. We want to see a value of 0.5 for every combination.

The best place to put this function is in the `MixingFunction` base class, which means the function automatically becomes available as a member function for any mixing function we create. It essentially becomes a feature of mixing functions in general, instead of having to create this method in some other class and then passing the mixing function as a parameter to the function.

Figure 4 shows the code for the `AvalancheMatrix` method of the `MixingFunction` class. It takes two parameters. The first is `trials` which is the number of random input vectors we're going to generate. Evaluating all 231 different combinations of input vectors for each bit combination will take too long, so we'll just pick about a million of them at random. That's the `trials` parameter.

The second parameter is `repetitions`. We'll see in a little while that even if a mixing function doesn't achieve avalanche in one call, it will sometimes do better if the mixing is performed two or more times in a row. The `repetitions` parameter lets us easily test the behavior of multiple applications of the mixing function.

The first thing the function does is to ensure the mixing function is a 32-bit mixing function, since this version of the algorithm doesn't support other sizes. We also check that the function parameters are valid.

Next we declare our variables, initialize some of them (C# initializes everything else to default values automatically), allocate some memory, and initialize the random number generator. I used RNGCryptoServiceProvider from the System.Security.Cryptography namespace because I know the System.Random class has linear correlations between values xn, xn-34, and xn-55 since it is a two-tap linear feedback PRNG. I'm not 100% sure what RNGCryptoServiceProvider is doing, but the scanty documentation I've seen suggest it uses entropy sources for seed generation and updating, and a cryptographic hash for output whitening. I like that better, although it's really slow.

Inside the main loop we start by getting 32 random bits and setting the internal state. We also save that state for later. Then we call the mixing function the appropriate number of times and save the new state.

Then we toggle each of the 32 input bits one at a time with the statement `state = save ^ (1U << i)`, then we mix that new state and compare it to the original mixed state by doing `outb = state ^ inb` which gives us 0's where the two states match and 1's where they differ. Then we count the 1's and keep a tally.

Finally, after all the trials are finished, we divide each count by the total number of trials to normalize them into the [0, 1] range. Remember that 0.5 is our goal.