Advanced Hash Functions
The Cyclic Redundancy Check (CRC) Algorithm
-
Let k[n] denote the nth bit of the key.
-
Divide the corresponding polynomial k[n]*xn + k[n-1]*xn-1
+ ... + k[0] by a ``magic'' polynomial.
-
Note: the coefficients of both polynomials are either 0 or 1.
-
Uses ``polynomial arithmetic mod 2'': all coefficients are calculated mod
2.
-
The ``magic'' (divisor) polynomial is commonly called the ``generator''
polynomial.
-
Use the remainder as the hash code.
-
The CRC algorithm is widely used for detection of errors in data storage
and transmission by unreliable media.
-
With a well-chosen generator polynomial, it's possible to detect ALL:
-
single-bit errors
-
two-bit errors
-
errors where an odd number of bits are affected
-
``burst'' errors (up to the degree of the generator polynomial)
-
The CRC algorithm makes a good hash function.
-
A single-bit difference between two keys yields large differences between
the hash codes.
-
Easy to apply to any size key.
-
The implementation can be reduced to repeated table lookup. (The
table is dependent on the generator polynomial.)
-
Efficient implementations are widely available as part of standard libraries.
Evaluating A Hash Function: A Case Study
-
No hash function is perfect! For any reasonable hash table size,
collisions will occur. We can only hope to minimize them.
-
Often, the best strategy is to test your hash function on representative
input: here