Hash Function

  • Identify the steps of hashing (i.e., convert to hash code and compression).
  • Enumerate the properties of a good hash function.

Imagine the array where we store the keys has the capacity $M$. The job of the mystery function is to map "keys" to "indices" in the range $[0, M-1]$.

The process of mapping the keys to array indices is called hashing.

The mystery function is a hash function.

A hash function performs the following two steps:

  1. convert a key to an integer (hash code) in a range like $[0, N)$.

  2. map the hash code into the smaller range $[0, M - 1]$ where $M \ll N$.

A good hash function is

  • Uniform: maps keys to array indices as evenly as possible (ideally with equal likelihood of generating each index value).

  • Deterministic: a given key is always mapped to the same index (so we can look it up after insertion).

  • Cheap: a constant-time operation that is simple and fast to compute.

Aside: What is "hashing," again?!

A speaker may use the word "hashing" to only mean "generating hash code for any given key," or to the "design of hash functions," or to refer to "hash tables" (the entire process of building these data structures with collision resolution, etc.)! Case in point: when you speak with "others," do not assume they have the same definition of "hashing" as what is described above here.

Resources