More Linear Sorts

  • Explain and trace "Bucket sort," and "Radix sort" on a given sequence of data.
  • Justify the time and space complexity of these algorithms in the worst case, based on the data size and data range.

In Counting sort, If the range of potential values is large, then counting sort requires a lot of (auxiliary) space. There are variations of Counting sort that address this issue.

Bucket Sort

The idea of Bucket sort is to divide the range of values into $k$ equal-sized subintervals, or buckets, and then distribute the $n$ input numbers into the buckets. To produce the output, we simply sort the numbers in each bucket and then go through the buckets in order, listing the elements in each.

Demo

We can consider Bucket sort as a more generalized version of Counting sort. In the Bucket sort the values are mapped to the buckets whereas counting sort instead stores a single number, the count of items, per bucket. Moreover, Bucket sort can be used for non-integers, as long as the elements can be mapped to the buckets in constant time.

Analysis

Bucket sort runs in $\Omicron(n + d^2k)$ where $n$ is number of items, $k$ is the number of buckets, and $d$ is max (or average) items per bucket.

  • $\Omicron(k)$ to create the bucket table.
  • $\Omicron(n)$ to put $n$ values into $k$ buckets, $\Omicron(1)$ per value.
  • $\Omicron(k)$ to pull out in order times $\Omicron(d^2)$ worst-case sorting each bucket.

If $k \in \Omicron(n)$ and $d$ is a small constant then $\Omicron(n + d^2k)$ becomes $\Omicron(n)$. Here, we are making two assumptions:

  • The input values can be mapped to the range $[0,k)$ in $\Omicron(1)$ time.
  • The inputs are uniformly distributed over $[0, k)$; we don't expect many numbers to fall into each bucket.

Radix Sort

If you want to sort large integers, you can use Radix sort as long as the input numbers have a fixed number of digits. (It won't work for arbitrarily-large numbers.)

The idea behind Radix sort is sorting the input numbers one digit at a time. You can use Counting sort on each digit (start with least significant digit and move towards the most significant digit).

Demo
Analysis

Radix sort takes $\Omicron(\ell \times (n + k))$ where $\ell$ is the number of digits in each item, $n$ is the number of items to sort, and $k$ is the number of values each digit can have.

This time complexity assumes we are calling Counting sort one time for each of the $\ell$ digits in the input numbers, and counting sort has a time complexity of $\Omicron(n + k)$.

The space complexity also comes from counting sort, which requires $\Omicron(n + k)$ space to hold the counts, indices, and output lists.

Counting sort, Bucket sort, and Radix sort can be made stable. We leave it to you as an exercise to look up and figure out how!

Resources