Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Linear-Time Sorting and Selection

In Chapter 4 we proved a lower bound: every comparison-based sorting algorithm must make comparisons in the worst case. The efficient algorithms of Chapter 5 — merge sort, quicksort, heapsort — all meet this bound, and none can beat it. But what if we are willing to go beyond pairwise comparisons? If we know something about the structure of the keys — for instance, that they are integers in a bounded range — we can exploit that structure to sort in linear time. In this chapter we study three such algorithms: counting sort, radix sort, and bucket sort. We also turn to a related problem — selection — and present two algorithms that find the th smallest element in time without sorting: randomized quickselect and the deterministic median-of-medians algorithm.

Breaking the comparison lower bound

The lower bound from Chapter 4 applies to comparison-based sorting: algorithms that learn about the input only by comparing pairs of elements. The decision-tree argument shows that any comparison-based algorithm must traverse a binary tree of height at least , because there are possible permutations and each leaf of the decision tree corresponds to one permutation.

This lower bound does not apply if we use operations other than comparisons. If the keys are integers, we can look at individual digits. If the keys are bounded, we can use them as array indices. These non-comparison-based operations give us additional information that comparison-based algorithms cannot access, and this is what allows us to sort faster.

The trade-off is generality: comparison-based sorting works for any totally ordered type, while the algorithms in this chapter require specific key structure (integers, bounded range, uniform distribution).

Counting sort

Counting sort is the simplest linear-time sorting algorithm. It works for non-negative integer keys in a known range and sorts by counting how many times each value appears.

The algorithm

  1. Create an array counts of size , initialized to zeros.
  2. For each element in the input, increment counts[element].
  3. Compute prefix sums: replace each counts[i] with the sum of all counts for values . After this step, counts[i] tells us the position after the last occurrence of value in the sorted output.
  4. Walk the input array in reverse, placing each element at position counts[element] - 1 and decrementing the count. Walking in reverse ensures stability.

Implementation

export function countingSort(elements: number[]): number[] {
  if (elements.length <= 1) {
    return elements.slice(0);
  }

  const max = Math.max(...elements);
  const counts = new Array<number>(max + 1).fill(0);

  // Count occurrences
  for (const val of elements) {
    counts[val]!++;
  }

  // Compute prefix sums (cumulative counts)
  for (let i = 1; i <= max; i++) {
    counts[i]! += counts[i - 1]!;
  }

  // Build output array in reverse for stability
  const output = new Array<number>(elements.length);
  for (let i = elements.length - 1; i >= 0; i--) {
    const val = elements[i]!;
    counts[val]!--;
    output[counts[val]!] = val;
  }

  return output;
}

Tracing through an example

Let us sort .

Step 1–2: Count occurrences. The maximum value is 8, so we create counts of size 9:

Index012345678
Count012210001

Step 3: Prefix sums. Each entry becomes the cumulative count:

Index012345678
Prefix sum013566667

The prefix sum tells us: 0 elements are , 1 element is , 3 elements are , and so on.

Step 4: Place elements (reverse scan).

counts[A[i]] beforeOutput positioncounts[A[i]] after
61100
53544
43433
38766
22322
12211
04655

Result: .

Notice that the two 2s and the two 3s appear in the same relative order as in the input — counting sort is stable.

Stability

Counting sort's stability is not an accident; it is a consequence of scanning the input in reverse during the placement step. When we encounter the last occurrence of a value (scanning right to left), we place it at the highest available position for that value. The second-to-last occurrence goes one position earlier, and so on. This preserves the original relative order among elements with equal keys.

Stability matters when sorting records by one key while preserving order on another, and it is essential for counting sort's role as a subroutine in radix sort.

Complexity analysis

Time. The algorithm makes four passes:

  1. Finding the maximum: .
  2. Counting occurrences: .
  3. Computing prefix sums: .
  4. Placing elements in the output: .

Total: , where is the maximum value.

Space. The counts array uses space, and the output array uses space. Total: .

When is counting sort practical? When , counting sort runs in time and is excellent. When (for example, sorting 10 numbers in the range ), the space and time become prohibitive, and a comparison-based sort would be faster.

Properties

PropertyCounting sort
Time
Space
StableYes
In-placeNo
Key typeNon-negative integers in

Radix sort

Radix sort extends counting sort to handle integers with many digits. Instead of sorting on the entire key at once (which would require a counts array as large as the key range), radix sort processes one digit at a time, from least significant to most significant.

The algorithm

  1. Find the maximum element to determine the number of digits .
  2. For each digit position from least significant to most significant:
    • Sort the array by that digit using a stable sort (counting sort restricted to digits 0–9).

The key insight is that we must process digits from least significant to most significant, and each digit sort must be stable. After sorting by the units digit, elements with the same units digit are in a consistent order. When we then sort by the tens digit, stability ensures that elements with the same tens digit remain sorted by their units digit — and so on.

Why least significant digit first?

It may seem counterintuitive to start with the least significant digit. Consider sorting . If we sorted by the most significant digit first, we would get groups starting with 3, 4, 6, 7. But then sorting by the next digit within each group is exactly the original problem on smaller arrays — we have made no progress toward a linear-time algorithm.

LSD radix sort avoids this by exploiting stability. After sorting by digit , the relative order of elements that agree on digit is determined by the previous passes on digits . When we sort by digit , stability preserves this order among elements with the same digit at position .

Implementation

The digit-level sorting subroutine is a specialized counting sort that operates on a single digit position:

export function countingSortByDigit(
  elements: number[],
  exp: number,
): number[] {
  const n = elements.length;
  if (n <= 1) {
    return elements.slice(0);
  }

  const output = new Array<number>(n);
  const counts = new Array<number>(10).fill(0);

  // Count occurrences of each digit at position exp
  for (const val of elements) {
    const digit = Math.floor(val / exp) % 10;
    counts[digit]!++;
  }

  // Compute prefix sums
  for (let i = 1; i < 10; i++) {
    counts[i]! += counts[i - 1]!;
  }

  // Build output in reverse for stability
  for (let i = n - 1; i >= 0; i--) {
    const val = elements[i]!;
    const digit = Math.floor(val / exp) % 10;
    counts[digit]!--;
    output[counts[digit]!] = val;
  }

  return output;
}

The main radix sort function calls this subroutine for each digit position:

export function radixSort(elements: number[]): number[] {
  if (elements.length <= 1) {
    return elements.slice(0);
  }

  const max = Math.max(...elements);

  let result = elements.slice(0);

  // Process each digit position from least significant to most significant
  for (let exp = 1; Math.floor(max / exp) > 0; exp *= 10) {
    result = countingSortByDigit(result, exp);
  }

  return result;
}

Tracing through an example

Sort .

Pass 1: Sort by units digit ():

ElementUnits digit
1700
455
755
900
8022
244
22
666

After stable sort by units digit: .

Pass 2: Sort by tens digit ():

ElementTens digit
1707
909
8020
20
242
454
757
666

After stable sort by tens digit: .

Notice that 802 and 2 both have tens digit 0, and they remain in the order established by Pass 1 (802 before 2) thanks to stability.

Pass 3: Sort by hundreds digit ():

ElementHundreds digit
8028
20
240
450
660
1701
750
900

After stable sort by hundreds digit: .

Result: . Sorted!

Correctness

Claim. After passes of LSD radix sort, the array is sorted with respect to the last digits.

Proof by induction. After the first pass, the array is sorted by the units digit (the counting sort is correct). Assume after passes the array is sorted by the last digits. Consider two elements and after pass :

  • If and differ in digit : the sort on digit places them correctly.
  • If and have the same digit at position : since the sort is stable, their relative order is preserved from the previous pass, which (by hypothesis) ordered them correctly by their last digits.

In both cases, the elements are correctly ordered by their last digits.

Complexity analysis

Time. Radix sort makes passes, where is the number of digits in the maximum element. Each pass is a counting sort with (the radix), which takes time. Total:

For (bounded number of digits), this is . More generally, if the values are in the range for some constant , then , and radix sort runs in — no better than comparison sort. Radix sort achieves true linear time only when is bounded by a constant independent of .

Space. Each counting sort pass uses auxiliary space.

Properties

PropertyRadix sort
Time where = number of digits
Space
StableYes
Key typeNon-negative integers

Bucket sort

Bucket sort works well when the input is drawn from a uniform distribution over a known range. It distributes elements into equal-width buckets, sorts each bucket individually (typically with insertion sort), and concatenates the sorted buckets.

The algorithm

  1. Determine the range of the input.
  2. Create empty buckets spanning the range.
  3. Place each element in its bucket: element goes to bucket .
  4. Sort each bucket using insertion sort.
  5. Concatenate all buckets.

Implementation

export function bucketSort(
  elements: number[],
  bucketCount?: number,
): number[] {
  const n = elements.length;
  if (n <= 1) {
    return elements.slice(0);
  }

  const max = Math.max(...elements);
  const min = Math.min(...elements);

  // If all elements are the same, return a copy
  if (max === min) {
    return elements.slice(0);
  }

  const numBuckets = bucketCount ?? n;
  const range = max - min;

  // Create empty buckets
  const buckets: number[][] = [];
  for (let i = 0; i < numBuckets; i++) {
    buckets.push([]);
  }

  // Distribute elements into buckets
  for (const val of elements) {
    let index = Math.floor(
      ((val - min) / range) * (numBuckets - 1)
    );
    if (index >= numBuckets) {
      index = numBuckets - 1;
    }
    buckets[index]!.push(val);
  }

  // Sort each bucket using insertion sort and concatenate
  const result: number[] = [];
  for (const bucket of buckets) {
    insertionSortInPlace(bucket);
    for (const val of bucket) {
      result.push(val);
    }
  }

  return result;
}

The subroutine insertionSortInPlace is efficient for the small bucket sizes expected under uniform distribution:

function insertionSortInPlace(arr: number[]): void {
  for (let i = 1; i < arr.length; i++) {
    const key = arr[i]!;
    let j = i - 1;
    while (j >= 0 && arr[j]! > key) {
      arr[j + 1] = arr[j]!;
      j--;
    }
    arr[j + 1] = key;
  }
}

Tracing through an example

Sort using 10 buckets.

Here the range is , so each bucket covers an interval of width roughly .

BucketElements
0[0.17, 0.12]
1[0.26, 0.21, 0.23]
3[0.39]
6[0.72, 0.68]
8[0.78]
9[0.94]

After sorting each bucket and concatenating: .

Complexity analysis

Expected time under uniform distribution. If elements are drawn independently and uniformly from , then with buckets each element lands in a random bucket. The expected number of elements per bucket is 1. By a balls-into-bins argument, the expected total cost of sorting all buckets is:

where is the number of elements in bucket . Since each element independently falls into bucket with probability , we have and . Summing over buckets:

Including the distribution and concatenation steps, the total expected time is .

Worst case. If all elements fall into one bucket, we pay for insertion sort on that bucket. This happens when the distribution is far from uniform.

Space. The buckets collectively hold elements, plus for the bucket array structure. Total: .

Properties

PropertyBucket sort
Expected time (uniform distribution)
Worst-case time
Space
StableYes (with stable per-bucket sort)
Key typeNumeric keys in a known range

Comparison of linear-time sorts

AlgorithmTimeSpaceStableAssumptions
Counting sortYesInteger keys in
Radix sortYesInteger keys with digits
Bucket sort expectedYesUniformly distributed keys

All three algorithms achieve linear time under specific conditions. Counting sort is simplest and best when the key range is not much larger than . Radix sort extends counting sort to larger ranges by processing one digit at a time. Bucket sort is ideal for floating-point data with a known, roughly uniform distribution.

None of these algorithms contradicts the comparison lower bound — they bypass it by using non-comparison operations (indexing into an array by key value, extracting digits).

The selection problem

We now turn to a different problem. Given an unsorted array of elements and an integer (with ), find the th smallest element — the element that would be at index if the array were sorted.

Special cases include:

  • : the minimum (trivially solvable in ).
  • : the maximum.
  • : the median.

The naive approach is to sort the array () and return the element at index . Can we do better? Yes — we can solve the selection problem in time.

Quickselect

Quickselect (also known as Hoare's selection algorithm) is the selection analogue of quicksort. Like quicksort, it partitions the array around a pivot. But unlike quicksort, it only recurses into one side — the side that contains the desired element.

The algorithm

  1. Choose a random pivot and partition the array.
  2. If the pivot lands at position , we are done.
  3. If pivot's position, recurse on the left partition.
  4. If pivot's position, recurse on the right partition.

Implementation

export function quickselect(
  elements: number[],
  k: number,
): number {
  if (elements.length === 0) {
    throw new RangeError('Cannot select from an empty array');
  }
  if (k < 0 || k >= elements.length) {
    throw new RangeError(
      `k=${k} is out of bounds for array of length ${elements.length}`,
    );
  }

  const copy = elements.slice(0);
  return select(copy, 0, copy.length - 1, k);
}

function select(
  arr: number[],
  left: number,
  right: number,
  k: number,
): number {
  if (left === right) {
    return arr[left]!;
  }

  const pivotIndex = randomizedPartition(arr, left, right);

  if (k === pivotIndex) {
    return arr[pivotIndex]!;
  } else if (k < pivotIndex) {
    return select(arr, left, pivotIndex - 1, k);
  } else {
    return select(arr, pivotIndex + 1, right, k);
  }
}

The randomizedPartition function is identical to the one used in randomized quicksort: choose a random element, swap it to the end, partition using the Lomuto scheme.

Tracing through an example

Find the 3rd smallest element (, zero-indexed) in .

The sorted array would be , so the answer is 5.

Iteration 1: Suppose the random pivot is 7 (index 0). After partitioning: , pivot at index 3.

We want , so recurse on the left partition (indices 0–2).

Iteration 2: Suppose the random pivot is 1 (index 1 of the subarray). After partitioning: , pivot at index 0.

We want , so recurse on the right partition (indices 1–2).

Iteration 3: Suppose the random pivot is 5 (index 2). After partitioning: , pivot at index 2.

We want . Done! Return .

Complexity analysis

Expected time. The analysis is similar to randomized quicksort. With a random pivot, the expected partition splits the array roughly in half. But unlike quicksort, we recurse into only one partition, so the expected work at each level halves:

More precisely, the expected number of comparisons is at most (by an analysis similar to the randomized quicksort proof, summing indicator random variables over pairs).

Worst case. If the pivot always lands at one extreme, we have:

This is the same worst case as quicksort, but it is extremely unlikely with random pivots.

Properties

PropertyQuickselect
Expected time
Worst-case time
Space for copy + expected stack
DeterministicNo (randomized)

Median of medians

Can we achieve worst-case selection? The answer is yes, using a clever pivot-selection strategy called median of medians (also known as BFPRT, after its five inventors: Blum, Floyd, Pratt, Rivest, and Tarjan, 1973).

The idea: instead of choosing a random pivot, choose a pivot that is guaranteed to be near the median, ensuring that each partition eliminates a constant fraction of the elements.

The algorithm

  1. Divide the elements into groups of 5.
  2. Find the median of each group by sorting (5 elements can be sorted in constant time).
  3. Recursively compute the median of these medians.
  4. Use this "median of medians" as the pivot for partitioning.
  5. Recurse into the appropriate partition (just like quickselect).

Why groups of 5?

The choice of 5 is not arbitrary. It is the smallest odd group size that makes the recurrence work out to . The median of medians is guaranteed to be larger than at least elements and smaller than at least elements. This means each recursive call operates on at most roughly elements.

Here is why: the median of medians is larger than the medians of half the groups (roughly groups), and each of those medians is larger than 2 elements in its group. Therefore, the pivot is larger than at least elements. By symmetry, it is also smaller than at least elements. The worst-case partition is therefore at most .

Implementation

export function medianOfMedians(
  elements: number[],
  k: number,
): number {
  if (elements.length === 0) {
    throw new RangeError('Cannot select from an empty array');
  }
  if (k < 0 || k >= elements.length) {
    throw new RangeError(
      `k=${k} is out of bounds for array of length ${elements.length}`,
    );
  }

  const copy = elements.slice(0);
  return selectMoM(copy, 0, copy.length - 1, k);
}

The core recursive function:

function selectMoM(
  arr: number[],
  left: number,
  right: number,
  k: number,
): number {
  // Base case: small enough to sort directly
  if (right - left < 5) {
    insertionSortRange(arr, left, right);
    return arr[k]!;
  }

  // Step 1: Divide into groups of 5, find median of each
  const numGroups = Math.ceil((right - left + 1) / 5);
  for (let i = 0; i < numGroups; i++) {
    const groupLeft = left + i * 5;
    const groupRight = Math.min(groupLeft + 4, right);

    insertionSortRange(arr, groupLeft, groupRight);

    const medianIndex =
      groupLeft + Math.floor((groupRight - groupLeft) / 2);
    swap(arr, medianIndex, left + i);
  }

  // Step 2: Recursively find the median of the medians
  const medianOfMediansIndex =
    left + Math.floor((numGroups - 1) / 2);
  selectMoM(arr, left, left + numGroups - 1, medianOfMediansIndex);

  // Step 3: Partition around the median of medians
  const pivotIndex = partitionAroundPivot(
    arr, left, right, medianOfMediansIndex
  );

  if (k === pivotIndex) {
    return arr[pivotIndex]!;
  } else if (k < pivotIndex) {
    return selectMoM(arr, left, pivotIndex - 1, k);
  } else {
    return selectMoM(arr, pivotIndex + 1, right, k);
  }
}

Tracing through an example

Find the median (, zero-indexed) in:

.

The sorted array is , so the answer at is 8.

Step 1: Divide into groups of 5 and find medians.

GroupElementsSortedMedian
1[12, 3, 5, 7, 19][3, 5, 7, 12, 19]7
2[26, 4, 1, 8, 15][1, 4, 8, 15, 26]8
3[20, 11, 9, 2, 6][2, 6, 9, 11, 20]9

Step 2: Median of medians. The medians are . The median of this group is 8.

Step 3: Partition around 8. Using 8 as the pivot, elements go left, elements go right:

The pivot lands at index 7. We want , and the pivot is at index 7. Done! Return 8.

Complexity analysis

Time. Let be the worst-case time for selecting from elements.

The algorithm does the following work:

  • Sorting groups of 5: total.
  • Finding the median of medians: .
  • Partitioning: .
  • Recursing into the larger partition: at most .

This gives the recurrence:

We claim . To verify, assume for some constant . Then:

provided . Since , the two recursive calls together operate on a shrinking fraction of the input, and the algorithm runs in time.

Space. The recursion has depth (each level reduces the problem by a constant factor), so the stack space is . Combined with the copy, total space is .

Practical considerations

While the median-of-medians algorithm is a beautiful theoretical result — it proved that deterministic linear-time selection is possible — it is rarely used in practice. The constant factor hidden in the is large (roughly 5–10× slower than randomized quickselect for typical inputs). Randomized quickselect is almost always faster in practice because:

  1. It avoids the overhead of computing medians of groups.
  2. Random pivots are usually good enough.
  3. The probability of quadratic behavior is astronomically small.

The practical value of median of medians is primarily as a fallback: some implementations (e.g., the introselect algorithm in C++ STL) start with quickselect and switch to median of medians if the recursion depth grows too large, guaranteeing worst-case while maintaining fast average-case performance.

Properties

PropertyMedian of medians
Worst-case time
Space
DeterministicYes
PracticalSlower than quickselect due to large constants

Chapter summary

In this chapter we studied algorithms that break the comparison-based sorting barrier and solve the selection problem in linear time:

  • Counting sort sorts non-negative integers in time by counting occurrences and computing prefix sums. It is stable and serves as a building block for radix sort.

  • Radix sort extends counting sort to handle integers with multiple digits, sorting digit by digit from least significant to most significant. It runs in time where is the number of digits. The key requirement is a stable subroutine sort.

  • Bucket sort distributes elements into buckets, sorts each bucket, and concatenates. Under a uniform distribution, the expected time is . Its worst case is when all elements land in one bucket.

  • Quickselect finds the th smallest element in expected time by partitioning around a random pivot and recursing into one side. It is the practical algorithm of choice for selection.

  • Median of medians achieves worst-case selection through a carefully chosen pivot: the median of group medians. While theoretically optimal, its large constant factor makes it slower than randomized quickselect in practice.

The linear-time sorting algorithms teach an important lesson: algorithmic lower bounds depend on the model of computation. The bound is real for comparison-based sorting, but by stepping outside the comparison model — using integers as array indices, extracting digits — we can do better. The selection algorithms show that finding a single order statistic is fundamentally easier than fully sorting, requiring only time regardless of the method.

Exercises

Exercise 6.1. Trace through counting sort on the input . Show the counts array after each step (counting, prefix sums, placement). Verify that the sort is stable by tracking the original indices of elements with value 3.

Exercise 6.2. Radix sort processes digits from least significant to most significant, using a stable sort at each step. What goes wrong if we process digits from most significant to least significant? Give a concrete example where MSD radix sort (without special handling) produces incorrect output.

Exercise 6.3. Counting sort uses space for the counts array, where is the maximum value. If we need to sort integers in the range , we could use counting sort directly with , or we could use radix sort with a base- representation (2 digits). Compare the time and space complexity of both approaches.

Exercise 6.4. Consider a modification of quickselect where, instead of choosing a random pivot, we always choose the first element as the pivot. Describe an input of size for which this modified quickselect takes time to find the median. Then describe an input for which it takes time.

Exercise 6.5. The median-of-medians algorithm divides elements into groups of 5. What happens if we use groups of 3 instead? Set up the recurrence and show that it does not solve to . What about groups of 7? (Hint: compute the fraction of elements guaranteed to be eliminated at each step for each group size.)