Module 3: Sorting, Average-case and Randomization¶

Analyzing average-case run-time¶

Recall definition of average-case run-time:

\[ T^{\text{avg}}_\mathcal{A}(n) = \sum_{\text{instance } I \text{ of size } n} T_\mathcal{A}(I) \cdot (\text{relative frequency of } I) \]

For this module:

Assume that the set \(\mathcal{I}_n\) of size- \(n\) instances is finite (or can be mapped to a finite set in a natural way)
Assume that all instances occur equally frequently

Then we can use the following simplified formula:

\[ T^{\text{avg}}(n) = \frac{\sum_{I: \text{ size}(I) = n} T(I)}{\#\text{instances of size } n} = \frac{1}{|\mathcal{I}_n|} \sum_{I \in \mathcal{I}_n} T(I) \]

Selection and quick-select¶

quick-select and the related quick-sort rely on two subroutines:

choose-pivot(A): Return an index \(p\) in \(A\). We will use the pivot-value \(v ← A[p]\) to rearrange the array.

partition(A, p): Rearrange \(A\) and return pivot-index \(i\) so that

the pivot-value \(v\) is in \(A[i]\),
all items in \(A[0, \ldots,i-1]\) are \(\leq v\), and
all items in \(A[i+1,\ldots, n-1]\) are \(\geq v\).

where \(p=\) index of pivot-value before partition (we choose it), and \(i=\) index of pivot-value after partition (no control).

Partition Algorithm

**partition(A, p)**
A: array of size n, p: integer s.t. 0 ≤ p < n

Create empty lists smaller, equal and larger.
v ← A[p]
for each element x in A do
    if x < v then smaller.append(x)
    else if x > v then larger.append(x)
    else equal.append(x).
i ← smaller.size
j ← equal.size
Overwrite A[0...i−1] by elements in smaller
OverwriteA[i...i+j−1]by elements in equal
Overwrite A[i+j ...n−1] by elements in larger
return i

Efficient In-Place partition (Hoare)

**partition(A, p)**
A: array of size n,
p: integer s.t. 0≤p<n

swap(A[n−1], A[p])
i ←−1, j ←n−1, v ←A[n−1]
loop
    do i ←i+1 while A[i]<v
    do j←j−1 while j>i and A[j]>v
    if i ≥ j then break
    else swap(A[i], A[j])
end loop
swap(A[n−1], A[i])
return i

Running time: \(\Theta(n)\).

Observe: \(n\) key-comparisons (comparing two input-items).

`quick-select` Algorithm¶

Selection: Want item \(m\) such that (after rearranging \(A\)) we have

**quick-select(A, k)**
A: array of size n, k: integer s.t. 0≤k<n

1. p ← choose-pivot(A)
2. i ← partition(A, p)
3. if i = k then return A[i]
4. else if i > k then return quick-select(A[0...i−1], k)
5. else if i < k then return quick-select(A[i+1...n−1], k−(i+1))

Note

❗

Concept Explanation from ChatGPT:

This is an implementation of the quick-select algorithm, which is used to find the k-th smallest element in an unordered list. This algorithm is closely related to the quicksort sorting algorithm and uses a similar approach of partitioning the array around a pivot element. Here's a step-by-step explanation of the code:

Choose a Pivot: The first step in the quick-select function is to choose a pivot element from the array A. This is done by the choose-pivot(A) function. The method of choosing a pivot can vary, but common strategies include picking the first element, the last element, the middle element, or a random element of the array.
Partition the Array: Once a pivot is chosen, the array is partitioned by the partition(A, p) function. The array is rearranged so that all elements less than the pivot come before the pivot, and all elements greater than the pivot come after it. The partition function returns the index i where the pivot element finally resides in the array.
Determine the k-th Smallest Element:
- Case 1: Pivot is the k-th Element: If the index i of the pivot element after partitioning is exactly equal to k, then the pivot element is the k-th smallest element, and it is returned.
- Case 2: Pivot Index Greater than k: If i is greater than k, this indicates that the k-th smallest element is in the left partition of the array. The algorithm recursively calls quick-select on the sub-array consisting of elements before the pivot (A[0...i-1]) to find the k-th smallest element.
- Case 3: Pivot Index Less than k: If i is less than k, this indicates that the k-th smallest element is in the right partition of the array. However, we need to adjust k because we are now looking for a different order statistic in the sub-array A[i+1...n-1]. Specifically, we are looking for the (k - (i+1))th smallest element in the right sub-array.

Analysis of `quick-select`¶

Let \(T(n,k)\) be the number of key-comparisons in a size-\(n\) array with parameter \(k\). This is asymptotically the same as the run-time. partition uses \(n\) key-comparisons.

Worst-case run-time:

Sub-array always gets smaller, so \(\leq n\) recursions.
Each takes \(\leq n\) comparisons \(\Rightarrow O(n^2)\) time.
This is tight: If pivot-value is always the maximum and \(k = 0\)

\[ T^{\text{worst}}(n, 0) \geq n + (n-1) + (n-2) + \ldots + 1 \in \Omega(n^2) \]

Best-case run-time:¶

First chosen pivot could be the \(k\)th element
No recursive calls; \(T(n, k) = n \in \Theta(n)\)

Randomized algorithms¶

A randomized algorithm is one which relies on some random numbers in addition to the input.

Doing randomization is often a good idea if an algorithm has bad worst-case time but seems to perform much better on most instances.
It can also (with restrictions) be used to bound the avg-case run-time.

Goal: Shift the dependency of run-time from what we can’t control (the input) to what we can control (the random numbers).

No more bad instances, just unlucky numbers.

Expected run-time¶

The run-time of the algorithm now depends on the random numbers.

Define \(T_\mathcal{A}(I,R)\) to be the run-time of a randomized algorithm \(\mathcal{A}\) for an instance \(I\) and the sequence \(R\) of outcomes of random trials.

The expected run-time \(T^{\text{exp}}(I)\) for instance \(I\) is the expected value:

\[ T^{\text{exp}}(I) = \mathbf{E}[T(I, R)] = \sum_{R} T(I, R) \cdot \Pr(R) \]

Now take the maximum over all instances of size \(n\) to define the expected run-time (or formally: worst-instance expected-luck run-time) of \(\mathcal{A}\):

\[ T^{\text{exp}}(n) := \max_{I \in \mathcal{I}_n} T^{\text{exp}}(I) \]

In contrast to average-case analysis, the natural guess usually is correct for the expected run-time.

Are average-case run-time and expected run-time the same? No!

There is a relationship only if the randomization effectively achieves “choose the input instance randomly”.

Summary of SELECTION¶

randomized-quick-select has expected run-time \(\Theta(n)\).
- The run-time bound is tight since partition takes \(\Omega(n)\) time.
- If we're unlucky in the random numbers then the run-time is still \(\Omega(n^2)\).
So the expected run-time of shuffled-quick-select is \(\Theta(n)\).
So the run-time of quick-select on randomly chosen input is \(\Theta(n)\).
So the average-case run-time of quick-select is \(\Theta(n)\).
randomized-quick-select is generally the fastest solution to SELECTION.

Sorting and `quick-sort`¶

**quick-sort(A)**
A: array of size n

if n ≤ 1 then return
p ← choose-pivot(A)
i ← partition(A, p)
quick-sort(A[0, 1, . . . , i−1])
quick-sort(A[i+1, . . . , n−1])

`quick-sort` Analysis¶

Set \(T(A) :=\) # of key-comparison for quick-sort in array \(A\).

Worst-case run-time:

\(\Theta(n^2)\)

Sub-arrays get smaller \(\Rightarrow \leq n\) levels of recursions.
On each level there are \(\leq n\) items in total \(\Rightarrow \leq n\) key-comparisons.
So run-time in \(O(n^2)\); this is tight exactly as for quick-select.

Best-case run-time: \(\Theta(n)\)

If pivot-index is always in the middle, then we recurse in two sub-arrays of size \(\leq n/2\).
\(T(n) \leq n + 2T(n/2) \in O(n \log n)\) exactly as for merge-sort.
This can be shown to be tight.

Summary of `quick-sort`¶

randomized-quick-sort has expected run-time \(\Theta(n \log n)\).
- The run-time bound is tight since the best-case run-time is \(\Omega(n \log n)\)
- If we're unlucky in the random numbers then the run-time is still \(\Omega(n^2)\)
Can show: This has the same expected run-time as quick-select on randomly chosen input
So the average-case run-time of quick-sort is \(\Theta(n \log n)\).

Auxiliary space?¶

Each nested recursion-call requires \(O(1)\) space on the call stack.
As described, quick-sort/randomized-quick-sort use \(\Omega(n)\) nested recursion-calls in the worst case.
So \(O(n)\) auxiliary space (can be improved to \(O(\log n)\))

**randomized-quick-sort-improved(A, n)**

Initialize a stack S of index-pairs with { (0, n−1) }
while S is not empty
    (ℓ,r) ← S.pop()
    while (r−ℓ+1 > 10) do
        p ← ℓ + random(ℓ−r+1)
        i ← partition(A, ℓ,r, p)
        if (i−ℓ > r−i) do
            S.push( (ℓ, i−1) )
            ℓ ← i+1
        else
            S.push( (i+1,r) )
            r ← i−1
insertion-sort(A)

Lower Bound for Comparison-Based Sorting¶

Comparison-based sorting algorithm lower bound is \(\Omega(n\log n)\), but non-comparison-based sorting can achieve \(O(n)\) (under restrictions!).

Theorem

Any comparison-based sorting algorithm requires in the worst case \(\Omega(n \log n)\) comparisons to sort \(n\) distinct items.

Note

Any comparison-based algorithms can be expressed as decision tree.

Non-Comparison-Based Sorting¶

Assume keys are numbers in base \(R\) (\(R\): radix)

So all digits are in \(\{0, \ldots,R-1\}\)

Assume all keys have the same number \(m\) of digits.

Can achieve after padding with leading 0s

(Single-digit) bucket-sort¶

**bucket-sort(A, n,sort-key(·))**
A: array of size n
sort-key(·) : maps items of A to {0, . . . , R−1}

Initialize an array B[0...R − 1] of empty queues (buckets)
for i ← 0 to n−1 do
    Append A[i] at end of B[sort-key(A[i])]
i ← 0
for j ← 0 to R − 1 do
    while B[j] is non-empty do
        move front element of B[j] to A[i++]

In our example sort-key(A[i]) returns the last digit of A[i]
bucket-sort is stable: equal items stay in original order.
Run-time \(\Theta(n+R)\), auxiliary space \(\Theta(n+R)\)

Most-significant-digit(MSD)-radix-sort¶

**MSD-radix-sort(A, n, d ← 1)**
A: array of size n, contains m-digit radix-R numbers

if (d ≤ m and (n > 1))
    bucket-sort(A, n,‘return dth digit of A[i]’)
    ℓ ← 0 // find sub-arrays and recurse
    for j ← 0 to R − 1
        Let r ≥ ℓ − 1 be maximal s.t. A[ℓ..r] have dth digit j
        MSD-radix-sort(A[ℓ..r],r−ℓ+1, d+1)
        ℓ ← r + 1

\(\Theta(m)\) levels of recursion in worst-case
\(\Theta(n)\) subproblems on most levels in worst-case.
\(\Theta(R+\text{size of sub-array}))\) time for each bucket-sort call.
Run-time \(\Theta(mnR)\) - slow. Many recursions and allocated arrays.

Least-significant-digit(LSD)-radix-sort¶

**LSD-radix-sort(A, n)**
A: array of size n, contains m-digit radix-R numbers

for d ← least significant to most significant digit do
    bucket-sort(A, n, ‘return dth digit of A[i]’)

Loop-invariant: \(A\) is sorted with respect to digits \(d, \ldots,m\) of each entry.

Time cost: \(\Theta(m(n+R))\)

Auxiliary space: \(\Theta(n+R)\)