Module 1: Introduction and Asymptotic Analysis¶

Random Access Machine (RAM) model¶

Each memory cell stores one (finite-length) datum, typically a number, character, or reference. Assumption: cells are big enough to hold the items that we store.
Any access to a memory location takes constant time.
Any primitive operation takes constant time. (Add, subtract, multiply, divide, follow a reference, ...)
- Anything involving irrational numbers is not primitive

These assumptions may not be valid for a “real” computer.

Running Time and Space¶

The running time is the number of memory accesses plus the number of primitive operations.
The space is the maximum number of memory cells ever in use.
Size(\(I\)) of instance \(I\) is the number of memory cells that \(I\) occupies.

Order Notations¶

In this course, \(\log(x) = \log_2(x)\), where the base is 2.

The formal definition of \(O\)-notation is as follows

Definition

Let \(f(x)\) and \(g(x)\) be two functions on the positive reals. We say that \(f(x)\) is asymptotically upper-bounded by \(g\) if there exist \(c > 0\) and \(n_0 \geq 0\) such that \(|f(x)| \leq c|g(x)|\) for all \(x \geq n_0\).

The set \(O(g(x))\) is the set of all those functions \(f\) on the positive reals that are asymptotically upper-bounded by \(g\).

Combining the two definitions, we say that \(f(x) \in O(g(x))\) (pronounced ‘\(f\) is in big-O of \(g\)’) if there exist \(c > 0\) and \(n_0 \geq 0\) such that \(|f(x)| \leq c|g(x)|\) for all \(x \geq n_0\).

As a general, using a bigger \(n_0\) can never hurt, but it does need to be finite. With a bigger \(n_0\), it is (for \(O\)-notation) often feasible to find a smaller \(c\).

How does one come with the constants?

Write down what you want to hold
Once you have picked your \(c\), re-write what you need to show
Resolve it for \(n\), and see what this tells you in terms of lower bounds for \(n\)
Verify that it actually works, i.e., prove that \(f(n) \leq cg(n) \text{ for } n_0\) for your choice of \(c\) and \(n_0\).

Definition

Let \(f(n, m)\) and \(g(n, m)\) be two functions with domain in \(\mathbb{R}^2\). We say that \(f(n, m) \in O(g(n, m))\) if there exist constants \(c > 0\) and \(n_0 \geq 0\) such that \(|f(n, m)| \leq c|g(n, m)|\) whenever \(n \geq n_0\) and \(m \geq n_0\).

Definition

Let \(f(x)\) and \(g(x)\) be two functions on the positive reals. We say that \(f(x) \in \Omega(g(x))\) (\(f\) is asymptotically lower-bounded by \(g\)) if there exist \(c > 0\) and \(n_0 \geq 0\) such that \(|f(x)| \geq c|g(x)|\) for all \(x \geq n_0\).

We only care about what happens for sufficiently large \(n\)!

Definition

Let \(f(x)\) and \(g(x)\) be two functions on the positive reals. We say that \(f(x) \in \Theta(g(x))\) if \(f(x) \in O(g(x))\) and \(f(x) \in \Omega(g(x))\). So there exist constants \(n_0 \geq 0\) and \(c_1, c_2 > 0\) such that \(c_1|g(x)| \leq |f(x)| \leq c_2|g(x)|\) for all \(x \geq n_0\).

We may be able to prove both \(O\) and \(\Omega\) in one shot, but it is rare. Usually when proving a \(\Theta\)-bound we have to prove \(O\)-bound and \(\Omega\)-bound separately.

Common Growth Rates

Commonly encountered growth rates in analysis of algorithms include the following:

\(\Theta(1)\) (constant),
\(\Theta(\log n)\) (logarithmic),
\(\Theta(n)\) (linear),
\(\Theta(n \log n)\) (linearithmic),
\(\Theta(n \log^k n)\), for some constant \(k\) (quasi-linear),
\(\Theta(n^2)\) (quadratic),
\(\Theta(n^3)\) (cubic),
\(\Theta(2^n)\) (exponential).

These are sorted in increasing order of growth rate.

Definition

Let \(f(x)\) and \(g(x)\) be two functions on the positive reals. We say that \(f(x) \in o(g(x))\) (\(f\) is asymptotically strictly smaller than \(g\)) if for any constant \(c > 0\) there exists an \(n_0 \geq 0\) such that \(|f(x)| \leq c|g(x)|\) for all \(x \geq n_0\).

The main difference between the definition of little-\(o\) and big-\(O\) is the change of quantifiers. For big-\(O\), the inequality must hold for one constant \(c\). You get to choose \(c\), and making it bigger will be to your advantage. In contrast, for little-\(o\) the inequality must hold for all constants \(c\). You have no control over \(c\), it will be given to you (and in particular, it could be arbitrarily small). Your only control is over \(n_0\).

\(n_0\) will depend on \(c\), so it is really a function \(n_0(c)\). We also say ‘growth rate of \(f\) is less than the growth rate of \(g\)’. Rarely proved from first principles (instead use limit-rule)

Definition

We say that \(f(x) \in \omega(g(x))\) (\(f\) is asymptotically strictly larger than \(g\)) if for any constant \(c > 0\) there exists an \(n_0 \geq 0\) such that \(|f(x)| \geq c|g(x)|\) for all \(x \geq n_0\).

Symmetric, the growth rate of \(f\) is more than the growth rate of \(g\).

Lemma (The Limit Rule)

Suppose that \(f(n) > 0\) and \(g(n) > 0\) for all \(n \geq n_0\). Suppose that

\[ L = \lim_{n \to \infty} \frac{f(n)}{g(n)} \quad \text{ (in particular, the limit exists).} \]

Then

\[ f(n) \in \begin{cases} o(g(n)) & \text{if } L = 0 \\ \Theta(g(n)) & \text{if } 0 < L < \infty \\ \omega(g(n)) & \text{if the fraction goes towards } \infty \end{cases} \]

We know nothing if the limit does not exist! However, if the limit does not exist, then \(f(n) \notin o(g(n))\)

The required limit can often be computed using L'Hôpital's rule (If the numerator and the denominator in a limit either both go to 0, or both go to infinity, then we can replace the terms by their derivative). Note that this result gives sufficient (but not necessary) conditions for the stated conclusion to hold.

Algebra of Order Notations

Identity rule: \(f(n) \in \Theta(f(n))\)

Transitivity:

If \(f(n) \in O(g(n))\) and \(g(n) \in O(h(n))\) then \(f(n) \in O(h(n))\).
If \(f(n) \in \Omega(g(n))\) and \(g(n) \in \Omega(h(n))\) then \(f(n) \in \Omega(h(n))\).
If \(f(n) \in O(g(n))\) and \(g(n) \in o(h(n))\) then \(f(n) \in o(h(n))\).
\(\ldots\)

Maximum rules: Suppose that \(f(n) > 0\) and \(g(n) > 0\) for all \(n \geq n_0\). Then:

\(f(n) + g(n) \in O(\max\{f(n), g(n)\})\)
\(f(n) + g(n) \in \Omega(\max\{f(n), g(n)\})\)

Key proof-ingredient: \(\max\{f(n), g(n)\} \leq f(n) + g(n) \leq 2\max\{f(n), g(n)\}\)

Relationships between Order Notations

\(f(n) \in \Theta(g(n)) \iff g(n) \in \Theta(f(n))\)
\(f(n) \in O(g(n)) \iff g(n) \in \Omega(f(n))\)
\(f(n) \in o(g(n)) \iff g(n) \in \omega(f(n))\)
\(f(n) \in \Theta(g(n)) \iff f(n) \in O(g(n)) \text{ and } f(n) \in \Omega(g(n))\)
\(f(n) \in o(g(n)) \implies f(n) \in O(g(n))\)
\(f(n) \in o(g(n)) \implies f(n) \not\in \Omega(g(n))\)
\(f(n) \in \omega(g(n)) \implies f(n) \in \Omega(g(n))\)
\(f(n) \in \omega(g(n)) \implies f(n) \not\in O(g(n))\)

When doing arithmetic with asymptotic notations, do not write \(O(n)+O(n) = O(n)\). Instead, replace ‘\(\Theta(f(n))\)’ by ‘\(c \cdot f(n) \text{ for some constant } c>0\)’.

Running time of an algorithm depends on the input size \(n\).

Techniques for Run-time Analysis

Identify primitive operations that require \(\Theta(1)\) time. (For doing arithmetic, assume they require \(c\) time for some \(c > 0\).)
The complexity of a loop is expressed as the sum of the complexities of each iteration of the loop.
Nested loops: start with the innermost loop and proceed outwards. This gives nested summations.

**insertion-sort(A, n)**
A: array of size n

for (i ← 1; i < n; i++) do
    for (j ← i; j > 0 and A[j−1] > A[j]; j--) do
        swap A[j] and A[j − 1]

Complexity of Algorithms¶

Worst-case (best-case) complexity of an algorithm

The worst-case (best-case) running time of an algorithm \(\mathcal{A}\) is a function \(T: \mathbb{Z}^+ \rightarrow \mathbb{R}\) mapping \(n\) (the input size) to the longest (shortest) running time for any input instance of size \(n\):

Worst-case running time:

\[ T_\mathcal{A}^{\text{worst}}(n) = \max_{I \in \mathcal{I}_n} \{T_\mathcal{A}(I)\} \]

Best-case running time:

\[ T_\mathcal{A}^{\text{best}}(n) = \min_{I \in \mathcal{I}_n} \{T_\mathcal{A}(I)\} \]

Average-case complexity of an algorithm

The average-case running time of an algorithm \(\mathcal{A}\) is a function \(T: \mathbb{Z}^+ \rightarrow \mathbb{R}\) mapping \(n\) (the input size) to the average running time of \(\mathcal{A}\) over all instances of size \(n\):

\[ T_\mathcal{A}^{\text{avg}}(n) = \sum_{I \in \mathcal{I}_n} T_\mathcal{A}(I) \cdot (\text{relative frequency of } I) \]

Lemma

Let \(f(n)\), \(g(n)\) be two functions that satisfy \(f(n), g(n) > 0\) for sufficiently large \(n\). Then \(f(n) \in o(g(n))\) if and only if

\[ \lim_{n \to \infty} \frac{f(n)}{g(n)} \]

exists and equals \(0\).

Polynomial-rule:

If \(f(n)\) is a polynomial with degree \(d\), then \(f(n) \in \Omega(n^d)\).

Logarithm-rules:

\[ \log_b(n) \in \Theta(\log n) \]

for any base \(b>1\).

Lemma (Generalized log-rule)

For any positive constants \(c > 0\) and \(d > 0\), we have \((\log n)^c \in o(n^d)\).

Lemma

Let \(f(n)\), \(g(n)\) be two functions that satisfy \(f(n), g(n) > 0\) for sufficiently large \(n\). If \(f(n) \in o(g(n))\), then \(f(n) \not\in \Omega(g(n))\).

Example: `merge-sort`¶

**merge-sort(A, n)**
A: array of size n

if (n ≤ 1) then return
else
    m = ⌊(n − 1)/2⌋
    merge-sort(A[0..m], m + 1)
    merge-sort(A[m + 1..n−1],r)
    merge(A[0..m], A[m + 1..n−1])

Better implementation:

**merge-sort(A, n, ℓ ← 0,r ← n − 1, S ← NULL)**
A: array of size n, 0 ≤ ℓ ≤ r ≤ n − 1

if S is NULL do initialize it as array S[0..n − 1]
if (r ≤ ℓ) then
    return
else
    m = ⌊(r + ℓ)/2⌋
    merge-sort(A, n, ℓ, m, S)
    merge-sort(A, n, m + 1,r, S)
    merge(A, ℓ, m,r, S)

Sub-routine merge:

**merge(A, ℓ, m,r, S)**
A[0..n − 1] is an array, A[ℓ..m] is sorted, A[m + 1..r] is sorted
S[0..n − 1] is an array

copy A[ℓ..r] into S[ℓ..r]
(iL, iR) ← (ℓ, m + 1); // start-indices of subarrays
for (k ← ℓ; k ≤ r; k++) do // fill-index for result
    if (iL > m) A[k] ← S[iR++]
    else if (iR > r) A[k] ← S[iL++]
    else if (S[iL] ≤ S[iR]) A[k] ← S[iL++]
    else A[k] ← S[iR++]

Time complexity: \(\Theta(n)\)

Note

❗

Concept Explanation from ChatGPT:

This pseudocode outlines the merge function of the merge sort algorithm, which is used to combine two sorted subarrays into a single sorted array. Let's break it down step by step:

Inputs and Preconditions:
- A: The original array containing elements that need to be merged.
- ℓ (ell), m, r: These are indices in array A such that A[ℓ..m] and A[m+1..r] are sorted subarrays.
- S: A temporary array used to store the merged result before copying it back into A.
Pseudocode Steps:
- Step 1: Copy the elements from the portion of A we are interested in (A[ℓ..r]) into the temporary array S. This step is about preparing for merging without affecting the original array.
- Step 2: Initialize two pointers, iL and iR, to the start of the two subarrays in S. iL starts at ℓ, and iR starts at m+1. These pointers are used to track which elements from the left and right subarrays have been merged.
- Step 3: Use a loop with index k that runs from ℓ to r to fill in the array A with the merged elements. Here's what happens inside the loop:
  - Step 4: If iL has passed the end of the left subarray (iL > m), it means all elements from the left subarray are already in A. So, take the element from the right subarray (S[iR]) and increment iR.
  - Step 5: If iR has passed the end of the right subarray (iR > r), it means all elements from the right subarray are already in A. So, take the element from the left subarray (S[iL]) and increment iL.
  - Step 6: If neither subarray has been exhausted, compare the elements pointed to by iL and iR. Place the smaller element into A[k] and increment the respective pointer (iL or iR) to move to the next element in that subarray.
  - Step 7: If the element in the right subarray (S[iR]) is smaller, place it in A[k] and increment iR.

By the end of the loop, all elements from the subarrays will have been merged back into A in sorted order between indices ℓ and r. This process is repeated recursively for each pair of subarrays in the merge sort algorithm until the entire array is sorted.

The run-time of `merge-sort`¶

The recurrence relation for \(T(n)\) is as follows (constant factor \(c\) replaces \(\Theta\)):
- Initial the array takes time \(\Theta(n)\)

\[ T(n) = \begin{cases} T(\lfloor \frac{n}{2} \rfloor) + T(\lceil \frac{n}{2} \rceil) + cn & \text{if } n > 1 \\ c & \text{if } n = 1 \end{cases} \]

The following is the corresponding sloppy recurrence (it has floors and ceilings removed):

\[ T(n) = \begin{cases} 2T(\frac{n}{2}) + cn & \text{if } n > 1 \\ c & \text{if } n = 1 \end{cases} \]

When \(n\) is a power of \(2\), then the exact and sloppy recurrences are identical and can easily be solved by various methods.
- E.g. prove by induction that \(T(n) = cn \log(2n) \in \Theta(n \log n)\).
It is possible to show that \(T(n) \in \Theta(n \log n)\) for all \(n\) by analyzing the exact recurrence.

Explaining the solution of a problem¶

Describe the overall idea
Give pseudo-code or detailed description
Argue correctness
1. Typically state loop-invariants, or other key-ingredients
2. Sometimes obviously enough from idea-description and comments
Analyze the run-time
1. First analyze work done outside recursions
2. If applicable, analyze subroutines separately
3. If there are recursions: how big are the subproblems? The run-time then becomes a recursive function.

Recall

Writing \(O(n)+O(n)=O(n)\) is very bad style
It even occasionally leads to incorrect results

Some Recurrence Relations¶

These bounds are tight if the upper bounds are tight

(*) These may or may not get used later in the course.

Module 1: Introduction and Asymptotic Analysis¶

Random Access Machine (RAM) model¶

Running Time and Space¶

Order Notations¶

Complexity of Algorithms¶

Example: merge-sort¶

The run-time of merge-sort¶

Explaining the solution of a problem¶

Some Recurrence Relations¶

Example: `merge-sort`¶

The run-time of `merge-sort`¶