Introduction¶
Time Complexity¶
Roughly, time complexity of an algorithm is the number of operations that the algorithm requires. However, the precise constant is probably machine-dependent (depending on the primitive operations that are supported) and may also be different to work out. So, the standard practice is to use asymptotic time complexity to analyze algorithms.
Asymptotic Time Complexity¶
Given two functions \(f(n), g(n)\), we say:
-
Upper bound (big-O)
\(g(n) = O(f(n)) \quad \text{if} \quad \lim_{n \to \infty} \frac{g(n)}{f(n)} \leq c \quad \text{for some constant } c \text{ (independent of } n \text{)}\)
-
Lower bound (big-Ω)
\(g(n) = \Omega(f(n)) \quad \text{if} \quad \lim_{n \to \infty} \frac{g(n)}{f(n)} \geq c \quad \text{for some constant } c\)
-
Same order (big-Θ)
\(g(n) = \Theta(f(n)) \quad \text{if} \quad \lim_{n \to \infty} \frac{g(n)}{f(n)} = c \quad \text{for some constant } c\)
-
Loose upper bound (small-o)
\(g(n) = o(f(n)) \quad \text{if} \quad \lim_{n \to \infty} \frac{g(n)}{f(n)} = 0\)
-
Loose lower bound (small-ω)
\(g(n) = \omega(f(n)) \quad \text{if} \quad \lim_{n \to \infty} \frac{g(n)}{f(n)} = \infty\)
Worst Case Complexity¶
- An algorithm has time complexity \(O(f(n))\) if it requires at most \(O(f(n))\) primitive operations for all inputs of size \(n\) (e.g., \(n\) bits, \(n\) numbers, \(n\) vertices, etc.).
- This analysis focuses only on deterministic algorithms.
- Asymptotic time complexity ignores constant factors and lower-order terms.
- Example:
- An algorithm with running time \(2^{2^{100}}n^2\) is asymptotically faster than one with running time \(n^3\).
- Practically, \(n^3\) might be faster for small \(n\).
- This situation is rare, but if it occurs:
- The quadratic algorithm can usually be improved to have a smaller constant.
- \(2^{2^{100}}n^2\) does run faster than \(n^3\) for large enough \(n\), making it theoretically more efficient.
- With faster computers, we can handle larger problem sizes, making asymptotic analysis more applicable.
- Even if asymptotic analysis doesn't always reflect practical performance, it is:
- A good general measure.
- Theoretically sound.
"Good" Algorithms¶
- For most optimization problems (e.g., the Traveling Salesman Problem), a straightforward exponential-time algorithm exists.
- For such problems, the goal is to design a polynomial-time algorithm with running time \(O(n^c)\) for some constant \(c\) independent of \(n\).
- Even though an algorithm with time complexity \(10^{10^{10}} \cdot n^{100}\) may not run faster than a \(2^n\) time algorithm in practice, it is still a major theoretical achievement.
- It implies the problem has fundamental properties that allow it to be solved faster than brute-force enumeration.
- This distinction sets such problems apart from those where brute-force is the best known method.
- The topic of polynomial-time computation will be revisited later in the course.
Computation Models¶
- When stating an algorithm has time complexity like \(O(n^2)\), we must specify the primitive operations assumed.
- In this course, we use the word-RAM model:
- Memory access at any array position is done in constant time.
- Word-level operations (e.g., addition, read/write) also take constant time.
- Example:
- In a graph with \(n\) vertices, identifying a vertex needs \(\log_2 n\) bits.
- We assume these bits fit into one word (e.g., 32 or 64 bits on most computers), so comparisons take constant time \(O(1)\).
- This model works well in practice:
- Because word size is large enough for memory access to take nearly constant time.
- Most data fits into main memory.
- Example:
- Binary search in word-RAM takes \(O(\log n)\) word operations.
- Not feasible on a Turing machine with 1D tape (no random access).
- For numerical algorithms like determinant computation:
- Intermediate values can grow exponentially.
- It's unreasonable to assume constant-time arithmetic.
- For these, we use bit-complexity:
- Count the number of bit operations required.
- Be aware of these assumptions in time complexity, especially for numerical problems.
- In this course:
- Difference between word-complexity and bit-complexity is minor (e.g., a \(\log n\) factor).
- So, we stick to the word-RAM model.
3SUM Problem Summary¶
Problem¶
Given an array \(a_1, a_2, ..., a_n\) and a number \(c\), determine if there exist indices \(i, j, k\) such that \(a_i + a_j + a_k = c\)
Algorithm 1: Brute Force¶
- Enumerate all triples \(a_i, a_j, a_k\).
- Time complexity: \(O(n^3)\)
Algorithm 2: Hashing + Binary Search¶
- Rewrite as: \(a_i + a_j = c - a_k\)
- Enumerate all \(a_i + a_j\) and check if \(c - a_i - a_j\) exists in the array.
- Sort the array first, then binary search for each \(c - a_i - a_j\).
- Time complexity: \(O(n \log n + n^2 \log n) = O(n^2 \log n)\)
Algorithm 3: Reduce to 2SUM¶
- Fix each \(a_k\) and check if \(a_i + a_j = c - a_k\)
- For each \(k\), reduce to a 2SUM problem with target \(b = c - a_k\).
- Sort the array once, then for each \(b\), use two pointers to solve 2SUM in \(O(n)\).
- Time complexity: \(O(n^2)\)
2SUM (Given Sorted Array)¶
- Use two pointers: start from both ends and move inward. \(L=1, R=n\)
- If sum is too big, move right pointer left. If too small, move left pointer right.
- Loop stops when \(L \geq R\).
- Time complexity: \(O(n)\)
Correctness¶
- If found \(a_i+a_j=b\), DONE
- Want to prove: \(\exists a_i+a_j=b\), won’t return NONE
If there are no indices \(i, j\) such that \(a_i + a_j = b\), then the algorithm will not find them.
Suppose \(a_i + a_j = b\) for some \(i < j\).
Since the algorithm runs until \(L = R\), there must be an iteration where either \(L = i\) or \(R = j\).
Without loss of generality, assume \(L = i\) happens first and \(R > j\). (The other case is symmetric.)
As long as \(R > j\), we have:
So we will decrease \(R\) in each iteration until \(R = j\).
At that point, \(a_i + a_j = b\), so the algorithm correctly finds the pair.
Time Complexity¶
The algorithm runs in \(O(n)\) time because:
- In each iteration, either \(L\) increases or \(R\) decreases.
- Thus, \(R - L\) decreases by 1 each time.
- The maximum number of iterations is \(n - 1\) (initially, \(R-L=n-1\)), so total time is \(O(n)\).
Remark (Exam Tip)¶
- Stating Algorithm 2 or 3 gets partial marks.
- Full marks require both algorithm and correctness proof.
- No need to write full code; describing the idea clearly is sufficient.
- Homework will focus more on correctness proofs.