Notes@HKU by Jax

Algorithm Analysis

We usually determine the efficiency of an algorithm by analyzing its time and space complexity.

Size of input

An input is a data structure that is given to an algorithm to solve a problem. The size of an input is the number n\mathbf{n} of elements in the input.

How do we assess time complexity?

When we analyize time complexity, we are interested in the efficiency of an algorithm as a function of the size of the input.

As different machines have varying speeds of processors, time is not a good fit for measuring algorithm efficiency.

We can consider the number of operations an algorithm performs relative to the size of the input, which will isolate the algorithm from the performing machines.

However, different operations have varying efficiencies, so instead we consider the growth rate of the total number of operations as a function of the input size. The analysis of such is called Asymptotic analysis.

Time complexity

The rate of growth of the total number of operations an algorithm performs relative to size of input.

Space complexity

The rate of growth of the total amount of memory an algorithm uses relative to size of input.

Asymptotic notation

Growth rate / Complexity: The basis of asymptotic notations

We are defining a certain function T(n)\mathbf{T(n)} to represent the total number of operations with respect to n\mathbf{n} (the size of the input).

When we say T(n)T(n) is AA of (g(n))(g(n)) (or T(n)A(g(n))T(n)\in A(g(n))), we are saying that the growth rate of T(n)T(n) is bounded by A(g(n))A(g(n)) under a specific mathematical inequality defined by AA.

General definition of asymptotic notations

T(n)A(g(n)) iff  c>0:T(n) \in A(g(n))\ \text{iff}\ \exists\ c > 0: T(n)cg(n)  nn0>0T(n) \simeq c\cdot g(n)\ \forall\ n \geq n_0 > 0 T(n)a(g(n)) if  c>0  n00:T(n) \in a(g(n))\ \text{if}\ \forall\ c > 0\ \exists\ n_0 \geq 0: T(n)cg(n)  nn0T(n) \sim c\cdot g(n)\ \forall\ n \geq n_0

Note that the \in is interchangable with ==. The meaning does not really matter.

To prove non-strict boundaries, we need to show the existence of constants c,n0c, n_0 that satisfies the inequality.

Prove ex. 1 Prove ex. 2 Prove ex. 3

Definition of all asymptotic notations

NotationConditionAsymptotic boundaryName
O(g)O(g)T(n)cg(n)T(n) \htmlClass{text-red-500}{\leq} c\cdot g(n)UpperBig O
Ω(g)\Omega(g)T(n)cg(n)T(n) \htmlClass{text-red-500}{\geq} c\cdot g(n)LowerBig Omega
Θ(g)\Theta(g)c1g(n)T(n)c2g(n) c_1\cdot g(n)\htmlClass{text-red-500}{\leq} T(n) \htmlClass{text-red-500}{\leq} c_2\cdot g(n)TightBig Theta / Order of growth
o(g)o(g)T(n)<cg(n)T(n) \htmlClass{text-red-500}{<} c\cdot g(n)Strictly upperLittle o
ω(g)\omega(g)T(n)>cg(n)T(n) \htmlClass{text-red-500}{>} c\cdot g(n)Strictly lowerLittle omega

Growth rate functions

We use the likeness of g(n)g(n) to describe the complexity of T(n)T(n). The following gives the common growth rate functions (increasing order of growth):

1<logn<n<n<nlogn<n2<n3<2n<n!<nn1 < \log n < \sqrt{n} < n < n\log n < n^2 < n^3 < 2^n < n! < n^n

Note that the higher the growth rate, the more complex the algorithm.

Note the following growth rate equivilencies:

  • log(n!)=Θ(nlog(n))\log(n!)=\Theta(n\log(n)) (prove not needed, related to lower bound of sorting algorithms)
  • Θ(log2n)=Θ(log10n)\Theta(\log_2n)=\Theta(\log_{10}n) as log2n=lognlog2=c×logn\log_2n = \frac{\log n}{\log 2} = c\times\log n

Identifying asymptotic growths

Non-strict: Identify the term with the highest growth rate in T(n)T(n) would be g(n)g(n).

Strict: Choose the next / previous growth rate function with reference to the list depending on if it's upper / lower.

You can use the same logic to determine if some f(n)f(n) is A/aA/a of g(n)g(n).

Note that as the defintions are specified by an inequality, there could be multiple satisfying g(n)g(n) for the same T(n)T(n) and all of them are valid. However, keep in mind the only useful g(n)g(n) for analysis would be closest to the condition boundary.

If the function functuates, there is not a consistent growth rate unless a boundary is specified.

Example

Big Theta of resursive relations

When tasked to find find the asymptotic notation of a recursive relation, we are required to give the Big Theta notation of the relation.

The straight forward way is to find the closed-form expression and identify the growth rate. However, this is not always possible.

Useful growth rates

  • Sum of square series: i=1kf(n,i)2Θ(n3)\sum_{i=1}^{k}f(n, i)^2\in \Theta(n^3)
  • Sum of series: i=1kf(n,i)Θ(n2)\sum_{i=1}^{k}f(n, i)\in \Theta(n^2)

Time complexity of code

Operation of time complexities

By definition:

O(g)+O(g)=k×O(g)=O(g)O(g)+O(g)=k\times O(g)=O(g) O(g)×O(g)=O(g×g)O(g)\times O(g)=O(g\times g) O(g1)+O(g2)=O(g2),g2>g1 in terms of order of growthO(g_1)+O(g_2)=O(g_2),\quad g_2 > g_1\text{ in terms of order of growth}

Our goal is to examine how the runtime grows with respect to the input size. Consider the following example:

# 1
for num in arr:         # n
if num > max_val:       # n
max_val = num           # n
return max_val          # 1

It should be acknowledged that each operation / line would take different amounts of time to compute. However, as we are interested in the runtime T(n)T(n)'s growth rate, we can treat all operations to take 11 unit of time.

Hence, T(n)=O(1)+O(1)+O(1)+O(1)+O(n)+O(n)+O(n)+O(1)=O(n)T(n)=O(1)+O(1)+O(1)+O(1)+O(n)+O(n)+O(n)+O(1)=O(n)

Typical growth rates and common occurance

GrowthOccurrence
11Statements that run a set number of times with no respect to the input size
logn\log nAlgorithms that solve a big problem by transformation into a series of smaller problems, cutting the problem size by some constant fraction at each step.
nn"for" loops that have nn amount of iterations per program run
nlognn\log nAlgorithms that solve a problem by breaking it up into smaller sub-problems, solving them independently, and then combining the solutions.
nkn^kNested "for" loops that have nn amount of iterations per program run

On this page