Notes@HKU by Jax

Searching

When we search for data, we want to find the data as quickly as possible.

Searching through unorganized data must be O(n)O(n), as we have to look at every element to find the data. This is called linear / sequential search.

This could be not true for organized data, where we can gain information about the data without looking at every element, such as with a sorted list.

We will focus on searching algorithms on a sorted list in this section.

Works by:

  • Access linearly every floor(n\sqrt{n}) elements
  • If the element is greater than the target, go back to the previous jump
  • Linearly search the elements in the jump Amount of comparisons =nn+n1=\frac{n}{\sqrt{n}} + \sqrt{n} - 1

Time complexity: O(n)O(\sqrt{n})

Works by:

  • Start at the middle of the list
  • If the element is greater than the target, search the left half
  • If the element is less than the target, search the right half

Time complexity: O(logn)O(\log n)

Remember that the middle must be recalculated each time.

An improved version of binary search. Instead of searching the middle, we search the interpolated position (probe) based on the target.

probe=low+(highlow)(A[high]A[low])×(targetA[low])\text{probe} = \text{low} + \frac{(\text{high} - \text{low})}{(\text{A[high]} - \text{A[low]})} \times (\text{target} - \text{A[low]})

Time complexity: Θ(loglogn)O(n)\Theta(\log \log n) O(n)

Note: This is only effective when the data is evenly distributed (linear). Otherwise, it is not effective, such as when the data is exponential, resulting in O(n)O(n) time complexity.

Tree searching

We can use a tree to search for data. BST and AVL trees guarentee Θ(logn)\Theta(\log n) time complexity, as we don't have to access every element to gain information about them.

This is built upon the idea of binary search.

Binary search tree

Binary search tree (BST)

A binary tree (two child nodes per node) that satisfies the following properties:

  • Left node: key \leq root.key
  • Right node: key >> root.key The height of a BST is logn\log n.

Time complexities

Consider the height of a BST:

  • Min height: log2n+1\log_2 n + 1
  • Max height: nn (All nodes are in a straight line) We can then conclude the time complexities for insert, search and delete is: Θ(logn)O(n)\Theta(\log n) O(n)

Traversal methods

The following are three recursive methods to traverse a tree:

  • Pre-order: root\toleft\toright
  • In-order: left\toroot\toright
  • Post-order: left\toright\toroot

Where you access the root and recursively trasverse the left and right subtrees. When we consider trees, we always think recursively. These are all DFS methods.

To search for a node in a BST, we can simply captialized on the properties of the tree:

  1. Start at the root
  2. If the key is less than the root, go left
  3. If the key is greater than the root, go right
  4. Return if equal

Remember, we always consider trees recursively.

Balancing trees

The balance factor

B(n)=left.heightright.heightB(n)=\text{left.height} - \text{right.height}

A perfectly balanced tree has B(n)=0B(n)=0.

AVL tree

A self-balancing binary search tree where B(n)1\|{B(n)}\|\leq 1.

Self-balancing means that when modifying the tree, it will rotate the tree to fix imbalance.

Time complexities

The time complexities for insert, search and delete in an AVL tree is Θ(logn)\Theta(\log n).

Rotations

After inserting a node to the correct position, or deleting a node, the tree may become unbalanced. We need to check for imbalance for each processed node and rotate the tree to fix it.

The following are two types of rotation:

Rotations

Notice that they are mirrored of each other.

Consider the following conditions of balance for each processed node for rotations:

l = node.left
r = node.right
if B(node) > 1:
    if B(l) < 0: rotateLeft(l)
    rotateRight(node)
if B(node) < -1:
    if B(r) > 0: rotateRight(r)
    rotateLeft(node)

On this page