Notes@HKU by Jax

Probabilities

Definitions

Complement

The complement of an event AA is denoted by AcA^c and is defined as the event that AA does not occur.

P(Ac)=1P(A)P(A^c) = 1 - P(A)

Mutually exclusive

Two events AA and BB are said to be mutually exclusive if they cannot occur at the same time.

P(AB)=0P(A \cap B) = 0

Example

For example, when tossing a coin, the events "heads" and "tails" are mutually exclusive.

Union

The union of two events AA and BB is denoted by ABA \cup B and is defined as the event that at least one of the events occurs.

The probability of the union of two events is given by:

P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)

Which is given by the general formula via inclusion-exclusion principle:

p(i=1nAi)=i=1nP(Ai)1i<jnP(AiAj)+1i<j<knP(AiAjAk)+p(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i) - \sum_{1 \leq i < j \leq n} P(A_i \cap A_j) + \sum_{1 \leq i < j < k \leq n} P(A_i \cap A_j \cap A_k) - \cdots + \cdots

For events tha are mutually exclusive, the formula simplifies to:

p(i=1nAi)=i=1nP(Ai)p(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)

Which is just the sum of the probabilities of the events.

Union bound

p(i=1nAi)i=1nP(Ai)p(\cup_{i=1}^n A_i) \leq \sum_{i=1}^n P(A_i)
Proof
Base case: n=1P(A1)P(A1)Inductive step: Assume true for some n=k:P(i=1kAi)i=1kP(Ai)Then for n=k+1:P(i=1k+1Ai)=P(i=1kAiAk+1)(as P(AB)=P(A)+P(B)P(AB))    P(i=1kAi)+P(Ak+1)P(i=1kAiAk+1)(P(i=1kAiAk+1)0)    P(i=1kAi)+P(Ak+1)i=1kP(Ai)+P(Ak+1)i=1k+1P(Ai)\begin{aligned} \text{Base case: } n=1 \to P(A_1) & \leq P(A_1)\\ \text{Inductive step: Assume true for some }& n=k:\\ P(\cup_{i=1}^k A_i) & \leq \sum_{i=1}^k P(A_i)\\ \text{Then for }n=k+1: P(\cup_{i=1}^{k+1} A_i) & = P(\cup_{i=1}^k A_i \cup A_{k+1})\\ (\text{as } P(A\cup B) = P(A) + P(B) - P(A\cup B)) \implies & \leq P(\cup_{i=1}^k A_i) + P(A_{k+1}) - P(\cup_{i=1}^k A_i \cap A_{k+1})\\ (\because P(\cup_{i=1}^k A_i \cap A_{k+1}) \geq 0) \implies & \leq P(\cup_{i=1}^k A_i) + P(A_{k+1})\\ & \leq \sum_{i=1}^{k} P(A_i) + P(A_{k+1})\\ & \leq \sum_{i=1}^{k+1} P(A_i)\\ \end{aligned}

Conditionals

Conditional probabilities

The probability of A given that B occurs: P(A  B)=P(AB)P(B)P(A\ \|\ B) = \frac{P(A\cup B)}{P(B)}

Exhaustive events

A set of events are exhaustive if at least one of them would occur in a sample space (the universe of possible outcomes).

A set of events are paritions of a sample space if they are:

  1. Mutually exclusive
  2. Collectively exhaustive
  3. P(Ai)0P(A_i) \neq 0 for all ii

Law of Total Probability

For events E1,E2,,EnE_1, E_2, \ldots, E_n that partition a sample space SS:

P(A)=i=1nP(AEi)P(Ei)P(A) = \sum_{i=1}^n P(A|E_i)P(E_i)
Proof

Consider probabilities:

A=AS=A(E1E2En)=(AE1)(AE2)(AEn)EiEj= for ij    =(AE1)+(AE2)++(AEn)AEi=AEiEi(AEi)×Ei=AEi    =(AE1)×E1+(AE2)×E2++(AEn)×En=i=1n(AEi)×Ei\begin{align*} A & = A \cap S\\ & = A \cap (E_1 \cup E_2 \cup \cdots \cup E_n)\\ & = (A \cap E_1) \cup (A \cap E_2) \cup \cdots \cup (A \cap E_n)\\ \because E_i \cap E_j = \emptyset \text{ for } i \neq j \implies & = (A \cap E_1) + (A \cap E_2) + \cdots + (A \cap E_n)\\ \because A|E_i=\frac{A\cap E_i}{E_i} & \\ (A|E_i)\times E_i = A \cap E_i \implies & = (A|E_1)\times E_1 + (A|E_2)\times E_2 + \cdots + (A|E_n)\times E_n\\ & = \sum_{i=1}^n (A|E_i)\times E_i\\ \end{align*}

Bayes' Theorem

P(AB)=P(B A)P(A)P(B)=P(BA)P(A)P(BA)P(A)+P(BAc)P(Ac)\begin{aligned} P(A|B) & = \frac{P(B\ | A) \cdot P(A)}{P(B)}\\ & = \frac{P(B|A) \cdot P(A)}{P(B|A)P(A) + P(B|A^c)P(A^c)}\\ \end{aligned}

We can generalize the theorem by the total probability theorem. Suppose that AA is an event from a sample space SS that is partitioned by events E1,E2,,EnE_1, E_2, \ldots, E_n. For an index of events EkE_k:

P(EkA)=P(AEk)P(Ek)i=1nP(AEi)P(Ei)P(E_k | A) = \frac{P(A|E_k)P(E_k)}{\sum_{i=1}^n P(A|E_i)P(E_i)}

Independence

Independent events

A and B are said to be independent if P(A)×P(B)=P(AB)P(A) \times P(B) = P(A\cap B).

Being independent means that the probability of an event has no influence on the other

Pairwise independence

For all pairs of events AA and BB in a set of events, P(AB)=P(A)×P(B)P(A \cap B) = P(A) \times P(B)

Mutually independent

For all events in a set of events, P(i=1nAi)=P(A1)×P(A2)××P(An)P(\cap^n_{i=1} A_i) = P(A_1) \times P(A_2) \times \cdots \times P(A_n)

By definition, all events inside a mutually independent set of events are pairwise independent. This might not be true in the reverse: nn pairwise independent events might not be mutually independent.

Random variables

Notation

Random variables are denoted with capital letters XX

The possible outcomes are denoted with regular letters xx

Probability that the outcome of XX is xx is denoted by P(X=x)P(X=x)

Expected value and Variance

E(Xn)=(xnP(X=x))E(X^n)=\sum(x^n\cdot P(X=x)) gives the expected value, which represents the mean value (outcome) of the random variable.

Var(X)=E(X2)E(X)2Var(X)=E(X^2)-E(X)^2 gives the variance, which is a measure of the variability of the random variable's outcomes.

Operations of E(X) and Var(X)

E(X+Y)=E(X)+E(Y)E(X+Y)=E(X)+E(Y), which any addition / subtraction function within E()E() can be expanded.

If XX and YY are independent:

Var(X+Y)=Var(X)+Var(Y),E(XY)=E(X)E(Y)Var(X+Y) = Var(X) + Var(Y),\quad\quad E(XY) = E(X)\cdot E(Y)

for Y=aX+b:for\ Y=aX+b: E(Y)=aE(X)+b,Var(Y)=a2Var(X)E(Y)=aE(X)+b,\quad\quad Var(Y)=a^2Var(X)

Binomial distribution

Bernoulli trials

A Bernoulli trial is an experiment or process that results in a binary outcome: success or failure.

The probability of success is denoted by pp and the probability of failure is denoted by q=1pq = 1 - p.

Binomial distribution

The binomial distribution is a discrete probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials.

The probability of getting exactly kk successes in nn trials is given by:

P(X=k)=(nk)pkqnkP(X=k) = \binom{n}{k} p^k q^{n-k}

Where (nk)\binom{n}{k} is the binomial coefficient, which represents the number of ways to choose kk successes from nn trials.

On this page