Chapter 1 - Basics and Bayes' Rule
De Morgans' Law
-
\( (S \cap T)^c = S^c \cup T^c \) and \( (S \cup T)^c = S^c \cap T^c \)
-
\( (S^c \cap T^c)^c = S \cup T \)
Problem 4. Parking lot problem
Mary and Tom park their cars in an empty parking lot with 𝑛≥2 consecutive parking spaces (i.e, 𝑛 spaces in a row, where only one car fits in each space). Mary and Tom pick parking spaces at random; of course, they must each choose a different space. (All pairs of distinct parking spaces are equally likely.) What is the probability that there is at most one empty parking space between them?
View answer
- when first car is at head or tail, second car has 2 configurations for each case (gives 4 configurations)
- when first car is at head+1 or tail-1, second har has 3 configurations for each case (gives 6 configurations)
- when first car is at the rest of the place, second car has 4 configurations for each case, giving 4*(n-4) configurations
- the total number of all possible configurations is n*(n-1)
- so the answer should be (4*(n-4)+ 4 +6)/(n*(n-1))
3 Important tools
- Multiplication rule
- Total probability theorem
- Bayes' rule (-> inference)
Conditional probability
\[ P(A | B) = \frac{ P(A \cap B)}{ P(B)} \]
\[ P(A | B) P(B)= P(B | A) P(A) \]
\[ P(A | B) = \frac{ P(B | A) P(A) } { P(B | A) P(A) + P(B | A^C) P( A^C)} \]
Multiplication rule
Total probability theorem
\( P(B) = P(B | A) P(A) + P(B | A^C) P( A^C) \) can be generalized to:
\[ P(B) = \sum_{i} P(B | A_i) P(A_i) \quad \text{ given } \sum_i P(A_i) = 1 \]
Definition of independence:
\[ P(A \cap B) = P(A) \cdot P(B) \]
Which also implies \( P(B | A) = P(B); P(A | B) = P(A)\), and also implies \( P(A \cap B^C) = P(A) \cdot P(B^C) \).
Note that ìndependence
has no relation related to disjoint
.
Independence is a relation about information. It is important
to always keep in mind the intuitive meaning of independence.
Two events are independent if the occurrence of one event
does not change our beliefs about the other.
Conditional independence
Permutation & Combination
Permutation - pick r
items out of n
items, order matters:
\[
P = \frac{n!}{(n-r)!}
\]
Combination - pick r
items out of n
items, order does not matter:
\[
C = \frac{P}{r!} = \frac{n!}{(n-r)! \cdot r!}
\]
Number of subsets of n
element:
\[
2^n
\]
\[ \sum_{k=0}^{n} \left( \begin{array}{c} n \\ k \end{array} \right) = 2^n \]
Number of possible outcomes of continuously toss a fair 6-faces dice for n
times:
\[
6^n
\]
Binomial coefficient \( \left( \begin{array}{c} n \\ k \end{array} \right) \)
- n>=1 independent coin tosses
P(H)=p
: \[ P(k \quad heads) = \left( \begin{array}{c} n \\ k \end{array} \right) p^k (1-p)^{n-k} \]
\[ \sum_{k=0}^{n} \left( \begin{array}{c} n \\ k \end{array} \right) p^k (1-p)^{n-k} = 1 \]
Multinomial coefficient
Chapter 2 - Discrete Random Variables
Random variable examples:
- Bernoulli
- Uniform
- Binomial
- Geometric
Random Variables
Random variables can be understand as a function
that
transfers a random selection
of a sample, into a
numeric value.
随机变量可以看作是一个函数!!!
Notation: random variable X
, numerical value x
Probability Mass Function (PMF)
PMF
also called as probablility distribution
.
Bernoulli
- X = 1, with probability p,
- X = 0, with probability 1 - p.
E[X] = p;
var(X) = E[X^2] - (E[X])^2 = p(1-p)
Uniform
- each outcome have the same probability
- h(X) = 1 / (b-a+1)
- E[X] = (a + b) / 2
- var(X) = 1/12 * (b-a) * (b-a+2)
For the continuous case:
- E[X] = (a + b) / 2
- var(X) = 1/12 * (b-a)^2
Binomial
n
independent tosses of a coin with P(Heads)=p, withX
= number of heads observed.
E(X) = E(X1) + E(X2) ... + E(Xn) = p + p + p + ... p = n * p
var(X) = var(X1 + X2 + ...) = var(X1) + var(X2)... = n * var(X1) = n * BernoulliVar = np(1-p)
Note above: because of independence.
Geometric
- infinitely many independent tosses of a coin,
P(Heads)=p,
X
= number of tosses until the first Heads. \( P(X=k) = (1-p)^{k-1} p \). If p=0.5, then it is like 1/2, 1/4, 1/8 ... - model the time we have to wait until something happens, especially for the continuous case.
Memorylessness
- E[X] = 1 / p
- var(X) = (1-p)/p^2
For the continuous case:
- E[x] = 1/lambda
- var(X) = 1/lambda^2