Markov Chain
Introduction to Markov Chains
Recommended Prerequesites
- Probability
- Probability II
- Sampling
- Introduction to Multivariate Distributions
Basic Concepts
Stochastic Process
A stochastic process is a collection of random variables \(\{X_t\}_{t\in T}\).
Markov Property
A stochastic process \(\{X_t\}_{t\geq 0}\) satisfies the Markov property if, for all \(t\geq 0\) and for all states \(x_0, x_1, \dots, x_{t}, x_{t+1}\),
$$P(X_{t+1}=x_{t+1}|X_{t}=x_{t},X_{t-1}=x_{t-1},\dots,X_0=x_0)=P(X_{t+1}=x_{t+1}|X_t=x_t)$$
In other words, only the present state affects the probability of the next transition.
Markov Chain
A Markov Chain is a discrete-time stochastic process with the Markov property and a discrete state space \(S\). This state space can be:
- Finite: \(S=\{1,2,\dots,N\}\)
- Countably Infinite: \(S=\mathbb{N}\) or any countable set.
Transition Probability
The dynamics of a Markov Chain are characterized by transition probabilities:
$$P_{ij}=P(X_{t+1}=j|X_{t}=i),\quad\forall i,j\in S$$
These transition probabilities can be arranged into a transition matrix where:
- Each element \(P_{ij}\) represents the probability of transitioning from state \(i\) to state \(j\) in one time step
- Each row sums to 1 \(\sum_{j}P_{ij}=1\)
- All entries are non-negative: \(P_{ij}\geq 0\)
The second and third characteristics imply that \(P_{ij}\leq 1\).
Absorbing State
An absorbing state \(i\) satisfies \(P_{ii}=1\). Once entered, the chain remains in this state indefinitely.
Intermediate Concepts
Chapman-Kolmogorov Equations
The \(n\)-step transition probability is the probability of transitioning from state \(i\) to state \(j\) in \(n\) steps:
$$P_{ij}^{(n)}=P(X_{t+n}=j|X_t=i)$$
This leads to the Chapman-Kolmogorov Equations:
$$P_{ij}^{(n+m)}=\sum_{k\in S}P_{ik}^{(n)}P_{kj}^{(m)}$$
Stationary Distribution
A stationary distribution \(\pi=\pi_{i\in S}\) satisfies:
$$\pi_j=\sum_{i\in S}\pi_i P_{ij},\quad \forall j\in S,$$
or alternatively written:
$$\vec{\pi}=\vec{\pi}\mathbf{P}$$
where \(pi\) is a vector of the probability distribution over states and \(P\) is the transition matrix
However, this need not be something that is initially true. We care more about convergence.
Convergence
It is said that the distribution of \(X_t\) converges to the stationary distribution \(\pi\) as \(t\to\infty\) if:
$$\lim_{n\to\infty}P(X_t=j|X_0=i)=\pi_j,\quad \forall i,j\in S$$
Total Variation Distance
The total variation distance between two probability distributions \(\mu\) and \(\nu\) on \(S\) is:
$$||\mu-\nu||_{TV}=\frac{1}{2}\sum_{i\in S}|\mu_i-\mu_i|$$
The mixing time is the time it takes for the Markov Chain to get close to its stationary distribution in total variation distance.
Ergodicity
For a Markov Chain to be
ergodic, it must satisfy the following 3 properties:
- Irreducible
- Aperiodic
- Positive Recurrent
Irreducibility
A Markov chain state \(j\) is accessible from state \(i\) (denoted \(i\rightarrow j\)) if there exists
\(n\geq 0\) such that \(P_{ij}^{(n)}\gt 0\) where \(P_{ij}^{(n)}\) is the \(n\)-step transition probability.
It is said that states \(i\) and \(j\) communicate (denoted \(i\leftrightarrow j\)) if \(i\rightarrow j\) and \(j\rightarrow i\).
A Markov Chain is irreducible if every pair of states communicates, meaning that it's possible to reach any state from any other state.
Periodicity
The period of a state \(i\) is:
$$d(i)=\text{gcd}\{n\geq 1: P_{ii}^{(n)}\gt 0\}$$
- A state i is aperiodic if \(d(i)=1\)
- A state i is periodic if \(d(i)\gt 1\)
An aperiodic chain is one where all states are aperiodic.
Recurrence and Transience
- A state \(i\) is recurrent if, starting from \(i\), the chain is guaranteed to return to \(i\) eventually
- A state \(i\) is transient if there is a non-zero probability that the chain will enver return to \(i\) after leaving it.
There are two categories of recurrent states:
- Positive Recurrent: If the expected return time to state \(i\) is finite.
- Null Recurrent: If the expected return time to state \(i\) is infinite, even though the chain returns to \(i\) with probability 1.
Reversible Markov Chains
Detailed Balance Condition
A Markov Chain satisfies the detailed balance condition with respect to a distribution \(\pi\) if:
$$\pi_{i}P_{ij}=\pi_{j}P_{ji},\quad\forall i,j\in S$$
This implies that the flow of probability from $i$ to $j$ is exactly balanced by the flow from $j$ to $i$.
A chain is reversible if it satisfies thee detailed balance condition.
Limit Theorems for Markov Chains
Law of Large Numbers
For an ergodic Markov Chain, the time average of a function converges to its expectation under the stationary distribution:
$$\lim_{n\to\infty}\frac{1}{n}\sum_{t=1}^{n}f(X_t)=\mathbb{E}_{\pi}[f(X)],\quad\text{almost surely}$$
Central Limit Theorem
$$\sqrt{n}\left(\frac{1}{n}\sum_{i=1}^{n}f(X_t)-\mathbb{E}_{\pi}[f(X)]\right)\overset{d}{\rightarrow}\mathcal{N}(0,\sigma^2)$$