Markov Chain Monte Carlo

Introduction to MCMC

Recommended Prerequesites

Theory

By constructing a Markov Chain whose stationary distribution is the target distribution, $\pi$, and simulating the chain for a sufficient number of steps, we can obtain samples from $\pi$.

Coupling

Coupling involves constructing two copies of a Markov Chain on the same probability space such that they eventually meet and evolve together.

Coupling Time: The expected time until the two chains meet.

Example: Simple Random Walk

Consider a simple random walk on the integers modulo $N$ (a circle of $N$ states):

State Space: $S=\{0,1,2,\dots,N-1\}$
Transition probabilities: $$P_{i,(i+1) \text{ mod }N}=P_{i,(i-1) \text{ mod }N}=\frac{1}{2}$$
Stationary Distribution: Uniform over $S$

This approximates the uniform distribution over $S$.

Metropolis Algorithm

The Metropolis Algorithm is one of the most common MCMC methods. Its primary purpose is to generate a sequence of samples from a target distribution $\pi(x)$, especially when $\pi(x)$ is known only up to a normalizing constant or is too complex for direct sampling.

Conceptual Framework

Markov Chain Construction

The algorithm constructs a Markov Chain whose stationary distribution is the desired target distribution $\pi(x)$.

Proposal Mechanism

At each iteration, a new candidate state $x^*$ is proposed based on the current state $x_t$, using a proposal distribution $q(x^*|x_t)$.

Acceptance Criterion

The proposed state $x^*$ is accepted or rejected based on an acceptance probability $\alpha$, which ensures that the chain has the correct stationary distribution.

Detailed Balance Condition

The Metropolis algorithm satisfies the detailed balance condition, which is a sufficient condition for the chain to have $\pi(x)$ as its stationary distribution $$\pi(x_{t})q(x^*|x_t)\alpha(x_t,x^*)=\pi(x^*)q(x_t|x^*)\alpha(x^*,x_t)$$

Symmetric Proposal Distribution

In the original Metropolis algorithm, the proposal distribution $q(x^*|x_t)$ is symmetric: $$q(x^*|x_t)=q(x_t|x^*)$$ This symmetry simplifies the acceptance probability to: $$\alpha=\min\left(1,\frac{\pi(x^*)}{\pi(x_t)}\right)$$

Random Walk Behavior

The algorithm often employs a random walk proposal, where the next state is proposed by adding a random perturbation to the current state.

Example

Suppose we want to sample from a univariate target distribution $\pi(x)$ defined on $\mathbb{R}$. The target distribution is proportional to: $$\pi(x)\propto e^{-(x^4-16x^2+5x)}$$ This distribution is complex and does not have a standard form, making direct sampling challenging.

Define Target Distribution

We define the unnormalized target density function: $$\pi(x)\propto e^{-(x^4-16x^2+5x)}$$ The normalizing constant isn't needed for the Metropolis algorithm.

Choose the Proposal Distribution

Select a proposal distribution $q(x'|x)$, which is the distribution of proposing a new state $x'$ given the current state $x$. We'll use a symmetric normal proposal distribution centered at the current state: $$q(x'|x)=\mathcal{N}(x';x,\sigma^2)$$ where $\sigma^2$ is the variance of the proposal distribution. Example: Set $\sigma=2$.

Initialize the Chain

Choose an initial value $x_0$ for the Markov Chain. $$x_0=0$$

Metropolis Algorithm Iteration

For $t=0$ to $T-1$, perform the following steps to generate $x_{t+1}$:

Generate a Proposal

Sample a candidate value $x^*$ from the proposal distribution: $$x^*\sim q(x^*|x_t)=\mathcal{N}(x^*;x_t,\sigma^2)$$ Example Calculation:

Suppose at iteration $t$, the current state is $x_t=0$
Sample $x^*$ from $\mathcal{N}(0,4)$ and obtain $x^*=1.5$

Compute the Acceptance Probability

Since the proposal distribution is symmetric, $q(x^*|x_t)=q(x_t|x^*)$, so the acceptance probability simplifies to: $$\alpha=\min\left(1,\frac{\pi(x^*)}{\pi(x_t)}\right)$$ Compute the ratio of target densities:

Compute $\pi(x^*)$: $$\pi(x^*)=e^{-(x^{*4}-16x^{*2}+5x^*)}$$
Compute $\pi(x_t)$ similarly

Example: $$\pi(1.5)=e^{-(1.5^4-16\times 1.5^2+5\times 1.5)}=e^{28.3125}$$ $$\pi(0)=e^{-(0^4-16\times0^2+5\times0)}=e^{0}=1$$ Since this ratio is greater than 1, the acceptance probability $\alpha=1$.

Accept or Reject the Proposal

Generate a uniform random number $u\sim \mathcal{U}(0,1)$
If $u \leq \alpha$, accept the proposal: $x_{t+1}=x^*$
Else, reject the proposal: $x_{t+1}=x_t$

Example:

Generate $u=0.7$
Since $u=0.7\leq\alpha=1$, accept the proposal.
Set $x_{t+1}=1.5$

Record the State

Continue the process for a large number of iterations to obtain a sequence of samples $\{x_t\}_{t=1}^T$.

Post-Processing

Burn-in Period: Discard the initial $B$ samples (e.g., first 1,000 iterations) to reduce the influence of the starting value.
Thinning (optional): To reduce autocorrelation, keep every $k$-th sample

Distribution Textbook (Work in Progress)

by John Della Rosa