Metropolis Algorithm
Recommended Prerequesites
- Probability
- Probability II
- Sampling
- Introduction to Markov Chains
- Markov Chain Monte Carlo
Gibbs Sampling
Gibbs Sampling is particularly useful when direct sampling is challenging due to the intricacies of the joint distribution, but the conditional distributions of each variable are tractable.
Gibbs sampling generates a sequence of samples by iteratively sampling each variable from its conditional distribution, given the current values of all other variables. This process constructs a Markov chain whose stationary distribution is the joint distribution \(p(X)\).
Steps involved
- A target joint distribution \(p(X_1,X_2,\dots,X_n)\) from which we wish to sample
- The ability to sample from the condition distributions \(p(X_i|X_{-i})\) for each \(i\)
Algorithm
Initialization
Start with an initial state
$$X^{(0)}=(X_{1}^{(0)},X_{2}^{(0)},\dots,X_{n}^{(0)})$$
Iteration
For \(t=1\) to \(N\):
- For \(i=1\) to \(n\):
- Sample \(X_{i}^{(t)}\) from the conditional distribution
\(p(X_i|X_{1}^{(t)},\dots,X_{i-1}^{(t)},X_{i+1}^{(t-1)},\dots,X_{n}^{(t-1)})\)
- This results in a new state \(X^{(t)}=(X_{1}^{(t)},X_{2}^{(t)},\dots,X_{n}^{(t)})\)
Repeat
Continue the process to generate a sequence of samples \(\left\{X^{(t)}\right\}\)
Implementation
- Ordering: Variables can be updated sequentially or in random order.
- Block Sampling: Groups of variables can be updated simulatneously if their joint conditional distribution is known
Example
Suppose we have a bivariate normal distribution with variables \(X\) and \(Y\), mean vector \(\mu=(\mu_X,\mu_Y)\), and covariance matrix:
$$\Sigma=\begin{bmatrix}\sigma_{X}^{2} & \rho\sigma_X\sigma_Y\\
\rho\sigma_X\sigma_Y & \sigma_{Y}^{2}\end{bmatrix}$$
Conditional Distributions
In a multivariate normal distribution, the conditional distributions are also normal. The conditional distribution of X given Y is:
$$p(X|Y)=\mathcal{N}(X;\mu_{X|Y},\sigma_{X|Y}^2)$$
where
- \(\mu_{X|Y}=\mu_{X}+\rho\frac{\sigma_X}{\sigma_Y}(Y-\mu_Y)\)
- \(\sigma_{X|Y}^2=(1-\rho^2)\sigma_{X}^{2}\)
Similarly for \(Y|X\):
- \(\mu_{Y|X}=\mu_{Y}+\rho\frac{\sigma_Y}{\sigma_X}(X-\mu_X)\)
- \(\sigma_{Y|X}^2=(1-\rho^2)\sigma_{Y}^{2}\)
Algorithm Steps
Initialization
- Choose initial values \(X^{(0)}\) and \(Y^{(0)}\)
- For simplicity, let \(X^{(0)}=\mu_X\) and \(Y^{(0)}=\mu_Y\)
Iteration
For \(t=1\) to \(N\):
- Sample X Given Y
Compute \(\mu_{X|Y^{(t-1)}}=\mu_X+\rho\frac{\sigma_X}{\sigma_Y}(Y^{(t-1)}-\mu_Y)\)
Sample \(X^{(t)}\sim\mathcal{N}(\mu_{X|Y^{(t-1)}},\sigma_{X|Y}^{2})\)
- Sample Y Given X
Compute \(\mu_{Y|X^{(t-1)}}=\mu_Y+\rho\frac{\sigma_Y}{\sigma_X}(X^{(t-1)}-\mu_X)\)
Sample \(Y^{(t)}\sim\mathcal{N}(\mu_{Y|X^{(t-1)}},\sigma_{Y|X}^{2})\)
Repeat
Continue for \(N\) iterations.
Gibbs Sampling Practice Problems