MCMC Algorithms
- In this talk we will cover some common MCMC algorithms.
Metropolis-Hastings
- The procedure
MetropolisHastings
takes as input :- the target distribution p(x),
- the proposal distribution q(x,y),
- the initial sample x_0, and
- the total number of samples to generate N.
- The procedure returns:
- the sequence of samples (x_0, x_1, \ldots, x_N).
Gibbs Sampling
Historical note
Gibbs sampling was introduced in 1984 by the brothers Stuart and Donald Geman as a method for image restoration. It is particularly useful when the joint distribution is complex, but the conditional distributions are easier to sample from.
Gibbs sampling is applicable when the joint distribution is difficult to sample from directly but we can sample from the conditional distribution of each variable. We sample from each of these in turn substituting the outcomes in the conditioning term of subsequent samples. It can be shown that such sequences of samples constitutes a Markov chain, and the stationary distribution of that Markov chain is the requires joint distribution.
- The procedure
GibbsSampling
takes as input :- the joint distribution p(x,y),
- the initial values for x and y
- (x^{(0)} and y^{(0)}), and
- the total number of samples to generate N.
- The procedure returns:
- the sequences of samples for x and y, (x_1, \ldots, x_N) and (y_1, \ldots, y_N), respectively.
Limitations
- There are two ways that Gibbs sampling can fail:
- When there are islands of high-probability states, with no paths between them.
- When all states have nonzero probability and there is only a single island of high-probability states
Inverse Sampling algorithm
Inverse transform sampling is a method for generating random numbers from any probability distribution by using its inverse cumulative distribution function (CDF). The basic idea is to generate a uniform random number and then transform it using the inverse CDF to obtain a sample from the desired distribution.
The procedure InverseSampling
takes as input : - the inverse cumulative distribution function F^{-1}(u) and - the number of samples to generate N. The procedure generates N uniform random numbers u_1, \ldots, u_N and computes the corresponding samples x_1, \ldots, x_N using the inverse cumulative distribution function F^{-1}(u).
The procedure returns: - the sequence of samples (x_1, \ldots, x_N). Note: that in this algorithm, we assume that the inverse cumulative distribution function F^{-1}(u) is available, and can be used to generate samples from a distribution with cumulative distribution function F(x).
EM algorithm for mixtures
The expectation-maximization algorithm, usually known as the EM algorithm, is a widely-used computational method for performing maximum likelihood method in statistical models which allows us to find point estimates of parameters.
Simple to code
Monotonic (every iteration increases the likelihood).
It is also often quick to find a “reasonably good” solution, although if a very precise solution is required then it can be slow.
Next we describe the EM algorithm for a particular problem – MLE of the mixture proportions, where it has a particularly simple and intuitive form.
This algorithm is not an MCMC algorithm.
And we can do inference on mixture models using MCMC methods.
However it can help us find point estimate for parameters e.g. the number of components in a mixture
Hamiltonian Monte Carlo algorithm
Hamiltonian Monte Carlo or HMC is a Markov chain Monte Carlo (MCMC) algorithm that leverages concepts from physics to efficiently sample from complex probability distributions. It is particularly useful for high-dimensional spaces where traditional MCMC methods may struggle. Also the leapfrog method is used to simulate Hamiltonian dynamics which allow the algorithm to propose new states that are distant from the current state, reducing the random walk behavior and improving sampling efficiency.
In this algorithm statement, the Hamiltonian Monte Carlo algorithm is encapsulated in a procedure called HamiltonianMC.
- The procedure HamiltonianMC takes as input:
- the target distribution \pi(x),
- its gradient \nabla \log \pi(x),
- the number of steps to simulate Hamiltonian dynamics L,
- the step size \epsilon, and
- the total number of samples to generate M.
- The procedure returns:
- the sequence of samples (x_1, \ldots, x_M).
Note: that in this algorithm, we first sample a momentum variable p from a normal distribution, then simulate Hamiltonian dynamics for L steps using the leapfrog method. We then compute the Metropolis-Hastings acceptance probability based on the updated proposal, and accept or reject the proposal according to this probability. We repeat this process for M iterations to generate the desired samples.
Citation
@online{bochman2023,
author = {Bochman, Oren},
title = {MCMC Algorithms},
date = {2023-04-22},
url = {https://orenbochman.github.io/posts/2023/2023-04-22-mcmc-algs/},
langid = {en}
}