Distribution Sampling
Introduction to Distribution Sampling
Recommended Prerequesites
- Probability
- Probability II
Introduction
If we have a distribution that we have fitted to data, we might want to sample from it, perhaps because we are doing some sort of Monte Carlo simulation.
The goal is to generate random samples that follow a specified distribution. This chapter will be more on the applied-side, looking at algorithms for generating (pseudo) random variables from a given distribution.
Continuous Distributions
Inverse Transform Sampling
Introduction
The Inverse Transform Sampling method is a relatively simple and straightforward method for sampling from a continuous distribution. Recall in a
previous chapter, we learned about the quantile function, which takes a probability and maps it to a value x, such that \(F_X(x)=p\). We will not worry about the more general definition if you don't have strict monotonicity; instead just think about it conceptually.
Now, there is a fact called the Probability Integral Transform; this states that for a distribution X, \(F(X)\sim\mathcal{U}(0,1)\).
Combining these two facts gives some intuition for how the Inverse Transform Sampling method works.
Algorithm
Step 1: Generate a Uniform Random Variable
We will take it as a given that we possess the ability to sample from \(U\sim \mathcal{U}(0,1)\). This is a safe assumption which computers are able to do. As for how this occurs, that is beyond the scope of this book.
Step 2: Apply the Inverse CDF
If \(F_X(x)\) is the CDF of target distribution X, then \(X=F_{X}^{-1}(U)\) is a sample from the distribution of X.
Example
Let us consider an exponentially distributed variable X with rate parameter \(\lambda\), its CDF is given by:
$$F_{X}(x)=1-\exp\left(-\lambda x\right)$$
Through algebra, we can actually solve for the quantile function:
$$p=1-\exp\left(-\lambda x\right)$$
$$x=\frac{-\log(1-p)}{\lambda}$$
Thus, we can generate an exponentially distributed realization x by plugging in a U(0,1) realization p.
Advantages
Inverse transform sampling is conceptually simple and works for any distribution with a known CDF.
Disadvantages
Most distributions don't have a closed form quantile function like the Exponential distribution. It is possible to still use inverse transform sampling in this case, but it involves computationally expensive numerical approximations. If the specified distribution has a closed form CDF, root-finding techniques can be employed to calculate the quantile.
Rejection Sampling
Also see
interactive simulator.
Motivation
In inverse transform sampling, we directly sample from a uniform distribution and use the inverse of the CDF to get samples from a target distribution. This works well when the CDF is known and easily invertible, which isn't always the case.
Algorithm
Choice of Sampling Distribution
Choose a distribution \(g(x)\) from which it is easy to sample and whose pdf dominates the target distribution \(f(x)\) when multiplied by a constant M.
$$f(x)\leq Mg(x)\quad \forall x$$
This implies that the support of \(G\) must be a superset of F. This requirement makes sense because if certain values of F would not possible to be drawn, then you would not truly be sampling from F.
We then draw a candidate sample \(x^{\star}\) from \(g(x)\)
Acceptance Ratio
Imagine you drew a vertical line at \(x^{\star}\) starting at y=0 all the way up to \(y=Mg(x^\star)\). Now, pick a random point on that line, which we will call \(y=y^{\star}\).
What is the probability that \(y^\star\leq f(x^\star)\)?
$$P(y^{\star} \leq f(x^\star))=P(U(0,Mg(x^\star))\leq f(x^\star))=\frac{f(x^\star)}{Mg(x^\star)}$$
Thus, we have our acceptance rate.
Accept or Reject
Now, generate a uniform random variable \(u\sim U(0,1)\). If \(u\leq \frac{f(x^\star)}{Mg(x^\star)}\), accept the sample; otherwise, reject it and draw another value of \(x^\star\) from \(g(x)\). As an easy example, if \(x^\star\) is outside of F's support, the acceptance rate is 0, and you would reject it, which is intuitive.
Advantages
- Simple to implement
- Doesn't require calculation of quantile function or CDF
Disadvantages
- Since the acceptance rate depends on the choice of distribution, if we have a mismatch of distributions, there can be low acceptance rates. This causes many cycles of the algorithm before a sample is generated. This can become very inefficient.
Ziggurat Algorithm
Also see
interactive simulator.
Introduction
The Ziggurat algorithm is a type of rejection sampling that can be used for unimodal symmetric or monotonically decreasing pdfs.
These assumptions about the target distribution allow it to be more efficients.
It achieves high efficiency by dividing the probability density function (PDF) of the target distribution into horizontal layers or rectangles. Most samples are drawn directly from these rectangles, which speeds up the process. Only a small percentage of samples require further computation, significantly reducing the number of expensive evaluations (like computing exponentials or logarithms) compared to traditional methods like inverse transform or rejection sampling.
Examples of distributions it would be appropriate for are Gaussian and Expoential distributions.
Algorithm
Precompute Rectangles
- The target PDF (e.g., normal or exponential) is divided into a series of horizontal rectangles.
- Each rectangle is constructed to have an equal area, with the width and height determined by the distribution’s properties.
- Precompute values like the height, width, and cumulative areas of these rectangles. These values are stored for quick lookup during sampling.
Sampling
- A random rectangle is chosen based on the precomputed areas. This random choice is efficient since the areas are equal.
- Next, a random point is generated inside the chosen rectangle, with both x (horizontal) and y (vertical) coordinates being uniformly sampled.
- If this layer is the base, there may be a special sampling method, especially if the support is not bounded, thus the rectangle width would be undefined. This can be the case with the normal distribution for example.
Acceptance Check
- If the random point lies completely inside the rectangle and under the PDF curve, the sample is immediately accepted.
- If the point is in the uppermost rectangle, additional checks are required because this top rectangle exceeds the actual PDF curve in height. A more detailed evaluation of the PDF value is done to determine if the point is valid.
If Acceptance Fails
If the point fails the acceptance check (i.e., it lies outside the PDF curve in the uppermost rectangle), it is rejected, and the process repeats with another random point. Most points are accepted after one or two attempts, but even in cases of rejection, the algorithm is still faster than others like inverse transform sampling due to the minimal number of function evaluations required.