Distribution Sampling

Introduction to Distribution Sampling

Recommended Prerequesites

Introduction

If we have a distribution that we have fitted to data, we might want to sample from it, perhaps because we are doing some sort of Monte Carlo simulation. The goal is to generate random samples that follow a specified distribution. This chapter will be more on the applied-side, looking at algorithms for generating (pseudo) random variables from a given distribution.

Continuous Distributions

Inverse Transform Sampling

Introduction

The Inverse Transform Sampling method is a relatively simple and straightforward method for sampling from a continuous distribution. Recall in a previous chapter, we learned about the quantile function, which takes a probability and maps it to a value x, such that $F_X(x)=p$. We will not worry about the more general definition if you don't have strict monotonicity; instead just think about it conceptually.
Now, there is a fact called the Probability Integral Transform; this states that for a distribution X, $F(X)\sim\mathcal{U}(0,1)$.
Combining these two facts gives some intuition for how the Inverse Transform Sampling method works.

Algorithm

Step 1: Generate a Uniform Random Variable

We will take it as a given that we possess the ability to sample from $U\sim \mathcal{U}(0,1)$. This is a safe assumption which computers are able to do. As for how this occurs, that is beyond the scope of this book.

Step 2: Apply the Inverse CDF

If $F_X(x)$ is the CDF of target distribution X, then $X=F_{X}^{-1}(U)$ is a sample from the distribution of X.

Example

Let us consider an exponentially distributed variable X with rate parameter $\lambda$, its CDF is given by: $$F_{X}(x)=1-\exp\left(-\lambda x\right)$$ Through algebra, we can actually solve for the quantile function: $$p=1-\exp\left(-\lambda x\right)$$ $$x=\frac{-\log(1-p)}{\lambda}$$ Thus, we can generate an exponentially distributed realization x by plugging in a U(0,1) realization p.

Advantages

Inverse transform sampling is conceptually simple and works for any distribution with a known CDF.

Disadvantages

Most distributions don't have a closed form quantile function like the Exponential distribution. It is possible to still use inverse transform sampling in this case, but it involves computationally expensive numerical approximations. If the specified distribution has a closed form CDF, root-finding techniques can be employed to calculate the quantile.

Rejection Sampling

Also see interactive simulator.

Motivation

In inverse transform sampling, we directly sample from a uniform distribution and use the inverse of the CDF to get samples from a target distribution. This works well when the CDF is known and easily invertible, which isn't always the case.

Algorithm

Choice of Sampling Distribution

Choose a distribution $g(x)$ from which it is easy to sample and whose pdf dominates the target distribution $f(x)$ when multiplied by a constant M. $$f(x)\leq Mg(x)\quad \forall x$$ This implies that the support of $G$ must be a superset of F. This requirement makes sense because if certain values of F would not possible to be drawn, then you would not truly be sampling from F.

We then draw a candidate sample $x^{\star}$ from $g(x)$

Acceptance Ratio

Imagine you drew a vertical line at $x^{\star}$ starting at y=0 all the way up to $y=Mg(x^\star)$. Now, pick a random point on that line, which we will call $y=y^{\star}$. What is the probability that $y^\star\leq f(x^\star)$? $$P(y^{\star} \leq f(x^\star))=P(U(0,Mg(x^\star))\leq f(x^\star))=\frac{f(x^\star)}{Mg(x^\star)}$$ Thus, we have our acceptance rate.

Accept or Reject

Now, generate a uniform random variable $u\sim U(0,1)$. If $u\leq \frac{f(x^\star)}{Mg(x^\star)}$, accept the sample; otherwise, reject it and draw another value of $x^\star$ from $g(x)$. As an easy example, if $x^\star$ is outside of F's support, the acceptance rate is 0, and you would reject it, which is intuitive.

Advantages

Simple to implement
Doesn't require calculation of quantile function or CDF

Disadvantages

Since the acceptance rate depends on the choice of distribution, if we have a mismatch of distributions, there can be low acceptance rates. This causes many cycles of the algorithm before a sample is generated. This can become very inefficient.

Ziggurat Algorithm

Also see interactive simulator.

Introduction

The Ziggurat algorithm is a type of rejection sampling that can be used for unimodal symmetric or monotonically decreasing pdfs. These assumptions about the target distribution allow it to be more efficients. It achieves high efficiency by dividing the probability density function (PDF) of the target distribution into horizontal layers or rectangles. Most samples are drawn directly from these rectangles, which speeds up the process. Only a small percentage of samples require further computation, significantly reducing the number of expensive evaluations (like computing exponentials or logarithms) compared to traditional methods like inverse transform or rejection sampling.

Examples of distributions it would be appropriate for are Gaussian and Expoential distributions.

Algorithm

Precompute Rectangles

The target PDF (e.g., normal or exponential) is divided into a series of horizontal rectangles.
Each rectangle is constructed to have an equal area, with the width and height determined by the distribution’s properties.
Precompute values like the height, width, and cumulative areas of these rectangles. These values are stored for quick lookup during sampling.

Sampling

A random rectangle is chosen based on the precomputed areas. This random choice is efficient since the areas are equal.
Next, a random point is generated inside the chosen rectangle, with both x (horizontal) and y (vertical) coordinates being uniformly sampled.
If this layer is the base, there may be a special sampling method, especially if the support is not bounded, thus the rectangle width would be undefined. This can be the case with the normal distribution for example.

Acceptance Check

If the random point lies completely inside the rectangle and under the PDF curve, the sample is immediately accepted.
If the point is in the uppermost rectangle, additional checks are required because this top rectangle exceeds the actual PDF curve in height. A more detailed evaluation of the PDF value is done to determine if the point is valid.

If Acceptance Fails

If the point fails the acceptance check (i.e., it lies outside the PDF curve in the uppermost rectangle), it is rejected, and the process repeats with another random point. Most points are accepted after one or two attempts, but even in cases of rejection, the algorithm is still faster than others like inverse transform sampling due to the minimal number of function evaluations required.

Distribution Textbook (Work in Progress)

by John Della Rosa

Distribution Sampling

Introduction to Distribution Sampling

Recommended Prerequesites

Introduction

Continuous Distributions

Inverse Transform Sampling

Introduction

Algorithm

Step 1: Generate a Uniform Random Variable

Step 2: Apply the Inverse CDF

Example

Advantages

Disadvantages

Rejection Sampling

Motivation

Algorithm

Choice of Sampling Distribution

Acceptance Ratio

Accept or Reject

Advantages

Disadvantages

Ziggurat Algorithm

Introduction

Algorithm

Precompute Rectangles

Sampling

Acceptance Check

If Acceptance Fails

Continuous Sampling Practice Problems