Distribution Textbook (Work in Progress)

by John Della Rosa

Compound Distributions

Introduction to Compound Distributions

Recommended Prerequesites

  1. Probability
  2. Probability 2
  3. Maximum Likelihood Estimation
  4. Mixture Distributions

Definition

In probability and statistics, a compound distribution arises when one random variable is dependent on another. This is often the case when there is an underlying variability in the parameters of a distribution. For example, if we model the number of insurance claims using a Poisson distribution, but the rate of claims (the parameter\(\lambda\)) is itself a random variable, we have a compound Poisson distribution. This is an extension of the idea of conditional distributions. Now, the conditional distributed is conditioned on a random variable.
In a previous chapter, we covered mixture distributions, where a random variable selects a distribution and then a value is selected from there. In a compound distribution, we have the same thing, but our random variable samples over many (uncountably so, even) kinds of the same distribution, each with different values for a parameter.

From another perspective, we can view our primary distribution as being sampled. That sampled value is then plugged into another distribution as a parameter, and then we draw from that second distribution. In some sense, compound distributions are the integration to mixture distribuons' summation.

An Aside: Common Conditional Probability Rules

Law of Total Probability

$$P(A)=\sum_{n}P(A|B_n)P(B_n)$$ $$P(A)=\int_{-\infty}^{\infty}P(A|X=x)dF_X(x)$$ $$=\int_{-\infty}^{\infty}P(A|X=x)f_{X}(x)dx$$

Law of iterated expectations

A useful formula is the law of iterated expectations which relates the unconditional expectation to the conditional expectations. This is essentially an extention of the Law of Total Probability, which becomes apparent if you write out the expectations explicitly. $$\mathbb{E}[X]=\mathbb{E}[\mathbb{E}[X|Y]]$$

Law of Total Variance

$$\text{Var}(X)=\mathbb{E}[\text{Var}(X|Y)]+\text{Var}(\mathbb{E}[X|Y])$$

Returning to Defining Compound Distributions

A compound distribution is a probability distribution of a random variable \(X\) where the distribution of \(X\) depends on another random variable \(Y\). This can be expressed as $$X|Y=z\sim f_{X|Y}(x|y)$$ where \(X\) has a conditional distribution \(f_X(x|y)\), and \(Y\) follows a marginal distribution \(g_Y(y)\).

The unconditional (or marginal) distribution of \(X\) is found by integrating over the distribution of Y: $$f_{X}(x)=\int_{-\infty}^{\infty}f_{X|Y}(x|y)g_{Y}(y)dy$$ The distribution of \(X\) is "compounded" by the randomness of Y. This should hopefully look similar to the Law of Total Probability, as it is the same, just restated in probability distribution notation for P(A).

Compound Distribution Generator

Primary Distribution (for parameter generation)

Secondary Distribution (used with generated parameter)

Summary Statistics

Mean:

Variance:

Standard Deviation:

Min:

Max:

Skewness:

Kurtosis:

User Guide

This tool allows you to generate compound distributions, where a parameter is drawn from one distribution (the primary distribution) and then used as a parameter for a second distribution (the secondary distribution). Follow these steps to use the generator:

Step 1: Select a Primary Distribution

The primary distribution generates the parameter for the secondary distribution. You can select from the following:

For each primary distribution, you need to enter specific parameters:

After selecting the primary distribution, the support of that distribution will be shown, which indicates the range of possible output values.

Step 2: Select a Secondary Distribution

The secondary distribution uses the parameter generated by the primary distribution. Available options include:

When a secondary distribution is selected, any specific requirements for that distribution will be displayed. It is very important that the output (support) of the first distribution matches the input (valid parameter range) of the second distribution.

Step 3: Generate the Compound Distribution

After choosing both distributions and entering the parameters, click the "Generate Compound Distribution" button to generate and plot the compound distribution. The tool will automatically validate the inputs and display alerts if any parameter falls outside of its valid range.

Example Use Case

Suppose you choose a Gamma distribution as the primary with a shape (k) of 2 and a rate (θ) of 2. The tool will generate positive values for λ. If you then choose a Poisson distribution as the secondary, the tool will use the λ values generated from the Gamma distribution to produce Poisson samples. You can then visualize the resulting compound distribution.

Step 5: View Summary Statistics

After the distribution is generated, you can view summary statistics such as:

Step 6: Export the Data

You can export the generated data as a CSV file using the Export Data button. This allows you to save the sample for further analysis or reporting.

Compound Distribution Practice Problems