Distribution Textbook (Work in Progress)

by John Della Rosa

Compound Poisson Distributions

Introduction to Compound Poisson Distributions

Recommended Prerequesites

  1. Probability
  2. Probability 2
  3. Maximum Likelihood Estimation
  4. Method of Moments

Introduction

In previous chapters, we explored both mixture and compound distributions, which allow for modeling variability in the parameters or outcomes of random processes. Mixture distributions arise when data is generated from one of several possible distributions, with each distribution selected according to a mixing probability. Compound distributions, on the other hand, result from random variables whose parameters are themselves random, introducing an additional layer of complexity in describing stochastic processes.

A particularly important class of compound distributions, which we turn our focus to now, is the Compound Poisson distribution. This distribution is widely applicable in scenarios where the number of events occurring in a fixed period is uncertain and follows a Poisson distribution, and the outcome of each event is itself random.

Definition

Let \(N\) be a Poisson-distributed random variable with parameter \(\lambda\gt 0\), representing the number of events that occur in a fixed interval. Assume that each event generates a random variable \(X_i\) from some iid sequence \(\left\{X_i\right\}\) with a common distribution function \(F_X(x)\) and mean \(\mu_X=\mathbb{E}[X]\). The Compound Poisson random variable S is defined as the sum of the N random variables: $$S=\sum_{i=1}^NX_i$$ If N=0, then we define S=0. The distribution of S is called a Compound Poisson distribution.

Wald's Equation

A powerful and satisfying equation is Wald's equation/identity/lemma, which relates the expectation of a sum of iid variables where the number of summands is also a random variable which is independent from the summed variables. Let \(x_1,x_2,\dots,x_n\) be iid random variables. Let n be a random variable. If both have finite expectations, then $$\mathbb{E}[\sum_{i=1}^nx_i]=\mathbb{E}[N]\mathbb{E}[X]$$

Variance

$$\text{Var}(S)=\lambda(\sigma_X^2+\mu_X^2)$$ where \(\mu_X^2\) and \(\sigma_X^2\) are the mean and variance of indivdual event sizes respectively.

Special Cases

X is Exponetially Distributed

If \(X_i\) follow an exponential distribution with rate \(\theta\) then S follows a Gamma distribution with shape parameter \(\lambda\) and rate \(\theta\).

X is a constant

If \(X_i=c\), then \(S\sim c\times N\).

Sampling from a Compound Poisson

The steps for sampling from a Compound Poisson distribution are fairly intuitive given the construction of the dynamics:

Poisson Sampling

First, sample \(N\sim \text{Po}(\lambda\)), where N is the number events occuring in a given time period.

Secondary Distribution Sampling

Sample the secondary distribution N times. Sum them together: $$Y=\sum_{i=1}^{N}X_i$$

Estimation of Parameters

For a Compound Poisson distribution, you now have the rate parameter from the Poisson distribution, as well as the parameter(s) for X. This can prove to make estimation tricky.

MLE

The likelihood function for a compound Poisson would be given by $$L(\lambda,\theta;S_1,\dots,S_n)=\prod_{i=1}^{n}P(S_i|\lambda, \theta)$$ where \(P(S_i|\lambda,\theta)\) is the probability mass function (or pdf if X is continuous) for the Compound Poisson distribution.

In practice, this is difficult since there may be mutliple ways to get a given value of s. Compound Poisson PMFs or PDFs are often written as infinite sums $$L(\lambda,\theta)=\prod_{i=1}^n\sum_{N=0}^\infty \left(\frac{e^{-\lambda}\lambda^N}{N!}\prod_{j=1}^Nf_{X}(y_{ij}|\theta)\right)$$ This is not a particularly nice equation to have to manipulate. The

Method of Moments

Another parameter estimation method we talked about in a previous chapter is the method of moments. The Compound Poisson distribution has a moment-generating function given by: $$M_Y(t)=\exp\left(\lambda[M_{X_{i}}(t)-1]\right)$$ From this, moments of the compound distribution can be calculated using the relationship: $$m_n=\mathbb{E}[Y^n]=M_{Y}^{(n)}(0)=\frac{d^nM_Y}{dt^n}|_{t=0}$$ The first few moments of the Compound Distribution are: $$\mathbb{E}[S]=\mathbb{E}[N]\mathbb{E}[X]$$ $$\mathbb{E}[S]=\lambda\mathbb{E}[X]$$ $$\text{Var}(S)=\mathbb{E}[N]\text{Var}(X)+(\mathbb{E}[X])^2\text{Var}(N)$$ $$\text{Var}(S)=\lambda(\text{Var}(X)+\mathbb{E}[X]^2)$$ $$\text{Skew}(Y)=\frac{1}{\sqrt{\lambda}}\frac{E[(X-\mathbb{E}[X])^3]}{(\text{Var(X)})^{3/2}}$$ $$\text{Kurt}(Y)=\frac{1}{\lambda}\frac{E[(X-\mathbb{E}[X])^4]}{(\text{Var(X)})^{2}}$$ Given a sample \(S_1,S_2,\dots,S_n\) the sample mean \(\hat{\mu}_S\) and sample variance \(\hat{\sigma}_S^2\) can be calculated as: $$\bar{S}=\frac{1}{n}\sum_{i=1}^nS_i$$ $$s_{S}^2=\frac{1}{n}\sum_{i=1}^n(S_i-\hat{\mu}_S)^2$$ $$\bar{S}=\lambda\mathbb{E}[X]$$ $$s_{S}^2=\lambda\mathbb{E}[X^2]$$ $$\lambda=\frac{\bar{S}^2}{s_{2}^2-\bar{S}}$$ $$\mu_X=\frac{\bar{S}}{\lambda}$$

Compound Poisson Distribution Generator

Discrete Distribution

Secondary Distribution

Summary Statistics

Mean:

Variance:

Standard Deviation:

Min:

Max:

Skewness:

Kurtosis:

Compound Poisson Distribution Practice Problems

  1. Mean and Variance of Compound Poisson Distribution:
    1. Given \( N(t) \sim \text{Poisson}(\lambda t) \) and \( X_i \) drawn from a distribution with mean \( \mu \) and variance \( \sigma^2 \), derive the mean and variance of the compound Poisson process \( S(t) \).
    2. If \( X_i \) follows an exponential distribution with rate parameter \(\theta\), compute the mean and variance of \( S(t) \).
  2. Simulation of a Compound Poisson Process:
    1. Write a Python (or R, MATLAB) program to simulate a compound Poisson process where the number of events \( N(t) \) follows \( \text{Poisson}(5) \), and the event sizes \( X_i \) are drawn from a normal distribution \( N(3, 1^2) \). Simulate 1000 realizations of the process and plot the resulting histogram of \( S(t) \).
  3. Fitting a Compound Poisson Distribution:
    1. Suppose you observe data that you believe follows a compound Poisson distribution with unknown parameters. How would you estimate the parameters \( \lambda \), \( \mu \), and \( \sigma^2 \) using the method of moments?
    2. Given the following data: \[ \{2.0, 4.1, 5.3, 7.8, 3.9, 6.2, 8.1, 5.5\} \] Fit a compound Poisson distribution assuming the event sizes follow a gamma distribution with shape parameter \( \alpha = 2 \) and scale parameter \( \beta = 1 \).
  4. Compound Poisson with Different Secondary Distributions:
    1. Explain how the choice of the secondary distribution \( X_i \) affects the overall behavior of the compound Poisson process. Compare the effects of using a gamma distribution versus a normal distribution for \( X_i \).
    2. For \( X_i \sim \text{Gamma}(2, 1) \), derive the mean and variance of the compound Poisson process \( S(t) \). How do these moments compare to using a normal distribution for \( X_i \)?
  5. Compound Poisson as a Limit of Discrete-Time Processes:
    1. Explain how a compound Poisson process can be viewed as the limit of a discrete-time process. Specifically, consider a sequence of Bernoulli trials with success probability \( p_n \) and event sizes \( X_i \). Show that as \( n \to \infty \) and \( p_n \to 0 \), the sum of these trials converges to a compound Poisson distribution.