Compound Poisson Distributions
Introduction to Compound Poisson Distributions
Recommended Prerequesites
- Probability
- Probability 2
- Maximum Likelihood Estimation
- Method of Moments
Introduction
In previous chapters, we explored both mixture and compound distributions, which allow for modeling variability in the parameters or outcomes of random processes. Mixture distributions arise when data is generated from one of several possible distributions, with each distribution selected according to a mixing probability. Compound distributions, on the other hand, result from random variables whose parameters are themselves random, introducing an additional layer of complexity in describing stochastic processes.
A particularly important class of compound distributions, which we turn our focus to now, is the Compound Poisson distribution. This distribution is widely applicable in scenarios where the number of events occurring in a fixed period is uncertain and follows a Poisson distribution, and the outcome of each event is itself random.
Definition
Let \(N\) be a Poisson-distributed random variable with parameter \(\lambda\gt 0\), representing the number of events that occur in a fixed interval.
Assume that each event generates a random variable \(X_i\) from some iid sequence \(\left\{X_i\right\}\) with a common distribution function \(F_X(x)\) and mean \(\mu_X=\mathbb{E}[X]\). The Compound Poisson random variable S is defined as the sum of the N random variables:
$$S=\sum_{i=1}^NX_i$$
If N=0, then we define S=0. The distribution of S is called a Compound Poisson distribution.
Wald's Equation
A powerful and satisfying equation is Wald's equation/identity/lemma, which relates the expectation of a sum of iid variables where the number of summands is also a random variable which is independent from the summed variables.
Let \(x_1,x_2,\dots,x_n\) be iid random variables. Let n be a random variable. If both have finite expectations, then
$$\mathbb{E}[\sum_{i=1}^nx_i]=\mathbb{E}[N]\mathbb{E}[X]$$
Variance
$$\text{Var}(S)=\lambda(\sigma_X^2+\mu_X^2)$$
where \(\mu_X^2\) and \(\sigma_X^2\) are the mean and variance of indivdual event sizes respectively.
Special Cases
X is Exponetially Distributed
If \(X_i\) follow an exponential distribution with rate \(\theta\) then S follows a Gamma distribution with shape parameter \(\lambda\) and rate \(\theta\).
X is a constant
If \(X_i=c\), then \(S\sim c\times N\).
Sampling from a Compound Poisson
The steps for sampling from a Compound Poisson distribution are fairly intuitive given the construction of the dynamics:
Poisson Sampling
First, sample \(N\sim \text{Po}(\lambda\)), where N is the number events occuring in a given time period.
Secondary Distribution Sampling
Sample the secondary distribution N times. Sum them together:
$$Y=\sum_{i=1}^{N}X_i$$
Estimation of Parameters
For a Compound Poisson distribution, you now have the rate parameter from the Poisson distribution, as well as the parameter(s) for X. This can prove to make estimation tricky.
MLE
The likelihood function for a compound Poisson would be given by
$$L(\lambda,\theta;S_1,\dots,S_n)=\prod_{i=1}^{n}P(S_i|\lambda, \theta)$$
where \(P(S_i|\lambda,\theta)\) is the probability mass function (or pdf if X is continuous) for the Compound Poisson distribution.
In practice, this is difficult since there may be mutliple ways to get a given value of s. Compound Poisson PMFs or PDFs are often written as infinite sums
$$L(\lambda,\theta)=\prod_{i=1}^n\sum_{N=0}^\infty \left(\frac{e^{-\lambda}\lambda^N}{N!}\prod_{j=1}^Nf_{X}(y_{ij}|\theta)\right)$$
This is not a particularly nice equation to have to manipulate. The
Method of Moments
Another parameter estimation method we talked about in a previous chapter is
the method of moments.
The Compound Poisson distribution has a moment-generating function given by:
$$M_Y(t)=\exp\left(\lambda[M_{X_{i}}(t)-1]\right)$$
From this, moments of the compound distribution can be calculated using the relationship:
$$m_n=\mathbb{E}[Y^n]=M_{Y}^{(n)}(0)=\frac{d^nM_Y}{dt^n}|_{t=0}$$
The first few moments of the Compound Distribution are:
$$\mathbb{E}[S]=\mathbb{E}[N]\mathbb{E}[X]$$
$$\mathbb{E}[S]=\lambda\mathbb{E}[X]$$
$$\text{Var}(S)=\mathbb{E}[N]\text{Var}(X)+(\mathbb{E}[X])^2\text{Var}(N)$$
$$\text{Var}(S)=\lambda(\text{Var}(X)+\mathbb{E}[X]^2)$$
$$\text{Skew}(Y)=\frac{1}{\sqrt{\lambda}}\frac{E[(X-\mathbb{E}[X])^3]}{(\text{Var(X)})^{3/2}}$$
$$\text{Kurt}(Y)=\frac{1}{\lambda}\frac{E[(X-\mathbb{E}[X])^4]}{(\text{Var(X)})^{2}}$$
Given a sample \(S_1,S_2,\dots,S_n\) the sample mean \(\hat{\mu}_S\) and sample variance \(\hat{\sigma}_S^2\) can be calculated as:
$$\bar{S}=\frac{1}{n}\sum_{i=1}^nS_i$$
$$s_{S}^2=\frac{1}{n}\sum_{i=1}^n(S_i-\hat{\mu}_S)^2$$
$$\bar{S}=\lambda\mathbb{E}[X]$$
$$s_{S}^2=\lambda\mathbb{E}[X^2]$$
$$\lambda=\frac{\bar{S}^2}{s_{2}^2-\bar{S}}$$
$$\mu_X=\frac{\bar{S}}{\lambda}$$