Distribution Textbook (Work in Progress)

by John Della Rosa


Introduction to Estimators

Recommended Prerequesites

  1. Probability


An estimator is a function or rule that takes a sample of data and produces an estimate of some population parameter.

Let \(\theta\) be a parameter of interest (e.g., the population mean, variance, or proportion), and let \(X_1,X_2,\dots,X_n\) be a random sample from a population. An estimator \(\hat{\theta}\) is a function of the sample: $$\hat{\theta}=\hat{\theta}(X_1,X_2,\dots,X_n)$$ The estimate \(\hat{\theta}\) is a specific value obtained by applying the estimator to a given data set


The sample mean, \(\bar{X}\), is an estimator for the population mean \(\mu\): $$\bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_i$$


The bias of an estimator measures how far the expectedvalue of the estimator is from the true parameter value. $$\text{Bias}(\hat{\theta})=\mathbb{E}[\hat{\theta}]-\theta$$


The sample variance \(S^2\) is an unbiased estimator for the population variance \(\sigma^2\). However, the biased sample variance estimator \(\hat{\sigma}^2\): $$\hat{\sigma}^2=\frac{1}{n}\sum_{i=1}^{n}(X_i-\bar{X})^2$$ has bias \(\frac{n-1}{n}\sigma^2\), making it a biased estimator of the population variance.

Variance of an Estimator

The variance of an estimator measures the expected deviation of the estimator from its expected value. The variance of \(\hat{\theta}^2\) is given by: $$\text{Var}(\hat{\theta})=\mathbb{E}[(\hat{\theta}-\mathbb{E}[\hat{\theta}])^2]$$

Bias-Variance Trade-off

The mean squared error of an estimator can be broken down into the two previously mentioned quantities: $$\text{MSE}(\hat{\theta})=\mathbb{E}[(\hat{\theta}-\theta)^2]=\text{Var}(\hat{\theta})+[\text{Bias}(\hat{\theta})]^2$$


Formally stated, consistency is that the probability that the bias deviates from the true parameter by a given amount goes to 0 as the sample sizes goes to infinity: $$\lim_{n\rightarrow\infty}P(|\hat{\theta}-\theta|>\varepsilon)=0$$ The previously mentioned biased sample variance estimator is consistent.

Sufficient Statistics

