Lecture 10. C) Cramer-Rao Lower Bound
Cramér-Rao Lower Bound (CRLB)
It is possible to provide a meaningful lower bound to the variance of an estimator. (An example of a meaningless bound is zero.)
Let [math]X_{1}..X_{n}[/math] be a random sample from a distribution with marginal pmf/pdf - i.e., of a single observation - [math]f\left(\left.\cdot\right|\theta\right)[/math].
Under some regularity conditions (finite variance of the estimator and differentiation under the integral sign is allowed),
[math]Var_{\theta}\left(\widehat{\theta}\right)\geq\frac{\left(\frac{d}{d\theta}E_{\theta}\left(\widehat{\theta}\left(X_{i}\right)\right)\right)^{2}}{nE_{\theta}\left[\left(\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right)^{2}\right]}=\frac{\left(\frac{d}{d\theta}E_{\theta}\left(\widehat{\theta}\left(X_{i}\right)\right)\right)^{2}}{nVar_{\theta}\left[\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right]},\,\forall\theta\in\Theta.[/math]
This is the version for the scalar case, but the analogue multivariate case exists as well.
Notice that when [math]\widehat{\theta}[/math] is unbiased, we obtained the simplified inequality
[math]Var_{\theta}\left(\widehat{\theta}\right)\geq\frac{1}{nE_{\theta}\left[\left(\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right)^{2}\right]}=\frac{1}{nVar_{\theta}\left[\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right]},\,\forall\theta\in\Theta.[/math]
This result presents a few striking features.
First, the log-likelihood function shows up in the denominator.
Second, it is not evaluated at some point [math]x_{i}[/math]. Rather, the expectation over [math]X_{i}[/math] of its derivative w.r.t. [math]\theta[/math] is taken.
Third, the last equality follows from a novel result (which we do not prove here):
[math]Var_{\theta}\left[\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right]=E_{\theta}\left[\left(\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right)^{2}\right][/math].
This is a special property of the log-likelihood function, since [math]E_{\theta}\left[\left(\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right)\right]=0[/math].
Fisher Information
We denote the denominator, [math]I\left(\theta\right)=nE_{\theta}\left[\left(\frac{\partial}{\partial\theta}\log\,f\left(\left.X_{i}\right|\theta\right)\right)^{2}\right][/math] as the Fisher information, which is the reciprocal of the minimum attainable variance of unbiased estimators.
CRLB: Possible Cases
The CRLB is a weak bound, in the sense that an UMVU may fail to reach it.
Three possible cases can occur:
- The CRLB is applicable and attainable:
- Estimating [math]p[/math] when [math]X_{i}\sim Ber\left(p\right)[/math]
- Estimating [math]\mu[/math] when [math]X_{i}\sim N\left(\mu,\sigma^{2}\right)[/math] with [math]\sigma^{2}[/math] known.
- The CRLB is applicable, but not attainable:
- Estimating [math]\sigma^{2}[/math] when [math]X_{i}\sim N\left(\mu,\sigma^{2}\right)[/math]: [math]\widehat{\sigma^{2}}=s^{2},[/math] while [math]Var_{\sigma^{2}}\left(s^{2}\right)=\frac{2\sigma^{4}}{n-1}\gt \frac{2\sigma^{4}}{n}[/math], the latter of which is the CRLB.
- The CRLB is not applicable:
- Estimating [math]\theta[/math] when [math]X_{i}\sim U\left(0,\theta\right)[/math]; [math]Var_{\theta}\left(\widehat{\theta}_{UMVU}\right)=\frac{1}{n\left(n+2\right)}\theta^{2}[/math], yet CRLB[math]=\infty[/math] or [math]\frac{\theta^{2}}{n}[/math].