Lecture 16. H) Theorem: Berstein von-Mises

From Significant Statistics
Jump to navigation Jump to search

Theorem: Berstein von-Mises

Let [math]\widehat{\theta}_{B}[/math] be a point estimator for Bayesian inference (i.e., [math]\widehat{\theta}_{B}=E\left(\left.\theta\right|X\right)[/math]) and [math]\widehat{\theta}_{ML}[/math] be the MLE. Then,

[math]\sqrt{n}\left(\widehat{\theta}_{B}-\theta_{0}\right)\overset{d}{\rightarrow}N\left(0,I\left(\theta_{0}\right)^{-1}\right),\text{ where }\theta_{0}\text{ is the true value of }\theta.[/math]

and

[math]\sqrt{n}\left(\widehat{\theta}_{B}-\widehat{\theta}_{ML}\right)\overset{p}{\rightarrow}0[/math]

The second result is relatively striking: it tells us that even after scaling by [math]\sqrt{n}[/math], the ML and Bayes estimators converge in probability.

In practice, researchers can use an estimate of [math]I\left(\theta\right)^{-1}[/math] based on the variance implied by [math]f_{\left.\theta\right|X}[/math] for hypothesis testing. In relatively complicated cases, priors need not belong to conjugate families, in which case numerical methods are used, including taking draws from distributions via the Gibbs sampler and the Metropolis Hastings algorithm.