Lecture 16. H) Theorem: Berstein von-Mises
Theorem: Berstein von-Mises
Let [math]\widehat{\theta}_{B}[/math] be a point estimator for Bayesian inference (i.e., [math]\widehat{\theta}_{B}=E\left(\left.\theta\right|X\right)[/math]) and [math]\widehat{\theta}_{ML}[/math] be the MLE. Then,
[math]\sqrt{n}\left(\widehat{\theta}_{B}-\theta_{0}\right)\overset{d}{\rightarrow}N\left(0,I\left(\theta_{0}\right)^{-1}\right),\text{ where }\theta_{0}\text{ is the true value of }\theta.[/math]
and
[math]\sqrt{n}\left(\widehat{\theta}_{B}-\widehat{\theta}_{ML}\right)\overset{p}{\rightarrow}0[/math]
The second result is relatively striking: it tells us that even after scaling by [math]\sqrt{n}[/math], the ML and Bayes estimators converge in probability.
In practice, researchers can use an estimate of [math]I\left(\theta\right)^{-1}[/math] based on the variance implied by [math]f_{\left.\theta\right|X}[/math] for hypothesis testing. In relatively complicated cases, priors need not belong to conjugate families, in which case numerical methods are used, including taking draws from distributions via the Gibbs sampler and the Metropolis Hastings algorithm.