Lecture 16. G) Multiple Observations
Multiple Observations
In the normal case for a single observation, the following holds:
[math]\left.\begin{array}{c} f_{\left.X\right|\mu}=N\left(\mu,\sigma^{2}\right)\\ f_{\mu}=N\left(\mu_{0},\sigma_{0}^{2}\right) \end{array}\right\} \Rightarrow f_{\left.\mu\right|X}=N\left(\frac{\sigma_{0}^{2}}{\sigma^{2}+\sigma_{0}^{2}}x+\frac{\sigma^{2}}{\sigma^{2}+\sigma_{0}^{2}}\mu_{0},\left(\frac{1}{\sigma^{2}}+\frac{1}{\sigma_{0}^{2}}\right)^{-1}\right)[/math]
Notice the following results:
- The posterior mean is a weighted average of the prior mean [math]\mu_{0}[/math] and the data [math]x[/math].
- The posterior variance does not depend on [math]x[/math] (this is a property of the Normal).
- Estimation: A large [math]\sigma_{0}^{2}[/math] is often preferred.
- Setting [math]\sigma^{2}=\infty[/math] is possible (uninformative/improper prior) but not well-defined.
Finally, notice that this result can be used to provide a generalization for the case of multiple observations.
2 Observations
Note that
[math]\begin{aligned} f\left(\left.\mu\right|x_{1},x_{2}\right) & =\frac{f\left(\mu,x_{1},x_{2}\right)}{f\left(x_{1},x_{2}\right)}\\ & =\frac{f\left(\left.x_{1}\right|\mu,x_{2}\right)f\left(\left.x_{2}\right|\mu\right)f\left(\mu\right)}{fx_{1},x_{2}}\\ & =\frac{f\left(\left.x_{1}\right|\mu\right)f\left(\left.x_{2}\right|\mu\right)f\left(\mu\right)}{fx_{1},x_{2}}\\ & \propto f\left(\left.x_{1}\right|\mu\right)f\left(\left.x_{2}\right|\mu\right)f\left(\mu\right)\end{aligned}[/math]
where the last equality follows from the fact that the data is a random sample.
We can take advantage of this result by calculating [math]f\left(\left.\mu\right|x_{1},x_{2}\right)[/math] sequentially:
First, we calculate
[math]f_{\left.\mu\right|x_{1}}=f_{\left.x_{1}\right|,\mu}.f_{\mu}=N\left(\frac{\sigma_{0}^{2}}{\sigma^{2}+\sigma_{0}^{2}}x_{1}+\frac{\sigma^{2}}{\sigma^{2}+\sigma_{0}^{2}}\mu_{0},\left(\frac{1}{\sigma^{2}}+\frac{1}{\sigma_{0}^{2}}\right)^{-1}\right)[/math]
Then, we use the result, [math]f_{\left.\mu\right|x_{1},x_{2}}=f_{\left.x_{1}\right|,\mu}.f_{\mu}[/math] as the prior to update for [math]x_{2}[/math], i.e.,
[math]f_{\left.\mu\right|x_{1},x_{2}}\propto\underset{\text{likelihood}}{\underbrace{f_{\left.x_{2}\right|\mu}}}\underset{\text{prior}}{\underbrace{f_{\left.\mu\right|x_{1}}}}[/math]
s.t.
[math]f_{\left.\mu\right|x_{1},x_{2}}=N\left(\frac{\sigma_{0}^{2}}{\sigma^{2}+2\sigma_{0}^{2}}\left(x_{1}+x_{2}\right)+\frac{\sigma^{2}}{\sigma^{2}+2\sigma_{0}^{2}}\mu_{0},\left(\frac{2}{\sigma^{2}}+\frac{1}{\sigma_{0}^{2}}\right)^{-1}\right)[/math]
[math]n[/math] Observations
For [math]x_{1}..x_{n}[/math], it follows that
[math]f_{\left.\mu\right|x_{1}..x_{n}}=N\left(\frac{\sigma_{0}^{2}}{\sigma^{2}+n\sigma_{0}^{2}}\sum_{i=1}^{n}x_{i}+\frac{\sigma^{2}}{\sigma^{2}+n\sigma_{0}^{2}}\mu_{0},\left(\frac{n}{\sigma^{2}}+\frac{1}{\sigma_{0}^{2}}\right)^{-1}\right).[/math]
Notice that unlike in the previous example, the expression for the posterior distribution is "stable", even for a very high number of observations.