Lecture 14. E) Central Limit Theorem

From Significant Statistics
Jump to navigation Jump to search

Central Limit Theorem

Let [math]X_{1}..X_{n}[/math] be a random sample with twice differentiable generating function [math]M_{X}\left(t\right)[/math] in a neighborhood of zero, and mean [math]\mu[/math] and variance [math]\sigma^{2}.[/math]

Then,

[math]\frac{\sqrt{n}\left(\overline{X}_{n}-\mu\right)}{\sigma}\overset{d}{\rightarrow}N\left(0,1\right)[/math]

This is a striking result. When [math]n[/math] is large, the standardized sample mean converges to the standard normal distribution.

Proof

We first define the standardized variable [math]Y_{i}=\frac{X_{i}-\mu}{\sigma}[/math], s.t.

[math]M_{Y}^{'}\left(0\right)=E\left(Y\right)=0[/math] and [math]M_{Y}^{''}\left(0\right)=Var\left(Y\right)=1[/math].

Notice that [math]\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\frac{X_{i}-\mu}{\sigma}=\frac{n}{\sqrt{n}}\sum_{i=1}^{n}\frac{\overline{X}_{n}-\mu}{\sigma}=\sqrt{n}\frac{\overline{X}_{n}-\mu}{\sigma},[/math] which is our statistic of interest.

So, if we show that [math]\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}[/math] has the same m.g.f. as [math]N\left(0,1\right)[/math], because we have assumed that [math]M_{X}\left(t\right)[/math] exists in a neighborhood of zero, this will imply

[math]\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}=\sqrt{n}\sum_{i=1}^{n}\frac{\overline{X}_{n}-\mu}{\sigma}\sim N\left(0,1\right)[/math] (Recall that the m.g.f. only identifies the distribution of [math]Z[/math] if [math]M_{Z}\left(0\right)[/math] exists).

Now, notice that if [math]Z_{n}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}[/math],

[math]\begin{aligned} & =M_{Z_{n}}\left(t\right)=E\left[\exp\left(t\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}\right)\right]=E\left[\Pi_{i=1}^{n}\exp\left(t\frac{Y_{i}}{\sqrt{n}}\right)\right]\\ & \underset{(i.i.d.)}{=}\Pi_{i=1}^{n}E\left[\exp\left(t\frac{Y_{i}}{\sqrt{n}}\right)\right]=\Pi_{i=1}^{n}M_{Y_{i}}\left(\frac{t}{\sqrt{n}}\right)=M_{Y}\left(\frac{t}{\sqrt{n}}\right)^{n}.\end{aligned}[/math]

Let us now expand [math]M_{Y}\left(\frac{t}{\sqrt{n}}\right)[/math] around zero:

[math]M_{Y}\left(\frac{t}{\sqrt{n}}\right)=M_{Y}\left(0\right)+M_{Y}^{'}\left(0\right)\frac{t}{\sqrt{n}}+\frac{M_{Y}^{''}\left(0\right)}{2}\left(\frac{t}{\sqrt{n}}\right)^{2}+R_{Y}\left(\frac{t}{\sqrt{n}}\right)[/math]

by Taylor’s theorem, the remainder approaches zero as [math]n[/math] approaches infinity. We will ignore it from now on (but it is possible to prove precisely that it vanishes).

Notice also that [math]M_{Y}\left(0\right)=1,[/math] [math]M_{Y}^{'}\left(0\right)=E\left(Y\right)=0[/math] and [math]M_{Y}^{''}\left(0\right)=Var\left(Y\right)=1[/math], s.t. we obtain

[math]M_{Y}\left(\frac{t}{\sqrt{n}}\right)\simeq1+0+\frac{t^{2}}{2n}[/math]

When [math]n[/math] is large, we obtain [math]\lim_{n\rightarrow\infty}\left[M_{Y}\left(\frac{t}{\sqrt{n}}\right)\right]^{n}=\lim_{n\rightarrow\infty}\left(1+\frac{t^{2}}{2n}+\underset{\rightarrow0}{\underbrace{R_{Y}\left(\frac{t}{\sqrt{n}}\right)}}\right)^{n}=\lim_{n\rightarrow\infty}\left(1+\frac{t^{2}}{2n}\right)^{n}=\text{e}^{\frac{t^{2}}{2}}.[/math]

The result is the m.g.f. of [math]N\left(0,1\right)[/math], thereby proving the CLT.