Lecture 14. A) Convergence

From Significant Statistics
Jump to navigation Jump to search
.

Convergence

In this lecture we will focus on extremely useful results that occur when we consider large sample sizes. In order to analyze these results, we need to introduce a few concepts related to convergence of sequences of random variables.

A sequence of random variables [math]X_{n}[/math] converges to a random variable [math]X[/math]:

  • in probability if [math]\lim_{n\rightarrow\infty}P\left(\left|X_{n}-X\right|\geq\varepsilon\right)=0,\,\forall\varepsilon\gt 0[/math]
  • almost surely if [math]P\left(\lim_{n\rightarrow\infty}\left|X_{n}-X\right|\gt \varepsilon\right)=0,\,\forall\varepsilon\gt 0[/math]
  • in quadratic mean if [math]\lim_{n\rightarrow\infty}E\left[\left(X_{n}-X\right)^{2}\right]=0[/math]

The concepts above apply if random variable [math]X[/math] is a constant. In this case, often denote it by [math]\mu[/math].

Some convergence concepts are stronger than others. The following facts are useful:

  • [math]X_{n}\overset{q.m.}{\rightarrow}X\Rightarrow X_{n}\overset{p}{\rightarrow}X[/math]
  • [math]X_{n}\overset{a.s.}{\rightarrow}X\Rightarrow X_{n}\overset{p}{\rightarrow}X[/math]
  • Quadratic mean convergence does not imply, nor is it implied, by almost sure convergence

Example: Convergence in Probability vs. Almost Sure Convergence

Let

[math]X_{n}=\begin{cases} 1, & \text{with probability }\frac{1}{n}\\ 0, & \text{with probability }1-\frac{1}{n} \end{cases}[/math]

[math]Y_{n}=\begin{cases} 1, & \text{with probability }\frac{1}{n^{2}}\\ 0, & \text{with probability }1-\frac{1}{n^{2}} \end{cases}[/math]

Do these random sequences converge in probability and/or almost surely to zero?

We first check that both sequences converge in probability to zero. For [math]X_{n}[/math], we require

[math]\begin{aligned} & \lim_{n\rightarrow\infty}P\left(\left|X_{n}-0\right|\geq\varepsilon\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}P\left(X_{n}\geq\varepsilon\right)=0\end{aligned}[/math]

If [math]\varepsilon\gt 1[/math], the condition above is always satisfied, since [math]X_{n}\in\left\{ 0,1\right\}[/math].

For [math]\varepsilon\in\left(0,1\right)[/math], we have [math]\begin{aligned} & \lim_{n\rightarrow\infty}P\left(X_{n}\geq\varepsilon\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}P\left(X_{n}=1\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}\frac{1}{n}=0\end{aligned}[/math]

which does indeed hold. The same method can be used to prove convergence in probability for [math]Y_{n}[/math].

Now, consider convergence almost surely.

For [math]X_{n}[/math], we require

[math]\begin{aligned} & P\left(\lim_{n\rightarrow\infty}\left|X_{n}-X\right|\gt \varepsilon\right)=0\\ \Leftrightarrow & P\left(\lim_{n\rightarrow\infty}X_{n}\gt \varepsilon\right)=0\\ \Rightarrow & P\left(\lim_{n\rightarrow\infty}X_{n}=1\right)=0\end{aligned}[/math]

where the last equation follows from the fact that [math]X\in\left\{ 0,1\right\}[/math], and that the condition can only be satisfied if the probability of [math]X_{n}[/math] equaling 1 vanishes as [math]n\rightarrow0[/math].

We will approach this problem indirectly.

Consider the sum, starting at a very high [math]n[/math], of the probability that [math]X_{n}=1[/math]: [math]\sum_{i=n}^{\infty}P\left(X_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i}[/math].

If this sum diverges, it means that for very high [math]n[/math], we still obtain [math]X_{n}=1[/math] with a finite probability, such that adding all the ones creates a diverging sum. If, on the other hand, the sum converges, this means that when [math]n[/math] is large, the probability of observing [math]X_{n}=1[/math] equals zero. Notice that

[math]\sum_{i=n}^{\infty}P\left(X_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i}=\infty[/math]

and

[math]\sum_{i=n}^{\infty}P\left(Y_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i^{2}}\lt \infty[/math]

(We don’t prove the results above here.)

While both sequences converge in probability to zero, only [math]Y_{n}[/math] converges almost surely. The reason is that, when [math]n[/math] is very high, the probability of observing [math]X_{n}=1[/math] remains finite (so that the sum of subsequent probabilities diverges), while the probability of observing [math]Y_{n}=1[/math] vanishes to zero (so that the sum of subsequent probabilities converges).

Estimator Consistency

Because estimators are statistics, we can apply the definitions we have found to them as well.

We say that an estimator [math]\widehat{\theta}[/math] is consistent if

[math]\widehat{\theta}\left(X_{1}..X_{n}\right)\overset{p}{\rightarrow}\theta,\forall\theta\in\Theta[/math],

where [math]X_{1}..X_{n}[/math] is a sequence of random variables (usually, data).

For example, the maximum-likelihood estimator is consistent, i.e., [math]\widehat{\theta_{ML}}\overset{p}{\rightarrow}\theta_{0}[/math].