.

# Convergence

In this lecture we will focus on extremely useful results that occur when we consider large sample sizes. In order to analyze these results, we need to introduce a few concepts related to convergence of sequences of random variables.

A sequence of random variables $X_{n}$ converges to a random variable $X$:

• in probability if $\lim_{n\rightarrow\infty}P\left(\left|X_{n}-X\right|\geq\varepsilon\right)=0,\,\forall\varepsilon\gt 0$
• almost surely if $P\left(\lim_{n\rightarrow\infty}\left|X_{n}-X\right|\gt \varepsilon\right)=0,\,\forall\varepsilon\gt 0$
• in quadratic mean if $\lim_{n\rightarrow\infty}E\left[\left(X_{n}-X\right)^{2}\right]=0$

The concepts above apply if random variable $X$ is a constant. In this case, often denote it by $\mu$.

Some convergence concepts are stronger than others. The following facts are useful:

• $X_{n}\overset{q.m.}{\rightarrow}X\Rightarrow X_{n}\overset{p}{\rightarrow}X$
• $X_{n}\overset{a.s.}{\rightarrow}X\Rightarrow X_{n}\overset{p}{\rightarrow}X$
• Quadratic mean convergence does not imply, nor is it implied, by almost sure convergence

## Example: Convergence in Probability vs. Almost Sure Convergence

Let

$X_{n}=\begin{cases} 1, & \text{with probability }\frac{1}{n}\\ 0, & \text{with probability }1-\frac{1}{n} \end{cases}$

$Y_{n}=\begin{cases} 1, & \text{with probability }\frac{1}{n^{2}}\\ 0, & \text{with probability }1-\frac{1}{n^{2}} \end{cases}$

Do these random sequences converge in probability and/or almost surely to zero?

We first check that both sequences converge in probability to zero. For $X_{n}$, we require

\begin{aligned} & \lim_{n\rightarrow\infty}P\left(\left|X_{n}-0\right|\geq\varepsilon\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}P\left(X_{n}\geq\varepsilon\right)=0\end{aligned}

If $\varepsilon\gt 1$, the condition above is always satisfied, since $X_{n}\in\left\{ 0,1\right\}$.

For $\varepsilon\in\left(0,1\right)$, we have \begin{aligned} & \lim_{n\rightarrow\infty}P\left(X_{n}\geq\varepsilon\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}P\left(X_{n}=1\right)=0\\ \Leftrightarrow & \lim_{n\rightarrow\infty}\frac{1}{n}=0\end{aligned}

which does indeed hold. The same method can be used to prove convergence in probability for $Y_{n}$.

Now, consider convergence almost surely.

For $X_{n}$, we require

\begin{aligned} & P\left(\lim_{n\rightarrow\infty}\left|X_{n}-X\right|\gt \varepsilon\right)=0\\ \Leftrightarrow & P\left(\lim_{n\rightarrow\infty}X_{n}\gt \varepsilon\right)=0\\ \Rightarrow & P\left(\lim_{n\rightarrow\infty}X_{n}=1\right)=0\end{aligned}

where the last equation follows from the fact that $X\in\left\{ 0,1\right\}$, and that the condition can only be satisfied if the probability of $X_{n}$ equaling 1 vanishes as $n\rightarrow0$.

We will approach this problem indirectly.

Consider the sum, starting at a very high $n$, of the probability that $X_{n}=1$: $\sum_{i=n}^{\infty}P\left(X_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i}$.

If this sum diverges, it means that for very high $n$, we still obtain $X_{n}=1$ with a finite probability, such that adding all the ones creates a diverging sum. If, on the other hand, the sum converges, this means that when $n$ is large, the probability of observing $X_{n}=1$ equals zero. Notice that

$\sum_{i=n}^{\infty}P\left(X_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i}=\infty$

and

$\sum_{i=n}^{\infty}P\left(Y_{n}=1\right)=\sum_{i=n}^{\infty}\frac{1}{i^{2}}\lt \infty$

(We don’t prove the results above here.)

While both sequences converge in probability to zero, only $Y_{n}$ converges almost surely. The reason is that, when $n$ is very high, the probability of observing $X_{n}=1$ remains finite (so that the sum of subsequent probabilities diverges), while the probability of observing $Y_{n}=1$ vanishes to zero (so that the sum of subsequent probabilities converges).

## Estimator Consistency

Because estimators are statistics, we can apply the definitions we have found to them as well.

We say that an estimator $\widehat{\theta}$ is consistent if

$\widehat{\theta}\left(X_{1}..X_{n}\right)\overset{p}{\rightarrow}\theta,\forall\theta\in\Theta$,

where $X_{1}..X_{n}$ is a sequence of random variables (usually, data).

For example, the maximum-likelihood estimator is consistent, i.e., $\widehat{\theta_{ML}}\overset{p}{\rightarrow}\theta_{0}$.