Lecture 3. B) Moments
Contents
Moments of a Random Variable
We have already discussed the expectation of an r.v. [math]X[/math]. It is useful to also consider expectations of specific transformations of [math]X[/math], called moments:
- The [math]k^{\mbox{th}}[/math] moment of [math]X[/math] is [math]E\left(X^{k}\right)=\mu_{k}^{'}[/math], [math]k\in\mathbb{N}[/math]
- The expected value of [math]X[/math] can be written as [math]\mu[/math], and also as [math]\mu_{1}^{'}[/math] .
- The [math]k^{\mbox{th}}[/math] central/centered moment of [math]X[/math] is [math]E\left(\left(X-\mu\right)^{k}\right)=\mu_{k}[/math], [math]k\in\mathbb{N}[/math]
Moments and Functions of Moments
- The variance of [math]X[/math] is [math]\mu_{2}=\sigma^{2}=E\left(\left(X-\mu\right)^{2}\right)[/math]
- The standard deviation of [math]X[/math] is [math]\sigma=\sqrt{\sigma^{2}}=\sqrt{E\left(\left(X-\mu\right)^{2}\right)}[/math]
- The skewness of [math]X[/math] is [math]\alpha_{3}=E\left(\left(\frac{X-\mu}{\sigma}\right)^{3}\right)=\frac{E\left(\left(X-\mu\right)^{3}\right)}{Var\left(X\right)^{\frac{3}{2}}}[/math]
- The kurtosis of [math]X[/math] is [math]\alpha_{4}=E\left(\left(\frac{X-\mu}{\sigma}\right)^{4}\right)=\frac{E\left(\left(X-\mu\right)^{4}\right)}{Var\left(X\right)^{2}}[/math]
Both skewness and kurtosis characterize the shape of the distribution. Both denominators are always positive, and so act as normalizers.
In terms of skewness, as [math]\mu[/math] increases, [math]\alpha_{3}[/math] decreases, such that distributions with higher mean (usually lop-sided the right side) show negative skewness.
As for kurtosis, both the numerator and denominator capture variability, but the numerator weights outliers more. When [math]\alpha_{4}=3[/math], we say the distribution is mesokurtic: its tails represent the same relative mass as the center, and the tails decay at the same rate, as the normal distribution. When [math]\alpha_{4}\lt 3[/math], the distribution is leptokurtic (most of the mass is away from the tails), and when [math]\alpha_{4}\gt 3[/math], we say the distribution is platykurtic (the distribution appears flatter than the Normal).
Standard Normal Distribution
Suppose [math]X\sim N\left(0,1\right)[/math], so [math]X[/math] is continuous with pdf [math]f_{X}\left(x\right)=\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{x^{2}}{2}\right)[/math].
Useful fact: In the case of the normal, [math]xf_{X}\left(x\right)=-\frac{d}{dx}f_{X}\left(x\right)[/math].
So,
[math]E\left(X\right)=\int_{-\infty}^{\infty}tf_{X}\left(t\right)dt=\left.f_{X}\left(t\right)\right|_{-\infty}^{\infty}=0-0=0[/math]
and
[math]Var\left(X\right)=E\left(\left(X-\mu\right)^{2}\right)=E\left(\left(X-0\right)^{2}\right)=E\left(X^{2}\right)=\int_{-\infty}^{\infty}t^{2}f_{X}\left(t\right)dt[/math]
where integration by parts yields
[math]\int_{-\infty}^{\infty}t^{2}f_{X}\left(t\right)dt=\underset{=0}{\underbrace{-\left.tf_{X}\left(t\right)\right|_{-\infty}^{\infty}}}+\underset{=1}{\underbrace{\int_{-\infty}^{\infty}f_{X}\left(t\right)dt}}=1.[/math]
Cauchy Distribution
In this case, [math]f_{X}=\frac{1}{\pi}\frac{1}{1+x^{2}}[/math], and [math]\int_{-\infty}^{\infty}\left|t\right|f_{X}\left(t\right)dt=\infty[/math] (i.e., its first moment does not exist).
Also, notice that because [math]\int_{-\infty}^{\infty}\left|t\right|^{k}f_{X}\left(t\right)dt\geq\int_{-\infty}^{\infty}\left|t\right|f_{X}\left(t\right)dt[/math] for [math]k\gt 1[/math], then the nonexistence of a moment implies nonexistence of higher moments as well.
Some Useful Identities
- [math]Var\left(X\right)=E\left(X^{2}\right)-E\left(X\right)^{2}[/math].
- [math]E\left(aX+b\right)=aE\left(X\right)+b[/math].
- [math]Var\left(aX+b\right)=a^{2}Var\left(X\right)[/math].
Normal Distribution
An r.v. [math]X[/math] is normally distributed with mean [math]\mu[/math] and variance [math]\sigma^{2}[/math], denoted as [math]X\sim N\left(\mu,\sigma^{2}\right)[/math], if [math]X[/math] is continuous with pdf [math]f_{X}\left(x\right)=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left(-\frac{\left(x-\mu\right)^{2}}{2\sigma^{2}}\right),x\in\mathbb{R}[/math].
Here’s a helpful fact: If [math]Z\sim N\left(0,1\right)[/math], then [math]X=\mu+\sigma Z\sim N\left(\mu,\sigma^{2}\right)[/math].
From this, it follows that:
- [math]E\left(X\right)=E\left(\mu+\sigma Z\right)=\mu+\sigma\underset{=0}{\underbrace{E\left(z\right)}}=\mu.[/math]
- [math]Var\left(X\right)=Var\left(\mu+\sigma Z\right)=\sigma^{2}\underset{=1}{\underbrace{Var\left(Z\right)}}=\sigma^{2}.[/math]