Lecture 11. G) Setting the Critical Value

From Significant Statistics
Jump to navigation Jump to search

Setting the Critical Value

Suppose we would like the probability of a type 1 error to equal 5% exactly. The way to do this is start with some fictitious threshold [math]c[/math], write down the probability of a type 1 error, and then equal it to 5% and solve w.r.t. [math]c[/math].

We rewrite our rule to reject [math]H_{0}[/math] when [math]\overline{X}\gt c[/math].

Then, [math]P_{\mu}\left(\text{type 1 error}\right)=P_{\mu}\left(\overline{X}\gt c\right)=1-P_{\mu}\left(\overline{X}\leq c\right)=1-\Phi\left(\frac{c-\mu}{\sqrt{\frac{1}{n}}}\right)[/math].

Under the null hypothesis that [math]\mu=0[/math], and for [math]n=20[/math], we want [math]P_{\mu}\left(\text{type 1 error}\right)=1-\Phi\left(c\sqrt{20}\right)=0.05[/math]. This equation does not admit a closed form solution, but the approximate numerical solution is [math]c\simeq0.368[/math].

At this value, [math]P_{\mu}\left(\text{type 1 error}\right)=0.05[/math].

Is it intuitive that as we increase the critical value (from zero in the previous example to 0.368 in this one), the probability of a type 1 error decreases (from 50% to 5%).

First, fix [math]\mu=0[/math], and imagine drawing multiple sample means. Clearly, fewer of them fall above [math]c=0.368[/math] than above [math]c=0[/math]. Given that we reject [math]H_{0}[/math] when [math]\overline{X}\gt c[/math], the likelihood of rejection decreases as [math]c[/math] decreases. With [math]n=20,[/math] at [math]c=0.368[/math], it is exactly 0.05.

A note on [math]n[/math]

You may feel a bit uncomfortable about the fact that we fixed [math]n[/math] in these examples. Clearly, [math]n[/math] affects the choice of critical value to use. For example, if [math]n[/math] were very high, a very small deviation from zero could already justify rejecting the null hypothesis.

When [math]n[/math] is given, this is not an issue. However, when the researcher has the ability to set [math]n[/math], then she has two degrees of freedom, [math]c[/math] and [math]n[/math]. This can be useful in experimental design and data collection. If one has good information about [math]\sigma^{2}[/math], one may select [math]c[/math] and [math]n[/math] to determine the probability of a type 1 error, and the probability of a type 2 error simultaneously (at specific values of [math]\mu[/math]), for example, since it will be possible to create a system of equations with two unknowns.

Composite [math]H_{0}[/math]

We could have designed a test with a composite null hypothesis. In this case, we could select [math]c[/math] for example by solving problem

[math]\begin{aligned} \max_{c\in\mathbb{R},\mu\in\Theta_{0}} & \,P_{\mu}\left(\overline{X}\gt c\right)\\ s.t. & P_{\mu}\left(\overline{X}\gt c\right)\leq0.05,\end{aligned}[/math]

where

[math]\Theta_{0}[/math] is the set of [math]\mu[/math]’s contemplated in the null hypothesis.

However, designing a test with a simple null hypothesis is simpler, because we require equality to the specified rejection probability at a single value of [math]\mu[/math].