Lecture 11. C) Variation on a Theme

From Significant Statistics
Jump to navigation Jump to search

Variation on the Theme

Let’s stop for a second, and ask “why are we even doing this”? Why is it so important to determine whether [math]\mu=0[/math] or [math]\mu\gt 0[/math]? Why not simply estimate [math]\mu[/math] through maximum likelihood or the method of moments, and use whatever information we obtained? If we estimate [math]\mu=0.3[/math], well, then maybe [math]\mu[/math] is indeed 0.3 for all we know.

The debate about hypothesis testing dates back to the early to 20th century primarily between Ronald Fisher and Jerzy Neyman and continued for several decades. A significant part of the debate was philosophical. What we teach and study today is a product of that debate, taking mostly Neyman’s as well as his co-author Karl Pearson’s approach, but still informed by ideas from Ronald Fisher. Fisher’s motivation was in part whether to maintain or reject a currently-held scientific hypothesis. If the data disagreed with the current hypothesis sufficiently - in some formal way - then one could do away with it. In contrast, Neyman and Pearson’s approach pitches two hypotheses, and favors one or the other.

Over time, Neyman and Pearson’s approach gained momentum, probably due to the amount of formal tools used, as well as due to the Neyman-Pearson lemma, which establishes a form of optimality when selecting a test. The approach is agnostic in terms of the scientific method. In practice though, natural sciences employ this method conservatively: A current theory is disproved if - statistically speaking - the chance that the observations from an experiment disagreed simply because of randomness are very very small; yet, we we observe a disagreement in the data. (For example, our theory predicted that the chances of observing a sample mean higher than 0 was 0.00001%; yet we observed it.)

In the social sciences, there exists a mild debate about how conservative hypothesis testing should be. Because humans are so volatile, data about their behavior is not always good enough to convincingly disprove a theory. And clearly, social sciences face challenges in replicating experiments while keeping conditions completely stable. So, some authors have proposed that the dichotomous approach of lending support to one theory or another by partitioning the parameter space into two is inadequate for the social sciences. Instead, researchers should keep track of the parameters estimated over time, in different studies and experiments, and use full information instead.

Some related debates still take place: For example, does the dichotomous approach provide too much freedom/incentive for scientists to interfere with experimental results?

The point of this section is to try to sensitize you to the fact that, despite its mathematical language, hypothesis testing is a tool that does the job your ask of it. By studying it and understanding it well, you will be able to decide whether it solves the particular problem you face, whether it needs a tweak or two, or whether it is completely inapplicable/inappropriate.

Keep in mind though, that the theory is intricate and sometimes deceiving. The people I’ve talked to who know the most in the world about this area of statistics say as much. You may also want to keep in mind that many people know only a bit about it, yet will speak as if they were experts.

With that, let’s proceed into the jewel of the modern scientific method.