Lecture 18. A) Multicollinearity

From Significant Statistics
Jump to navigation Jump to search
.

Multicollinearity

Consider the case where [math]X[/math] is given by:

[math]X=\overset{\begin{array}{ccc} \beta_{0} & \beta_{1} & \beta_{2}\end{array}}{\left[\begin{array}{ccc} 1 & 1 & 0\\ 1 & 0 & 1\\ 1 & 1 & 0\\ 1 & 0 & 1\\ 1 & 1 & 0 \end{array}\right]}[/math]

Notice that [math]\beta_{0}=0;\beta_{1}=1;\beta_{2}=1[/math] predict the same value of [math]y_{i}[/math] as [math]\beta_{0}=1;\beta_{1}=0;\beta_{2}=0[/math]. In this case, there is no unique solution for [math]\widehat{\beta}_{OLS}=\text{argmin}_{\beta}\left(y-X\beta\right)^{'}\left(y-X\beta\right).[/math]

Issues may also arise when two variables are almost collinear. In this case, it can become challenging to identify the parameters of two highly correlated variables separately. Moreover, perturbing the data may move significance from one parameter to the other, and often only one of the two parameters will be significant (although removing the significant regressor will make the parameter of the remaining regressor significant).

When one cares only about prediction, then separating two coefficients may not be crucial: The joint effect is what the researcher is interested in this case, and it matters little that the effects from each regressor are hard to separate. On the other hand, if one cares about the specific effects, then there is no easy way around the problem.

Often, multicollinearity arises because of the so-called “dummy trap”. In the example above, [math]x_{1}[/math] could represent young age and [math]x_{2}[/math] could represent old age. By adding all possible cases as well as a constant, we have effectively introduced “too many cases” and induced multicollinearity. For example, notice that by simply introducing [math]x_{1}[/math], an observation where [math]x_{1}=0[/math] means that the individual is old, and the effect will be captured by [math]\beta_{0}[/math]. When [math]x_{0}=1[/math], the effect of an individual being young is characterized by [math]\beta_{0}+\beta_{1}[/math]. Alternatively, one could have removed the constant from the model.

If multicollinearity is not due to the dummy trap, one can use specific regression methods (e.g., ridge regression, principal components regression) that eliminate variables according to specific criteria. These techniques are especially useful in regressions with many variables, sometimes even when [math]K\gt N[/math]. However, they do not solve the issue of separately identifying effects of each variable. Ideally, collecting more or better data will solve the problem.