# Partitioned Regression

Partitioned regression is a method to understand how some parameters in OLS depend on others. Consider the decomposition of the linear regression equation

\begin{aligned} & y=X\beta+\varepsilon\\ \Leftrightarrow & y=\left[\begin{array}{cc} X_{1} & X_{2}\end{array}\right]\left[\begin{array}{c} \beta_{1}\\ \beta_{2} \end{array}\right]+\varepsilon\end{aligned}

Where we are effectively partitioning the parameter vector $\beta$ into two sub-vectors $\beta_{1}$ and $\beta_{2}$.

What is $\widehat{\beta}_{1}$?

Starting with the OLS normal equation,

\begin{aligned} & X^{'}X\beta=X^{'}y\\ \Leftrightarrow & \left[\begin{array}{c} X_{1}^{'}\\ X_{2}^{'} \end{array}\right]\left[\begin{array}{cc} X_{1} & X_{2}\end{array}\right]\left[\begin{array}{c} \beta_{1}\\ \beta_{2} \end{array}\right]=\left[\begin{array}{c} X_{1}^{'}\\ X_{2}^{'} \end{array}\right]y\\ \Leftrightarrow & \left[\begin{array}{cc} X_{1}^{'}X_{1} & X_{1}^{'}X_{2}\\ X_{2}^{'}X_{1} & X_{2}^{'}X_{2} \end{array}\right]\left[\begin{array}{c} \beta_{1}\\ \beta_{2} \end{array}\right]=\left[\begin{array}{c} X_{1}^{'}\\ X_{2}^{'} \end{array}\right]y\\ \Leftrightarrow & \left\{ \begin{array}{c} X_{1}^{'}X_{1}\beta_{1}+X_{1}^{'}X_{2}\beta_{2}=X_{1}^{'}y\\ X_{2}^{'}X_{1}\beta_{1}+X_{2}^{'}X_{2}\beta_{2}=X_{2}^{'}y \end{array}\right.\end{aligned}

Define the equations immediately above as (1) and (2). Now, premultiply the first equation by $X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}$, to obtain

\begin{aligned} & X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}X_{1}\beta_{1}+X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}X_{2}\beta_{2}=X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}y\\ \Leftrightarrow & X_{2}^{'}X_{1}\beta_{1}+X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}X_{2}\beta_{2}=X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}y\end{aligned}

Removing the last equation from equation (2) yields:

$\left(X_{2}^{'}X_{2}-X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}X_{2}\right)\beta_{2}=\left[X_{2}^{'}-X_{2}^{'}X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}\right]y$

Now, let $P_{1}=X_{1}\left(X_{1}^{'}X_{1}\right)^{-1}X_{1}^{'}$, to get

\begin{aligned} & X_{2}^{'}\left(I-P_{1}\right)X_{2}\beta_{2}=X_{2}^{'}\left(I-P_{1}\right)y\\ \Leftrightarrow & \widehat{\beta_{2}}=\left[X_{2}^{'}\left(I-P_{1}\right)X_{2}\right]^{-1}X_{2}^{'}\left(I-P_{1}\right)y\end{aligned}

In order to interpret this equation, we need to understand the meaning of matrix $P_{1}$. In linear algebra, this matrix is called a projection matrix.

## Projections

Let $P_{X}=X\left(X^{'}X\right)^{-1}X^{'}$. When multiplied by a vector, matrix $P_{X}$ yields another vector that can be obtained by a weighted sum of vectors in $X$. Consider the following representation, which applies to the case where $N=3$ and $K=2$.

When multiplied by vector $y$, matrix $P_{x}$ yields vector $P_{x}y$, which lives in the column space of $X$ . This column space is the space defined by the vectors defined in the columns of $X$. Any vector in $Col\left(X\right)$ can be obtained by weighted sums of the vectors in the column space of $X$. In fact, notice that $P_{X}y=X\left(X^{'}X\right)^{-1}X^{'}y=X\widehat{\beta}_{OLS}$, i.e., it is the OLS prediction of $y$.

As for matrix $I-P_{X}$ , this matrix produces a vector that is orthogonal to the column space of $X$. In fact, it is given by the vertical dashed vector in the figure above. Notice that

$\left(I-P_{X}\right)y=y-\widehat{y}=\widehat{\varepsilon},$

i.e., this matrix produces the vector of estimated residuals, which is orthogonal (in the geometric sense) to the column space of $X$.

Projections are symmetric and idempotent, the last term meaning that repeated self-multiplication always yields the projection matrix itself.

## Partitioned Regression (cont.)

With the knowledge of projection matrices, equation

$\widehat{\beta_{2}}=\left[X_{2}^{'}\left(I-P_{1}\right)X_{2}\right]^{-1}X_{2}^{'}\left(I-P_{1}\right)y$

can be rewritten as

$\widehat{\beta_{2}}=\left[X_{2}^{*'}X_{2}^{*}\right]^{-1}X_{2}^{*'}y^{*}$

where

$X_{2}^{*}=\left(I-P_{1}\right)X_{2}$ and $y^{*}=\left(I-P_{1}\right)y$ (notice that we are using the idempotence property).

Notice that $y^{*}=\left(I-P_{1}\right)y$ are the residuals from regressing $y$ on $X_{1}$, and $X_{2}^{*}=\left(I-P_{1}\right)X_{2}$ are the residuals from regressing each of the variables in $X_{2}$ on $X_{1}$. Finally, $\widehat{\beta_{2}}$ is obtained from regressing the residuals $y^{*}$ on $X_{2}^{*}$.

More than a clarification of how OLS operates, partitioned regression can be used to inform the variance of two-stage estimators in which estimation requires plugging in first stage estimates into a second stage where additional estimates are produced. It can also be used to inform variable selection problems.