II.II.1 OLS for Multiple Regression
The
general linear statistical
model can be described in matrix notation as
(II.II.11)
where
y is a stochastic T*1
vector, X is a
deterministic (exogenous) T*K matrix, b
is a K*1 vector of invariant parameters to be estimated by OLS, e
is a T*1 disturbance vector, T is the number of observations in the
sample, and K is the number of exogenous variables used in the right
hand side of the econometric equation.
It
is furthermore assumed
that
(II.II.12)
which
is the equivalent matrix expression of the weak set of assumptions
under section II.I.3.
The
least squares estimator minimizes e'e
(the sum of squared residuals).
Solving
the normal equations X'Xb
= X'y with respect to b
yields
(II.II.13)
where
X'X must be a non
singular symmetric K*K matrix!
Obviously,
the OLS estimator is unbiased
(II.II.14)
since
E(X'e) = 0
by assumption (X is
exogenous). This result can be proved quite easily.
Note that if X is not
exogenously given (thus stochastic) the small sample property of
unbiasedness only holds if E(X'e)
= 0.
Under
the assumption of OLS it can be proved that the covariance matrix of the parameters is
(II.II.15)
The
GaussMarkov theorem
states that if
(II.II.16)
then
any other estimator
(II.II.17)
has
a parameter covariance matrix which is at least as large as the
covariance matrix of the OLS parameters
(II.II.18)
This
important theorem therefore proves that the OLS estimator is a best
linear unbiased estimator (BLUE).
If
D^{*} is a K by T
matrix which is independent from y and if
(II.II.19)
the
parameter vector is by definition a linear estimator, and if
(II.II.110)
then
it follows that
(II.II.111)
Evidently,
it follows from (II.II.111) that the parameter vector can only be
unbiased if DX = 0
and if E(D^{*}e) = 0.
Now
what happens to the covariance matrix of this estimator? Obviously,
we find
(II.II.112)
which proves the theorem (on comparing
(II.II.112) with (II.II.15); Q.E.D.).
It can be proved that
(II.II.113)
which
states that the OLS estimator of the variance is unbiased.
The
operational formula for calculating the variance
is
(II.II.114)
The
prediction of y
values outside the sample range is
(II.II.115)
which
is an unbiased
prediction function
(II.II.116)
Example of
extrapolation forecast
The
point forecast error can
be found as
(II.II.117)
whereas
the average forecast error
is equal to
(II.II.118)
The
degree of explanation can
be measured by the determination coefficient (Rsquared) or by the
Fstatistic which is defined as
(II.II.119)
and
(II.II.120)
and
(II.II.121)
where
the F statistic is valid
for all ß coefficients except for the constant term.
To
test the significance of a subset of m parameters (out of a total number of K) the following F
test is used
(II.II.122)
which
is in fact a generalization of (II.II.121).
The
parameter estimation of a multiple and a simple
regression are related to each other. It is also possible to
prove that if all explanatory variables are independent
(orthogonal), there is no difference between multiple and simple
regression coefficients. Assume
(II.II.123)
then
it is easily deduced from (II.II.123) that any multiple regression
parameter can be computed by
(II.II.124)
Since
it is assumed that the explanatory variables are orthogonal it
follows that
(II.II.125)
and
due to the OLS assumptions we know that
(II.II.126)
On
substituting (II.II.I25) and (II.II.126) into (II.II.124) we
obtain
(II.II.127)
which
proves the theorem (Q.E.D.).
