• Nenhum resultado encontrado

Métodos Quantitativos dout 2019 - 2

N/A
N/A
Protected

Academic year: 2021

Share "Métodos Quantitativos dout 2019 - 2"

Copied!
38
0
0

Texto

(1)

Econometria aplicada 2018

(2)

Generalising the Simple Model to

Multiple Linear Regression

Before, we have used the model t = 1,2,...,T

But what if the dependent (y) variable depends on more than one independent variable?

It is very easy to generalise the simple model to one with k-1 regressors (independent variables).

t t

t x u

(3)

Multiple Regression and the Constant Term

Now we write

, t=1,2,...,T

Where is x1? It is the constant term. In fact the constant term is usually represented by a column of ones of length T:

1 is the coefficient attached to the constant term (which we called  before). t kt k t t t

x

x

x

u

y

1

2 2

3 3

...

             1 1 1 1  x

(4)

Different Ways of Expressing

the Multiple Linear Regression Model

We could write out a separate equation for every value of t:

We can write this in matrix form

y = X +u where y is T  1 X is T  k is k  1 u is T  1 T kT k T T T k k k k

u

x

x

x

y

u

x

x

x

y

u

x

x

x

y

...

...

...

3 3 2 2 1 2 2 32 3 22 2 1 2 1 1 31 3 21 2 1 1

(5)

Inside the Matrices of the

Multiple Linear Regression Model

e.g. if k is 2, we have 2 parameters, one of which is a column of ones:

T 1 T2 21 T1

Notice that the matrices written in this way are conformable.

1 21 1 2 22 1 2 2 2

1

1

1

T T T

y

x

u

y

x

u

y

x

u

  

 

  

 

 

  

 

 

  

 

 

  

 

  

 

(6)

How Do We Calculate the Parameters (the

)

in this Generalised Case?

Previously, we took the residual sum of squares, and minimised it w.r.t.  and .

In the matrix notation, we have

The RSS would be given by

             T u u u u ˆ ˆ ˆ ˆ 2 1 

1 2 2 2 2 2 1 2 1 2 ˆ ˆ ˆ ˆ' ˆ ˆ ˆT ˆ ˆ ... ˆT ˆt u u u u u u u u u u u                

(7)

The OLS Estimator for the

Multiple Regression Model

• In order to obtain the parameter estimates, 1, 2,..., k, we would minimise the RSS with respect to all the s.

It can be shown that

y X X X k                   2 1 1 ) ( ˆ ˆ ˆ ˆ     

(8)

Derivation of the OLS coefficient

estimator in matrix form

Derivation of the OLS coefficient estimator in matrix form

2

The linear model in matrix form is:

ˆ

ˆ

Hence, the residuals are:

ˆ

ˆ

ˆ

ˆ ˆ

We want to minimize the residual sum of squares, i.e.

'

Hence

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ ˆ

'

(

) '(

)

'

' '

'

' '

y

X

u

u

y X

u

u u L

L u u

y X

y X

y y

X y y X

X X

 

 

(9)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

But

ˆ

' '

'

ˆ

because

ˆ

' '

( '

ˆ

) '

ˆ ' ' is (1 ) (

) (

1) 1 1

scalar

ˆ

( '

) ' is (1

) (

) (

1) 1 1

scalar

X y

y X

X y

y X

X y

k

k T

T

y X

T

T k

k

       

       

(10)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

Let us check it out:

1 1 2 1 2 1 2 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 1 1 2 1 2 1 1 2 2 2 2 2 1 1 2 2 1 1 2 2

ˆ

' '

'

ˆ

since

1

1

ˆ ' '

(

)

(

)

and

1

ˆ

'

1

(

)

(

)

X y

y X

y

y

y

X y

x

x

y

x y

x y

y

y

x y

x y

x

y X

y

y

y

y

x y

x y

x

y

y

x y

x y

 

 

 







(11)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

We have already seen that:

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ ˆ

'

(

) '(

)

'

' '

'

' '

ˆ

ˆ

ˆ

ˆ

ˆ

But, since ' '

'

then

'

2 '

' '

Now, we must differentiate:

ˆ

ˆ

ˆ

( ' )

( '

)

( ' '

)

2

ˆ

ˆ

ˆ

ˆ

L u u

y X

y X

y y

X y y X

X X

X y

y X

L

y y

y X

X X

L

y y

y X

X X

 

 

(12)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

st

ˆ

ˆ

ˆ

( ' )

( '

)

( ' '

)

2

ˆ

ˆ

ˆ

ˆ

( ' )

1 term:

0

ˆ

L

y y

y X

X X

y y

(13)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

nd

( '

ˆ

)

[( ' ) ' ]

ˆ

2 term: 2

2

2 '

ˆ

ˆ

y X

X y

X y

 

 

1 1 2 2 1 1 2 2 1

Let be a column vector and a column vector . Then we can write:

´

The derivative of ´ w.r.t. vector will be: ( ´ ) ( ( ´ ) n n n n a x a x a x a x a x a x x                                    a a x x a x a x x a x a x x    1 1 2 2 1 1 1 1 2 2 2 2 2 1 1 2 2 ( ) ´ ) ( ) ( ´ ) ( ) n n n n n n n n n a x a x a x x a a x a x a x a x x a a x a x a x x x                                                                     a x a a x      

(14)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

rd ( ' 'ˆ ˆ)

3 term: ˆ

ˆ ' is a 1 k vector, ' is a symmetric square matrix, and is a k 1 vector.ˆ ˆ ˆ

Hence, ' ' has the same form as ' .

( ' ) But from the matrix chapter, we know that 2

X X X X X X x Ax x Ax Ax x              ˆ ˆ ( ' ' ) ˆ then  X Xˆ  2 'X X  

(15)

Derivation of the OLS coefficient

estimator in matrix form (cont.)

Hence we can write:

1 1 1 1 1

ˆ

ˆ

ˆ

( ' )

( '

)

( ' '

)

2

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

0 2 '

2 '

0

'

'

If we pre-multiply both sides by ( ' ) , we get:

ˆ

( ' ) ( ' )

( ' )

'

But ( ' ) ( ' )

ˆ ( ' )

'

L

y y

y X

X X

X y

X X

X X

X y

X X

X X

X X

X X

X y

X X

X X

I

X X

X y

    

 

(16)

Proof that is a minimum

 

1

st

derivative = β

2

nd

derivative =

If is a minimum, X´X must be a positive definite matrix

X´X is positive definite iff all elements in the leading

diagonal > 0 and det(X´X) > 0.

(17)

Proof that is a minimum

 

The elements in the leading diagonal are 2 and which are

both positive.

which is positive.

This can be generalized to a matrix with a T x k size.

This proves that is the estimator that minimizes u´u.

QED.

(18)

Calculating the Standard Errors for the

Multiple Regression Model

Check the dimensions: is k  1 as required.

But how do we calculate the standard errors of the coefficient estimates?Previously, to estimate the variance of the errors, 2, we used .

Now using the matrix notation, we use where k = number of regressors.

The OLS estimator of the variance of is given by the diagonal elements of , so that the variance of is the 1st element, the variance of

is the 2nd element, and variance of is the kth diagonal element.

s

2

 

T k

u u

' 

  s X X2( ' )1

1  2 k 2 ˆ2 2  

T u s t

(19)

Derivation of the OLS standard error

estimator in multiple regression

The variance of a vector of random variables is given by

the formula E[( −

)( −

)’]. Since and y = X

+ u:

(20)

Derivation of the OLS standard error

estimator in multiple regression (cont.)

= variance-covariance matrix of the coefficients

• The leading diagonal of s2(X’X)-1 are the variances of the coefficients  0,

1, 2,..., k, and the off-diagonal terms are the covariances between each pair of coefficients, i.e.  with  ;  with  , etc

(21)

Derivation of the OLS standard error estimator

in multiple regression (cont.)

The standard errors  i = 0, 1, 2, …, k are respectively the

square roots of the i

th

element of the leading diagonal of .

In a 3-regressor equation, we have:

This is the variance-covariance matrix of the regression

 

��� (^)=2 (�′ )− 1=

(

� �� ( ^�0) ��� ( ^�0, ^�1) ��� ( ^�0, ^�2) ���( ^�1, ^�0) � �� ( ^�1) ��� ( ^�1, ^�2) ���( ^�2, ^�0) ��� ( ^�2, ^�1) � �� ( ^�2)

)

(22)

Calculating Parameter and Standard Error Estimates

for Multiple Regression Models: An Example

Example: The following model with k=3 is estimated over 15 observations: and the following data have been calculated from the original X’s.

Calculate the coefficient estimates and their standard errors.

To calculate the coefficients, just multiply the matrix by the vector to obtain .

To calculate the standard errors, we need an estimate of 2.

( ' ) .. .. .. . . . ,( ' ) .. . , '  . X X    X y u u                     1 2 0 3535 10 6510 10 65 4 3 30 2 2 0 6 10 96

s

2

    

T k

RSS

15 3 0 91

10 96

.

.

u

x

x

y

1

2 2

3 3

X' X

1X' y

(23)

Calculating Parameter and Standard Error Estimates

for Multiple Regression Models: An Example (cont’d)

The variance-covariance matrix of is given by

The variances are on the leading diagonal:

We write:   s X X2 1 0 91 X X 1 320 0 91 594183 320 0 91 0 91 594 393 ( ' ) . ( ' ) .. .. .. . . .            Var SE Var SE Var SE (  ) . (  ) . (  ) . (  ) . (  ) . (  ) .

1 1 2 2 3 3 183 135 0 91 0 96 3 93 198       

1

.

10

 

4

.

40

 

19

.

88

ˆ

x

2t

x

3t

y

(24)

Testing Multiple Hypotheses: The F-test

We used the t-test to test hypotheses involving only one coefficient. But what if we want to test more than one coefficient simultaneously?

We do this using the F-test. The F-test involves estimating 2 regressions. The unrestricted regression is the one in which the coefficients are freely

determined by the data, as we have done before.

The restricted regression is the one in which the coefficients are restricted, i.e. the restrictions are imposed on some s.

(25)

The F-test:

Restricted and Unrestricted Regressions

Example

The general regression is

yt = 1 + 2x2t + 3x3t + 4x4t + ut (1)

• We want to test the restriction that 3+4 = 1. The unrestricted regression is (1) above, but what is the restricted regression?

yt = 1 + 2x2t + 3x3t + 4x4t + ut s.t. 3+4 = 1

• We substitute the restriction (3+4 = 1) into the regression so that it is automatically imposed on the data.

(26)

The F-test: Forming the Restricted Regression

yt = 1 + 2x2t + 3x3t + (13)x4t + ut

yt = 1 + 2x2t + 3x3t + x4t  3x4t + utGather terms in ’s together and rearrange

(yt  x4t) = 1 + 2x2t + 3(x3t x4t) + ut

This is the restricted regression. We actually estimate it by creating two new variables, call them, say, Pt and Qt.

Pt = yt  x4t

Qt = x3t  x4t so

(27)

Calculating the F-Test Statistic

The test statistic is given by

where URSS = RSS from unrestricted regression RRSS = RSS from restricted regression m = number of restrictions

T = number of observations

k = number of regressors in unrestricted regression

including a constant in the unrestricted regression (or the total number of parameters to be estimated).

(28)

The F-Distribution

The test statistic follows the F-distribution, which has 2 d.f. parameters.

The value of the degrees of freedom parameters are m and (T-k) respectively (the order of the d.f. parameters is important).

The appropriate critical value will be in column m, row (T-k).

The F-distribution has only positive values and is not symmetrical. We therefore only reject the null if the test statistic > critical F-value.

(29)

Determining the Number of Restrictions in an F-test

Examples :

H0: hypothesis No. of restrictions, m

1 + 2 = 2 1

2 = 1 and 3 = -1 2

2 = 0, 3 = 0 and 4 = 0 3

If the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut, then the null hypothesis

H0: 2 = 0, and 3 = 0 and 4 = 0 is tested by the regression F-statistic. It tests the null hypothesis that all of the coefficients except the intercept coefficient are zero.

Note the form of the alternative hypothesis for all tests when more than one restriction is involved: H :   0, or   0 or   0

(30)

What we Cannot Test with Either an F or a t-test

We cannot test using this framework hypotheses which are not linear or which are multiplicative, e.g.

H0: 2 3 = 2 or H0: 2 2= 1

(31)

The Relationship between the t and the

F-Distributions

Any hypothesis which could be tested with a t-test could have been tested using an F-test, but not the other way around.

For example, consider the hypothesis H0: 2 = 0.5

H1: 2  0.5

We could have tested this using the usual t-test:

or it could be tested in the framework above for the F-test. Note that the two tests always give the same result since the

t-distribution is just a special case of the F-t-distribution.

For example, if we have some random variable Z, and Z  t(T-k) then

Z2  F(1,T-k) test stat SE    . (  )

2 2 0 5

(32)

F-test Example

Question: Suppose a researcher wants to test whether the returns on a company stock (y) show unit sensitivity to two factors (factor x2 and factor x3) among three considered. The regression is carried out on 144 monthly observations. The

regression is yt = 1 + 2x2t + 3x3t + 4x4t+ ut

- What are the restricted and unrestricted regressions?

- If the two RSS are 436.1 and 397.2 respectively, perform the test. • Solution:

Unit sensitivity implies H0: 2 = 1 and 3 = 1. The unrestricted regression is the one in the question. The restricted regression is (yt  x2t  x3t)= 1 + 4x4t + ut or letting

zt = yt  x2t  x3t, the restricted regression is zt = 1 + 4x4t + ut

In the F-test formula, T=144, k=4, m=2, RRSS=436.1, URSS=397.2

(33)

Goodness of Fit Statistics

Goodness of fit statistics is a measure of how well the sample regression function (srf) fits the data.

The most common goodness of fit statistic is known as R2.

We are interested in explaining the variability of y about its mean value , i.e. the total sum of squares, TSS:

We can split the TSS into 2 parts, the part which we have explained (known as the explained sum of squares, ESS) and the part which we did not explain using the model (the RSS).

 

  t t y y TSS 2

(34)

Defining R

2

That is, TSS = ESS + RSS

Our goodness of fit statistic is

But since TSS = ESS + RSS, we can also write

R2 must always lie between zero and one. To understand this, consider two

extremes

RSS = TSS i.e. ESS = 0 so R2 = ESS/TSS = 0

R2  ESSTSS R ESS TSS TSS RSS TSS RSS TSS 2  1

See next slide

 

  t t t t t t y y y u y 2 ˆ 2 ˆ2

(35)

Proof

(36)

The Limit Cases: R

2

= 0 and R

2

= 1

yt y xt yt xt

(37)

Problems with R

2

as a Goodness of Fit Measure

There are a number of them:

1. R2 is defined in terms of variation about the mean of y so that if a model

is reparameterised and the dependent variable changes, R2 will change.

2. R2 never falls if more regressors are added. to the regression, e.g.

consider:

Regression 1: yt = 1 + 2x2t + 3x3t + ut

Regression 2: y = 1 + 2x2t + 3x3t + 4x4t + ut

R2 will always be at least as high for regression 2 relative to regression 1.

3. R2 quite often takes on values of 0.9 or higher for time series

(38)

Adjusted R

2

=

 

Because of (2) in the last slide, a modification is made which takes into account the loss of degrees of freedom associated with adding extra variables. This is known as or adjusted R2

So if we add an extra regressor, k increases and unless R2 increases by

a more than offsetting amount, will fall.

When we analyse one regression alone, we simply use R2

We only use when we want to compare an existing regression with one ore more regressions which are obtained by adding more

regressors to the existing regression, i.e., when we have nested models.

 

       1 1 (1 2) 2 R k T T R

Referências

Documentos relacionados

Ao Dr Oliver Duenisch pelos contatos feitos e orientação de língua estrangeira Ao Dr Agenor Maccari pela ajuda na viabilização da área do experimento de campo Ao Dr Rudi Arno

Ousasse apontar algumas hipóteses para a solução desse problema público a partir do exposto dos autores usados como base para fundamentação teórica, da análise dos dados

didático e resolva as ​listas de exercícios (disponíveis no ​Classroom​) referentes às obras de Carlos Drummond de Andrade, João Guimarães Rosa, Machado de Assis,

i) A condutividade da matriz vítrea diminui com o aumento do tempo de tratamento térmico (Fig.. 241 pequena quantidade de cristais existentes na amostra já provoca um efeito

Outrossim, o fato das pontes miocárdicas serem mais diagnosticadas em pacientes com hipertrofia ventricular é controverso, pois alguns estudos demonstram correlação

The probability of attending school four our group of interest in this region increased by 6.5 percentage points after the expansion of the Bolsa Família program in 2007 and

60 Cfr. Escobar Fomos, Iván, Manual de Derecho Constitucional, 2a. Abad Yupanqui, Samuel, “El Ombudsman o Defensor del Pueblo en la Constitución peruana de 1993; retos y

não existe emissão esp Dntânea. De acordo com essa teoria, átomos excita- dos no vácuo não irradiam. Isso nos leva à idéia de que emissão espontânea está ligada à