Regressão Linear e Multilinear

(1)

Regressão Linear e

Multilinear

Delineamento Experimental

Mestrado em Sistemas de Produção em

Agricultura Mediterrânica

(2)

Modelo de Regressão

Linear Simples

X – Variável Independente

Y – Variável Dependente

β

₀

– ordenada na origem

β

₁

– coeficiente de regressão

ε

_i

– erro ou resíduo aleatório

A partir de n pares de observações (x

_i

, y

_i

)

podemos estimar β

₀

e β

₁

através do

método

dos mínimos quadrados.

minimização da soma dos quadrados dos

desvios das observações à recta de regressão

i

0 1 i

i

y

= β + β

x

+ ε

(

)

n

2

2 i

i

0 1 i

i 1

SQE

y

x

=

∑

ε =

∑

− β − β

(3)

Método dos Mínimos Quadrados

Os estimadores de mínimos quadrados são:

Recta de Regressão

(

)

(

)

n n n i 0 1 i i 0 1 i 0 i 1 i 1 i 1 n n n n 2 i 0 1 i i i i 0 i 1 i i 1 i 1 i 1 i 1 1

dSQE

0

2 y

x

0 y

n

x

d

dSQE

0

2 y

x x

0 x y

x

d

= = = = = = =



₌



₋

_{− β − β}

₌



_{= β + β}



_β



_⇔



_⇔







₌



₋

_{− β − β}

₌



_{= β}

_{+ β}



β



_



_



∑

(

)(

)

(

)

n

i

xy

i 1

1 _n

2 _xx

i

i 1

x

x y

y

_S

ˆ

S

x

=

−

β =

=

−

∑

0

1 ˆ

_y

ˆ

_x

β = − β

0

1 ˆ

ˆ

ˆy

= β + β

x

(4)

Pressupostos

Resíduos ε

_i

:

Distribuição normal

Valor médio nulo

Variâncias iguais ( σ

2 )

Independentes

(5)

Coeficiente de Determinação

Percentagem da variação total que é explicada

pela relação linear entre X e Y.

R

2 _{=1 Variação total de Y é explicada totalmente}

pela variação de X

R

2 _{=0 Variação de X não contribui em nada para}

explicar a variação de Y

(

)

(

)

(

)

n

2

2 i

i

i 1

Variaçao Total

Variaçao Nao Explicada

Variaçao Explicada

ˆ

y

=

−

=

−

+

−

∑

(

)

(

)

(

)

(

)

n

2

2 i

i

2 i 1

i 1

n

2

2 _YY

i

i 1

ˆ

y

SQE

R

1

1 S

y

=

−

=

= −

−

∑

(6)

Inferência

2

0

0 xx

1 x

ˆ

_N

_,

n S



_

_







β ∩

β σ

_

+

_



_

_







1

1 xx

ˆ

_N

_,

S



_σ



β ∩



_

β



_





(

)

n

2 i

i

2 i 1

ˆ

y

SQE

ˆ

QME

n 2

=

−

σ =

=

−

∑

0

0 n 2

2 xx

ˆ

t

1 x

QME

n S

−

β − β

_∩





+









1

1 n 2

xx

ˆ

t

QME

S

−

β − β

∩

(7)

Inferência

Intervalo de confiança de nível (1-

α)100%

para

β

₀

Intervalo de confiança de nível (1-

α)100%

para

β

₁

Testes de Hipóteses para

β

₁

e

β

₂

2 2 0 n 2,1 / 2 0 n 2,1 / 2 xx xx

1 x

ˆ

_t

_QME

_,

ˆ

_t

_QME

n S

− −α − −α



_

_

_

_





β −

×

_

+

_

β +

×

_

+

_





_

_

_

_







1 n 2,1

/ 2

1 n 2,1

/ 2

xx

QME

ˆ

_t

_,

ˆ

_t

S

−

−α

−

−α





β −

β +













0 0

0

0 n 2

2 xx

H :

ˆ

T= t

1 x

QME

n S

−

β = β

β − β

∩





+









0 0

0

1

1 n 2

xx

H :

ˆ

T= t

QME

S

−

β = β

β − β

∩

(8)

Predição

Estimação pontual para uma nova

observação x

₀

Intervalo de Confiança para E( Y| x

₀

)

de nível (1-α).100%

Intervalo de Predição para y

₀

de nível

(1-α).100%

0 ˆ

1 0

ˆy

= β + β

x

(

)

2 (

)

2

0

0 n 2,1 /2

xx

x x

1

1 ˆ

ˆ

y t

QME 1

, y t

QME 1

n

S

n

S

− −α



_

₋

_

_

₋

_





₋

_{× + +}

_

_

₊

_{× + +}

_

_

















_

_

_

_







(

)

2

(

)

2 0 0 0 n 2,1 / 2 0 n 2,1 / 2 xx xx

x

1

1 ˆ

ˆ

y

t

QME

, y

t

QME

n

S

n

S

− −α − −α



_

₋

_

_

₋

_





₋

_×

_

₊

_

₊

_×

_

₊

_

















_

_

_

_







(9)

(10)

Exemplo:

X – Preço Y – Quantidade

0 10 20 30 40 50 60 70 0 20 40 60 80 Preço Q u a n ti da de

ˆy 66.25 0.8125x

=

−

Model Summary ,920a ,846 ,827 6,29670 Model 1 R R Square Adjusted R Square Std. Error of the Estimate Predictors: (Constant), Preço

a. Coefficientsa 66,250 4,840 13,687 ,000 55,088 77,412 -,813 ,123 -,920 -6,630 ,000 -1,095 -,530 (Constant) Preço Model 1 B Std. Error Unstandardized Coefficients Beta Standardized Coefficients

t Sig. Lower Bound Upper Bound 95% Confidence Interval for B

Dependent Variable: Quantidade a.

Scatterplot

Dependent Variable: Quantidade

Regression Standardized Predicted Value 2,0 1,5 1,0 ,5 0,0 -,5 -1,0 -1,5 Regr es si on St andar di zed Res idual 1,5 1,0 ,5 0,0 -,5 -1,0 -1,5 -2,0

Normal P-P Plot of Regression St Dependent Variable: Quantidade

Observed Cum Prob

1,0 ,8 ,5 ,3 0,0

Expected Cum Prob

1,0

,8

,5

,3

(11)

Modelo de Regressão

Múltipla

X

₁

, X

₂

, ...X

_k

–

Variáveis Independentes

Y –

Variável Dependente

A partir de n>k observações podemos

estimar β

_i

(i=0, 1, 2, ..., k) através do

método dos mínimos quadrados.

i

0 1 i1

2 i2

k

ik

i

y

= β + β

x

+ β

x

+ + β

...

x

+ ε

(

) (

)

n

T

2 T

i

i 1

SQE

Y X

=

∑

ε = ε ε =

− β

1

11 1k

0

1

2

21 2k

1

2 n

n1

nk

k

n

Y

1 X

... X

Y

1 X

... X

Y

; X=

;

Y

1 X

... X

 





 

β

 

ε

 





 





 

_β

 

_ε

 





 

=

_{ }

_

_

β =

_{ }

ε =

_{ }

 





 





 

_β

 

_ε

 





 

 





 

#

Y X

= β + ε

(12)

Estimação de Parâmetros

Matriz de Variância-Covariância

(

_T

)

1 _T

ˆ

_{X X}

−

_{X Y}

β =

( )

_ˆ

₂

(

_T

)

1 ˆ X X

−

/

Σ β = σ

T

ˆ

T

SQE

=

Y Y

− β

X Y

2 SQE

ˆ

QME

n k 1

σ =

=

− −

(13)

Significância do Modelo

Coeficiente de Determinação

(

)

(

)

(

)

n

2

2 i

i

i 1

SQTot

SQE

SQ Reg

ˆ

y

=

−

=

−

+

−

∑

2 YY

SQ Reg

SQE

R

1 SQTot

S

=

= −

0

1

2 k

1 i

H :

...

0 H :

0, para pelo menos um i (i=1, 2, ..., k)



_{β = β = β =}



 β ≠



n-1

Total

n-k-1

Erro

k

Regressão

F

QM

SQ

GL

OV

2 n T T i i 1

1 ˆ X Y

y

n

=



_



β

−

_



_



∑



T

ˆ

T T

Y Y

− β

X Y

2 n T i i 1

1 Y Y

y

n

=



_



−



_

_



_



∑

QM Reg

F

QME

=

SQ Reg

k

SQE

n k 1

− −

0 k, n-k-1,

(14)

Contribuições Parciais

Determinar se as variáveis contribuem

significativamente para o modelo de regressão.

Ajustar o modelo assumindo H

₀

:β

₁

=0 Verdadeira

1 2 r

X ,X ,...X

(

)

1

2

2 1 1

2 2

0

1

1 ,

~ r 1 e

~ k 1 r

1 Y

X

H :

0 H :

0  

_β

 

β = − β

_{ }

×

β

− − ×

 

_β

 

 

=

β +

β + ε



_{β =}



 β ≠



(

)

( )

(

)

( )

1 T

T

2 2

2

2 T

2

1

2

2 ˆ

Y

X

X X

X Y

ˆ

SQReg

X Y

SQ Reg

|

SQReg

SQ Reg

−

=

β + ε →

β =

β = β

β β =

β −

β

(

1 2

)

0 r, n-k-1,

1-SQReg

|

/ r

Rejeitar H ao nivel sse F=

f

QME

α

β β

(15)

Inferência

Intervalo de confiança de nível

(1-

α)100% para β

_j

C

_jj

– elemento da matriz (X

T

_X)

-1

Testes de Hipóteses para

β

_j

(

2

2 )

j

n k 1,1

/ 2

jj

j

n k 1,1

/ 2

jj

ˆ

_t

_ˆ

_{C ,}

ˆ

_t

_ˆ

_C

− − −α

β −

σ

β +

σ

0

0 j

j

n k 1

2 jj

H :

ˆ

E.T.

T=

t

ˆ C

− −

β = β

β − β

→

∩

σ

(16)

Inferência

Intervalo de confiança de nível

(1-

α) 100% para E(Y

₀

)

Predição de Novas Observações

(

)

1 2 T

T

0 n k 1,1

/ 2

0

0 ˆ

ˆ

y

t

_{− − −α}

x X X

−

x



_±

_σ











01 T

0

0 0k

1 x

ˆ

E(Y )

y

x x

x

















=

β

_{=  }

















#

(

)

1

2 T

T

0 n k 1,1

/ 2

0

0 ˆ

ˆ

y

t

_{− − −α}

1 x X X

−

x



_±

_σ

_

₊

_





_

_

_

_







(17)

Exemplo 1:

X

₁

– Fertilizante; X

₂

– Precipitação

Y – Produção de um cereal (kg/ha)

Coefficientsa 126,314 181,914 ,694 ,518 -341,311 593,939 ,781 ,216 ,463 3,616 ,015 ,226 1,337 1,032 ,241 ,547 4,276 ,008 ,412 1,652 (Constant) Fertilizante Precip. Model 1 B Std. Error Unstandardized Coefficients Beta Standardized Coefficients

t Sig. Lower Bound Upper Bound 95% Confidence Interval for B

Dependent Variable: Produção a. ANOVAb 1185768 2 592884,236 266,608 ,000a 11119,028 5 2223,806 1196887 7 Regression Residual Total Model 1 Sum of

Squares df Mean Square F Sig.

Predictors: (Constant), Precip., Fertilizante a.

Dependent Variable: Produção b. Model Summaryb ,995a _,991 _,987 _47,15724 Model 1 R R Square Adjusted R Square Std. Error of the Estimate Predictors: (Constant), Precip., Fertilizante

a.

Dependent Variable: Produção b. Correlations 1,000 ,978 ,983 ,978 1,000 ,942 ,983 ,942 1,000 . ,000 ,000 ,000 . ,000 ,000 ,000 . 8 8 8 8 8 8 8 8 8 Produção Fertilizante Precip. Produção Fertilizante Precip. Produção Fertilizante Precip. Pearson Correlation Sig. (1-tailed) N

Produção Fertilizante Precip.

Descriptive Statistics 1631,2500 413,50203 450,0000 244,94897 1117,5000 219,26826 Produção Fertilizante Precip. Mean Std. Deviation

Regressão Linear e Multilinear