• Nenhum resultado encontrado

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS*

No documento Categorical Data Analysis (páginas 147-154)

Having introduced GLMs for binary and count data, we now turn our attention to details such as likelihood equations and methods for fitting them. The remainder of this chapter is somewhat technical, providing general results applying to most modeling methods presented in subsequent chapters.

Ž .

See McCullagh and Nelder 1989 for further details.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 133 It is helpful to extend the notation for a GLM so that it can handle many distributions that have a second parameter. The random component of the

Ž .

GLM specifies that the N observations y1, . . . , yN on Y are independent, with probability mass or density function for yi of form

f yŽ i;␪i,␾.sexp

yi i␪ ybŽ␪i. raŽ␾.qc yŽ i,␾.

4

. Ž4.14. This is called the exponential dispersion family and ␾ is called the dispersion

Ž .

parameter Jorgensen 1987 . The parameterⲐ ␪i is the natural parameter.

Ž . Ž .

When ␾ is known, 4.14 simplifies to the form 4.1 for the natural exponential family, which is

f yŽ i;␪i.saŽ␪i.b yŽ i.exp y Qi Ž␪i. .

Ž . Ž . Ž . Ž . w Ž . Ž .x

We identify Q␪ here with ␪ra␾ in 4.14 , a␪ with expyb␪ ra ␾ in Ž4.14 , and. b yŽ .with expwc y,Ž ␾.xin 4.14 . The more general formula 4.14Ž . Ž . is not needed for one-parameter families such as the binomial and Poisson.

Ž . Ž .

Usually, a␾ has form a␾ s␾r␻i for a known weight ␻i. For instance, when yi is a mean of ni independent readings, such as a sample proportion

Ž .

for ni Bernoulli trials, ␻isni Section 4.4.2 .

4.4.1 Mean and Variance Functions for the Random Component

Ž . Ž . Ž .

General expressions for E Yi and var Yi use terms in 4.14 . Let Lis

Ž .

logf yi;␪i,␾ denote the contribution of yi to the log likelihood; that is, the

Ž .

log-likelihood function is LiLi. Then, from 4.14 ,

Lis yi i␪ ybŽ␪i. raŽ␾.qc yŽ i,␾. . Ž4.15. Therefore,

X 2 2 Y

Lir⭸␪is yiybŽ␪i. raŽ␾., ⭸ Lir⭸␪i s yb Ž␪i.raŽ␾.,

XŽ . YŽ . Ž .

where bi and bi denote the first two derivatives of b⭈ evaluated at ␪i. We now apply the general likelihood results

2

L2LL

E

ž

⭸␪

/

s0 and yE

ž

⭸␪2

/

sE

ž

⭸␪

/

,

which hold under regularity conditions satisfied by the exponential family ŽCox and Hinkley 1974, Sec. 4.8 . From the first formula applied with a single.

w XŽ .x Ž .

observation, E Yiybi ra␾ s0, or

isE YŽ i.sbXŽ␪i.. Ž4.16.

INTRODUCTION TO GENERALIZED LINEAR MODELS

134

From the second formula,

2 2

Y X

b Ž␪i.raŽ␾.sE

Ž

YiybŽ␪i.

.

raŽ␾. svarŽYi.r aŽ␾. , so that

varŽYi.sbYŽ␪i. Ža ␾.. Ž4.17.

Ž . Ž .

In summary, the function b⭈ in 4.14 determines moments of Yi. 4.4.2 Mean and Variance Functions for Poisson and Binomial

We illustrate the mean and variance expressions for Poisson and binomial distributions. When Yi is Poisson,

eyiiyi

f yŽ i;␮i.s sexpŽyilog␮iy␮iylog yi!. yi!

sexp yi i␪ yexpŽ␪i.ylogyi! ,

Ž . Ž .

where ␪islog ␮i. This has exponential dispersion form 4.14 with bi s

Ž . Ž . Ž .

exp ␪i, a␾ s1, and c yi,␾ s ylog yi!. The natural parameter is ␪is

Ž . Ž .

log␮i. From 4.16 and 4.17 ,

E YŽ i.sbXŽ␪i.sexpŽ␪i.s␮i, varŽYi.sbYŽ␪i.sexpŽ␪i.s␮i.

Ž .

Next, suppose that n Yi i has a bin ni,␲i distribution; that is, here yi is

Ž . Ž .

the sample proportion rather than number of successes, so E Yi is indepen-w Ž .x Ž . w Ž .x dent of ni. Let ␪islog␲ir 1y␲i . Then,␲isexp ␪ir1qexp ␪i and

Ž . w Ž .x Ž .

log 1y␲i s ylog 1qexp ␪i . Extending 4.3 , one can show that ni n yi i niyn yi i

f yŽ i;␲i,ni.s

ž /

n yi ii Ž1y␲i.

yi i␪ ylog 1qexpŽ␪i. ni

sexp 1rni qlog

ž /

n yi i . Ž4.18.

Ž . Ž . w Ž .x

This has exponential dispersion form 4.14 with bi slog 1qexp ␪i ,

ni

Ž . Ž .

a␾ s1rni, and c yi,␾ slog

ž /

n yi i . The natural parameter is the logit, w Ž .x Ž . Ž .

islog␲ir 1y␲i . From 4.16 and 4.17 ,

E YŽ i.sbXŽ␪i.sexpŽ␪i.r 1qexpŽ␪i. s␲i,

Y 2

varŽYi.sb Ž␪i. Ža ␾.sexpŽ␪i.r

½

1qexpŽ␪i. ni

5

s␲iŽ1y␲i.rni.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 135 4.4.3 Systematic Component and Link Function

Ž .

Let xi1, . . . ,xi p denote values of explanatory variables for observation i.

4

The systematic component of a GLM relates parameters ␩i to these variables using a linear predictor

is

Ý

jxi j, is1, . . . ,N.

j

In matrix form,

␩sX␤,

Ž .X Ž .X

where ␩s ␩1, . . . ,␩N , ␤s ␤1, . . . ,␤p are column vectors of model parameters, andXis the N=pmatrix of values of the explanatory variables for the Nsubjects. In ordinary linear models,Xis called the design matrix. It need not refer to an experimental design, however, and the GLM literature calls it the model matrix.

Ž . Ž .

The GLM links␩i to␮isE Yi by a link function g ⭈. Thus, ␮i relates to the explanatory variables by

isgŽ␮i.s

Ý

jxi j, is1, . . . ,N.

j

Ž . Ž .

The link function g for which gi s␪i in 4.14 is the canonical link. For it, the direct relationship

is

Ý

jxi j j

occurs between the natural parameter and the linear predictor.

XŽ .

Since ␮isbi , the natural parameter is the function of the mean,

Ž X.y1Ž . Ž X.y1Ž . X

is bi, where b ⭈ denotes the inverse function to b. Thus, the

X Ž .

canonical link is the inverse of b. In the Poisson case, for instance, bi s

Ž . XŽ . Ž . Ž X.y1Ž .

exp ␪i, so bi sexp ␪i s␮i. Thus, b ⭈ is the inverse of the

expo-Ž .

nential function, which is the log function i.e., ␪islog␮i . The canonical link is the log link.

4.4.4 Likelihood Equations for a GLM

Ž .

For N independent observations, from 4.15 the log likelihood is yi i␪ ybŽ␪i.

LŽ␤.s

Ý

Lis

Ý

log f yŽ i;␪i,␾.s

Ý

q

Ý

c yŽ i,␾.. aŽ␾.

i i i i

4.19

Ž .

Ž .

The notation L␤ reflects the dependence of␪ on the model parameters ␤.

INTRODUCTION TO GENERALIZED LINEAR MODELS

136

The likelihood equations are

LŽ␤.r⭸␤js

Ý

Lir⭸␤js0

i

Ž .

for all j. To differentiate the log likelihood 4.19 , we use the chain rule,

LiLi ⭸␪ ⭸␮ ⭸␩i i i

s . Ž4.20.

⭸␤j ⭸␪ ⭸␮ ⭸␩ ⭸␤i i i j

w XŽ .x Ž . XŽ . Ž . Since ⭸Lir⭸␪is yiybi ra␾ , and since ␮isbi and var Yi s

YŽ . Ž . Ž . Ž .

bi a ␾ from 4.16 and 4.17 ,

Lir⭸␪iyiy␮i.raŽ␾., ⭸␮ir⭸␪isbYŽ␪i.svarŽYi.raŽ␾.. Also, since␩ijjxi j,

⭸␩ir⭸␤jsxi j. Ž .

Finally, since␩isgi,⭸␮ir⭸␩idepends on the link function for the model.

Ž .

In summary, substituting into 4.20 gives us

⭸Li yiy␮i aŽ␾. ⭸␮i Žyiy␮i.xi j ⭸␮i

s xi js . Ž4.21.

⭸␤j aŽ␾. varŽYi. ⭸␩i varŽYi. ⭸␩i The likelihood equations are

N Žyiy␮i.xi j ⭸␮i

s0, js1, . . . ,p. Ž4.22.

Ý

varŽYi. ⭸␩i

is1

Although␤ does not appear in these equations, it is there implicitly through

y1Ž .

i, since ␮isg Ýjjxi j . Different link functions yield different sets of equations.

Ž .

Interestingly, the likelihood equations 4.22 depend on the distribution of Ž .

Yi only through ␮i and var Yi . The variance itself depends on the mean through a particular functional form

varŽYi.s ®Ž␮i.

Ž . Ž . Ž .

for some function ®, such as ® ␮i s␮i for the Poisson, ® ␮i s␮i1y␮i

Ž . 2 Ž .

for the Bernoulli, and ® ␮i s␴ i.e., constant for the normal. When Yi has distribution in the natural exponential family, the relationship between

Ž .

the mean and the variance characterizes the distribution Jorgensen 1987 .Ⲑ For instance, if Yi has distribution in the natural exponential family and if

Ž .

® ␮i s␮i, then necessarily Yi has the Poisson distribution.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 137 4.4.5 Likelihood Equations for Binomial GLMs

Ž .

Using notation from Section 4.4.2, suppose that n Yi i has a bin ni,␲i

distribution. Then yi is a sample proportion of successes for ni trials. The

Ž .

binomial GLM 4.8 for a single predictor extends with several predictors to

is⌽

ž Ý

jjxi j

/

, Ž4.23.

where ⌽ is the standard cdf of some class of continuous distributions. Since Ž .

is␮is⌽␩i with ␩ijjxi j,

⭸␮ir⭸␩is␾ ␩Ž i.s␾

ž Ý

jjxi j

/

,

Ž . Ž . Ž

where␾ u s⭸ ⌽ ur⭸u i.e., the probability density function corresponding

. Ž . Ž . Ž .

to the cdf ⌽ . Since var Yi s␲i1y␲irni, the likelihood equations 4.22 simplify to

niŽyiy␲i.xi j

␾ ␤ x s0, Ž4.24.

Ý

i iŽ1y␲i.

ž Ý

j j i j

/

Ž . y1

where␲is⌽ Ýjjxi j . These depend on the link function ⌽ through the derivative of its inverse.

w Ž .x w Ž .x For the logit link,␩islog␲ir1y␲i , so ⭸␩ir⭸␲is1r␲i1y␲i and

Ž . Ž .

⭸␮ir⭸␩is⭸␲ir⭸␩is␲i 1y␲i . Then the likelihood equations 4.22 and Ž4.24 simplify to.

nŽy y␲ .x s0, Ž4.25.

Ý

i i i i j i

Ž .

where␲i satisfies 4.23 with ⌽ the standard logistic cdf.

4.4.6 Asymptotic Covariance Matrix of Model Parameter Estimators The likelihood function for the GLM also determines the asymptotic covari-ance matrix of the ML estimator ˆ␤. This matrix is the inverse of the

w 2 Ž . x

information matrix

IIIII

, which has elements Ey⭸ L ␤ r⭸␤ ⭸␤h j. To find this, for the contribution Lito the log likelihood we use the helpful result

2LiLiLi E

ž

⭸␤ ⭸␤h j

/

s yE

ž

⭸␤h

/ ž

⭸␤j

/

,

INTRODUCTION TO GENERALIZED LINEAR MODELS

138

Ž .

which holds for exponential families Cox and Hinkley 1974, Sec. 4.8 . Thus,

2Li ŽYiy␮i.xi h ⭸␮i ŽYiy␮i.xi j ⭸␮i

E

ž

⭸␤ ⭸␤h j

/

s yE varŽYi. ⭸␩i varŽYi. ⭸␩i from 4.21Ž . yx xi h i j ⭸␮i 2

s varŽYi.

ž

⭸␩i

/

. Ž .

Since L␤ sÝi Li,

2 N 2

LŽ␤. x xi h i j ⭸␮i E

ž

y ⭸␤ ⭸␤h j

/

si

Ý

s1 varŽYi.

ž

⭸␩i

/

.

Generalizing from this typical element to the entire matrix, the information matrix has the form

IIIII

sXXWX, Ž4.26.

whereWis the diagonal matrix with main-diagonal elements

wisŽ⭸␮ir⭸␩i.2rvarŽYi.. Ž4.27. The asymptotic covariance matrix of ␤ˆ is estimated by

$ ˆ

ˆ

y1 Xˆ y1

cov

Ž .

␤ s

IIIII

sŽX WX. , Ž4.28.

ˆ ˆ Ž .

where Wis Wevaluated at ␤. From 4.27 , the form of W also depends on the link function. We’ll see an example for Poisson GLMs next and for binomial GLMs in Section 5.5.

4.4.7 Likelihood Equations and Covariance Matrix for Poisson Loglinear Model

Ž .

The general Poisson loglinear model 4.4 has the matrix form log␮sX␤.

Ž . Ž .

For the log link, ␩islog␮i, so ␮isexp ␩i and ⭸␮ir⭸␩isexp ␩i s␮i.

Ž . Ž .

Since var Yi s␮i, the likelihood equations 4.22 simplify to

y y␮ x s0. 4.29

Ž . Ž .

Ý

i i i j i

These equate the sufficient statistics Ýiy xi i j for ␤ to their expected values.

INFERENCE FOR GENERALIZED LINEAR MODELS 139 Also, since

wisŽ⭸␮ir⭸␩i.2rvarŽYi.s␮i

ˆ Xˆ y1 ˆ

Ž . Ž .

the estimated covariance matrix 4.28 of ␤ is X WX , where W is the diagonal matrix with elements of ␮ˆ on the main diagonal.

No documento Categorical Data Analysis (páginas 147-154)