MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS*

Having introduced GLMs for binary and count data, we now turn our attention to details such as likelihood equations and methods for fitting them. The remainder of this chapter is somewhat technical, providing general results applying to most modeling methods presented in subsequent chapters.

Ž .

See McCullagh and Nelder 1989 for further details.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 133 It is helpful to extend the notation for a GLM so that it can handle many distributions that have a second parameter. The random component of the

Ž .

GLM specifies that the N observations y₁, . . . , y_N on Y are independent, with probability mass or density function for y_i of form

f yŽ _i;␪_i,␾.sexp

y_{i i}␪ ybŽ␪_i. raŽ␾.qc yŽ _i,␾.

4

. Ž4.14. This is called the exponential dispersion family and ␾ is called the dispersion

Ž .

parameter Jorgensen 1987 . The parameterⲐ ␪ⁱ is the natural parameter.

Ž . Ž .

When ␾ is known, 4.14 simplifies to the form 4.1 for the natural exponential family, which is

f yŽ _i;␪_i.saŽ␪_i.b yŽ _i.exp y Q_i Ž␪_i. .

Ž . Ž . Ž . Ž . w Ž . Ž .x

We identify Q␪ here with ␪ra␾ in 4.14 , a␪ with expyb␪ ra ␾ in Ž4.14 , and. b yŽ .with expwc y,Ž ␾.xin 4.14 . The more general formula 4.14Ž . Ž . is not needed for one-parameter families such as the binomial and Poisson.

Ž . Ž .

Usually, a␾ has form a␾ s␾r␻_i for a known weight ␻_i. For instance, when y_i is a mean of n_i independent readings, such as a sample proportion

Ž .

for n_i Bernoulli trials, ␻_isn_i Section 4.4.2 .

4.4.1 Mean and Variance Functions for the Random Component

Ž . Ž . Ž .

General expressions for E Y_i and var Y_i use terms in 4.14 . Let L_is

Ž .

logf y_i;␪_i,␾ denote the contribution of y_i to the log likelihood; that is, the

Ž .

log-likelihood function is LsÝ_iL_i. Then, from 4.14 ,

L_is y_{i i}␪ ybŽ␪_i. raŽ␾.qc yŽ _i,␾. . Ž4.15. Therefore,

X 2 2 Y

⭸L_ir⭸␪_is y_iybŽ␪_i. raŽ␾., ⭸ L_ir⭸␪_i s yb Ž␪_i.raŽ␾.,

XŽ . ^YŽ . Ž .

where b ␪_i and b ␪_i denote the first two derivatives of b⭈ evaluated at ␪_i. We now apply the general likelihood results

⭸L ⭸2L ⭸L

ž

⭸␪

/

s0 and yE

ž

⭸␪₂

/

ž

⭸␪

/

which hold under regularity conditions satisfied by the exponential family ŽCox and Hinkley 1974, Sec. 4.8 . From the first formula applied with a single.

w ^XŽ .x Ž .

observation, E Y_iyb ␪_i ra␾ s0, or

␮_isE YŽ _i.sb^XŽ␪_i.. Ž4.16.

INTRODUCTION TO GENERALIZED LINEAR MODELS

134

From the second formula,

2 2

Y X

b Ž␪i.raŽ␾.sE

Ž

YiybŽ␪i.

.

raŽ␾. svarŽYi.r aŽ␾. , so that

varŽY_i.sb^YŽ␪_i. Ža ␾.. Ž4.17.

Ž . Ž .

In summary, the function b⭈ in 4.14 determines moments of Y_i. 4.4.2 Mean and Variance Functions for Poisson and Binomial

We illustrate the mean and variance expressions for Poisson and binomial distributions. When Y_i is Poisson,

e^y^␮ⁱ␮i^yⁱ

f yŽ _i;␮_i.s sexpŽy_ilog␮_iy␮_iylog y_i!. y_i!

sexp y_{i i}␪ yexpŽ␪_i.ylogy_i! ,

Ž . Ž .

where ␪_islog ␮_i. This has exponential dispersion form 4.14 with b ␪_i s

Ž . Ž . Ž .

exp ␪_i, a␾ s1, and c y_i,␾ s ylog y_i!. The natural parameter is ␪_is

Ž . Ž .

log␮_i. From 4.16 and 4.17 ,

E YŽ _i.sb^XŽ␪_i.sexpŽ␪_i.s␮_i, varŽY_i.sb^YŽ␪_i.sexpŽ␪_i.s␮_i.

Ž .

Next, suppose that n Y_i _i has a bin n_i,␲_i distribution; that is, here y_i is

Ž . Ž .

the sample proportion rather than number of successes, so E Y_i is indepen-w Ž .x Ž . w Ž .x dent of n_i. Let ␪_islog␲_ir 1y␲_i . Then,␲_isexp ␪_ir1qexp ␪_i and

Ž . w Ž .x Ž .

log 1y␲_i s ylog 1qexp ␪_i . Extending 4.3 , one can show that ni n y_i _i n_iyn y_i _i

f yŽ _i;␲_i,n_i.s

ž /

^{n y}ⁱ ⁱ ␲_i Ž1y␲_i.

y_{i i}␪ ylog 1qexpŽ␪_i. n_i

sexp ¹^rⁿi qlog

ž /

^{n y}ⁱ ⁱ . Ž4.18.

Ž . Ž . w Ž .x

This has exponential dispersion form 4.14 with b ␪_i slog 1qexp ␪_i ,

n_i

Ž . Ž .

a␾ s1rn_i, and c y_i,␾ slog

ž /

^{n y}i i . The natural parameter is the logit, w Ž .x Ž . Ž .

␪_islog␲_ir 1y␲_i . From 4.16 and 4.17 ,

E YŽ _i.sbXŽ␪_i.sexpŽ␪_i.r 1qexpŽ␪_i. s␲_i,

Y 2

varŽYi.sb Ž␪i. Ža ␾.sexpŽ␪i.r

½

1qexpŽ␪i. ni

5

s␲iŽ1y␲i.rni.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 135 4.4.3 Systematic Component and Link Function

Ž .

Let xi1, . . . ,xi p denote values of explanatory variables for observation i.

The systematic component of a GLM relates parameters ␩_i to these variables using a linear predictor

␩is

Ý

␤jxi j, is1, . . . ,N.

In matrix form,

␩sX␤,

Ž .^X Ž .^X

where ␩s ␩₁, . . . ,␩_N , ␤s ␤₁, . . . ,␤_p are column vectors of model parameters, andXis the N=pmatrix of values of the explanatory variables for the Nsubjects. In ordinary linear models,Xis called the design matrix. It need not refer to an experimental design, however, and the GLM literature calls it the model matrix.

Ž . Ž .

The GLM links␩_i to␮_isE Y_i by a link function g ⭈. Thus, ␮_i relates to the explanatory variables by

␩isgŽ␮i.s

Ý

␤jxi j, is1, . . . ,N.

Ž . Ž .

The link function g for which g ␮_i s␪_i in 4.14 is the canonical link. For it, the direct relationship

␪is

Ý

␤jxi j j

occurs between the natural parameter and the linear predictor.

XŽ .

Since ␮_isb ␪_i , the natural parameter is the function of the mean,

Ž ^X.^y¹Ž . Ž ^X.^y¹Ž . ^X

␪_is b ␮_i, where b ⭈ denotes the inverse function to b. Thus, the

X Ž .

canonical link is the inverse of b. In the Poisson case, for instance, b ␪_i s

Ž . ^XŽ . Ž . Ž ^X.^y¹Ž .

exp ␪_i, so b ␪_i sexp ␪_i s␮_i. Thus, b ⭈ is the inverse of the

expo-Ž .

nential function, which is the log function i.e., ␪_islog␮_i . The canonical link is the log link.

4.4.4 Likelihood Equations for a GLM

Ž .

For N independent observations, from 4.15 the log likelihood is y_{i i}␪ ybŽ␪_i.

LŽ␤.s

Ý

Lis

Ý

log f yŽ i;␪i,␾.s

Ý

c yŽ i,␾.. aŽ␾.

i i i i

4.19

Ž .

The notation L␤ reflects the dependence of␪ on the model parameters ␤.

INTRODUCTION TO GENERALIZED LINEAR MODELS

136

The likelihood equations are

⭸LŽ␤.r⭸␤js

Ý

⭸Lir⭸␤js0

Ž .

for all j. To differentiate the log likelihood 4.19 , we use the chain rule,

⭸Li ⭸Li ⭸␪ ⭸␮ ⭸␩i i i

s . Ž4.20.

⭸␤j ⭸␪ ⭸␮ ⭸␩ ⭸␤i i i j

w ^XŽ .x Ž . ^XŽ . Ž . Since ⭸L_ir⭸␪_is y_iyb ␪_i ra␾ , and since ␮_isb ␪_i and var Y_i s

YŽ . Ž . Ž . Ž .

b ␪_i a ␾ from 4.16 and 4.17 ,

⭸L_ir⭸␪_isŽy_iy␮_i.raŽ␾., ⭸␮_ir⭸␪_isb^YŽ␪_i.svarŽY_i.raŽ␾.. Also, since␩_isÝ_j␤_jx_{i j},

⭸␩_ir⭸␤_jsx_{i j}. Ž .

Finally, since␩_isg ␮_i,⭸␮_ir⭸␩_idepends on the link function for the model.

Ž .

In summary, substituting into 4.20 gives us

⭸L_i y_iy␮_i aŽ␾. ⭸␮_i Žy_iy␮_i.x_{i j} ⭸␮_i

s x_{i j}s . Ž4.21.

⭸␤_j aŽ␾. varŽY_i. ⭸␩_i varŽY_i. ⭸␩_i The likelihood equations are

N Žy_iy␮_i.x_{i j} ⭸␮_i

s0, js1, . . . ,p. Ž4.22.

Ý

varŽY_i. ⭸␩_i

is1

Although␤ does not appear in these equations, it is there implicitly through

y1Ž .

␮_i, since ␮_isg Ý_j␤_jx_{i j} . Different link functions yield different sets of equations.

Ž .

Interestingly, the likelihood equations 4.22 depend on the distribution of Ž .

Y_i only through ␮_i and var Y_i . The variance itself depends on the mean through a particular functional form

varŽY_i.s ®Ž␮_i.

Ž . Ž . Ž .

for some function ®, such as ® ␮i s␮i for the Poisson, ® ␮i s␮i1y␮i

Ž . ² Ž .

for the Bernoulli, and ® ␮_i s␴ i.e., constant for the normal. When Y_i has distribution in the natural exponential family, the relationship between

Ž .

the mean and the variance characterizes the distribution Jorgensen 1987 .Ⲑ For instance, if Y_i has distribution in the natural exponential family and if

Ž .

® ␮_i s␮_i, then necessarily Y_i has the Poisson distribution.

MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS 137 4.4.5 Likelihood Equations for Binomial GLMs

Ž .

Using notation from Section 4.4.2, suppose that n Yi i has a bin ni,␲i

distribution. Then y_i is a sample proportion of successes for n_i trials. The

Ž .

binomial GLM 4.8 for a single predictor extends with several predictors to

␲is⌽

ž Ý

_j ␤jxi j

/

, Ž4.23.

where ⌽ is the standard cdf of some class of continuous distributions. Since Ž .

␲_is␮_is⌽␩_i with ␩_isÝ_j␤_jx_{i j},

⭸␮ir⭸␩is␾ ␩Ž i.s␾

ž Ý

_j ␤jxi j

/

Ž . Ž . Ž

where␾ u s⭸ ⌽ ur⭸u i.e., the probability density function corresponding

. Ž . Ž . Ž .

to the cdf ⌽ . Since var Y_i s␲_i1y␲_irn_i, the likelihood equations 4.22 simplify to

n_iŽy_iy␲_i.x_{i j}

␾ ␤ x s0, Ž4.24.

Ý

_i _␲_iŽ₁y␲_i.

ž Ý

_j j i j

/

Ž . ^y¹

where␲_is⌽ Ý_j␤_jx_{i j} . These depend on the link function ⌽ through the derivative of its inverse.

w Ž .x w Ž .x For the logit link,␩_islog␲_ir1y␲_i , so ⭸␩_ir⭸␲_is1r␲_i1y␲_i and

Ž . Ž .

⭸␮_ir⭸␩_is⭸␲_ir⭸␩_is␲_i 1y␲_i . Then the likelihood equations 4.22 and Ž4.24 simplify to.

nŽy y␲ .x s0, Ž4.25.

Ý

i i i i j i

Ž .

where␲_i satisfies 4.23 with ⌽ the standard logistic cdf.

4.4.6 Asymptotic Covariance Matrix of Model Parameter Estimators The likelihood function for the GLM also determines the asymptotic covari-ance matrix of the ML estimator ˆ␤. This matrix is the inverse of the

w ² Ž . x

information matrix

IIIII

, which has elements Ey⭸ L ␤ r⭸␤ ⭸␤_h _j. To find this, for the contribution L_ito the log likelihood we use the helpful result

⭸²L_i ⭸L_i ⭸L_i E

ž

^{⭸␤ ⭸␤}^h ^j

/

s yE

ž

^⭸␤^h

/ ž

^⭸␤^j

/

INTRODUCTION TO GENERALIZED LINEAR MODELS

138

Ž .

which holds for exponential families Cox and Hinkley 1974, Sec. 4.8 . Thus,

⭸2L_i ŽY_iy␮_i.x_{i h} ⭸␮_i ŽY_iy␮_i.x_{i j} ⭸␮_i

ž

^{⭸␤ ⭸␤}^h ^j

/

s yE ^var^Ž^Yⁱ^. ^⭸␩ⁱ ^var^Ž^Yⁱ^. ^⭸␩ⁱ from 4.21Ž . yx xi h i j ⭸␮i 2

s ^varŽ^Y_i.

ž

⭸␩_i

/

. Ž .

Since L␤ sÝ_i L_i,

2 N 2

⭸ LŽ␤. x x_{i h} _{i j} ⭸␮_i E

ž

y ^{⭸␤ ⭸␤}^h ^j

/

Ý

s1 ^var^Ž^Yⁱ^.

ž

^⭸␩ⁱ

/

Generalizing from this typical element to the entire matrix, the information matrix has the form

IIIII

^s^X^X^WX, ^Ž^4.26^.

whereWis the diagonal matrix with main-diagonal elements

w_isŽ⭸␮_ir⭸␩_i.2rvarŽY_i.. Ž4.27. The asymptotic covariance matrix of ␤ˆ is estimated by

$ ˆ

ˆ

y1 Xˆ y1

cov

Ž .

␤ s

IIIII

^sŽ^{X WX}. ^, ^Ž^4.28^.

ˆ ˆ Ž .

where Wis Wevaluated at ␤. From 4.27 , the form of W also depends on the link function. We’ll see an example for Poisson GLMs next and for binomial GLMs in Section 5.5.

4.4.7 Likelihood Equations and Covariance Matrix for Poisson Loglinear Model

Ž .

The general Poisson loglinear model 4.4 has the matrix form log␮sX␤.

Ž . Ž .

For the log link, ␩_islog␮_i, so ␮_isexp ␩_i and ⭸␮_ir⭸␩_isexp ␩_i s␮_i.

Ž . Ž .

Since var Y_i s␮_i, the likelihood equations 4.22 simplify to

y y␮ x s0. 4.29

Ž . Ž .

Ý

i i i j i

These equate the sufficient statistics Ý_iy x_i _{i j} for ␤ to their expected values.

INFERENCE FOR GENERALIZED LINEAR MODELS 139 Also, since

w_isŽ⭸␮_ir⭸␩_i.2rvarŽY_i.s␮_i

ˆ ^Xˆ ^y¹ ˆ

Ž . Ž .

the estimated covariance matrix 4.28 of ␤ is X WX , where W is the diagonal matrix with elements of ␮ˆ on the main diagonal.

No documento Categorical Data Analysis (páginas 147-154)