IIIII
5.1 INTERPRETING PARAMETERS IN LOGISTIC REGRESSION Ž
LOGISTIC REGRESSION
166
5.1 INTERPRETING PARAMETERS IN LOGISTIC REGRESSION
INTERPRETING PARAMETERS IN LOGISTIC REGRESSION 167
FIGURE 5.1 Linear approximation to logistic regression curve.
Ž . Ž . 1 Ž .
point, substitute y␣r for x in 5.1 , or substitute x s2 in 5.2 and solve for x. Thisx x value is sometimes called the median effecti®e le®el and
Ž .
denoted EL . In toxicology studies it is called LD50 50 LDslethal dose , the dose with a 50% chance of a lethal result.
Ž . 1
From this linear approximation, near x where x s2, a change in x of
Ž . Ž .Ž . 1
1r corresponds to a change in x of roughly 1r r4 s 4; that Ž .
is, 1r approximates the distance between x values where x s0.25 or
Ž . Ž .
0.75 in reality, 0.27 and 0.73 and where x s0.50. The linear approxima-tion works better for smaller changes in x, however.
Ž . An alternative way to interpret the effect reports the values of x at certain x values, such as their quartiles. This entails substituting those
Ž . Ž . Ž .
quartiles for x into formula 5.1 for x . The change in x over the middle half of x values, from the lower quartile to the upper quartile of x, then describes the effect. It can be compared to the corresponding change over the middle half of values of other predictors.
The intercept parameter ␣ is not usually of particular interest. However,
w Ž .x
by centering the predictor about 0 i.e., replacing x by xyx , ␣ becomes
␣ Ž ␣. Ž . Ž
the logit at that mean, and thus e r1qe s x . As in ordinary regression, centering is also helpful in complex models containing quadratic or interaction terms to reduce correlations among model parameter esti-mates..
LOGISTIC REGRESSION
168
5.1.2 Looking at the Data
Ž .
In practice, these interpretations use formula 5.1 with ML estimates substi-tuted for parameters. Before fitting the model and making such interpreta-tions, look at the data to check that the logistic regression model is appropri-ate. SinceY takes only values 0 and 1, it is difficult to check this by plotting Y against x.
It can be helpful to plot sample proportions or logits against x. Let ni denote the number of observations at setting i of x. Of them, let yi denote the number of ‘‘1’’ outcomes, with pisyirni. Sample logit i is
w Ž .x w Ž .x
log pir1ypi slog yir niyyi . This is not finite when yis0 or ni. An ad hoc adjustment adds a positive constant to the number of outcomes of the two types. The adjustment
yiq12
log 1
niyyiq2
Ž .
is the least-biased estimator of this form of the true logit Note 5.2 . The plot of sample logits should be roughly linear.
When X is continuous and all nis1, or when it is essentially continuous and all ni are small, this is unsatisfactory. One could group the data with nearby x values into categories before calculating sample proportions and sample logits. A better approach that does not require choosing arbitrary categories uses a smoothing mechanism to reveal trends. One such smoothing
Ž .
approach fits a generalized additive model Section 4.8 , which replaces the linear predictor of a GLM by a smooth function. Inspect a plot of the fit to see if severe discrepancies occur from the S-shaped trend predicted by logistic regression.
5.1.3 Horseshoe Crabs Revisited
To illustrate logistic regression, we reanalyze the horseshoe crab data intro-duced in Section 4.3.2. The binary response is whether a female crab has any
Ž .
male crabs residing nearby satellites :Ys1 if she has at least one satellite, and Ys0 if she has none. We first use as a predictor the female crab’s width.
Figure 4.7 plotted the data and showed the smoothed prediction of the
Ž .
mean provided by a generalized additive model GAM , assuming a binomial response and logit link. The logistic regression model appears to be ade-quate. This is also suggested by the grouping of the data used to investigate
Ž .
the adequacy of Poisson regression models in Section 4.3.2 Table 4.4 . In each of the eight width categories, we computed the sample proportion of crabs having satellites and the mean width for the crabs in that category.
Figure 5.2 shows eight dots representing the sample proportions of female crabs having satellites plotted against the mean widths for the eight
cate-INTERPRETING PARAMETERS IN LOGISTIC REGRESSION 169
FIGURE 5.2 Observed and fitted proportions of satellites by width of female crab.
gories. The eight plotted sample proportions and the GAM smoothing curve both show a roughly increasing trend, so we proceed with fitting the logistic regression model with linear width predictor.
We defer to Section 5.5 details about ML fitting. Software e.g., for SASŽ see Table A.8 reports output such as Table 5.1 exhibits. For the ungrouped.
Ž .
data from Table 4.3, let x denote the probability that a female horseshoe crab of width x has a satellite. The ML fit is
expŽy12.351q0.497x.
ˆŽx.s .
1qexpŽy12.351q0.497x.
TABLE 5.1 Computer Output for Logistic Regression Model with Horseshoe Crab Data
Criteria For Assessing Goodness Of Fit
Criterion DF Value
Deviance 171 194.4527
Pearson Chi- Square 171 165.1434
Log Likelihood y97.2263
Std Likelihood- Ratio Wald
Parameter Estimate Error 95% Conf Limits Chi- Sq P>ChiSq Intercept y12.3508 2.6287 y17.8097 y7.4573 22.07 <.0001 width 0.4972 0.1017 0.3084 0.7090 23.89 <.0001
LOGISTIC REGRESSION
170
Ž .
Substituting xs26.3 cm, the mean width level in this sample,ˆ x s0.674.
1 ˆ
The estimated probability equals 2when xs y␣ˆrs12.351r0.497s24.8.
Ž .
Figure 5.2 plotsˆ x against width.
ˆ
Ž . Ž .
The estimated odds of a satellite multiply by exp  sexp 0.497 s1.64 for each 1-cm increase in width; that is, there is a 64% increase. To convey the effect less technically, we could report the incremental rate of change in
Ž . Ž .
the probability of a satellite. At the mean width, ˆ x s0.674, and ˆ x ˆw Ž .Ž Ž ..x Ž .Ž .
increases by about  ˆ x 1yˆ x s0.497 0.674 0.326 s0.11 for a Ž .
1-cm increase in width. Or, we could reportˆ x at the quartiles of x. The lower quartile, median, and upper quartile for width are 24.9, 26.1, and 27.7;
Ž .
ˆ x at those values equals 0.51, 0.65, and 0.81, increasing by 0.30 over the x values for the middle half of the sample.
The latter summary is useful for comparing the effects of predictors having w Ž .x different units. For instance, with crab weight as the predictor, logitˆ x s y3.695q1.815x. A 1-kg increase in weight is not comparable to a 1-cm
ˆ ˆ
increase in width, so s0.497 for xswidth is not comparable tos1.815 Ž . for xsweight. The quartiles for weight are 2.00, 2.35, and 2.85; ˆ x at those values are 0.48, 0.64, and 0.81, increasing by 0.33 over the middle half of the sampled weights. The effect is similar to that of width.
5.1.4 Logistic Regression with Retrospective Studies
Another property of logistic regression relates to situations in which the explanatory variable X rather than the response variable Y is random. This occurs with retrospective sampling designs, such as case᎐control biomedical
Ž . Ž .
studies Section 2.1.6 . For samples of subjects having Ys1 cases and
Ž .
having Ys0 controls , the value of X is observed. Evidence exists of an association if the distribution of X values differs between cases and controls.
Ž .
In retrospective studies, one can estimate odds ratios Section 2.2.4 . Effects in the logistic regression model refer to odds ratios. Thus, one can fit such models and estimate effects in case᎐control studies.
Here is a justification for this. Let Z indicate whether a subject is sampled Ž1syes, 0sno . Let. 1sP ZŽ s1<ys1 denote the probability of sam-.
Ž < .
pling a case, and let 0sP Zs1 ys0 denote the probability of sampling a control. Even though the conditional distribution of Y given Xsx is not
Ž < . Ž < .
sampled, we need a model for P Ys1 zs1, x , assuming that P Ys1 x follows the logistic model. By Bayes’ theorem,
< <
P ZŽ s1 ys1, x P Y. Ž s1 x.
<
P YŽ s1 zs1, x.s 1 . Ž5.3.
< <
Ýjs0P ZŽ s1 ysj,x P Y. Ž sj x.
Ž < . Ž < .
Now, suppose that P Zs1 y,x sP Zs1 y for ys0 and 1; that is, for each y, the sampling probabilities do not depend on x. For instance, often x
INTERPRETING PARAMETERS IN LOGISTIC REGRESSION 171 refers to exposure of some type, such as whether someone has been a smoker. Then, for cases and for controls, the probability of being sampled is the same for smokers and nonsmokers. Under this assumption, substituting
Ž . Ž < .
1 and 0 in 5.3 and dividing numerator and denominator by P Ys0 x , Ž5.3 simplifies to.
1expŽ␣qx.
<
P YŽ s1 zs1,x.s .
0q1expŽ␣qx.
Then, dividing numerator and denominator by 0 and using 1r0s w Ž .x
exp log 1r0 yields
<
logit P YŽ s1 zs1, x. s␣*qx
Ž .
with ␣*s␣qlog 1r0 .
Thus, the logistic regression model holds with the same effect parameter Ž < .
as in the model for P Ys1 x . If the sampling rate for cases is 10 times that Ž .
for controls, the intercept estimated is log 10 s2.3 larger than the one estimated with a prospective study. For related comments, see Anderson Ž1972 , Breslow and Day 1980, p. 203 , Breslow and Powers 1978 , Carroll. Ž . Ž .
Ž . Ž . Ž . Ž .
et al. 1995 , Farewell 1979 , Mantel 1973 , Prentice 1976a , and Prentice
Ž .
and Pyke 1979 .
With case᎐control studies, one cannot estimate  in other binary-response models. Unlike the odds ratio, the effect for the conditional distribution of X givenY does not then equal that for Y given X. This is an important advantage of the logit link and is a major reason why logit models have surpassed other models in popularity in biomedical studies.
Many case᎐control studies employ matching. Each case is matched with one or more control subjects. The controls are like the case on key character-istics such as age. The model and subsequent analysis should take the matching into account. In Section 10.2.5 we discuss logistic regression for matched case᎐control studies.
Regardless of the sampling mechanism, logistic regression may or may not describe a relationship well. In one special case, it necessarily holds. Given
Ž 2.
that Ysi, suppose that X has N i, distribution, is0, 1. Then, by
Ž < . Ž . Ž . 2
Bayes’ theorem, P Ys1 Xsx equals 5.1 with s 1y0 r ŽCornfield 1962 . When a population is a mixture of two types of subjects,. one type withYs1 that is approximately normally distributed on X and the other type with Ys0 that is approximately normal on X with similar
Ž .
variance, the logistic regression function 5.1 approximates well the curve for Ž .
x . If the distributions are normal but with different variances, the model
Ž .
applies also having a quadratic term Anderson 1975 . In that case, the Ž .
relationship is nonmonotone, with x increasing and then decreasing, or
Ž .
the reverse Problem 5.33 .
LOGISTIC REGRESSION
172