CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS The accuracy of estimators of association parameters is characterized by

C H A P T E R 3

Inference for Contingency Tables

In this chapter we introduce inferential methods for contingency tables.

Many of these methods also play a vital role in analyses of later chapters for which categorical data need not have contingency table form. The methods assume Poisson, multinomial, or independent binomial sampling.

In Section 3.1 we present confidence intervals for measures of association for 2=2 tables such as the odds ratio. Section 3.2 covers chi-squared tests of the hypothesis of independence between two categorical variables. Like any significance test, these have limited usefulness. In Section 3.3 we show how to follow-up the test using residuals or the partitioning property of chi-squared to extract components that describe the evidence about the association. In Section 3.4 we present more powerful inference applicable with ordered categories. The methods of Sections 3.1 through 3.4 assume large samples. In Sections 3.5 and 3.6 we introduce small-sample methods.

3.1 CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS

CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS 71 showed that the amended estimators

n q0.5 n q0.5

Ž ₁₁ . Ž ₂₂ .

␪˜s

n q0.5 n q0.5

Ž ₁₂ . Ž ₂₁ .

˜ Ž .

and log␪ behave well Problem 14.4 .

ˆ ˜

The estimators ␪ and ␪ have the same asymptotic normal distribution around ␪. Unless n is quite large, however, their distributions are highly

ˆ Ž

skewed. When ␪s1, for instance, ␪ cannot be much smaller than ␪ since ˆ .

␪G0 , but it could be much larger with nonnegligible probability. The log transform, having an additive rather than multiplicative structure, converges more rapidly to normality. An estimated standard error for log␪ˆis

1r2

1 1 1 1

␴ˆ

Ž

log␪ˆ

.

ž

ⁿ11 q ⁿ12 q ⁿ21 q ⁿ22

/

. Ž3.1. We derive this formula in Section 3.1.7.

By the large-sample normality of log␪ˆ,

ˆ ˆ

log␪"z_␣^r2␴ˆ

Ž

log␪

.

Ž3.2.

Ž .

is a Wald confidence interval for log␪. Exponentiating taking antilogs of its

Ž .

endpoints provides a confidence interval for ␪. Woolf 1955 proposed this interval. It works quite well, usually being a bit conservative i.e., actualŽ coverage probability higher than the nominal level ..

ˆ ˆ

When␪s0 or⬁, Woolf’s interval does not exist. When␪s0, one should take 0 as the lower limit and when ␪ˆs⬁, one should take ⬁ as the upper limit. The other bound can use the Woolf formula following some

adjust-Ž . 4 4

ment, such as Gart’s 1966 , which replaces n_{i j} by n_{i j}q0.5 in the estimator and standard error. A less ad hoc approach forms the interval by

Ž .

inverting score tests Cornfield 1956 or likelihood-ratio tests for ␪, as we discuss in Section 3.1.8.

3.1.2 Aspirin and Myocardial Infarction Example

We illustrate inference for the odds ratio with Table 3.1 based on a Swedish study of the association between aspirin use and myocardial infarction similar to that described in Section 2.2.5. The study randomly assigned 1360 patients who had already suffered a stroke to an aspirin treatment one low-doseŽ tablet a day or to a placebo treatment. Table 3.1 reports the number of. deaths due to myocardial infarction during a follow-up period of about 3 years.

ˆ ˜

The sample odds ratio␪s1.56 is close to␪s 1.55, since no cell count is

ˆ ˆ

Ž . Ž .

especially small. The standard error 3.1 of log␪s0.445 is ␴ˆ log␪ s0.307.

INFERENCE FOR CONTINGENCY TABLES

TABLE 3.1 Swedish Study on Aspirin Use and Myocardial Infarction

Myocardial Infarction

Yes No Total

Placebo 28 656 684

Aspirin 18 658 676

Source:Based on results described in Lancet338: 1345᎐1349 Ž1991 ..

A 95% confidence interval for log␪ in the population this sample represents

Ž . Ž .

is 0.445"1.96 0.307 , or y0.157, 1.047 . The corresponding interval for␪ is wexpŽy0.157 , exp 1.047 , or 0.85, 2.85 . The estimate of the true odds ratio. Ž .x Ž . is rather imprecise.

Since the confidence interval for␪ contains 1.0, it is plausible that the true odds of death due to myocardial infarction are equal for aspirin and placebo.

If there truly is a beneficial effect of aspirin but the odds ratio is not large, it may require a large sample size to show that benefit because of the relatively

Ž .

small number of myocardial infarction cases Problem 3.21 . 3.1.3 Interval Estimation of Difference of Proportions

The difference of proportions and the relative risk compare conditional distributions of a response variable for two groups. For these measures, we treat the samples as independent binomials. For group i, y_i has a binomial distribution with sample size n_iand a probability␲_i of a ‘‘success’’ response.

The sample proportion ␲ˆisyirni has expectation ␲i and variance

Ž .

␲i 1y␲i rni. Since ␲ˆ1 and␲ˆ2 are independent, their difference has E

Ž

␲ˆ1y␲ˆ2

.

s␲1y␲2

and standard error

␲₁Ž1y␲₁. ␲₂Ž1y␲₂. 1r2

␴ ␲

Ž

ˆ1y␲ˆ2

.

s q . Ž3.3.

n₁ n₂

Ž . Ž .

The estimate ␴ ␲ˆ ˆ1y␲ˆ2 uses formula 3.3 with␲i replaced by␲ˆi. Then

␲ˆ y␲ˆ "z ␴ ␲ˆ ˆ y␲ˆ Ž3.4.

Ž

1 2

.

␣r2

Ž

1 2

.

Ž .

is a Wald confidence interval for␲₁y␲₂. Like the Wald interval 1.13 for a single proportion, it usually has true coverage probability less than the nominal confidence coefficient, especially when ␲₁ and ␲₂ are near 0 or 1.

More complex but better methods are cited in Section 3.1.8, Note 3.2, and Problem 3.23.

CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS 73 3.1.4 Interval Estimation of Relative Risk

The sample relative risk is rs␲ˆ1r␲ˆ2. Like the odds ratio, it converges to normality faster on the log scale. The asymptotic standard error of logr is

1r2

1y␲1 1y␲2

␴Žlogr.s

ž

␲₁n₁ q ␲₂n₂

/

. Ž3.5.

Ž .

The Wald interval exponentiates endpoints of logr"z_␣^r2␴ˆ logr . It works well but can be somewhat conservative. We discuss an alternative method in Section 3.1.8.

For Table 3.1, the sample proportion of myocardial infarction deaths was 0.0409 for subjects taking placebo and 0.0266 for subjects taking aspirin. The sample relative risk is 0.0409r0.0266s1.54. The 95% confidence interval for

Ž . Ž . Ž .

the log relative risk of log 1.54 "1.96 0.297 translates to 0.86, 2.75 for the relative risk. We infer that the death rate for those taking placebo was between 0.86 and 2.75 times that for those taking aspirin. The Wald 95%

Ž . Ž .

confidence interval for ␲₁y␲₂ is 0.014"1.96 0.0098 or y0.005, 0.033 . According to either measure, substantial public health benefits could result from taking aspirin, but no effect or a slight negative effect are also plausible.

Results for the larger study described in Section 2.2.5 do show a benefit.

3.1.5 Deriving Standard Errors with the Delta Method*

A simple and useful method exists of deriving standard errors for large-sam-ple inferences. Let T_n denote a statistic that is asymptotically normally distributed about a parameter ␪, the subscript n expressing its dependence

Ž .

on sample size. Suppose that an estimator is a function g T_n of T_n. Then, Ž .

under mild conditions, g T_n itself has a large-sample normal distribution.

The standard error depends on how fast g tŽ .changes for t near␪.

Specifically, for large n, suppose that T_n is normally distributed about

' '

Ž .

␪ with standard error ␴r n. That is, as n™⬁, the cdf of n Tny␪ converges to the cdf of a normal random variable with mean 0 and variance

␴². This limiting behavior is an example of con®ergence in distribution, denoted by

d ₂

'

_{n T}Ž _ny␪.™NŽ0,␴ ..

Let g be a function that is at least twice differentiable at␪. Using the Taylor series expansion for g tŽ . in a neighborhood of ts␪, in Section 14.1.2 we show

'

_{n g T}Ž _n.ygŽ␪. f

'

n TŽ _ny␪.gXŽ␪.

INFERENCE FOR CONTINGENCY TABLES

FIGURE 3.1 Depiction of delta method.

XŽ .

for large n, where g ␪ s⭸gr⭸t evaluated at ts␪. Recall if a variate

Ž ². Ž ² ².

Y;N 0,␴ , then cY;N0,c␴ . Thus,

d _X ₂ ₂

'

_{n g T}Ž n.ygŽ␪. ™N

Ž

0, g Ž␪. ␴

.

. Ž3.6.

Ž . Ž .

In other words, g T_n is approximately normal around g ␪ with variance wg^XŽ␪ ␴.x² ²rn.

Figure 3.1 portrays this result. Locally around ␪, g tŽ . is approximately

XŽ . Ž .

linear, with slope g ␪ . Then g T_n is approximately normal, since linear transformations of normal random variables are themselves normal. The

Ž . Ž . < ^XŽ .<

dispersion of g T_n values about g ␪ is about g ␪ times the dispersion of T_n values about ␪. If the slope of g at ␪ is , then1₂ g maps a region of T_n

Ž .

values into a region of g T_n values only about half as wide.

Ž . ^XŽ . ² ²Ž .

Result 3.6 is called the delta method. Since g ␪ and ␴ s␴ ␪ usually depend on the unknown parameter ␪, the asymptotic variance is unknown. Confidence intervals and tests substitute T_n for ␪ and use the

'

w Ž . Ž .x < XŽ .< Ž .

result that n g T_n yg ␪ r g T_n ␴ T_n is asymptotically standard nor-mal. For instance,

'

g TŽ _n."1.96 g TŽ _n. ␴ŽT_n.r n Ž . is a large-sample Wald 95% confidence interval for g ␪ . 3.1.6 Delta Method Applied to Sample Logit*

We illustrate the delta method for a function of the ML estimator Tns␲ˆs Ž . yrnof the binomial parameter␲, for ysuccesses in ntrials. Since E Y sn␲

Ž . Ž . Ž . Ž . Ž .

and var Y sn␲ 1y␲ , E ␲ˆ s␲ and var ␲ˆ s␲ 1y␲ rn. Also, ␲ˆ

CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS 75 has a large-sample normal distribution by the central limit theorem. So do many functions of␲ˆ.

The log odds function of␲ˆ,

gŽ␲ˆ.slog ␲ˆrŽ1y␲ˆ. ,

Ž .

is called the sample logit. Evaluated at␲, its derivative equals 1r␲ 1y␲ . By the delta method, the asymptotic variance of the sample logit is

Ž . Ž . w Ž .x

␲ 1y␲ rn the variance of ␲ˆ multiplied by the square of 1r␲ 1y␲ . That is

␲ˆ ␲ d 1

'

ž

_log ¹^y␲ˆ ylog 1y␲

/

™N

ž

0, ␲Ž1y␲.

/

The asymptotic normality of ␲ˆ propagates to asymptotic normality of w Ž .x

log␲ˆr1y␲ˆ .

The asymptotic variance is the variance of the normal distribution that approximates the true distribution, for largen. It is not an approximation for the variance of the true distribution. For 0-␲-1, the asymptotic variance wn␲Ž1y␲.x^y¹ of the sample logit is finite. By contrast, the true variance does not exist: Since␲ˆs0 or 1 with positive probability, the logit can equal y⬁ or ⬁ with positive probability. The probability of an infinite logit converges to zero rapidly as n increases. For large n, the distribution of the

w Ž .x

sample logit looks essentially normal with mean log␲r1y␲ and standard w Ž .x^y1r2

deviation n␲ 1y␲ . Thus, for the logit, the asymptotic variance actually has greater use than the true variance. Incidentally, related to this, the bootstrap is not helpful for approximating standard errors for many discrete measures, because it mimics the true rather than the more relevant asymptotic standard error.

3.1.7 Delta Method for Log Odds Ratio*

Standard errors for the log odds ratio and the log relative risk result from a

multiparameter version of the delta method. Suppose that ni, is1, . . . ,c

Ž 4.

have a multinomial n, ␲i distribution. The sample proportion ␲ˆisnirn has mean and variance

Ž

␲ˆi

.

s␲i and var

Ž

␲ˆi

.

s␲iŽ1y␲i.rn. Ž3.7. In Section 14.1.4 we show that for i/j,␲ˆi and␲ˆj have covariance

cov

Ž

␲ˆ ˆi,␲j

.

s y␲ ␲i jrn. Ž3.8.

Ž .

The sample proportions ␲ˆ ˆ1,␲2, . . . , ␲ˆcy1 have a large-sample multivariate normal distribution. For functions of them, the delta method implies the

INFERENCE FOR CONTINGENCY TABLES

following result, proved in Section 14.1.4:

Ž . 4 Ž .

Let g ␲ denote a differentiable function of ␲i, with sample value g ␲ˆ for a multinomial sample. Let

⭸gŽ␲.

␾_is , is1, . . . ,c.

⭸␲i

'

w Ž . Ž .x

Then as n™⬁, the distribution of n g ␲ˆ yg ␲ r␴ converges to standard normal, where

2 2 2

␴ s

Ý

␲ ␾i i y

Ž Ý

␲ ␾i i

.

. Ž3.9.

The asymptotic variance depends on ␲_i and the partial derivatives of the

4 4 4 Ž .

measure with respect to ␲_i . In practice, replacing ␲_i and ␾_i in 3.9 by

2 2

'

their sample values yields an ML estimate ␴ˆ of ␴ . Then ␴ˆr n is an Ž .

estimated standard error for g ␲ˆ . A large-sample Wald confidence interval Ž .

for g ␲ is

gŽ␲ˆ."z␣r2␴ˆr

'

Ž .

With the substitution of ␴ˆ for ␴ in 3.9 , the limiting distribution is still standard normal, but convergence is slower. The equivalence in the large-sample distribution is justified as follows: The large-sample proportions converge

in probability to ␲i, by the weak law of large numbers. Since ␴ˆ is a continuous function of the sample proportions, it converges in probability to

␴, and ␴r␴ˆ converges in probability to 1. Now

gŽ␲ˆ.ygŽ␲. gŽ␲ˆ.ygŽ␲. ␴

'

_n s

'

n .

␴ˆ ␴ ␴ˆ

The first term on the right-hand side converges in distribution to standard

Ž .

normal, by 3.9 , and the second term converges in probability to 1. Thus, their product also has a limiting standard normal distribution.

Ž .

We now apply the delta method to the log odds ratio, taking g ␲ slog␪ slog␲₁₁qlog␲₂₂ylog␲₁₂ylog␲₂₁. Since

␾₁₁s⭸Žlog␪.r⭸␲₁₁s1r␲₁₁

␾₁₂s y1r␲₁₂, ␾₂₁s y1r␲₂₁, ␾₂₂s1r␲₂₂,

2 2 Ž .

Ý Ý_i _j␲ ␾_{i j} _{i j}s0 and ␴ sÝ Ý_i _j␲ ␾_{i j} _{i j}sÝ Ý_i _j 1r␲_{i j} . The asymptotic

stan-ˆ 4

dard error of log␪ for a multinomial sample n_{i j} is

1r2

'

␴

Ž

log␪

.

s␴r n s

ž Ý Ý

_i _j 1rn␲i j

/

Ž .

Since n␲ˆi jsni j, the estimated standard error is 3.1 .

CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS 77 ˆ

Ž .

The delta method also applies directly with ␪ to obtain ␴ ␪ˆ and a Wald

ˆ Žˆ. ˆ

confidence interval ␪"z_␣r2␴ ␪ˆ . This is not recommended; ␪ converges more slowly than log␪ˆ to normality, this interval could contain negative values, and it does not give results equivalent to those obtained with the Wald interval using 1r␪ˆand its standard error.

3.1.8 Score and Profile Likelihood Confidence Intervals*

Standard errors obtained with the delta method appear in Wald confidence intervals. However, intervals based on inverting Wald tests sometimes work poorly for small to moderate n. Alternative intervals result from inverting likelihood-ratio or score tests. Although computationally more complex, these methods often perform better.

We illustrate first with the score method for the difference of proportions.

Ž .

The score test Mee 1984; Miettinen and Nurminen 1985 ofH0:␲1y␲2s⌬ has the test statistic

␲ˆ y␲ˆ y⌬

Ž

1 2

.

zŽ⌬.s

␲ ⌬ 1y␲ ⌬ rn q␲ ⌬ 1y␲ ⌬ rn

'

ˆ1^Ž ^. ˆ1^Ž ^. 1 ˆ2^Ž ^. ˆ2^Ž ^. 2

Ž .

where ␲ ⌬ˆi denotes the ML estimate of ␲i subject to the constraint

Ž . Ž .

␲1y␲2s⌬. That is, ␲ ⌬ˆ1 and ␲ ⌬ˆ2 are the values of ␲1 and ␲2

satisfying ␲₁y␲₂s⌬ that maximize the product of the two binomial probability mass functions. These values do not have closed-form expressions and are determined using numerical methods. The score confidence interval

< Ž .<

is the set of ⌬ such that z ⌬ -z_␣r2. Computations for such intervals

Ž .

require iteration Nurminen 1986 .

For the relative risk also, slightly better performance results with an interval using the score method ŽBedrick 1987; Gart and Nam 1988;

Koopman 1984, Miettinen and Nurminen 1985; Nurminen 1986 . Cornfield. Ž1956 and Miettinen and Nurminen 1985 showed the score interval for the. Ž . odds ratio. We prefer not to use a continuity or finite-sampling correction with these intervals, as then performance is too conservative. The fact that the score intervals are computationally more complex than Wald intervals should not be an impediment to their use in this modern era of computing, as the principle behind them is simple. However, currently they are not avail-able in standard software.

For a confidence interval based on the likelihood-ratio test, we illustrate with the odds ratio. The multinomial likelihood for a 2=2 table is a function

4 4

of ␲11,␲12,␲21 . Equivalently, it can be expressed in terms of ␪,␲1q,␲q1

Žrecall Section 2.4.1 . Thus, in inverting a likelihood-ratio test of. H0: ␪s␪0

to check whether ␪₀ belongs in the confidence interval, there are two

Ž . Ž .

nuisance parameters. Their null ML estimates ␲ˆ1q ␪0 and ␲ˆq1 ␪0 that maximize the likelihood under the null vary as ␪₀ does.

INFERENCE FOR CONTINGENCY TABLES

Ž Ž . Ž ..

The profile log-likelihood function is L␪0,␲ˆ1q ␪0 ,␲ˆq1 ␪0 , viewed as a function of␪₀. For each ␪₀ this function gives the maximum of the ordinary log likelihood subject to the constraint ␪s␪₀. Evaluated at ␪₀s␪ˆ, this is

Ž .

the maximized log likelihood L ␪,␲ˆ1q,␲ˆq1 , which occurs at the sample proportions ␲ˆ1qsn1qrn and ␲ˆq1snq1rn. The profile likelihood confi-dence interval for␪ is the set of ␪₀ for which

ˆ 2

y2 L

Ž

␪⁰,␲ˆ¹^qŽ␪⁰.,␲ˆ^q¹Ž␪⁰.

.

ž

␪,␲ˆ¹^q,␲ˆ^q¹

/

-␹ ␣¹Ž . .

This contains all ␪₀ not rejected in likelihood-ratio tests of nominal size ␣. The profile likelihood approach is available with some software e.g., forŽ SAS, see Table A.2 in Appendix A . A related approach, discussed in Section. 6.7.1, uses a conditional likelihood function that eliminates the nuisance parameters by conditioning on their sufficient statistics. This is beneficial when there are many nuisance parameters. An advantage of score and likelihood-based intervals is that unlike the Wald, they are not adversely affected when the sample relative risk or odds ratio is 0 or⬁.

In this section we have discussed interval estimation. Significance tests normally refer to a null hypothesis value of 0.0 for the log odds ratio, log relative risk, and difference of proportions. These are special cases of independence applied to 2=2 tables. In the next section we present tests of independence for two-way contingency tables.

No documento Categorical Data Analysis (páginas 85-93)