• Nenhum resultado encontrado

Example

For the leukaemia survival data of Table 6.1, the Cox model without treat-ment differences has an AIC of 255.8, whereas that with such differences has 242.4. Although these are data originally used by Cox (1972) to illustrate use of this model, these values are both larger than and, hence, inferior to those obtained for the exponential and Weibull models above. This is due to the large number of parameters that must be estimated in this “semi-parametric” model. The difference in log risk is now estimated to be 1.521, very similar to that for the exponential model. 2 Summary

Techniques for analyzing survival data are widely known and well docu-mented. Because they are not generalized linear models, at least in the approaches described in this chapter, we shall not consider them further here. Many books on survival analysis are available; standard texts include those by Kalbfleisch and Prentice (1980), Lawless (1982), Cox and Oakes (1984), Fleming and Harrington (1991), and Andersenet al.(1993).

blood cell counts were actually greater than 100,000. Notice that none of the survival times are censored. Do either of these variables help to predict survival time? Try transformations of the white blood cell counts.

2. A total of 90 patients suffering from gastric cancer were randomly assigned to two groups. One group was treated with chemotherapy and radiation, whereas the other only received chemotherapy, giving the following survival times in days (Gamerman, 1991, from Stablein et al.) (asterisks indicate censoring):

Chemotherapy Chemotherapy

+ radiation

17 167 315 1174* 1 356 524 977

42 170 401 1214 63 358 535 1245

44 183 445 1232* 105 380 562 1271 48 185 464 1366 125 383 569 1420 60 193 484 1455* 182 383 675 1460*

72 195 528 1585* 216 388 676 1516*

74 197 542 1622* 250 394 748 1551 95 208 567 1626* 262 408 778 1690*

103 234 577 1736* 301 460 786 1694

108 235 580 301 489 797

122 254 795 342 499 955

144 307 855 354 523 968

Is there any evidence that radiation lengthens survival times?

3. The Eastern Cooperative Oncology Group in the United States of America conducted a study of lymphocytic nonHodgkin’s lymphoma.

Patients were judged either asymptomatic or symptomatic at the start of treatment, where symptoms included weight loss, fever, and night sweats. Survival times (weeks) of patients were classified by these symptoms (Dinse, 1982) (asterisks indicate censoring):

Asymptomatic Symptomatic

50 257 349* 49

58 262 354* 58

96 292 359 75

139 294 360* 110

152 300* 365* 112

159 301 378* 132

189 306* 381* 151

225 329* 388* 276

239 342* 281

242 346* 362*

Patients with missing symptoms are not included. Do the symptoms provide us with a means of predicting differences in the survival time?

4. Fifty female black ducks,Anas rubripes, from two locations in New Jersey, USA, were captured by the U.S. Fish and Wildlife Service over a four-week period from 8 November to 14 December, 1983. The ducks were then fitted with radio emitters and released at the end of the year. Of these, 31 were born in the year (age 0) and 19 the previous year (age 1). Body weight (g) and wing length (mm) were recorded. Usually, these are used to calculate a condition index, the ratio of weight to wing length. The status of each bird was recorded every day until 15 February 1984 by means of roof-mounted antennae on trucks, strut-mounted antennae on fixed-wing airplanes, and hand-held antennae on foot and by boat. The recorded survival times were (Pollocket al., 1989) (asterisks indicate censoring):

Age Weight Wing Time Age Weight Wing Time

1 1160 277 2 0 1040 255 44

0 1140 266 6* 0 1130 268 49*

1 1260 280 6* 1 1320 285 54*

0 1160 264 7 0 1180 259 56*

1 1080 267 13 0 1070 267 56*

0 1120 262 14* 1 1260 269 57*

1 1140 277 16* 0 1270 276 57*

1 1200 283 16 0 1080 260 58*

1 1100 264 17* 1 1110 270 63*

1 1420 270 17 0 1150 271 63*

1 1120 272 20* 0 1030 265 63*

1 1110 271 21 0 1160 275 63*

0 1070 268 22 0 1180 263 63*

0 940 252 26 0 1050 271 63*

0 1240 271 26 1 1280 281 63*

0 1120 265 27 0 1050 275 63*

1 1340 275 28* 0 1160 266 63*

0 1010 272 29 0 1150 263 63*

0 1040 270 32 1 1270 270 63*

1 1250 276 32* 1 1370 275 63*

0 1200 276 34 1 1220 265 63*

0 1280 270 34 0 1220 268 63*

0 1250 272 37 0 1140 262 63*

0 1090 275 40 0 1140 270 63*

1 1050 275 41 0 1120 274 63*

What variables, including constructed ones, influence survival?

7

Event Histories

An event history is observed when, in contrast to survival data, events are not absorbing but repeating, so that a series of events, and the corresponding durations between them, can be recorded for each individual.

Many simple event histories can be handled in the generalized linear model context. Some of these were covered in Chapter 4. If the intervals between events are independently and identically distributed, we have a re-newal processthat can be fitted in the same way as ordinary survival models.

This is generally only realistic in engineering settings, such as the study of times between breakdowns of machines or the replacement of burned out light bulbs. If the distribution of the intervals only depends on what has happened before an interval begins, the time series methods of Chapter 5 can be applied, by conditioning on the appropriate information.

However, if there are variables that are changing within the intervals, that is, time-varying explanatory variables, the probability distribution can no longer easily be modelled directly. Instead, one must work with the intensity function, as for the nonhomogeneous Poisson processes of Chapter 5.

In even more complex situations, an event signals a change of state of the subject, so that there will be a number of different intensity functions, one for each possible change of state. This is known as a semiMarkov or Markov renewal process.

7.1 Event Histories and Survival Distributions

An event history follows an individual over time, recording the times of occurrence of events. As we have seen, survival data are a special case, where the first event is absorbing so that the process stops. If the successive intervals in an event history are independent, they may simply be modelled as survival distributions. Then, we have a renewal process, so that things start over “as new” at each event. Usually, this will not be the case, because we are interested in modelling the evolution of each individual over time.

As we have seen in Chapter 6, survival data can be modelled equivalently by the probability density, the survival function, or the intensity function.

Common survival distributions include the exponential, gamma, Weibull, extreme value, log normal, inverse Gaussian, and log logistic. Two import-ant families of models are the proportional hazards, discussed in Chapter 6, with

h(t;α, β) =h0(t;α)g1(β)

whereg(·) is a link function, usually the log link, giving Equation (6.2), and theaccelerated lifetimemodels with

h(t;β) =h0(te−β)e−β

Both of these model the intensity, instead of the probability density, al-though they cover some of the densities mentioned above. The most famous example is the Cox proportional hazards model. In the present context, this is often called a multiplicative intensities model.

When the probability density is modelled directly, all conditions describ-ing the subjects must be assumed constant within the complete duration until an event occurs, because that total duration is the response variable.

Thus, even in the case of survival data, time-varying explanatory variables cannot easily be modelled by the density function. The solution is to allow the intensity function to vary over time and to model it directly: a non-homogeneous Poisson process. This may, however, render certain effects of interest difficult to interpret. For example, what does a difference in treatment effect in a randomized clinical trial really mean if time-varying variables are changing in different ways in the treatment groups after ran-domization?

In fact, except for the exponential distribution, the intensity function does change, in any case, over time, but only as a strict function of time since the beginning of the period. Thus, for example, for the Weibull distribution, the intensity function can be written

h(t;α, β) =tα−1g1(β)

a member of both families mentioned above. However, we now would like to introduce observed explanatory variables that may change over time, even in between events.

No documento Applying Generalized Linear Models (páginas 129-134)