Directional Dependence - Applying Generalized Linear Models

7.5 Exercises

8.1.1 Directional Dependence

One of the simplest types of spatial data is the equivalent of a discrete time point process (Section 5.1.1): a record of presence or absence of some event at each point of a regular lattice. An important question that does not oc-cur with time series data is whether dependence is directional. This can be studied by constructing a model whereby the probability of an event at each given point depends on the presence or absence of a neighbour in possibly different ways in each direction. This generally could involve a simple lo-gistic model conditional on the nearest neighbours, a Markov process. If we are to construct a model for dependence on nearest neighbours, we lose the boundary cells because we do not have information on all of their neigh-bours (just as we lost the first observation of a time series). However, if all remaining observations are used, there will be a “circular” effect from

the illegitimate decomposition of the multivariate distribution. Each point would be used too many times because of the reciprocal dependencies. One possible remedy, with substantial loss of information, is to use only every other point (Besag, 1974).

One common multivariate model with dependence on neighbours in a lattice is the Ising model:

Pr(y;α, β) =c(α,β) exp





y_iα_i+

j=i

β_ijy_iy_j



 (8.1)

whereiindexes each position and j its neighbours andc(·) is the normal-izing constant, generally an intractable function of the parameters. This distribution is a member of the exponential family. To be estimable, the model must haveβ_ij = 0 forj outside some neighbourhood ofi.

For binary data, this is a rather peculiar model because the probability of all points withy_i= 0 is the same regardless of the conﬁguration of their neighbours. Generally, the model is simpliﬁed by assuming stationarity, that is, that the parameters do not vary withi:

Pr(y;α, β) =c(α,β) exp





y_iα+

j=i

β_jy_iy_j





Usually, either four or eight neighbours are used so that β has this di-mension. The original Ising model heldβ_j constant so that the probability just depends on the total number of neighbours and not on where they are located.

In spite of the complex normalizing constant, this model can be estimated by the techniques of Chapter 3 (see also Lindsey, 1995b, pp. 129–132).

For example, the simplest model, for total number of neighbours, can be estimated from a cross-tabulation of an indicator of presence at each point by number of neighbours. Then, the Poisson regression has the model

C+C·NUMBER

whereCis the binary indicator for each point. The one with four neighbours would require tabulation of the 2⁵ table of all possible combinations of presence at a point with these neighbours, giving the Poisson regression,

C+C·(N+E+S+W)

where the four letters are binary indicators for a neighbour in the four directions. Notice how the main eﬀects for number or for the directions are not included in the model. If they were, we would have a multivariate

“distribution” for each point and its neighbours whereby each point would be included ﬁve times.

TABLE 8.1. Grid indicating presence of Carex arenaria. (Strauss, 1992, from Bartlett)

0 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 0 1 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 1 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

Example

Consider a 24×24 grid indicating the presence or absence of the plant, Carex arenaria, as shown in Table 8.1. The inﬂuence of neighbours can be summarized in a contingency table giving their numbers, as in Table 8.2.

The plants obviously tend to have neighbours!

None of the Ising models described above ﬁt well for the obvious reason, from Table 8.2, that the probability of zero is not constant for diﬀering

numbers of neighbours. 2

Because of the intractable normalizing constant, this multivariate model has generally not been used. Instead, conditional models are applied, with the diﬃculties mentioned above. A simple conditional logistic (Ising) model has each lattice point depending on its four neighbours:

N+E+S+W

TABLE 8.2. Numbers of neighbours for the plant data of Table 8.1.

Number of neighbours

Presence 0 1 2 3 4

0 128 105 77 23 6

1 20 49 46 25 5

In terms of the Poisson regression above, this is C∗(N+E+S+W)

so that, in comparison to the multivariate Ising model above, it has the main eﬀects for the neighbours (illegitimately) included. It does not correspond to the correct likelihood because of the circular eﬀect mentioned above.

This approach, that has been called “pseudo-likelihood”, may be mis-leading because of the misspeciﬁcation of the multivariate distribution and, hence, of the likelihood function. The analysis should be repeated with every other point, once using only odd points and again only even points. Each of these analyses thus eliminates the dependency but loses information.

Example

For our plant data, using all of the points, this model has an AIC of 569.4.

The north and south parameters, of similar size, are slightly large than the east and west ones, that are also of similar size. We can also add dependence on diagonal neighbours:

N+E+S+W+NE+SE+SW+NW

for an AIC of 566.2. Here, the north–west and south–east estimates are slightly smaller, although all are of similar size. Elimination of north–west reduces the AIC to 565.7, but no other parameter can be removed. The latter is most likely an artifact, because, if we remove two rows around the edge, its elimination is no longer possible. Adding dependence on points two steps away does not reduce the AIC.

We repeat the analysis only using odd or even points. The four and eight directional AICs are summarized in the ﬁrst panel of Table 8.3. We see, indeed, that the two half-models are not independent: if they were, their deviances (not their AICs) would add to give that for the complete set of points.

The parameter values are considerably diﬀerent for the three analyses.

Consider, for example, those for the four main directions, as shown in Table 8.4. (These values remain similar when the four diagonal directions are in-troduced.) Evidently, there is some east–west relationship that is not detec-ted when all of the data are used. The negative value indicates some form

of repulsion. 2

TABLE 8.3. AICs for spatial models for the plant data of Table 8.1.

Neighbours All Odd Even Directions

Four 569.4 276.3 281.4

Eight 566.2 279.0 277.1 Total

Four 563.6 281.6 285.8

Four + four 554.5 279.6 280.3 Eight 552.7 278.4 278.3

TABLE 8.4. Directional parameters for the plant data of Table 8.1.

Direction All Odd Even

North 0.564 0.545 0.540 South 0.606 0.919 0.525 East 0.481 −0.311 1.200 West 0.475 1.169 −0.264

The logistic model looks at the presence or absence of the event. However, more than one event may have occurred at a point. If the counts are avail-able, the Poisson log linear model can be used. Instead, suppose that we only have the binary indicator. The Poisson probability of no event is exp(−µ), so that the probability of one or more events is 1−exp(−µ). Ifµin a Pois-son regression model were to have a log link, this latter expression can be transformed, for binary data, as a complementary log log link.

Example

For our grid of plants, replacing the logit by the complementary log log link yields a slightly poorer model, with a marginally larger AIC. Apparently, presence of plants, and not their number at each point, is important in this

plot. 2

No documento Applying Generalized Linear Models (páginas 152-156)