A Latent Approach to the Statistical Analysis of Space-time Data
Dani Gamerman
Instituto de Matemática
Universidade Federal do Rio de Janeiro Brasil
http://acd.ufrj.br/~dani
17th International Workshop on Statistical Modelling Chania, Crete, Greece, 8-12 July 2002
1990 1974
Played in Europe
World Cup Algorithm
?
2006 1958
Played in Europe
2002 1962
Played in exotic place
1970
1994
Played in (L) America
1986 1978
Played in L. America
1966
1998
One-time only home win Europe 25 miles apart
Center of the world football
1982
A Latent Approach to the Statistical Analysis of Space-time Data
Dani Gamerman
Instituto de Matemática
Universidade Federal do Rio de Janeiro Brasil
http://acd.ufrj.br/~dani
Joint work with
Marina S. Paez (IM-UFRJ) Flavia Landim (IM-UFRJ) Victor de Oliveira (Arkansas)
Alan Gelfand (Connecticut) Sudipto Banerjee (Minnesota)
17th International Workshop on Statistical Modelling
Introduction
Examples:
1) measurements of pollutants in time over a set of monitoring stations
3) counts of morbidity/mortality events in time over a collection of geographic regions
Environmental science – data in the form
of a collection of time series that are geographically referenced.
Some examples can be found in other areas
2) selling price of properties around a neighborhood of interest
Main Objective: spatial interpolation
Example: Pollution in Rio de Janeiro
Paez, M.S. and Gamerman, D. (2001). Technical report. Statistical Laboratory, UFRJ.
Example: Pollution in Rio de Janeiro Prob ( PM10 > 100 g/m3 | Yobs )
Other features of interest can be obtained Picture showed mean interpolated values
Spatial Interpolation
m = number of observations g = number of grid points s1, ... ,sm = observed sites
s1n,...,sgn = grid points (to interpolate)
Y1n,...,Ygn = observations in the grid points
p Y Y p Y d Y
Y
p (
n|
obs) (
n|
obs, ) ( |
obs)
- all model parameters
Ymis - missing data, treated as parameters
1. Frequentist inference: generate Yn from
p ( Y
n| Y
obs, )
• Obtain P(Yn|Yobs)by simulation.
Steps to generate from Yn|Yobs : If with probability 1 then
) ,
| (
)
|
( Y
nY
obs p Y
nY
obs
*p
2. Bayesian inference: i ) generate from ii ) generate Yn from
)
|
( Y
obsp
) ,
|
( Y
nY
obs p
Interpolation p ( Y
n| Y
obs)
Usual simplifications:
where
= ( 1, ... , n ) with i=E[w(si)] and
= (ij [w(si), w(sj)] )i,j
Gaussian Process (GP)
(or Gaussian Random Field)
S - region of Rp (in general, p=2) { w(s) : s S } is a GP if
n, s1 , ... , sm S
( w(s1) , ... , w(sn) ) ~ Nn (, )
2) Homoscedasticity i = , i Notation: w(.) ~ GP((.),,)
1) Isotropy [w(si),w(sj)]= (hij) with hij=|si– sj|
Statistical Analysis
Starting point: regression models Yt(s) = t(s) + e t(s) where
t(s) = 0 + 1 X t1(s) + ... + pXtp(s) and et(s) ~ N(0, e2) independent
Suppose that Xtj(s) handles temporal autocorrelation Otherwise, we can include a temporal component t
Usually et(s) remains spatially correlated In this case, et(s) = e0(s) + et1(s)
e0(s) errors spatially correlated et1(s) pure residual (white noise)
0(s) = 0 + e0(s)
Inference
1. At first (3 steps)
• How to estimate 0(s) ?
Traditional approach: geostatistical 0(.) ~ GP(0,,) or
e0(.) = 0(.) 0 ~ GP(0,,)
(b) e2, 2 and 0 estimated from rt0(s)
ˆ0
ˆ and
ˆ ,
e
(c) Inference based on
(a) 0 , 1 , ... , p estimated in the regression model and the residuals rt0(s) = Yt (s) t(s) are constructed
then, 0obs ~ N(0 1, , R)
0obs = (0(s1) , ... , 0(sm) )
Hiperparameters: e2, 2 and 0
3) Natural solution (Kitanidis, 1986; Handcock & Stein, 1993):
• specify distribution for 0
• perform Bayesian inference Problems:
(a) rt0(s) et (s)
(b) ( ˆ e,ˆ,ˆ0) (e,,0)
2) next step:
• 0 , 1 , ... , p and estimated jointly solves (a)
• but to incorporate uncertainty about is complicated
, 0
,
e
ˆ0
ˆ, ˆ ,
e
Recall: E[Yt(s)]=0(s) + 1Xt1(s) + ... + pXtp(s)
Spatial heterogeneity doesn’t have to be restricted to 0 Model generalization
Example:
site by site effect of temperature in the Rio pollution data
Extension of the previous model
E [Yt(s)] = 0(s) + 1(s)Xt1(s) + ... + p(s)Xtp(s) previous model
E [Yt(s)] = 0(s) + 1 Xt1(s) + ... + p Xtp(s)
Hyperparameters: = (ewhere = (0, 1,..., p) Special cases for the j(.)´s:
One possibility: (.) ~ GP(, , )
a) prior independence
) ,...,
(
20 2pdiag
(.)) (.),...,
(
0
p
b) same spatial structure and prior correlation between the j(.)´s
j
j
(.) (.),
We can accommodate spatial variation for other coefficients j, j=1, ... , p.
(.) = (0(.), 1(.),..., p(.))
How to estimate j(s), j=0,1,...,p ?
2) natural solutions:
Specify prior distribution for
In general, independent and non informative priors are used Problems (the same as before):
(a) bj(s) j(s)
(b) (ˆ,ˆ e,ˆ,ˆ) (,e,,)
1) classical solution (Oehlert, 1993; Solna & Switzer, 1996):
(a) 0 (s), 1 (s), ... , p (s) estimated by
b0(s), b1(s), ... , bp (s) in the local regression model (b) estimated from the bj(s)
(c) inference based on ,e,, ˆ,ˆ e,ˆ,ˆ
Model Summary
Parameters: obs , where = ( ,e2, ,)
jobs = (j(s1) , ... , j(sm) ), j=0, 1, ... , p
obs = (0obs , ... , pobs )
= ( 0 , 1 , ... , p )
Data: Yobs = (Y1(s1) , ... , YT(sm)) Xobs = (X1(s1) , ... , XT(sm))
Simulated data
Yt(s) = t(s) t(s), t=1,...,30 t(s) = 0(s)+ 1(s) Xt(s)
t(s) ~ N(0, e2) independent with e2=1
0 ~ N(, ,(
1 ~ N(, ,(
Xt(s) ~ N(, ,(, for all time t
Exponential correlation functions: j(x)exp{- j x}
0= 100 1= 5 2= 0
0= 0.4 1= 0.8 2= 1.5 02= 0.1 12= 1 22= 0.333
+
=
+
0
1X
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
-2.4 -2.4
-2.4
-1.7
-1.7
-1.7 -1.7
-1.7
-1.7 -0.9
-0.9 -0.9
-0.9
-0.9 -0.9
-0.9
-0.2
-0.2 -0.2
-0.2
-0.2
-0.2 -0.2 -0.2
-0.2
-0.2 -0.2
-0.2 -0.2
0.5 0.5
0.5
0.5 0.5
0.5
0.5 0.5
0.5
0.5 0.5
0.5
1.3 1.3
1.3
1.3 2.0
2.0
2.0 2.0
2.7 2.7
Y
Simulated Data
Inference
Parameters: (obs ,)
= ( ,e2, ,) Likelihood:
L(obs ,) = p(Yobs | obs , e2 ) Prior:
p(obs ,)= p( obs | ) p() p(e2) p() p() Posterior:
(obs ,) L (obs ,) p(obs ,)
• Many parameters
• Complicated functional form
• Solution by MCMC
again, use jobs as data (geostatistical analysis) (c) [ e2 | rest ] ~ [ e2 | Yobs , obs ]
~ Inverse Gamma
Full Conditionals
(a) [ obs | rest ] ~ Normal (b) [ | rest] ~ Normal
(e) [ | rest ] ~ j p(j | jobs , j, ) p()
use jobs as if they were data
hard to sample Metropolis - Hastings (d) [ |rest ] ~ [ | obs , ,
Inverse Wishart
Results
(based on a regular grid of m=25 sites) Histogram of the parameters
i = i-2
Spatial Interpolation
Interpolation grid: s1n , ... , sgn
jn = (j(s1n) , ... , j(sgn) ), j=0, 1, ... , p
n = (0n , ... , pn )
We need to obtain the interpolation of j´s to interpolate Yn
Interpolation of Y´s
(Yn,n,| Yobs) = (Yn|n, , Yobs) (n,| Yobs) = (Yn| n ,) (n,| Yobs)
Simulation of [Yn |Yobs] also in 2 steps:
(a) [ n, | Yobs ] MCMC and Spatial Interpolation (b) [ Yn| n ,] using Multivariate Normal
Spatial Interpolation of ´s
(n,obs,| Yobs) = ( n | obs, , Yobs) ( obs, | Yobs) = ( n | obs ,) ( obs, | Yobs)
Simulation of [ n | Yobs ] in 2 steps:
(a)[ obs, | Yobs ] using MCMC
(b)[ n | obs ,] using Multivariate Normal
Simulated data: Interpolation of 1 Simulated values
Interpolated values
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
0.1 0.3 0.5 0.7 0.9 0.1
0.3 0.5 0.7 0.9
Coordenada 2
Simulated data: Interpolation of Y30(.) Simulated values
Interpolated values
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
Interpolation of X´s
These interpolations assume that the interpolated covariates Xj are available for j=1, ... , p
Otherwise, we must interpolate them
Simulation of [Xn|Yobs,Xobs] in 2 steps:
(a) [x | Xobs ] MCMC
(b) [Xn| x, Xobs ] using Multivariate Normal Model may be completed with
X(.) | x , x , x ~ GP(x, x ,x(.))
(Xn, x | Yobs , Xobs) = (Xn , x| Xobs )
= (Xn| x, Xobs) (x | Xobs )
Results obtained by interpolating X Histogram of the parameters
Precisions less sparse then when X is known
Interpolation of X30(.)
0.1 0.3 0.5 0.7 0.9
lat 0.1
0.3 0.5 0.7 0.9
long
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
Simulated values
Interpolated values
Interpolation of Y30(.)
Known X
Unknown X
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
0.1 0.3 0.5 0.7 0.9
Coordenada 1 0.1
0.3 0.5 0.7 0.9
Coordenada 2
113.2
Application to the pollution data
t(s) independents N(0,2)
0~ N(, ,(.
1~ N(, ,(.
i(., i=0,1 are exponential correlation functions Yt (s) = square root of PM10 at site s and time t Xt = (MON, TUE, WED, THU, FRI, SAT)
Yt(s) = 0 (s)+ 1(s)TEMPt + ´ Xt t(s)
• Now, the temperature coefficient varies in space
Results for the pollution data in Rio
Histogram of the hiperparameters sample
where i = i -2
Interpolation of the coefficient Prior for
G(10,10c)
SSDE = 0.0637
G(10-3,10-3)
SSDE = 0.1444
m
i
ols i obs
i
Y
s m
1E
}
2] ˆ
| ) ( [ 1 {
where SSDE =
Same idea can be used for (explanatory geostatistical analysis)
c obtained by exploratory analysis site by site (OLS)
Yt(s)= t(s) + et(s) where
t(s)=t0(s)+t1(s)Xt1(s)+...+tp(s)Xtp(s) et(s) ~ N(0, e2) independent
Extension of the previous model Yt(s)= t(s) + et(s) where
t(s)= 0(s)+ 1(s)Xt1(s)+...+ p(s)Xtp(s) et(s) ~ N(0, e2) independent
previous model
Natural specification
t(.) | t ~ GP(t , ,), independent in time The model must be completed with:
(a) prior for (e , , as before
(b) specification of the temporal evolution of the t´s
We can also accommodate temporal variation of the coefficients
j, j=0,...,p.
Suggestion - use dynamic models (SVP/TVM) (Landim & Gamerman, 2000)
t | t-1 ~ N( Gt t-1 , Wt )
unknown parameters of the evolution Model parameters: obs , ,
= ( 0 , e2 , , , W ) where
= ( 1, ... , T) and
t = ( t0 , t1, ... , tp ), t=1, ... , T Simulation cycle has 2 changes:
I) additional step to II) modified step to
Application to simulated data
Yt (s) = t0 (s) + t1(s)Xt1(s) + t(s)
t(.) ~GP (t, ,)
t = t-1 + t
same spatial correlation to 0 and 1
(. exponential correlation function with = 1.
Histogram of the posterior of
Multivariate observations: Yt (s) = (Yt1 (s), Yt2 (s))
Trajectory of (t) - mean and credibility limits
Interpolation
Samples from ytn|yobs are obtained through the algorithm below:
1. Sample from tobs, yobs - through MCMC
2. Sample from tntobs- through Gaussian process 3. Sample from ytntn - Independent Normal draws
Once again, Xtn must be known, otherwise, they will have to be interpolated.
Spatially- and time-varying parameters (STVP)
Not separable at the latent level, unlike SVP/TVM
Another possibility: temporal evolution applied directly to the t processes rather than to their means
Yt (s) = t0 (s) + t1(s)Xt1(s) + t(s)
t(.) = t-1(.) + wt(.)
wt(.) ~ GP (t, ,) independent in time
SVP/TVM:
|) (|
) (
] } ,
min{
[ )]
( ),
(
cov[
1 2 1 2 1 2 1 22
1
s
ts R t t I t t s s
t
Completed with: 0(.) = 0 ~ N(g0,R)
STVP:
|)]
(|
}[
, min{
)]
( ),
(
cov[
ts
1
ts
2 R t
1t
2W
s
1 s
2Marginal Prior:t(.)| t ~ GP (t, t ,)
Computations
MCMC algorithm must explore the correlation structures
parameters are visited in blocks
(Landim and Gamerman, 2000; Fruhwirth-Schnatter, 1994)
Based on the forecast distribution of YT+h|Yobs,
for YT+h = (YT+h(s1f ),..., YT+h(sFf )), and any collection (s1f,..., sFf)
1. Sample from Tobs, Yobs - through MCMC
2. Sample from T+hfTobs- obtained by introduction of T+hobs
3. Sample from YtnTn - Independent normal draws
Tobs T+hobs by successive evolution of the process
T+hobs T+hfby interpolation with gaussian process
Prediction
Time-varying locations
Assume locations st = (st1,..., stnt) at time t
tobs is a nt-dimensional vector, t = 1,...,T
1 1
1 1
) ~ ,
~ | , (
) ,
|
(
obst
obst p
obst
t
tobs d
tp
1 1
1 1
) ~ ,
~ | ( )
~ ,
|
(
p
obst tp
t obstd
tBoth densities in the integrand are multivariate normal
The convolution of these two densities can be shown to be normal and required evolution equation for can be obtained SVP/TVM: Easily adapted
STVP: requires introduction of for updated locations
~
1
tNon-Gaussian Observations
Two distinct types of non-normality data:
• Count data:
• Continuous:
Can be normalized after suitable transformation g(y) Example: Rio pollution data
( g ( Y ) Y )
estimated jointly with other model parameters (de Oliveira, Kedem and Short, 1997)
For example, in the bernoulli or poisson form standard approach: yt(s) ~ EF(t(s))
spatio-temporal modeling issues: similar computations: harder
Non-Gaussian Evolution
Abrupt changes in the process normality is not suitable Robust alternative:
wt(.) ~ GP(t,,) is replaced by
wt(.)| t ~ GP(t,t-1, ) and t ~ G(t, t), independent for t=1,...,T
Therefore, wt(.) ~ tP(t,,)
t’s control the magnitude of the evolution
Final Comments
• More flexibility to accommodate variations in time and space.
• Static coefficient models: samples from the posterior were generated in the software BUGS, with interpolation made in FORTRAN
• Extension to accommodate anisotropic processes to some components of the model.
• Extensions to observations in the exponential family and estimation of the normalizing transformation.
A Latent Approach to the Statistical Analysis of Space-time Data
Dani Gamerman
Instituto de Matemática
Universidade Federal do Rio de Janeiro Brasil
http://acd.ufrj.br/~dani
17th International Workshop on Statistical Modelling Chania, Crete, Greece, 8-12 July 2002