, .'
.
. "... .,..FUNDAÇÃO ... GETULIO VARGAS
EPGE
Escola de Pós-Graduação em Economia
"TO COINTEGRATE OR
NOT TO COINTEGRATE
?
THAT'S
A
TOPOLOGICAI.
QUESTION ."
,...., A
RENATO GAL V AO FLORES JUNIOR
(EPGE-FGV / UFRJ)
LOCAL
Fundação Getulio Vargas
Praia de Botafogo, 190 - 10° andar - Auditório
DATA
16/1 0/97 (sa feira)
HORÁRIO
16:00h
33
0 . 0 (/2 F CJ4f 1..
..
How often do you cointegrate ?
or
To cointegrate or not to cointegrate ? That's a topological question.·
Renato G. Flôres Jr.
EPGEIFGV, Rio de Janeiro and Ecole de Commerce SolvayfULB, Bruxelles
(Very) Preliminary version; please do not circulate.
September, 1997.
Abstract
We show that for any multivariate I( 1) process which does not
cointegrate, it is possible to find another process sufficient1y elose to it where
cointegration applies. Closeness is defined in terms of the spectral density matrices of the
respective processes in differences, i.e., a metric which takes into account only the
information in the (centred) second moments. The result may explain why in practice
cointegration is found a bit "too often". Examples developing this point and simulations
giving an insight on the metric used are also presented .
• I am indebted to Clive Granger for discussions and providing the model in Example 1.
.
...
..
1. Introduction.
This paper presents a topological result for the phenomenon of
cointegration. It is shown that for any multivariate I( 1) process which does not
cointegrate, it is possible to fmd another process sufficient1y c10se to it where
cointegration appIies. CIoseness is defmed in terms of the spectral density
matrices ofthe respective processes in differences, i.e., a metric which takes into
account only the information in the second moments. In spite of this, the result
may expIain why in practice cointegration is found a bit "too often".
The structure of the paper is as follows. In the next section we motivate
the resuIt, with a series of non-trivial examples in the time domain. The main
theorem and its interpretations are dealt with in section 3. Section 4 discusses
further consequences of the result and presents a few simuIations that shed some
light on the behaviour impIicit in the metric used. The final section conc1udes.
2. A cointegrating machine.
Flôres and Szafarz (1994) presented a general condition for two I( 1)
processes whose differences are moving averages to cointegrate. Considering the
MA's {Lllt} and {~Yt}, with representations:
if the original processes {Xt} and {Yt} cointegrate, with vector (l,c), it follows
(1)
where r*(L) v*t is a WoId representation of p(L) et + cq(L) Ut .
From (I), one has
·
.' Xt+ cYt = [r*(L) / (1-L) ] v*t (2)
; so that the poIynomial r*(L) must have a unit root, i.e., the combined process
must be a non-invertibIe moving average.
Stated in this generality it is perhaps difficult to fulIy grasp the
impIications in (2). However, a few speciaI cases may raise interesting insights.
Example 1.' The random-walk case.
Suppose p(L)=q(L)=I ; then r*(L) = et + C Ut and it might seem that
cointegration is impossibIe. It is, indeed, if the innovation processes {et} and {ut}
are completely independent (including alI leads and lags), but not necessarily if
-in spite of hav-ing zero contemporaneous covariance - they are cross-correlated at
some lago It is not vety easy to produce examples of two random walks whose
innovations, though contemporaneously non correlated, have some cross-links.
The model below is one of them:
Let xt = wt + e!t
(3)
with L1Wt = Ut + a Ut-l , -1<a<I, and {Ut}, {elt}, {e2t} white noises, all
completely independent, with variances Vu , Ver, Ve2 , respectively, satisfying
t
{
I
\
i
.
.'
..
.
-•
It follows that both original processes are random-walks. Taking {Xt} for
instance, it is obviously 1(1) and one has
so that ,
cov(Llxt, Llxt-l)
=
cOV(Ut-l' a Ut-l ) - cov (e1.t-l' el.t-l )=
aVu - Ve=
Othe covariances with a higher lag being obviously zero .
But, also, both processes are cointegrated as:
is a stationary processo
To acquire a deeper insight on the example, linking it to the result
mentioned in the beginning of this section., it is interesting to compute the lag
polynomial r*(L). As both differences are represented by combinations of
independent white noises up to one lag, the ~ representation of any linear
combination of them will be of order one. Given that c = -1 , putting
r*(L)
=
1+
b L ,(1) becomes:Ô. (xt - Yt)
=
elt - el,t-l - e2t+
e2,t-l=
(1+
bL) v*tWell known properties of a ~(1) give the system:
b var v*t = -2 Ve , (4)
so that, as b
=f
O, it implies that b = -1 and r*(L) is non-invertible as expected .Example 2: The cointegrating machine.
The gist ofthe previous example lies in the special arrangement Ve = aVu ,
which makes for the two key properties of the processes in (3): they are random
walks but their innovations have a non-zero correlation at lag 1. This is not
•
.
' evident at a frrst look at (3), and it may seem that the example is a bit toaartificial. Indeed, processes {Xt} and {Yt} are very similar: they share the same
common part and their innovations bear the same characteristics. However,
based on this simple idea, a whole "machine" for generating cointegrated random
walks may be created.
The key point is to focus on the Wold representation of a bivariate
stationary stochastic process with no deterministic component
(5)
To simplify matters, we shall suppose that alllag polynomials have degree
one and that {elt} and {e2t} are (completely) independent random walks with
identical variances. A little algebra translates the conditions for each component
ofthe integrated bivariate processes { (xt,yt)'} (i.e., all those processes for which
(~t,~Yt)'=( x*t,Y*t)' ) to be a (weak) random walk into:
(6)
A condition for the two innovations be contemporaneously uncorrelated can also
(7)
As known from the previous discussion, cointegration needs the existence
of a non-invertible linear combination of the innovations. As such a combination,
.
.
' with general form.
. '
will be a MA(1); applying to it the same properties that led to (4), with b = -1
already, one arrives at the restriction (8):
The system (6)+(7)+(8) provides a mechanism for generating several
representations that will satisfy the three requirements. It is easy to verify that
is a possible solution. This implies tha~ given the family ofbivariate processes
(9)
where {elt} and {e2t} are independent white noises with the same variance, any
integral of {( x* b y* t)'} wiIl be a pair of random walks whose innovations are
-sense, (9) is even more startling than (3), as in that case the innovations were
contemporaneously correlated.
It is not very difficult to accept that the number of possible families like
the one above is quite numerous. By allowing polynomials of higher degrees in
~
," (5), by working with different variances for the white noises, a variety of
examples can be constructed, satisfying even more stringent conditions, but
always ensuing cointegrated processes.
3. A general resulto
In the previous section we produced a class of processes - in a bivariate
setting - which allow for cointegration. This may suggest that processes of this
sort might be somehow dense, what seems to contradict the usual intuition that
cointegration is a "rare" or "zero measure" event, which only takes place under
strong behaviourallinkages, explained by economic theory. Of course, a density
property will depend on which set, and under which topology, it is being verified.
In dealing with multivariate stochastic processes, a natural space where to work
would be the space of trajectories. However, it suffices to look at the trajectories
of cointegrated and non-cointegrated processes to have the feeling that this is not
perhaps the adequate place where to search for such a result.
We shall instead work with the spectral density matrix associated with the
differenced processo Use of this object in the context of cointegration was
previously made by Phillips and Ouliaris (1988). A proximity result in this space
-First, as the variance of the spectral measure of the process,l there is not a
one-to-one correspondence between spectral densities and stationary processes.
Secondly, to recover the original process, this stationary process has to be
integrated again a not one-to-one operation. This means that two
"disaggregations" are performed to arrive at the space of ultimate interest. This
also explains why a (topological) density result in terms of cointegration can be
obtained in a certain space, without contradicting its rareness in the space of alI
I( 1) n-variate processes.
That some aggregation is needed is more evident if one thinks of the
previously mentioned space of trajectories. Indeed, there, the space is too "[me"
to give what is desired, unless a coarse and uninteresting topology is used. That
however the particular sequence of aggregations in our case is meaningful is what
we try to convince the reader in the next section. We pass now to the formal
setting.
Let S be the space of n+ 1- dimensional stationary processes {X*t} such
that each component admits an invertible MA representation. To each {X*t} one
can associate the set {ft; 't
=
0, 1, 2, ... } of autocovariance matrices.2 We shalluse as the norm of a matrix A, denoted II A 11, the highest absolute value of its cell
entries. With this, a distance is defined for each pair {X*t} ,{Y*t} in S as:
1 Or spectral representation ofthe process, see Anderson (1971), chapter 7.
With the aid of the spectral representation theorem one can prove that d is
really a distance. The metric space (S,d) is not a vector space however, as it is not
closed to addition. The following is true:
Proposition. Let S'cS be the (metric) subspace of (S,d) fonned by those
processes whose components admit at least one linear combination which is not
invertible, then S' is dense in S.
T o prove the above Proposition one needs the
Lemma. Let {X*t} be a process in S whose components admit k, l~16n, linear
combinations which are not invertible, then its spectral density matrix at
frequency w=O , L(O) , has k zero eigenvalues and its entries satisfy the
following system of k equations:
<l'i(L(O))
=
O , I~~ ,where, for each i, <l'j is the sum of the principal minors of order n+ l-i of L(O),
multiplied by (-I )i-l .
.-Theorem: Let {Xt} be a n+l-dimensional 1(1) process, then either (exclusively)
ofboth cases holds:
i) the components of {LlXt } admit k, l~k~n, linear combinations which are not
invertible;
or
ii) for evel)' E > O , and l~k~n, there is a process {X*\} in S' such that
d( {LlXt } , {X*\}) < E
Moreover, for a given k, this process can be chosen in such a way that ali its
marginallong-run variances, and all the covariances between its components (for
allleads lags) but k are identical to those of {Axt}.
Proof: See the Appendix.
The Theorem says then that glVen an I( 1) process whieh is not
eointegrated, i.e., sueh that {LlXt } is not in S', it is possible to find another I( 1)
process, suffieient1y "close" to the first one, that is eointegrated. Closeness is
understood in the sense that both differeneed processes have the same marginal
speetra and, with one exeeption, the same speetra; for the unequal
eo-speetrum, the eorresponding eovarianees ean be made as close as desired. This
means that, for this eo-speetrum, one may have to "add" non-zero eovarianees
between the two eomponents at stake, at ever longer lags and leads for deereasing
E. This is indeed the rationale behind the examples in the previous seetion in
whieh the contemporaneously uncorrelated random walks turn out to be
cointegrated. By introducing these "infinitesimal" covariances, in longer time
spans for each smaIler E, one forces the appearance of the non-invertible
combined moving average and, consequent1y, the occurrence of cointegration.
Any integral of this sequence of "elose" stationary processes, will be an I( 1)
process "elose" - now with the due quaIifications already made - to {Xt}.
4. A deeper insight on the resulto
What are the practicaI consequences of the Theorem in section 3 ? We try
to put it into an applied perspective by discussing in Example 3 one of the most
popular tests for cointegration, the Johansen procedure. The other example tries
to develop the intuition on the "elose" processes in I( 1) space.
Example 3: Johansen's test.
Putting it rather briefly, Johansen (1988, 1991)'s test assumes that {Xt}
has a V AR(P) representation in leveIs, so that an error correction model (ECM)
of order p can be derived:
(10)
The basic idea of the procedure is to "clean" ~t and Xt-l of the p-l lags in
the difIerences and then try to determine the rank of
Ao
by testing the significanceof the canonicaI correlations (or rather, the multivariate regression) between the
residuais from the two regressions. As known, the absence of correlation, i.e.,
Suppose then that {Xt} is not cointegrateel, that the hypotheses of
Johansen's test are verified and that the researcher underfits mode1 (10), stopping
at a lag p'<p . This means that after regressing à.,Xt and Xt-l on the p' lags, a
"common part" due to the p-p' missing (last) lags will be in the residuals. The
Theorem points out that the introduction of cross correlations leads to
cointegration anel, in this case, the probability of falsely accepting cointegration
will be higher. Now suppose the same setting with the researcher overfitting
mode1 (10); according to the same reasoning this seems less serious in tenns of
distorting the testo
The more interesting situation is perhaps when mode1 (10) is correctly
fitted but, due either to measurement errors or to the use of proxies, the
cross-covariance between two components of {Xt} become artificially higher at longer
lags. They wiIl then remain in both residuals and false cointegrations wiIl be
more likely. The same applies if, due to some policy mechanism, a longer (cross)
memory is created between two series during a certain period. As the proof of the
Proposition shows, even if the memory remains short but the lagged covariances
increase sufticiently cointegration may take place. Of course in this case,
deciding whether the found cointegration is economically meaningful or not is a
matter to the researcher's personal judgement. Further developments of this last
point are weIl posed in Harvey (1997).
Example 4: How dose are the 1(1) processes?
Consider at frrst the pair of random walks
Yt = Yt-l + e2t Xo = Yo = O
var elt = var e2t = 1 , cov( elt, e2 t+k) = O , for all k integer
which is naturally not cointegrated. Now, for E = 1/2 ; 1/4 and 1/6 , we are
'! going to generate new pairs of random walks, cointegrated, whose differences are
not further than E - in the metric proposed in section 3 - to {(~t.ôYt)'} .
Following the reasoning in the previous section, the way to do this is to introduce
"extra"covariances of order 11m , between {elt } and {e2t} at a11 leads and lags
from 1 to m. The respective values of m are 1; 2 and 3. The integrals in each case
are generated as follows:
i) the new innovations (elt * ,e2t *)' , with the due cross-correlations, are generated
with the help of their W old representation. For the case E = 1/2 and m= 1 , the
representation is:
.fi
2
.fi
L 2These representations are found by solving the system which links the
coefficients of a (bivariate) MA of order m to its m+ 1 covariance matrices (see,
for instance Hamilton (199), chapter 10). The matrices of coefficients, at any lag,
are symmetric as we impose that covariances are equal for leads and lags at the
same distance from the ori~ i.e.,
r't
=r_'t .
Even so, the solution of thisii) with the assumption that XlO*=X20*=O , the integrated process {(Xlt*, X2/)'} is
generated by:
;=1
Exhibits 1, 2 and 3 show, respectively, each integrated process and the
original random walks.
insert Exhibits by here
Appendix: Proofs
Lemma: Denoting by :2:*(w) the spectral density of process {X*t}, the density of a linear
combination with coefficients v will be v' :2:*(w) v . By hypothesis, there are k linearly independent v such that process {v'X*t} has one real unit root. As known, this means that the equation
x' :2:*(0) x
=
O , xeRn+I, x;t:O,admits k non trivial, linear1y independent solutions. This implies that O is an eigenvalue of
muhiplicity k of :2:*(0) and that, consequently, the k terms of lowest degree of the characteristic
polynomialof :2:*(0) are zero. The k equations shown are simply the coefficients ofthese terms
Proposition: Let {X*t=( X*\t, X*2t, ... , X*o+l.t)'} in S - S', with cross covariance matrices í~*
be given. For each E> O we sha11 produce a rather special element of S', whose spectral density at
the zero frequency has rank n, and which will be closer than E to {X*,} . Define matrix !:*(O;
XI2), which has a11 its entries equal to !:*(O) but for co-spectrum Cl2 which is left as an unknown
X\2, and consider the n+ I-th degree algebraic equation:
det !:*(O; X\2) =
o
(A. 1)Let c' 12 be a real root of (A. 1) . Now set Õ = c' \2 - C\2 and let m be the smallest positive integer
such that I Õ 11t/m ~ E . Consider a sequence of cross-covariance matrices {í'~} such that, for
every 't, they are equal to í ~. but for the following exceptions:
for 0< I 't I ~m, the (1,2) and (2,1) entries ofrnatrix í'~ equal, respectively,
cov (X*\t, X*2.t-~) + o1t/m cov (X*2t, X*I.t_~) + o1t/m
By the spectral representation theorem there exists a stationary process which admits this
sequence of covariance matrices. By construction, the distance between this process and {Xt*} is
smaller than E and its spectral matrix at the zero frequency is equal to !:*(O) but for the (1,2) and
. (2,1) entries, whose value is now:
C\2 + (1/21t) x (o1t/m) x (2m) = C\2 + o
=
c' \2This matrix has then at least one zero eigenvalue and the result is proved.
Theorem: Suppose that i) does not apply. Then {~} is a member of S. Given 00 and k=I, by
the Proposition one can find a process {Xt*I} satisfying ii). If k>1 it is easy to aefapt the
technique in the proof ofthe Proposition to change exactly k co-spectra.
Anderson, T. W. 1971. The Statistical Analysis ofTime Series. 1. Wiley and Sons
Inc., New York.
Flôres, R. G. Jr and A. Szafarz. 199. Efficient markets do not cointegrate.
Cahiers du CERO (Numéro en hommage à Simone Huyberechts), 36;
143-51.
Hamilton, 1. D. 1994. Time Series Analysis. Princeton University Press,
Princeton.
Harvey, A. 1997. Trends, cycles and autoregressions. The Economic Journal,
107, 192-201.
Johansen, S. 1988. Statistical analysis of cointegration vectors. J. of Economic
Dynamics and Control, 12, 231-54.
Johansen, S. 1991. Estimation and hypothesis testing of cointegration vectors in
Gaussian vector autoregressive models. Econometrica, 59, 1551-80.
Phillips, P. C. B. and S. Ouliaris. 1988. Testing for cointegration using principal
components methods. J. of Economic Dynamics and Control, 12, 205-30.
Shilov, G. E. 1961. An Introduction to lhe Theory ofLinear Spaces.
Prentice-Hall Inc., New York.
11-N.Cham. PIEPGE SPE F634t
Autor: Flôres Junior, Renato Ga!vão.
Título: To cointegrate or not to cointegrate? That's a topo!
100639
1111111111111111111111111111111111111111 51507
FGV -BMHS N" Pat.:FI57/98
000100639