Analyzing the Gaver - Lewis Pareto Process under an Extremal Perspective

(1)

Article

Analyzing the Gaver—Lewis Pareto Process under an

Extremal Perspective

Marta Ferreira1,2,3,∗and Helena Ferreira4

1 _{Centro de Matemática da Universidade do Minho, Campus de Gualtar 4710-057 Braga, Portugal} 2 _{Centro de Matemática Computacional e Estocástica, Departamento de Matemática-Instituto Superior}

Técnico Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal

3 _{Centro de Estatística e Aplicações, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal} 4 _{Universidade da Beira Interior, Centro de Matemática e Aplicações (CMA-UBI),}

Avenida Marquês d’Avila e Bolama, Covilhã 6200-001, Portugal; helena.ferreira@ubi.pt

* Correspondence: msferreira@math.uminho.pt Academic Editor: Mogens Steffensen

Received: 10 April 2017; Accepted: 16 June 2017; Published: 27 June 2017

Abstract:Pareto processes are suitable to model stationary heavy-tailed data. Here, we consider the auto-regressive Gaver–Lewis Pareto Process and address a study of the tail behavior. We characterize its local and long-range dependence. We will see that consecutive observations are asymptotically tail independent, a feature that is often misevaluated by the most common extremal models and with strong relevance to the tail inference. This also reveals clustering at “penultimate” levels. Linear correlation may not exist in a heavy-tailed context and an alternative diagnostic tool will be presented. The derived properties relate to the auto-regressive parameter of the process and will provide estimators. A comparison of the proposals is conducted through simulation and an application to a real dataset illustrates the procedure.

Keywords:extreme value theory; autoregressive processes; extremal index; asymptotic tail independence

MSC:60G70; 60G10

1. Introduction

Increased exposure to catastrophic losses and the complexity of financial instruments require sophisticated risk assessment tools in areas such as (re) insurance, banking, finance, among others. Extreme value theory plays an important methodological role in risk management by providing appropriate instruments to deal with values as large as or even higher than those ever observed. These techniques include heavy-tailed models and measures to evaluate tail dependence, namely to infer to what extent the occurrence of a risk value in some variable influences an analogous occurrence in another variable.

Linear ARMA (autoregressive moving average) with heavy-tailed noise may be suitable to model time series presenting peaks of observations. However, in place of a summation, a maximum operator is more propitious to derive extremal properties. Max-autoregressive and moving maximum models were developed within this spirit, such as MARMA (max-autoregressive moving average) introduced inDavis and Resnick(1989) and M4 (multivariate maxima of moving maxima) processes presented in Smith and Weissman (1996). The Pareto model, which is closed under geometric multiplication and minimization, also motivated the first-order Pareto processes presented inArnold(2001). Further analysis may be found inFerreira(2016) and references therein.

A random variable (rv) is modeled by Pareto(σ, α) if it has distribution function (df)

FX(x) =1− (x/σ)−α, x>σ, σ>0, α>0. (1)

(2)

This model is a particular case of Pareto-type tail models, the class of regular varying tail distributions, that is,

1−F(x):=F(x) =x−αL(x), (2)

where L(x)is a slowly varying function (i.e., L(x)is a real function of positive real values satisfying L(tx)/L(x) →1, as x→∞,∀t>0). Parameter 1/α is usually denoted as the tail index of the Pareto rv. Consider{Xn}n≥1a Gaver–Lewis Pareto (GLP) process, presented inArnold(2001). More precisely, for each positive integer n, we have

Xn=σpX1−p_n−1Unn, (3)

where {n}n≥1 is an independent and identically distributed (iid) sequence with common df Pareto(σ, α) given in Equation (1) and independent of the Bernoulli(p) iid sequence {Un}n≥1with 0 < p < 1, and i independent of Xj ∀i > j. If X0 ∼Pareto(σ, α), then{Xn}n≥1is stationary also with marginal df Pareto(σ, α). The GLP process corresponds to the exponentiated version of the Gaver–Lewis process introduced inGaver and Lewis(1980), and hence its name. Simulated samples from the GLP process with marginals Pareto(1, 1) and p=0.25, 0.5, 0.75 are plotted in Figure1.

0 200 400 600 800 1000 0 50 100 150 200 250 0 200 400 600 800 1000 0 50 100 150 200 250 0 200 400 600 800 1000 0 50 100 150 200 250

Figure 1.Simulated sample paths of the GLP process with marginals Pareto(1, 1) for p=0.25 (left), p=0.5 (middle) and p=0.75 (right).

This is a model within the heavy-tailed class where mean values (of different orders) may not exist. For instance, in this case, the mean exists only if α>1 and the variance/covariance exists whenever α>2. InArnold(2001), the autocorrelation was derived as

ρ(Xn−1, Xn) =

(1−p)(α−2)

α+p−2 , (4)

for α>2. Moreover, in heavy-tailed models, the extremal observations are important, and a dependence analysis based on central measures like the most common autocorrelation may be misleading if the dependence in the tails presents a different structure from the remaining.

Here, we focus on the extremal behavior of the GLP process, namely the tail dependence structure. Despite being a heavy-tailed model, it is practically unknown in modeling extreme values. We shall see that it has interesting properties, such as asymptotic tail independence, i.e., the probability that one observation exceeds an increasing large value given that the previous one has already exceeded it, approaches zero. The rate of the convergence, usually denoted coefficient of asymptotic tail independence η (Ledford and Tawn (1996); Wadsworth and Tawn (2012)) captures a residual tail dependence, revealing a kind of “penultimate” clustering, i.e., an aggregation of not so high values. This is a not so fortuitous behavior in real applications and can be observed in the well-known Gaussian

(3)

processes (seeBortot and Tawn(1998),Ramos and Ledford(2013), and references therein). In practice, ignoring this phenomena will result in misleading inference (see, e.g.,Poon et al.(2003)). However, not all series associated with environmental, social or economic phenomena are susceptible to Gaussian modeling, especially when they have heavy tails. The most common extremal models MARMA and M4, as well as, the Yeh–Arnold–Robertson Pareto (III) and the Lawrence–Lewis Pareto processes (see, respectively,Ferreira(2012) andFerreira(2016) and references therein) are not tail independent and new processes have been appearing (Heffernan et al.(2007);Ferreira and Canto e Castro(2010);

Ferreira and Ferreira(2014) andFerreira and Ferreira(2015) ). The GLP is an additional contribution within this class. Coefficient η will also be extended to observations that are lag-m apart, providing an alternative to the autocorrelation function (acf).

This paper is organized as follows. The tail dependence measures and conditions to be analyzed are detailed in Section2and applied to the GLP process in Section3. The tail characterization provides us with methods to estimate the autoregressive parameter p, which will be compared through simulation in Section4. An illustration with a real dataset is addressed in Section5.

2. Tail Dependence

A stationary sequence{Xn}n≥1has extremal index θ∈ [0, 1]if, for each τ>0, there is a sequence of normalized levels{un≡u(τ)n }n≥1, i.e.,

n(1−F(un)) →τ,

as n→∞, such that

P(max(X1, . . . , Xn) ≤un) →e−θτ (5)

(Leadbetter et al. (1983)). The extremal index is a measure for the clustering propensity, being interpreted as the arithmetic inverse of the mean number of exceedances of an increasing threshold per independent cluster. The null case is often ignored, corresponding to a degenerate limiting distribution for the maximum. The value θ=1 is associated to iid sequences but not only these. Below, we shall see that it is a form of asymptotic independence of extremes.

Some dependence conditions allow us to derive the extremal index through the joint distribution of s consecutive terms of{Xn}n≥1.

The long-range condition D(un) ofLeadbetter(1974) states that αn,ln →0, as n→∞, for some

sequence ln=o(n), where α_n,l =sup |P Mi1,i1+p≤un, Mj1,j1+q≤un −P Mi1,i1+p≤un P Mj1,j1+q≤un |: 1≤i1<i1+p+l≤ j1<j1+q≤n , (6)

with Mi,j=max j

s=i+1(Xs), M0,j =Mjand Mi,j= −∞ for i≥j. Consider{kn}n≥1such that,

kn →∞, knαn,ln →0, knln/n→0, as n→∞. (7)

Observe that D(un) is a milder condition than the usual mixing, such as strong-mixing.

Under condition D(un), we say that the local dependence condition D(s)(un) ofChernick et al.(1991) holds for{Xn}n≥1, if for some{kn}n≥1satisfying Equation (7), we have

(4)

with{rn = [n/kn]}n≥1([x]denoting the integer part of x). The validation of D(s)(un) implies that D(s0)(un) holds for s0 >s.

If D(s)(un) holds, the extremal index exists and can be computed through (Chernick et al.(1991))

θ= lim

n→∞P(M1,s≤un|X1>un). (8) Observe that, under D(1)(un), we have θ=1. Condition D(s)(un) is also implied by

n rn

∑

j=s+1 P X1>un, M1,s≤un <Xj −→ n→∞0.

This corresponds to condition D0(un) ofLeadbetter et al.(1983) whenever s= 1, which locally restricts the occurrence of clusters of exceedances and thus leads to θ = 1. Condition D00(un) ofLeadbetter and Nandagopalan(1989) is obtained with s=2 and locally restricts upcrossing clustering.

Observe that, under D00(un), we can write

θ= lim

n→∞P(X2≤un|X1>un) and thus state

θ(un) ∼1−P(X2>un|X1>un), n→∞,

where a(x) ∼ b(x)means a(x)/b(x) → c, as x → ∞, for some constant c. Observe that, if θ < 1, then P(X2 >un|X1 > un) >0, meaning that consecutive observations are tail dependent. On the other hand, under a unit extremal index, we have P(X2 > un|X1 > un) → 0, as n → ∞, and thus consecutive observations are asymptotically tail independent. This feature has been observed in some real data and theoretical examples (Ledford and Tawn (1996); Bortot and Tawn (1998);

Ramos and Ledford(2013)).

Ledford and Tawn(1996) introduced the asymptotic tail independence coefficient, η, in order to measure the rate of convergence of P(X2>F−1(1−t)|X1>F−1(1−t))towards 0, where F−1is the quantile function, capturing a kind of pre-asymptotic dependence. More precisely, the asymptotic tail independence coefficient, η∈ [0, 1], exists whenever it holds

PX1>F−1(1−t), X2>F−1(1−t)

∼t1/ηL(1/t), t↓0 . (9)

Thus, under Equation (9), we can state,

θ(un) ∼1−P(X2>un|X1>un) ∼1−u1−1/ηn L(un), n→∞.

Observe that, if η=1 and L(un) 6→0, we have θ<1 and thus an effect of clustering of high values. Under a unit extremal index, the coefficient η<1 measures the rate of convergence of θ towards 1, capturing a kind of pre-asymptotic clustering, despite a resembling of the process to an iid sequence at increasingly high thresholds.

Analogous with the acf, we extend the coefficient η and state the tail dependence within random pairs that are lag-m apart,(Xi, Xi+m), i≥1, through the coefficient ηm, i.e.,

PX1>F−1(1−t), X1+m>F−1(1−t)

∼t1/ηm_L₍_1/t₎_{, t}_↓_{0 ,}

where η1≡η.

The tail dependent class has been greatly enhanced within the methodology of extreme values. However, this approach results in the overestimation of extremal dependence if the series is actually

(5)

asymptotically independent. An illustration with financial data may be seen inPoon et al.(2003). The most recent literature has been addressing this issue, namely, with the introduction of new models comprising asymptotic tail independence (Heffernan et al.(2007);Ferreira and Canto e Castro(2010);

Ferreira and Ferreira(2014,2015). In the next section, we will show that the GLP belongs to this latter class of models.

3. The Tail Dependence of the Gaver–Lewis Process

In the following, and without loss of generality, we will take σ=1.

“Mixing”conditions roughly state that two rvs become increasingly independent as they get more apart in time. One of its forms is the β-mixing condition, defined by

β(l):=sup p∈N E sup B∈F (X_p+l+1,...) |P(B|F (X1, ..., Xp)) −P(B)| −→ l→∞0,

withF (.)denoting the σ−field generated by the indicated random variables (Bradley(2005)).

Proposition 1. The GLP process is β-mixing.

Proof. The β-mixing condition will be proved through the sufficient conditions of regeneration and aperiodicity (see, e.g.,Bradley(2005); Corollary 3.6).

In the following, consider notation Qm(x,]1, y]) = P(X1+m ≤ y|X1 = x), with Q(x,]1, y]) ≡ Q1₍_x,_]_{1, y}_])_. Observe that Q(x,]1, y]) =P(X2≤y|X1=x) =P U2 2 ≤ y x1−p =F y x1−p p+1−p, y≥x1−p.

First, we show that GLP is regenerative, that is, it has a regeneration set, i.e., a recurrent set R such that, for some m∈ N, a distribution ϕ and κ∈ (0, 1), we have

Qm(x, B) ≥κ ϕ(B), x∈R

for all Borel set B overR. If, for any regeneration set R and any Borel set B overR, we have

Qm+1(x, B) ≥κ₁ϕ(B) and Qm(x, B) ≥κ2ϕ(B), ∀x∈R, (10)

for some m∈ Nand κ1, κ2∈ (0, 1), then the process is said to be aperiodic (Asmussen(1987)). Consider R=]1, r[(and thus recurrent since it is in the state space]1,∞[of the process) and B a Borel set overR. Let x∈R, S= [r, r1/(1−p)]and V∼Pareto(r1−p, α). For all x∈R, we have

Q(x, B) ≥

Z

B∩SdQ(x, z) ≥P(V∈B∩S)p,

and thus regeneration holds by considering m= 1, ϕ(B) = P(V ∈ B|V ∈ S)and κ = P(V ∈ S)p. Observe that S is also regenerative since it is recurrent (S⊂]1,∞[) and,∀z∈S, z1−p≤r<y, for any y∈S, and thus Q(z, B) ≥κ ϕ(B), with ϕ(B)and κ as above. Now, we have

Q2(x, B) ≥

Z

SQ

(z, B)dQ(x, z) ≥κ ϕ(B)Q(x, S) ≥κ ϕ(B)pP(V∈S).

Therefore, the aperiodicity condition in Equation (10) is satisfied if we take κ1 = κpP(V ∈ S)and κ₂=κ.

Note that condition D(un)given in Equation (6) is weaker than β-mixing and thus holds for GLP by the previous result.

(6)

Proposition 2. The GLP process satisfies condition D0(un) for sequences{kn}n≥1satisfying Equation (7) and such that(2−p)n/kn_/np_→_{0, as n}_→∞.

Proof. We have successively, for rn = [n/kn], n≥1,

n rn

∑

j=2 PX1>un, Xj>un = n rn

∑

j=2 PX1>un, Xj>un, U2=. . .=Uj=0 +n rn

∑

j=2 j−1

∑

k=1 P X1>un, Xj>un, j

∑

i=2 1{Ui=1} =k ! ≤ rn

∑

j=2    τ(1−p)j−1 (n/τ)1/(1−p)j−1₋₁+ j−1

∑

k=1 2≤s1<...<s

∑

k≤j τ (n/τ)1−(1−p)j−1 k−1

∏

i=0 s0=1 p(1−p)j−1−k 1− (1−p)sk−si    ≤ rn

∑

j=2   τ(1−p)j−1 (n/τ)1/(1−p)−1+ j−1

∑

k=1 2≤s1<...<s

∑

k≤j τ(1−p)j−1−k (n/τ)p   ≤ τ (n/τ)p rn

∑

j=2 j−1

∑

k=0 j−1 k (1−p)j−1−k ≤ τp+1 1−p (2−p)rn np . (11)

Corollary 1. The GLP process has θ=1.

This result reveals that high observations of the GLP process behave similar to an iid scenario. However, there is a weak dependence that may be evaluated through the Ledford and Tawn coefficient ηin Equation (9). Moreover, we will see that it relates with parameter p of the process.

Proposition 3. The GLP process has η=1/(1+p).

Proof. Consider at=F_X−1(1−t)and take t↓0. Observe that

P(X1>at, X2>at) = pP(X1>at, X₁1−p2>at) + (1−p)P(X1>at, X1−p₁ >at) = p Z ∞ at F₂(at/x1−p)dFX(x) + (1−p)FX(a1/(1−p)t ) = a−α(1+p)_t + (1−p)a−α/(1−p)_t ∼tp+1+t1/(1−p)(1−p), as n→∞.

The fluctuation probability in the GLP process, given by

f := P(Xn−1<Xn) =p P(X_n−1p <n) =p

Z ∞

1 FX

(x1/p)dF(x) =p/(1+p),

(7)

Corollary 2. The GLP process verifies the following equalities: (i) p=1/(1−f) −1;

(ii) p=1/η−1; (iii) η=1−f .

This result states a characterizing feature that can be helpful in model specification. Moreover, in order to satisfy 0< p<1, we must have 0< f <1/2 and 1/2<η<1.

Another interesting property for model identification is based on the lag-m coefficient ηm, analogous with the acf for linear models. The plots in Figure2exhibit a power decay as the acf of AR(1) processes. Observe also that the smaller the value of p, the higher we must choose the lag-m in order to have "almost" independent observations, i.e., ηm≈1/2.

1 2 3 4 5 6 0.50 0.60 0.70 0.80 m ηm p=0.3 1 2 3 4 5 6 0.50 0.60 0.70 0.80 m ηm p=0.5 1 2 3 4 5 6 0.50 0.60 0.70 0.80 m ηm p=0.7

Figure 2.Plots of ηmfor the GLP process with p =0.3 (left); p=0.5 (middle) and p=0.7 (right),

for lags m=1, . . . , 6.

Proposition 4. The GLP process has lag-m coefficient ηmgiven by

η_m =1/2− (1−p)m. (12)

Proof. Consider at = F_X−1(1−t). The product of powered Pareto rvs is still Pareto-type tail distributed (see, e.g.,Arnold(2001)) and thus, applying Equation (2) and the theorem of the dominated convergence, we have successively, as t↓0,

P(X1>at, X1+m>at) = Z ∞ at Px(1−p)m m−1

∏

j=0 Um+1−j(1−p) j m+1−j >at dFX(x) ∼ Z a(1−p)−( m+1) t at atx−(1−p) m+1−α L(x/t)dFX(x) + Z ∞ a(_t1−p)−(m+1)dFX(x) ∼ a−α_t L(1/t) Z a(1−p)−( m+1) t at αx−α(1−(1−p)m+1)−1dx+a−α(1−p)_t −(m+1) = a −α t L(1/t) 1− (1−p)m+1 a−α(1−(1−p)_t m+1)−a−α((1−p)_t −(m+1)−1) +a−α(1−p)_t −(m+1) ∼ L∗(1/t)t2−(1−p)m+1, (13)

(8)

4. Estimation

Relations (i) and (ii) stated in Corollary2will provide us with estimators for the autoregressive parameter p. More precisely, from (i), we have

b

p(F) = 1

1−bf

−1, (14)

with bf corresponding to the empirical counterpart of f ,

b f = 1 n−1 n

∑

j=2 1{Xj−1<Xj},

provided that bf < _{1/2 (notation 1}_{·} means the indicator function). From the iid property of the generating process εt(and with X0 as specified for stationarity), we have ergodicity and thus consistence of the proposed estimators. In addition, bf corresponds to the mean of Bernoulli trials with Markov dependence. FromKlotz(1973), we have that√n(bf −f)converges in distribution to a centered Gaussian model, and thus√n(bp

(F)₋_p₎_{by the Delta Method, as n}_→_{∞. For more details,} seeFerreira(2012).

From (ii), the estimation of p is based on the estimation of η through

b

p(H)= 1

b

η(H) −1, (15)

as long as bη

(H) _> _1/2. _{Observe that η corresponds to the tail index of T} ₌ _min₍_1/₍₁₋ F(X1)), 1/(1− F(X2))). The most common method developed in literature is the Hill estimator (Hill(1975))—thus the superscript “H”. More precisely, we have

b η(H) = 1 k k

∑

i=1

log Tn−i+1:n−log u,

where Tn−k+1:n, . . . , Tn:n are the k larger order statistics of T that exceed u. It is usual to consider u∈ [Tn−k:n, Tn−k+1:n[and plotbη

(H)_{as a function of k. In Figure}₃_{, we can see the Hill trajectories of} b η for the respective GLP models considered in Figure1. The paths are quite stable around the true value of η for a large range of values of k. Indeed, variable T corresponds to the minimum of unit Pareto rvs, where the Hill estimator behaves particularly well. Consistency and asymptotic normality of the Hill estimatorbη

(H)_{can be seen in}_{Draisma et al.}₍₂₀₀₄_).

0 1000 2000 3000 4000 5000 0.5 0.6 0.7 0.8 0.9 1.0 k η H 0 1000 2000 3000 4000 5000 0.5 0.6 0.7 0.8 0.9 1.0 k η H 0 1000 2000 3000 4000 5000 0.5 0.6 0.7 0.8 0.9 1.0 k η H

Figure 3.Hill plots ofbη

(H)_{for the GLP process with marginals Pareto(1, 1) and p}₌_{0.25 (left); p}₌_0.5

(9)

Expecting to observe time series data behaving exactly as the GLP functional Equation (3) is not realistic. At best, we might observe perturbed versions of the GLP process, for instance “noisy” processes of the form X(δ)n =Xn+δZn, n≥1, where{Zn}n≥1is an iid sequence of standard Gaussian rvs and δ>0. Thus, the simulations cover the GLP and “noisy” GLP sample paths for δ=0, 0.1, 1. We consider 1000 replicas of sizes n=100, 1000, 5000 for p=0.25, 0.5, 0.75, and marginals Pareto(1, 1). The computed estimates of the root mean squared error (rmse) and absolute bias (abias) are reported in Table1, where the estimatorbp

(H)_{is based on thresholds u corresponding to the sample minimum (q} 0), the median (q50) and the percentile 80 (q80). We also register the number of fails resulting, respectively, from bf ≥1/2 andbη

(H)_≤_{1/2. Not surprisingly, they are more associated to small sample sizes, where} the case p= 0.75 seems particularly sensitive. Indeed, the results tend to be slightly worse under large p=0.75, where the process approximates to independence. In practice, the difficulty in deciding between tail dependence (η=1) and asymptotic independence (η<1) is well known. For a survey on this topic, see, e.g.,Poon et al.(2003) andBeirlant et al.(2004). The results get better as the sample sizes increase. We observe that estimatorbp

(F)_{is the best for the GLP process but not so robust for “noisy”} GLP. In what concerns estimatorbp

(H)_{, it seems to present an overall better performance under u}₌_q 0. The estimation of the tail index parameter α may be conducted through the Hill estimatorbα

(H) ₍_Hill₍₁₉₇₅_{)), which is consistent and asymptotically normal under strong mixing} conditions (seeRootzn et al.(1990)).

Table 1.Simulation results of the root mean squared error (RMSE) and of the absolute bias (abias). The last three columns correspond to the number of fails (nf) in each case.

RMSE abias n=100 δ=0 δ=0.1 δ=1 δ=0 δ=0.1 δ=1 n fδ=0 n fδ=0.1 n fδ=1 p=0.25 b p(F) _0.0548 _0.1844 _0.5050 _0.0005 _0.1693 _0.4940 ₀ ₀ ₂₄ b p(qH0) 0.0775 0.0837 0.2302 0.0547 0.0070 0.2127 0 0 1 b p(qH50) 0.1225 0.1140 0.1095 0.0801 0.0130 0.0202 0 0 1 b p(qH80) 0.2280 0.2236 0.2098 0.1587 0.0500 0.0842 8 7 35 p=0.5 b p(F) 0.0707 0.1225 0.3317 0.0013 0.0943 0.3200 0 0 68 b p(qH0) 0.0894 0.0894 0.1789 0.0492 0.0557 0.1576 0 0 0 b p(qH50) 0.1342 0.1414 0.1342 0.0586 0.0633 0.0197 2 2 1 b p(qH80) 0.2191 0.2121 0.2049 0.0816 0.0783 0.0182 100 83 68 p=0.75 b p(F) 0.0894 0.0548 0.1483 0.0001 0.0265 0.1287 10 0 222 b p(qH0) 0.1000 0.0949 0.1265 0.0419 0.0422 0.0917 21 22 93 b p(qH50) 0.1265 0.1225 0.1342 0.0116 0.0120 0.0028 143 123 121 b p(qH80) 0.1975 0.1897 0.1975 0.0611 0.0526 0.0686 311 274 267 n=1000 rmse δ= 0 δ= 0.1 δ= 1 δ= 0 δ= 0.1 δ= 1 nfδ=0 nfδ=0.1 nfδ=1 p=0.25 b p(F) _0.0000 _0.1673 _0.5010 _0.0003 _0.1656 _0.4998 ₀ ₀ ₀ b p(qH0) 0.0775 0.0000 0.1449 0.0547 0.0169 0.1423 0 0 0 b p(qH50) 0.1225 0.0316 0.0775 0.0801 0.0080 0.0742 0 0 0 b p(qH80) 0.2280 0.0548 0.0775 0.1587 0.0209 0.0494 8 0 0 p=0.5 b p(F) 0.0000 0.0949 0.3302 0.0005 0.0943 0.3285 0 0 0 b p(qH0) 0.0316 0.0316 0.1095 0.0058 0.0122 0.1068 0 0 0 b p(qH50) 0.0447 0.0447 0.0632 0.0087 0.0072 0.0508 0 0 0 b p(qH80) 0.0775 0.0775 0.0949 0.0158 0.0143 0.0459 0 0 0 p=0.75 b p(F) 0.0316 0.0548 0.1732 0.0006 0.0459 0.1695 0 0 10 b p(qH0) 0.0316 0.0316 0.0707 0.0053 0.0075 0.0636 0 0 0 b p(qH50) 0.0548 0.0548 0.0548 0.0084 0.0062 0.0186 0 0 0 b p(qH80) 0.1000 0.0949 0.1000 0.0120 0.0047 0.0287 16 16 7

(10)

Table 1. Cont.. RMSE abias n=5000 rmse δ= 0 δ= 0.1 δ= 1 δ= 0 δ= 0.1 δ= 1 nfδ=0 nfδ=0.1 nfδ=1 p=0.25 b p(F) 0.0000 0.1643 0.5000 0.0002 0.1649 0.4994 0 0 0 b p(qH0) 0.0000 0.0000 0.1378 0.0015 0.0107 0.1360 0 0 0 b p(qH50) 0.0000 0.0000 0.0837 0.0027 0.0004 0.0836 0 0 0 b p(qH80) 0.0316 0.0316 0.0707 0.0054 0.0044 0.0618 0 0 0 p=0.5 b p(F) 0.0000 0.0949 0.3302 0.0002 0.0939 0.3302 0 0 0 b p(qH0) 0.0000 0.0000 0.1049 0.0004 0.0061 0.1025 0 0 0 b p(qH50) 0.0000 0.0000 0.0632 0.0007 0.0004 0.0567 0 0 0 b p(qH80) 0.0316 0.0316 0.0632 0.0020 0.0021 0.0563 0 0 0 p=0.75 b p(F) 0.0000 0.0447 0.1703 0.0002 0.0451 0.1685 0 0 0 b p(qH0) 0.0000 0.0000 0.0632 0.0011 0.0034 0.0585 0 0 0 b p(qH50) 0.0316 0.0316 0.0316 0.0013 0.0000 0.0257 0 0 0 b p(qH80) 0.0447 0.0447 0.0548 0.0037 0.0026 0.0367 0 0 0 5. Application

Insurance loss data is typically well modeled by heavy-tailed processes. We consider the daily closing values for the Danish fire losses registered from January 1980 to December 1990, plotted in Figure4(left). Observe the high values that appear suddenly, similar to the GLP simulated sample paths (see Figure1), as well as the close linearity of the Pareto quantile-quantile plot (right panel of Figure4). In Figure5(left), the almost plane region of Hill’s sample path led us to the estimate

b

α(H)≈1.4. Thus, we cannot assume the existence of the acf and should avoid an analysis based on this tool. We conduct the estimate of the GLP parameter p through bp

(F) _and b

p(H). More precisely, based on Equation (14), we obtainpb

(F) ₌_{0.9945 and, using Equation (}₁₅_{), we derive} b

p(H)q0 =0.9610,

b

p(H)q50 =0.9715 andpb (H)

q80 =0.9996, where the quantiles q0, q50and q80were considered according to the

simulation study (see also the sample path estimates of Hill in the right panel of Figure5). Formula (12) of the lag-m coefficient ηmof Proposition4is a similar tool to the role of the acf in identifying linear models. Table2presents the estimates of ηm, for m = 1, 2, 3, obtained from estimatorbη

(H)

m , which consists of the Hill estimator, respectively applied to lag-m apart random pairs(Xi, Xi+m), as well as, estimates ofbηm(bp

(F)₎_and b

η_m(p_b(H))derived from Equation (12) by replacing p, respectively, bybp (F)_and

b

p(H). The closeness between the two type of estimates,bη (H)

m andbηm(pb

(·)₎_{, shows a further contribution} in favor of the model.

The GLP process thus seems to have potential in the modeling of this type of data. More tools regarding goodness-of-fit analysis will be addressed in a future work.

Table 2.Danish fire losses: estimates of the lag-m coefficient ηm.

b η₁(_bp(F)) 0.5014 b η2(bp (F)₎ _0.5000 b η3(bp (F)₎ _0.5000 c ηm q0 q50 q80 b η₁(H)≡_bη₁(_bp(H)) 0.5099 0.5072 0.5001 b η₂(H) 0.5094 0.4995 0.5209 b η2(bp (H)₎ _0.5004 _0.5002 _0.5000 b η3(H) 0.5081 0.4985 0.4820 b η3(bp (H)₎ _0.5000 _0.5000 _0.5000

(11)

1982−01−01 1984−01−01 1986−01−01 1988−01−01 1990−01−01 0 50 100 150 200 250 0 1 2 3 4 5 0 2 4 6

Figure 4.Danish fire losses: daily closing values from January 1980 to December 1990 (left); Pareto quantile-quantile plot (right).

0 200 400 600 800 1000 1.0 1.5 2.0 2.5 3.0 k α ( H ) 0 500 1000 1500 2000 0.4 0.5 0.6 0.7 0.8 0.9 1.0 k η ( H )

Figure 5.Danish fire losses: trajectory of Hill estimates ofαb

(H)_{(left) and}

b

η(H)(right).

Acknowledgments:The authors wish to thank the reviewers for their important comments that have improved this work. The first was financed by Portuguese Funds through FCT—Fundação para a Ciência e a Tecnologia within the Project UID/MAT/00013/2013 and by the research center CEMAT (Instituto Superior Técnico, Universidade de Lisboa) through the Project UID/Multi/04621/2013. The second author’s research was partially supported by the research unit UID/MAT/00212/2013.

Author Contributions:The authors contributed equally to the paper.

Conflicts of Interest:The authors declare no conflict of interest. References

Arnold, Barry C. 2001. Pareto Processes. In Handbook of Statistics. Edited by D. N. Shanbhag and C. R. Rao. Vol. 19. Amsterdam: Elsevier Science B.V.

(12)

Beirlant, Jan, Yuri Goegebeur, Johan Segers, and Jozef Teugels. 2004. Statistics of Extremes: Theory and Applications. Hoboken: John Wiley & Sons.

Bortot, Paola, and Jonathan A. Tawn. 1998. Models for the extremes of Markov chains. Biometrika 85: 851–67. Bradley, Richard C. 2005. Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions.

Probability Surveys 2: 107–44.

Chernick, Michael R., Tailen Hsing, and William P. McCormick. 1991. Calculating the extremal index for a class of stationary sequences. Advance in Appliec Probability 23: 835–50.

Davis, Richard A., and Sidney I. Resnick. 1989. Basic properties and prediction of max-ARMA processes. Advance in Appliec Probability 21: 781–803.

Draisma, Draisma, Holger Dress, Ana Ferreira, and Laurens de Haan. 2004. Bivariate tail estimation: dependence in asymptotic independence. Bernoulli 10: 251–80.

Ferreira, Marta. 2016. The Lawrence-Lewis Pareto process: an extremal approach. Electronic Journal of Applied Statistical Analysis 9: 68–82.

Ferreira, Marta. 2012. On the extremal behavior of a Pareto process: An alternative for armax modeling. Kybernetika 48: 31–49.

Ferreira, Marta, and Luísa Canto e Castro. 2010. Modeling rare events through a pRARMAX process. Journal of Statistical Planning and Inference 140: 3552–66.

Ferreira, Helena, and Marta Ferreira. 2015. Extremes of scale mixtures of multivariate time series. Journal of Multivariate Analysis 137: 82–99.

Ferreira, Helena, and Marta Ferreira. 2014. Extremal behavior of pMAX processes. Statistics & Probability Letters 93: 46–57.

Gaver, Donald, and P. A. W. Lewis. 1980. First-Order Autoregressive Gamma Sequences and Point Processes. Advances in Applied Probability 12: 727–45.

Heffernan, Janet E., Jonathan A. Tawn, and Zhengjun Zhang. 2007. Asymptotically (in)dependent multivariate maxima of moving maxima processes. Extremes 10: 57–82.

Hill, Bruce M. 1975. A Simple General Approach to Inference About the Tail of a Distribution. The Annals of Statistics 3: 1163–74.

Klotz, Jerome. 1973. Statistical inference in Bernoulli trials with dependence. The Annals of Statistics 1: 373–79. Leadbetter, M. Ross. 1974. On extreme values in stationary sequences. Zeitschrift Für Wahrscheinlichkeitstheorie Und

Verwandte Gebiete 28: 289–303.

Leadbetter, Malcolm R., Georg Lindgren, and Holger Rootzén. 1983. Extremes and Related Properties of Random Sequences and Processes. New York: Springer.

Leadbetter, M. Ross, and S. Nandagopalan. 1989. On exceedance point processes for stationary sequences under mild oscillation restrictions. Lecture Notes in Statistics 51: 69–80.

Ledford, Anthony W., and Jonathan A. Tawn. 1996. Statistics for near independence in multivariate extreme values. Biometrika 83: 169–87.

Poon, Ser-Huang, Michael Rockinger, and Jonathan Tawn. 2003. Modelling Extreme-Value Dependence in International Stock Markets. Statistica Sinica 13: 929–53.

Ramos, Alexandra, and Anthony Ledford. 2013. Estimation of the extremal index function in case of asymptotically independent Markov chains and its application to stock market indexes. In Studies in Theoretical and Applied Statistics: Subseries B: Recent Developments in Modeling and Applications in Statistics (SPE2010). Edited by P. Oliveira, M. G. Temido, C. Henriques and M. Vichi. New York: Springer, pp. 89–96.

Rootzen, Holger, Malcolm R. Leadbetter, and Laurens De Haan. 1990. Tail and Quantile Estimation for Strongly Mixing Stationary Sequences. Series: Report 9024/A; Rotterdam: Erasmus University.

Smith, Richard L., and Ishay Weissman. 1996. Characterization and Estimation of the Multivariate Extremal Index. Technical Report; Chapel Hill: Universityof North Carolina.

Wadsworth, Jennifer L., and Jonathan A. Tawn. 2012. Dependence modelling for spatial extremes. Biometrika 99: 253–72.

c

2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).