Instituto de Ciéncias Mateméticas e de Computagﬁo

(1)

UNIVERSIDADE DE sAo PAULO

Instituto de Ciéncias Mateméticas _e de Computagﬁo

ISSN 0103-2569

LOG-BURR XII REGRESSION MODELSWITH CENSORED DATA

GIOVANA OLIVEIRA SILVA VICENTE G. CANCHO EDWIN M. M.ORTEGA MAURICIOLIMABARRETO

N9 297

RELATORIOS TECNICOS

8130Carlos ^—^SP Mai./2007

SYSNOW

DATA—g;

_ICMC_-_SBAB

(2)

Log-Burr XII Regression Models with Censored Data

Giovana Oliveira

Silva ^*

Edwin

M. M.

Ortegal

Universidadede StioPaulo Universidadede 560 Paulo

Vicente

G. Caneho. ⁱ

Mauricio Lima Barreto.

^§

Universidadede 560 Paulo Universidade Federalda Bahia

Abstract

In survival analysis applications, ^when the failure rate function has an unimodal shape7

that ^is a common situation, the log»normal or log—logisticdistributions are ^used. In this paper, a regression model based inthe Burr XII distributionisproposed^formodelingdata what has a unimodal ^failure rate ^function. The Burr XII distribution has a advantage over the log—normalthat the BurrXII survival functionis written in closed form and the

leg—logistic distribution is a special case ofthe BurrXII distribution. Assuming censored data, we considereda classic analysis, aBayesian analysis assuming^no informative priors and jackknife estimator ^for the parameters ôf the model. The Bayesian approach îs considered using Markov ChainMonte Carlo Methods withMetropolis~Hasting algorithms steps to obtain the posterior summaries ôf interest. Besides, we used the sensitivity analysis to detect influential or outlying observations and residual analysis îs ûsed to cheek assumptions in the model such as departures ^from the error assumptions. The relevance ofthe approachîs illustrated with a real data^set.

Keywords: Burr distribution; regression models; censored data; local inﬂuence; generalized leverage; residualanalysis.

1

Introduction

We consider in this paper data set given by Instituto de Saude Coletiva _- Universidadc Federalda^Bahia. This dataset was designed to evaluate the ^{effect of}vitamin ^A supple—

mentation onrecurrent diarrhealepisodes in small children (seeBarreto et a1., 1994). We

"Address: ESALQ, Universidade^deSic Paulo, Piracicaba, Brasil. E-mail: giosilva©esalq.usp.br lAddress: ESALQ, University of ^Séo Paulo, Piracieaba, sac Paulo, Brasil. E-mail: edwin©esalq.usp.br

tAddress: ICMC, Universidade^{de 850} Paulo,^Sao Carlos, Brasil. ^E—mail: garibay@icmc.usp.br

§Address: ISC, Universidade Federal da Bahia, Salvador, Brasil. ^E—mail: mauricio©uibabr

1Addressfor correspondence: Departamento de Ciéncias Exatas, USP ^Av. Padua Dias 11 - Caixa Postal 9, 13418900 Piracicaba ^»SioPaulo- Brazil.

e-mail: edwin@esalq.usp.br

(3)

aim to model the treatment effect on the time to the occurrence of diarrhea. Moreover, the censoring timesare random.

In _many applications there ^is qualitative information about the ^failure rate function shape, ^which can ^help with selecting a particular ^model. In this context, a device called the total time on test (TTT) plot (Aarset, 1987) is useful. The TTT plot ^is obtained by plotting G(r/n)

:

^[( ^{:21 Tim)}⁺ ⁽ⁿ

⁷

T)T,:n]/(E;‘:1T};n), where

r

= ^1, ^.^. ^.

,n

^and ^Tim,

2' = ^1,^{. . .},n, ^are the order statistics ^ofthe sample, against

r/n

(Mudholkar ct al., 1996).

For this data, the TTTplot indicates a unimodal^failure rate function.

It is known that the log—normaldistribution is a popularmodel for survival timewhen the ^failure rate function îs unimodal and the log—logistic distribution is often used as an alternative to the log—normal. The main purpose ôf this paper îs to present other distribution that can be viewed as amore useful and ^flexible alternative.

Wc proposed to use the Burr XII distribution in modelling survival time as a ^viable alternative to the log—normal. The Burr XII distribution has the advantages that _your survival function can be written in closed form. Besides, the leg—logistic distribution is a special case ofthe Burr^XII distribution. The Burr^XII distribution was used inreliability analysisby Zimmer et ^a1. (1998) but in data set without covariatcs.

We considered a^classicanalysis^forlogABurr XIIregression model. The inferential part was carried out using the asymptotic distribution ôfthe maximum^likelihood estimators, which in situations when the sample îs small, it might present ^difficult results to ^be justified. As an alternative for classic analysis, we explored the use of techniques ôf ChainsMarkov Monte Carlo (MCMC) Method to develop an Bayesian inference for the log-Burr XII regression model and it was also used the Jackknife estimator. In both cases, Bayesian andJackknife, itisn’tneed using asymptotic distribution ôfthe maximum likelihood estimators.

After modelling, it îs important checking assumptionsⁱⁿ the model and to conduct a robustness study to detect influentialôr extreme observationthat can cause distortions on the resultsof the analysis.

The examinationofresidualswas used to ^check assumptionsⁱⁿ the model. Numerous approaches^have been proposedⁱⁿ the literature to detect inﬂuentialor outlying^observa—

tions. An efficient way to detect influential observationsîs the diagnosticanalysis. Cook (1986) uses this idea to motivatehis assessment of local influence. He suggest that more confidence can be put ⁱⁿ a model which is relatively stable under ^small modifications.

The best known perturbation schemes are basedon casedeletion (Cook and Weisberg, 1982 and Xie and Wei, 2007) in which the effect is studied ofcompletelyremoving cases from the analysis. This reasoning^{will form} the basis for ourglobal influence introduced in section 4.1 and in doing so it will be possible to determine ^which subjects might ^be influential for the analysis. On the other hand using ease deletionâll information from a single subject îs deleted at once and therefore it îs hard to tell whether that subject has some influence on a ^specific aspect ôfthe model. A solutionfor the earlier problem can ^{be found} in a quite ^different paradigm being a local influence approach where one again investigates^how the results ofan analysisare changed under small perturbations of the model but where these perturbations can ^{be specific} interpretations. Also, some authors have investigated the assessment of local influence in survival analysis ^models:

for instance, Pettit and Bin Daud (1989) investigateof local inﬂuence in proportional hazard regression models, Escobar and Meeker (1992) adapt local inﬂuence methods to

(4)

regression analysis with censoring, Ortega et â1. (2003) consider the problemôf âssess—

inglocal influence in generalized log—gamaregression models with censored observations, Leivarsanehez et â1. (2006) investigateinfluence local in log-Birnbaum—Saunders regres—

sion models with censored data andmore recently Ortega et ^a1. (2006) derive curvature calculations under various perturbation schemes in log-exponentiated—Weibull regression model withcensored data. We developed a similar methodology to detect inﬂuential^sub—

jects ⁱⁿ log—BurrXII regression models with censored data, it ^is presentedⁱⁿ section4.2.

Finally applied methodology the leverage generalized developed by Wei at âl. (1998) In Section ² this article is considering a briefing study ôf the Burr XII distribution besides the inferential part ôfthis ^{model. In} the section ³ we suggest a log—BurrXII re—

gression model, in addition with the maximum^likelihood estimators, Bayesian inference and the J^ackknife estimator. In the section⁴we used several diagnostics measures^consid~

ering three perturbation schemes, case—deletion and the generalized leverage in log—burr

XIIregression model withcensored observations. Wepresentresidual from a ﬁtted model using the Martingale residualproposed byTherneau et^a1. (1990)and we proposed a^mod—

iﬁed deviance residual for the log—BurrXII regression model in the section 5. Finally, in the section6 the real data set ^is analyzed and the ^conclusion appears ⁱⁿsection 7.

2 The Burr XII distribution

The Burr ^XII distribution, used in Zimmer et al.(1998), with parameters s, c and ^k considers that the ^life time T has a density functiongiven by

t ^c ^(71:71)tc‘l

t, ,k, = ^k ¹ ^— ¹

f(5

^c)

c(+(s))

^<>

where k > ⁰ and s > ⁰ are scale parameters and c > Ô îs shape parameter. The survival function corresponding to the random variable T with Burr ^XII densityîs given by

S(t;s,k,c)

= P(T

2

^t)

:

⁽¹ ⁺ ^(é)c>—k

The corresponding^failure rate function^has the following form :71

h(t;s,k,c)

: iﬂ

2.1 Characterizing the failure rate function

According to ^Zimmer et â1. (1998), the failure rate functionôfthe Burr XII distribution can bedecrease when ^G S ¹ and^when c > ² the failure rate functionreaches a maximum and thedecreases, where the rangeof values in which thefailure rate functionîs increasing can ^be manipulated using s. When c values between ¹ and 2, the failure rate function can ^be made to ^be essentially constant over much of the rangeôf the distribution, this

(5)

depends ^of s values. To study the shape ^of the ^failure rate function we have found its derivativethat can be written as

, cktc’2 t ^c

h(t,07k,5) =

ﬁ|:C—1—(g):|.

sc<1 + (E) ^>

In order to study better this function one can note that ^two situations might ^be considered:

0 c S ¹

To any t > 0, h’(t) < ⁰ and therefore h(t) is an decreasing function.

0 c > ¹

When h’(t‘)

:

⁰ ^we ^have ^c

⁷

¹ ^~

(if

_S

:

^0, ^hence the critic point ^is given by t" = ^s(c ^—

I)?

When t < t*, h/(t) > 0, the ^failure rate function is increasing and when t > t", h/(t) < ⁰⁷ the failure rate function ^is decreasing. Hence, ^{t" is} an inﬂexion point and the ^failure rate function has unimodal shape property. ^Besides,

h(t)—>0fort—>00rtﬁoo.

Figure ¹ ^shows the plots ^of the failure rate function for some different parameter combinations.

Figure ^I: Plots ofthe failure rate function^for Burr XII distribution.

From ﬁgure ^1, it can be seen that ^the ^failure rate function ^is an decreasing function when _c

g

¹ and h(t) ^is a unimodal-shaped function^when ⁶> ^1.

2.2 Moments for the failure time

The qth moments^for the failure timeis given by:

E(T‘1): 341:3? + ^1,1: ^—3], ^{if ck}> q

c c

where B(a, ^b) ^is the complete beta function(see Lawless (2003)).

4

(6)

2.3 Relation the other distributions

The log—logistic distribution is aspecial case ofthe Burr XII distribution. When

;

1 ⁼ ^m

and k = ^1, Burr XII distribution is reduced to the log—logistic distribution where the survival function can_Besides, _Rodriguez^be(1977) showswritten as S(t;that^k,s,c)the Burr

:

¹

⁷

_coverage

m.

area in ^specific plane îs ôe—

eupied by various well—knownand useful distributions, including the normal,log—normal,

gamma, ^logistic and extreme^value type ^I distributions.

2.4 Maximum likelihood estimation

We assume that ^the lifetime are independently distributed, and ^also independent ^from the censoring mechanism. Considering rightweensored lifetime data, we observe t, = min(T,,C,), ^where ^{T, is} the lifetime and C, is the censoring, both for ith individual,

i

:

^1, ^{. . .}^,n. Âssuming ^that ^t1,t2,^. ^{. .} ^,tn îs a random sample ôfthe random variable T

with BurrXII distribution (1) ^The ^likelihood ^function^ofc, k and s corresponding to the observed sample^is given by

W=<kcrgl<1+<2>C)*““t:3]n[<1+of]

15C

where

r

is observed numberof failures,

F

denotes theset ofuncensored observations and C denotes the set of censored observations. Thelog~likelihoodfunctionis given by:

l(c,k-,s)

:

rlog(/€)+rlog(c)—

()k+1 )glog<l+(: is))+;log<t: 1)

_k§log<l+ (:

⁾³⁾

The maximumlikelihood estimator e, ^1%and s of c,kands are obtainedby maximizing the log—likelihood, which results in solving the equations

81(c,k,s)

:

^E

^‘ ^(k+ DEM-+21%(ts

^1—)

k:

^{(§)°(10g§)}

<90 C isF

(1+(§))

^ieF ^iEC ^(1+(§)C)

81(c,k, ⁾ r

8,; :,_;10g(1+(g )) ^Q4143)

3K0,k,5)

_

^C ^ticsr(c+1)

_ E

C tics‘(c+1)

fas ^’ (“1&0ng)

^s ⁺

§(1+<:+>°)

These equations cannot ^{be solved} analyticallysothat statistical software such as Ox or R can ^be used to ^solved them. In this paper, ^software ^OX ^(MAXBFGS subroutine)

is used to compute the maximum^likelihood estimator (MLE) but reparametrization ^is necessary. It can ^{be used} the transformations0

: ⁱ

^and^s ⁼ ^exp(u).

5

(7)

3 Log—Burr XII regression models

3.1

Log—location—scale

regression model

In many practical applications,lifetimes are affected by variables, which are referred to as explanatory variablesor covariates, such asthe cholesterol level,blood pressure and many others. So it isimportant toexplore the relationship^between the lifetime and explanatory variables, an approach basedin regression model can be used.

The covariate vector is denoted by x

:

^($1,362,^{. . .} ,xp)T which is related to the res- ponses Y

:

^log(T) ^through a regression model.

Considering the transformations c =

i

^and ^s

:

^exp(n). ^Hence, ^it ^follows ^that ^the

density function ^of^Y can^be written as

ﬁll; ^{16,01 M)}= §<1 +

emf?» ^#kexpei-ﬁ)

⁽³⁾

—00 < y < _00, where k > 0, a ^> ^0, ^and ^—00 ^< p < ^00‘ And survival functionis given by S(y) = ^[1+

$424)] ^W).

Besides, we have the following important theorem.

Theorem ^1: For the variableYthe moment generating function (mgf)is given by My(t)

:

^{kstBE +} ^{1,k— E],} ^if ^kc ^{> t}

where B^{[(1, b]} is the completebetafunction (proofgiven in appendixB).

Hence the mean^{of Y} ^is given by

E(Y)

: ⁸⁺

^a[z/1(1)— ^w(k)], ^if ^kc ^> ^t

where 111(0) is the digamma function(see Lawless(2003)).

We can write the above model as a log—linearmodels Y

:

ⁿ^{+ aZ}

where the variable Z followsthe density

f(2) = ^k(1 +exp(z))(_k_1)cxp(z), ^V

7

⁰⁰ < ^z < 00 and ^k > ^0. ⁽⁴⁾ Now,it is also considered that the^scale parameter p^ofthelogBurr model dependson the matrix ofexplanatory^variables X, this is, #1= ^,TB. We also consider the regression model based on the log—BurrXH given in (6) relating the response Y and the eovariate vector x, ^so that the distribution

le

^{can be} ^represents^as

Y,=xﬂB+aZ,-, i:1,...,n,

⁽⁵⁾

where ['3

:

^(B1, ^{. . .}^,flp)T, â ^> ⁰ând^k ^> ⁰âre unknownparameters,x? = ($11,129,^{i . .},xip) is the explanatory vector and Z ^followsthe distribution in (4).

In this _case, the survival functionof

le

^is ^{given by}

S(y|x) = ^[1

+exp(ﬂ)}ik.

_U

(8)

3.2 Estimation by maximum likelihood

The corresponding ^values to the sample (y1,x1), (y2,xz), ^{r . . ,} (gmxn) ^of n observations from the distribution (6) where yz- represents the logarithm ^of the survival time and x1-

the covariate vector associated with the^i—th individual, the log-likelihood function can ^be written as

l(9) = ^rlog(k) ^—rlog(a)+

Z

_ieF ^Z1^— ^(k⁺ ¹⁾

Z

^log(1⁺^exp(zi)) ⁽⁶⁾

M?

—k:

2

_isC^log(1⁺^exp(z;)),

Where

r

^is ^{the number}of uncensored observation(failures) and zi

2 gm.

^Maximum

likelihood estimates forthe parameter vector 0

:

^(k,â,fiT)T can be obtainedby maximiv Zing the likelihood function. In this paper, the software Ox (MAXBFGS subroutine) (see Doornik, ¹⁹⁹⁶⁾ was used to compute maximum^likelihood estimates (MLE). Covariance estimates ^for the maximumlikelihood estimators 3 can be obtained using the Hessian matrix. Confidence intervals and hypothesis testing can ^be conducted using the large sample distribution ôf the MLE which is a normal distribution with the ^covariance ^ma—

trix as theinverse ofthe Fisher informationsince regularity conditionsare satisfied. More specifically, the asymptotic^covariance matrixis given by14(0) ^with¹⁽⁰⁾ = Ê[L(0)] ^such

" ₆₂₁0

that L(0)

: ^7{%(§;}.

Since it is not possible to compute the Fisher information matrix ^{1(0) due} to the censored observations (censoring ^is random and noninformative), but it ^is possible to use the matrix of second derivatives ofthe log likelihood, —L(0), evaluated at the ^MLE 0

:

^6,^which ^is consistent. The asymptotic normal approximation^for3 may be expressed as 3T

~

^N(p+2){0T; ^L(0)‘1} ^where ^L(0) ^is ^the (0+2)⁽¹²⁺²⁾ ^observed information matrix, obtained from:

Llclc L/w Lkﬂ]

—t(0) = ^.

L” La,

with the submatriccsgiven in appendix^A.

3.3 A Bayesian analysis

The use of the Bayesian method besides being an alternative ânalysis, it âllows the în—

corporationof previous knowledge ofthe parameters through informative priori densities.

When there îs not this informationone considers noninformativepriori. In the Bayesian approach7 the referent information to the ^model parametersîs obtained through posterior marginal distribution. În this _way it appears two difficulties. The first it ^refers to ât—

tainment marginal posterior distribution ^and the second to the calculation^ofthe interest moments. In both cases are necessary integral resolutionsthat_{many times}^donot present analytical ^solution. In this paper ^we ^{have used} the simulation method ^{of Markov} Chain MonteCarlo, such as the ^Gibbs sampler and Metropolis-Hasting algorithm.

Consider the Burr XII distribution (1), censored data ^and the likelihood function (2) for k, c and

s

^For ^a Bayesian analysis, we assume the following priori densitiesfor k, s

and c

(9)

o k N

Rahal),

^a1 ^and ^b1 ^known;

0 s N F(a2, b2), a2 and ^b2 known;

0 0

~

1"(a3,b3)7 a3 and b3 known;

where “34: bi) denotes a gamma distribution with^mean it, ^variance ^E; and density^func—

tion given by

b?v"‘"lexp{—vbi}

“V;8a, bi) =

T

where V > ^0, ai > ⁰ and ^bi > ^0.

In the special case where a1 = b; = a2

:

^b2

:

^a3 ^{= b3:0,} the noninformativecase follows, and it assumed independenceamong the parameters the priori densities ^for ^k7 s

and c is written as

k

ﬁ

1

7r( ,s,c) ^(x _ksc

Wefurther assume independenceamong the parametersk,sandc. The joint posteriori distributions for k,s and c is given by,

7r(k, s,^CID) ^o< ka‘_lexp{—kb1}sa2‘1exp{7sb2}ca3_lexp{icb3}

(mile<S>C>*(k+”lgtf-lgl<1+<26>>ikl

where D denotes the data^sets.

It can^{be shown} that the conditional postcriori densitiesare given by

«(14mm a k““*cxp{-kbl}H

_ieF [(1 +

(Ell—(Ml H

_icC [(1 +

(Elli

5

«(slkicim ^o< s”’°“lcxp{-Sb2}i1} [(1 +

(Slalﬂwl ll

^{[(1 +} ^(Sell—k]

7r(c|k,s, D) oc c“"cxp{—cb3}crs_cr<§)rﬂ_icF

[(1+ (33%“?

^Hit—1H_ieF _ieC

[(H (2))?

Observethat we need to use the Metropolis-Hastingsalgorithm to generate the ^vari—

ables k, s and c from the respective conditional posteriori densities^since their forms are somewhatcomplex.

For Bayesian inference, considering model ⁽⁵⁾⁷ assume the following priori densities for a, ^k and ^HF:

ok~l‘(cl,d1),

^c1 ^and ^d1 ^known;

8

(10)

o a~1nverseF(e2,d2), ^C2 ^and ^d2 known;

oﬂj~N(p0j,a§J-),

¹¹⁰⁾ ^and ^031- ^known7 j:0,.11p.

Noninformative prioris assume independenceamong the parameters, follows by considering^(:1 =

C2:

^d1—^— ^dg—^— ⁰ and aoj large.

We again assume independence_among the parameters. The joint posteriori^distribu—

tion ^for 0', k and

ﬂ

^is given by:

7r(o,k,,6T|D) ^o< kc11exp{—kd1}aC2+1)0xp{—d—}exp{—

22<ﬂj—;

^WHY^}

j—O 0i

(§)rcxp{ :21}

^H[(1 +exp{zi})*(k+1)]H[(1+ exp{zi})'k]

ieF ieF isc

where _z- ^—

”—22.

It can^{be show} that the conditional marginal distributionsare given by:

”(klaa ^5T»D) ^0< kc‘+r_lexp{'kdi}CXDHl(1+eXP{Zi})’(k+l)l

Hi0

⁺ CXP{Zi})—k]

ieF ice

~cz7r71 ^d2 +

7r(0|k,ﬂT,D) ^Dc ⁽⁷ exp{ ^—

;}exp{ g7

^z,} ^11;!“

⁽¹⁺

^exp{z.)—}^(k ^1)]

Hi0

⁺CXp{Zi})—k]

iec

7r(fljlkiayfi_j,]:)) Ô( ^6Xp{

_ i; (L

^;0H0j)2^}

6"“ _{Z; 4.1111}

⁺€XP{Zi})““+“l

Hi0

⁺

expmrki

Observe that we need to use the Metropolis Hastings algorithm to generate ^from the posteriori conditional distributions of k, a ^and^ﬁj (j=0,...p).

3.4 The Jaekknife Estimator ^for the model

The idea the jaekknifingîsto transform the problemôfestimatinganypopulationparame—

ter into the problem^ofestimating a populationmean. So, what is done when estimating a mean^value ^is realized in this method but from an unusual point of view. In this paper,

we used this method as an alternative method to estimate the population parameter.

Suppose that T1,T2,^.. .,T" ^is a random ^{sample of} n ^values and the sample mean ^is given by

(11)

and ^is used to estimate the mean ofthe population.

Now, it is calculated the sample mean with the ^lt" observationmissed out, T_l

_

^211:1]?n^— ¹ ^Tl

Then of two expressions above is obtain

Tl

:

^7LT ^— ⁽ⁿ^— ^1)T,l. ⁽⁷⁾

In a ^general situation, ^consider that ^{0 is} a parameter estimated by E(T1,T2,.. i7Tn)

andfor ease ofnotation drop(T1,T2, ^{. i}ⁱ ,Tn). Finally, itiscalculated^EL;whatisobtained with the T; observationmissed out. Itfollows, by equation ⁽⁷⁾ that pseudo—values can ^be calculated

E; = ^HE^— (n^— ^1)];11 ⁷^l

:

^1,^{. . .} ^,n

The _{average of}the pseudo—values is given by

A

: ^2;;

^E;

E.

n

that ^is the jaekknife estimate ^of^0.

Manly (1997) suggests that an approximate ¹⁰⁰⁽¹

7

00% conﬁdence interval for 0 is given by E" ^:l:

tat/2,",1s/ﬂ,

^where ^tat/231,1 îs ^the ^value ^that îs êxceeded with probability

a /

² ^for ^the ^t distribution With (n^—1) degrees of freedom and thejackknife estimator had the effect of removing bias of order ¹

/

^mi

The jaekknife estimator calculations^for the log-Burr XII regression model are ^reali—

zed to k, (r and flj (j=0,...p) and ^confidence intervalsare calculated separately to êach parameter.

4 Sensitivity analysis

4.

1

Global inﬂuence

A first tool to perform sensitivity analysisâsstated ^before is by means of global influence starting ^from case—deletion. Case—deletion is a common approach to study the ^{effect of} dropping the ith case from the data ^set. The case—deletion model for the model (5) is given by

Yl=xfe+az,, l:1,2,...,n,

^17m: ⁽⁸⁾

In the following, a quantity with subscript ^"^(i)” means the ^original quantity with the ith case deleted. For the model (8), the log—likelihood functionof0 is denotedby l(¢)(0). Let

9G) = ^(12(5),^(”7(1),

ﬁg)?

^{be the}^ML ^estimate^de ⁰from l(i)(0). To assess theinﬂuence of the ith _{case on} the _ML estimate 9 = ^(i9,6,

ﬂy,

^the ^basic îdeaîs to compare the ^difference between00-) and 0. Ifdeletionôfacase seriously influences the estimates, more attention should be paid to that case ^{Hence, if} ⁹⁽¹⁾ îs far from 9‘ then ith case îs regarded as an

10

(12)

influentialobservation. Â first measure theinfluence global is defined asthe standardized normôf^{0(1‘) —}0 (generalized Cook distance)

GQWV=@M*9VEWHA@m—@

Other alternative is to assess the values GD,(,6) and GDi(k,0), such values reveal the impact^of^2thcase on the estimates of^[3and (19,0), respectively. Another popular measure ofthe difference between 00-) and 9 is the likelihood distance

mm:W@4@M

Besides, we canâlsocomputeflj^—flfli)(j = ^1,^2, ^{. .}. ,p) toseethe difference between^[3 and ,3“). Alternative global influence measuresare possible. One could think ofthe behavior of a test statistics, such as a Wald test for covariate or ccnsuring eflect, under a case deletion scheme.

4.2 Local inﬂuence

As a^second toolfor sensitivityânalysis the local influence methodwill now be described for _log—BurrX11 regression models withcensored data. Local influence calculation can ^be carried out ⁱⁿ the model (12). If the ^likelihood displacementLDOU) = ^2{l(él)^—

new}

is used, where 9w denotes the MLE under the perturbed model, the normal curvature

for 0 at the direction d, ^H d H: ^1, is given by 001(0) = ZldTATL(0)’1Adl, ^where A

is a (p + ²⁾ ^X n matrix that depends on the perturbation ^scheme and ^whose elements are given by A],

:

82l(0|w)/80j8w,, ⁱ

:

^1,2,.

^..,n

^and

^j

⁼ ^1,2,^{. . .} ^,p ⁺² evaluated at é and _{we, where} _{mo is} the no perturbation vector (see Cook, 1986). For the log—Burr

XII model the elements of ^—L(l§) are given in appendix^A. We can calculate the normal curvatures Cd(0), Cd(k), Cd(0) and Cd(,8) to performvarious index plots, ^for instance, the index plot of dmm the eigenveetor corresponding to ^Cum,» the largest eigenvalue of the matrix B

:

ÂTL(0)‘1A ând ^the îndex ^plots ôf ^0111(0), Cdi(kli Cdl(o) and Cd,(,6) named total local influence (see, ^for example, Lesaflre ^& Verbeke, 1998), where ^(1,denotes an n^{X 1} vectorof zeros withoneat the ith position. Thus, the curvature at the direction d, assumes the ^{form C,} = 2‘A?L(0)’1Ai|^where A? denotes the ithrow ofA. It isusual to point out thosecases such that

0,226, C=ﬁZC+

1:]

4.3 Curvature calculations

Next, we calculate, ^for three perturbation ^schemes, the matrix

62l(6|w) ^_ .

A:

⁽^A--^Jt)(p+2)xn ⁼

( —

₆₀₁₁₃₁ ^,

^:1,2,..., ⁺²

^and

1:1,2,.,.,n,

(P+2)Xn

J P

considering the model deﬁned in (5) and itslogelikelihood functiongiven by (6).

11

(13)

4.3.1 Case-weights

perturbation

Consider the vectorof weights to = ^(1111, ^11.12,.^{. .}^,wn)T.

In this case the log-likelihood function takes the form

[(0lw) = ^[log(k)

log(a)]2w1

+ Zw1z1+

+(k +1)Zw1log[1+

^exp{z1}]

iEF iEF iEF

_

_26‘^wilogﬂ + ^exp{21}]

where ⁰ S M S l and w = ^(1,.^. ^{.,1)T. Let}^us ^denoteA = (A1,. ^. .,Ap+2)T.

Then the elementsofvector A1 take theform

A ^>

k

¹ ⁺^log[1 ⁺êxp{z1}] îf îeF

11:log[1

+êxp{z1}] îf îeC

On the other hand, the elements of vector A; can^{be shown} to be given by A __ ^—5-1{1+^21+(12+ 1)21cxp{21}[1+exp{z1}]‘1} if 15F

21 k6’121exp{21}[1 +êxp{z1}]'1 îf îsC

The elements of vector Aj, ^for

j :

^3,^{. .}^. ^,p⁺ ^2, may be expressed as A = —x11{r‘1{1+(k+1)exp{i1}[1+ exp{21}]_1} if ieF

fl X11kd'lexp{z1}[l+ êxp{z1}]_1 îf îeC 4.3.2 Response

perturbation

We will consider here that ^each y1 is perturbed as gm = ^y1 +^w1Sy, ^where ^{Sy is} ^{a scale} factor that _may^be estimated standard deviationof Y and ^(U1- E R.

Here the perturbed log—likelihood functionbecomes expressed as

l(19W) = ^r[10g(k) '10g(0)l +

Z

_iEF^Z?^— ^(k⁺ ¹⁾

Z

_iEF^logll⁺^CXP{Z?}]

~19:

^log[l ⁺ ^exp{zi*}]

iEC

where

zf=

^{(y‘+w‘s”)_}

^xaﬂ

^In ^addition^the elements ofthe vector A1 take the form A __ —Sy{7’121[1+exp{z1}]71 if ieF

11

7

_{78134241 +}exp{z1]>]71 if ieC

On the other hand, the elements ofvectorA2 can ^{be shown} to be given by

A —-Sya'2{l ^— (k+ l)exp{i1}[1 +exp{i1}]_1(i1[1+ exp{i1}]v1+ 1)} ^if ieF

2i

:

Syké‘zexp{i1}[1+^A exp{21}1*1{21[1+ ^{exp{21}]—‘} + ^1} ^{if 160} The elements^ofvectorAj, ^for

j

⁼ ^3,^{. . .}^,^p⁺^2, may be expressed as

A“

_

^x1J-Sy(k⁺ 1)&’Zcxp{i1}[1+exp{i1}]72 if ieF

ﬂ

a

x118ykﬁ’2exp{i1}[1+ exp{i1}]_2 if 15C 12

(14)

4.3.3

Explanatory

variable

perturbation

Considernow an additive perturbation on a particular continuous explanatory ^variable, namely X), by making ^arm}

:

^xii ⁺ ^WiSt, ^where ^{S; is} â ^scaled ^factor, â),- Ê ^R1 ^This

perturbation scheme leads to the following expressions for the log—likelihood function and for the elementsofthe matrix A:

In this _case the log—likelihood function takes the form

l(49W) = ^r[10g(k) 71050)] +

Z

_iEF^Z?

⁷

^(k⁺ ¹⁾

^Zlogﬂ

⁺^exp{z}'}l

iEF

7]:_iec

Z

^log[1 ⁺^exp{zi*}]

h ^r =

w

^d ^rT ⁼ ^. ^. ^. ^.

w ere zI a an X, 51 +⁵²³⁹²+ +[MM+^wlst) + +^IBDXIp'

In addition, the elements ofthe vectorA1 are expressed as

A:

sxriwlexpmm+cxp{2i}}—1 ^if ieF

1’ Sxﬂifx’lcxpmu+exp{2,}]-1 if ieC, the elements^ofthe vectorA2 are expressed as

A

pasta-2h

^— ^(k+ l)exp{ii}[1+exp{ii}]_1(1+ Zg[1+ exp{2i}]‘1)} ^if ieF

21' = ^A _.

—ﬂ(ka&‘2exp{i;}[1+^exp{ii}]_1⁽¹ + 241+exp{2i}]‘1) if 2-60,

the elementsofthe vector A], ^for

j :

^3, ^{. i i}^,p⁺ ² ^and

^j

^7é t, take the ^forms A” = —x;ijﬂt(k+1)o"20xp{ii}[1+Cxp{ii}]_2 if ieF

fl —xiijfltka’2exp{ii}[1+exp{i;}]’2 îf iEC,

the elements^ofthe vector A; are given by

A

_

^Sxé‘l ⁺ ^(k⁺ 1)Sxa’lexp{z;}[l+exp{i;}]_1[xu[3t ^— ^1] if ieF

n ^—

kaa‘lcxp{ii}[l

+exp{2i}]71[xitf3‘, ^— ^1] if ieC

4.4 Generalized Leverage

Let [(0) denote the log—likelihood functionfrom the postulated ^{model in} equation (5), 3 the MLE of0 and _p. the expectation^ofY = ^{(Y1, Y2,}^. ^{. .}^,Yn)T,then, ^5: = ”(3) ^{will be} ^the predicted response^vector.

The main idea behind the concept of leverage (sec, for instance, ^Cook and Weisberg, 1982; Wei et a1., 1998) is that ôf evaluating the influence of ya on its own predicted value This influence may well be representedbythe derivative^35—; that equals ha îs the i-th principal diagonal element ôf the projection matrix H = X(XTX)‘1XT and X îs the model matrixi Extensions to more general regression models have been given, ^for instance, by St. Laurent and Cook (1992),andWei,et â1. (1998) andPaula. (1999), when 0 is restricted with inequalities. Hence, it follows from Wei et al.(1998) that the nxn matrix

(g9

of generalized leverage may be expressed ^as:

GL(§) =

{D9[L(0)]_li0y}

13

(15)

evaluated at 0 = ⁶ ^and ^Where D0

:

^(alElY‘fl ^B Êw‘) ^,x,,-) ând

(92KB) ^~~ .. .1 T

0y

2

_aoayT

:

⁽ kaerrynLﬁJ-ya )

with _r

t ^“Yi' ^, ^ﬂa-lh,

~6’lexp{hi} ^ifif ieC,^16F

L 2

â-2{ ^— ¹ ⁺⁽¹²⁺ ^1)fi,[1 ⁺ ^2, ⁺êxp{2,}][1⁺exp{i;}]_1} if ^16F

(m [741213,[1+i; +êxp{2i}][1+êxp{2i}]_1 îf 160,

ff ,

x,,-(r2(12+1)1},[1+exp{2,}]‘1 if ieF

By

’

x,,a-212ﬁ,[1+exp{2,}]*1 if ieC, where h,

:

^exp{i,}[1+exp{i,}]’1.

5 Residual analysis

In order to study departures ^from the error assumption as well as presence of outliers

we will consider the deviance residualproposed by Barlow and Prentice (1988) (see also Therneau et

al,

¹⁹⁹⁰⁾ ^and Martingale—type residual.

5.1 Martingale-type ^residual

This residual was introduced in counting process and can be written in log—Burr XII regression models as

Vlzlogﬂ +exp{i,}) ^{if ieF}

TM, =

1

7

^k10g(1 +exp{2i}) if ieC

where i,- =

g

^_ ^Due ^{to the} ^skewness distributional form of _TMI, it has maximum^value +1 and minimum value ~00, transformations to achieve a more normal shaped ^form would be more appropriate ^for residualanalysis.

5.2 Deviance residual

Another possibility^is to use the deviance residual (see, for instance, deﬁnition ^{in Me-} Cullagh and Nelder, 1989, section2.4) that ^has beenlargely appliedin generalized linear models (GLMs). Various authorshave investigated the use of deviance residualsin GLMs (see, forinstance, ^Williams, ^1987; ^Hinkley et al., 1991; Paula 1995) as ^Well as in other regression models (see, ^for example, Farhrmeir and Tutz, 1994). Inlog—BurrXII regression models the residualdeviance is expressed here as

14

(16)

1

2

—[^— 2[^— ^klog(1 + exp{ii})+log(1+ ^klog(1+ exp{2i}))H if ieF

TDI —

sign 1—klog(1+exp{2i})] ^[

7

²+ ^2klog(1+ ^exp{2;})]^E ^{if ieC.}

5.3 Modiﬁed Deviance Residual

Weproposed a change in the ^deviance residual and can be written as

TMD, = ⁶⁵ +^’I‘D1

where 6i

:

⁰ ^denotescensored observation, ^6.- = ¹ ûncensored ând^TD. îs deviance residual that îs defined in Section 5.2.

In the log—BurrXII regression models the modiﬁed residual^deviance is given by

1

7 [7

^2[^—^klog(1 ⁺^exp{2i}) ⁺^log(1 ⁺^klog(1+

main)”

⁵ ^if

^up

7‘MD,

:

_L

sign[1 ^—klog(1 +exp{ii})]^[ ~²+^2kleg(l +exp{ii})]2 ^if ^iGC.

5.4 Impact ^of the detected inﬂuential observations

To reveal the impact ôfthe detected influentialobservations, we estimate the parameters again without the influentialobservations. Let ^[9 and ⁹⁰ ^be the maximumlikelihood esti- matesôfthe models that are obtained^from the datasets withând without the influential observations,respectively. Lee, Lu and Song (2006) define thefollowing two quantities to measure the diflerence between ⁽⁹ and 9 ^:

n»

934;? i _{_}

^“.0

TRC

: ^Z

i=1 ⁱ ^6‘ ^6‘

and MR0 = max;

where TRC is totalrelativechanges7 MRC maximum relativechanges and _np = ⁶ ^is ^the number of parameters, and likelihood displacement:LD1(0)= ^2{l(9)

v “90)».

^where

9(1) denotes^{MLE of}0 after the set (I) ^ofinﬂuential observations has been removed (see, Cook, Pena and Weisberg,1988).

Now, the same numberôf the influential observationare randomlyselected from the non influential observations and TRC, ^MRC and LD, are again calculated. After this, the results can ^be compared îf there is difference between them the observations are influential.

6 Application

We provide an application ^ofthe resultsderived in the previous sections using real data.

The required numerical evaluationswereimplementedusing the program^{Ox (see}Doornik, 1996).

15

Instituto de Ciéncias Mateméticas e de Computagﬁo

UNIVERSIDADE DE sAo PAULO