UNIVERSIDADE DE sAo PAULO
Instituto de Ciéncias Mateméticas e de Computagfio
ISSN 0103-2569
LOG-BURR XII REGRESSION MODELSWITH CENSORED DATA
GIOVANA OLIVEIRA SILVA VICENTE G. CANCHO EDWIN M. M.ORTEGA MAURICIOLIMABARRETO
N9 297
RELATORIOS TECNICOS
8130Carlos —SP Mai./2007
SYSNOW
DATA—g;
ICMC-SBABLog-Burr XII Regression Models with Censored Data
Giovana Oliveira
Silva *Edwin
M. M.Ortegal
Universidadede StioPaulo Universidadede 560 Paulo
Vicente
G. Caneho. iMauricio Lima Barreto.
§Universidadede 560 Paulo Universidade Federalda Bahia
Abstract
In survival analysis applications, when the failure rate function has an unimodal shape7
that is a common situation, the log»normal or log—logisticdistributions are used. In this paper, a regression model based inthe Burr XII distributionisproposedformodelingdata what has a unimodal failure rate function. The Burr XII distribution has a advantage over the log—normalthat the BurrXII survival functionis written in closed form and the
leg—logistic distribution is a special case ofthe BurrXII distribution. Assuming censored data, we considereda classic analysis, aBayesian analysis assumingno informative priors and jackknife estimator for the parameters of the model. The Bayesian approach is considered using Markov ChainMonte Carlo Methods withMetropolis~Hasting algorithms steps to obtain the posterior summaries of interest. Besides, we used the sensitivity analysis to detect influential or outlying observations and residual analysis is used to cheek assumptions in the model such as departures from the error assumptions. The relevance ofthe approachis illustrated with a real dataset.
Keywords: Burr distribution; regression models; censored data; local influence; generalized leverage; residualanalysis.
1
Introduction
We consider in this paper data set given by Instituto de Saude Coletiva - Universidadc FederaldaBahia. This dataset was designed to evaluate the effect ofvitamin A supple—
mentation onrecurrent diarrhealepisodes in small children (seeBarreto et a1., 1994). We
"Address: ESALQ, UniversidadedeSic Paulo, Piracicaba, Brasil. E-mail: giosilva©esalq.usp.br lAddress: ESALQ, University of Séo Paulo, Piracieaba, sac Paulo, Brasil. E-mail: ed- win©esalq.usp.br
tAddress: ICMC, Universidadede 850 Paulo,Sao Carlos, Brasil. E—mail: garibay@icmc.usp.br
§Address: ISC, Universidade Federal da Bahia, Salvador, Brasil. E—mail: mauricio©uibabr
1Addressfor correspondence: Departamento de Ciéncias Exatas, USP Av. Padua Dias 11 - Caixa Postal 9, 13418900 Piracicaba »SioPaulo- Brazil.
e-mail: edwin@esalq.usp.br
aim to model the treatment effect on the time to the occurrence of diarrhea. Moreover, the censoring timesare random.
In many applications there is qualitative information about the failure rate function shape, which can help with selecting a particular model. In this context, a device called the total time on test (TTT) plot (Aarset, 1987) is useful. The TTT plot is obtained by plotting G(r/n)
:
[( :21 Tim)+ (n7
T)T,:n]/(E;‘:1T};n), wherer
= 1, .. .,n
and Tim,2' = 1,. . .,n, are the order statistics ofthe sample, against
r/n
(Mudholkar ct al., 1996).For this data, the TTTplot indicates a unimodalfailure rate function.
It is known that the log—normaldistribution is a popularmodel for survival timewhen the failure rate function is unimodal and the log—logistic distribution is often used as an alternative to the log—normal. The main purpose of this paper is to present other distribution that can be viewed as amore useful and flexible alternative.
Wc proposed to use the Burr XII distribution in modelling survival time as a viable alternative to the log—normal. The Burr XII distribution has the advantages that your survival function can be written in closed form. Besides, the leg—logistic distribution is a special case ofthe BurrXII distribution. The BurrXII distribution was used inreliability analysisby Zimmer et a1. (1998) but in data set without covariatcs.
We considered aclassicanalysisforlogABurr XIIregression model. The inferential part was carried out using the asymptotic distribution ofthe maximumlikelihood estimators, which in situations when the sample is small, it might present difficult results to be justified. As an alternative for classic analysis, we explored the use of techniques of ChainsMarkov Monte Carlo (MCMC) Method to develop an Bayesian inference for the log-Burr XII regression model and it was also used the Jackknife estimator. In both cases, Bayesian andJackknife, itisn’tneed using asymptotic distribution ofthe maximum likelihood estimators.
After modelling, it is important checking assumptionsin the model and to conduct a robustness study to detect influentialor extreme observationthat can cause distortions on the resultsof the analysis.
The examinationofresidualswas used to check assumptionsin the model. Numerous approacheshave been proposedin the literature to detect influentialor outlyingobserva—
tions. An efficient way to detect influential observationsis the diagnosticanalysis. Cook (1986) uses this idea to motivatehis assessment of local influence. He suggest that more confidence can be put in a model which is relatively stable under small modifications.
The best known perturbation schemes are basedon casedeletion (Cook and Weisberg, 1982 and Xie and Wei, 2007) in which the effect is studied ofcompletelyremoving cases from the analysis. This reasoningwill form the basis for ourglobal influence introduced in section 4.1 and in doing so it will be possible to determine which subjects might be influential for the analysis. On the other hand using ease deletionall information from a single subject is deleted at once and therefore it is hard to tell whether that subject has some influence on a specific aspect ofthe model. A solutionfor the earlier problem can be found in a quite different paradigm being a local influence approach where one again investigateshow the results ofan analysisare changed under small perturbations of the model but where these perturbations can be specific interpretations. Also, some authors have investigated the assessment of local influence in survival analysis models:
for instance, Pettit and Bin Daud (1989) investigateof local influence in proportional hazard regression models, Escobar and Meeker (1992) adapt local influence methods to
regression analysis with censoring, Ortega et a1. (2003) consider the problemof assess—
inglocal influence in generalized log—gamaregression models with censored observations, Leivarsanehez et a1. (2006) investigateinfluence local in log-Birnbaum—Saunders regres—
sion models with censored data andmore recently Ortega et a1. (2006) derive curvature calculations under various perturbation schemes in log-exponentiated—Weibull regression model withcensored data. We developed a similar methodology to detect influentialsub—
jects in log—BurrXII regression models with censored data, it is presentedin section4.2.
Finally applied methodology the leverage generalized developed by Wei at al. (1998) In Section 2 this article is considering a briefing study of the Burr XII distribution besides the inferential part ofthis model. In the section 3 we suggest a log—BurrXII re—
gression model, in addition with the maximumlikelihood estimators, Bayesian inference and the Jackknife estimator. In the section4we used several diagnostics measuresconsid~
ering three perturbation schemes, case—deletion and the generalized leverage in log—burr
XIIregression model withcensored observations. Wepresentresidual from a fitted model using the Martingale residualproposed byTherneau eta1. (1990)and we proposed amod—
ified deviance residual for the log—BurrXII regression model in the section 5. Finally, in the section6 the real data set is analyzed and the conclusion appears insection 7.
2 The Burr XII distribution
The Burr XII distribution, used in Zimmer et al.(1998), with parameters s, c and k considers that the life time T has a density functiongiven by
t c (71:71)tc‘l
t, ,k, = k 1 — 1
f(5
c)c(+(s))
<>where k > 0 and s > 0 are scale parameters and c > O is shape parameter. The survival function corresponding to the random variable T with Burr XII densityis given by
S(t;s,k,c)
= P(T2
t):
(1 + (é)c>—kThe correspondingfailure rate functionhas the following form :71
h(t;s,k,c)
: ifl
2.1 Characterizing the failure rate function
According to Zimmer et a1. (1998), the failure rate functionofthe Burr XII distribution can bedecrease when G S 1 andwhen c > 2 the failure rate functionreaches a maximum and thedecreases, where the rangeof values in which thefailure rate functionis increasing can be manipulated using s. When c values between 1 and 2, the failure rate function can be made to be essentially constant over much of the rangeof the distribution, this
depends of s values. To study the shape of the failure rate function we have found its derivativethat can be written as
, cktc’2 t c
h(t,07k,5) =
fi|:C—1—(g):|.
sc<1 + (E) >
In order to study better this function one can note that two situations might be considered:
0 c S 1
To any t > 0, h’(t) < 0 and therefore h(t) is an decreasing function.
0 c > 1
When h’(t‘)
:
0 we have c7
1 ~(if
S:
0, hence the critic point is given by t" = s(c —I)?
When t < t*, h/(t) > 0, the failure rate function is increasing and when t > t", h/(t) < 07 the failure rate function is decreasing. Hence, t" is an inflexion point and the failure rate function has unimodal shape property. Besides,h(t)—>0fort—>00rtfioo.
Figure 1 shows the plots of the failure rate function for some different parameter combinations.
Figure I: Plots ofthe failure rate functionfor Burr XII distribution.
From figure 1, it can be seen that the failure rate function is an decreasing function when c
g
1 and h(t) is a unimodal-shaped functionwhen 6> 1.2.2 Moments for the failure time
The qth momentsfor the failure timeis given by:
E(T‘1): 341:3? + 1,1: —3], if ck> q
c c
where B(a, b) is the complete beta function(see Lawless (2003)).
4
2.3 Relation the other distributions
The log—logistic distribution is aspecial case ofthe Burr XII distribution. When
;
1 = mand k = 1, Burr XII distribution is reduced to the log—logistic distribution where the survival function canBesides, Rodriguezbe(1977) showswritten as S(t;thatk,s,c)the Burr
:
17
coveragem.
area in specific plane is oe—eupied by various well—knownand useful distributions, including the normal,log—normal,
gamma, logistic and extremevalue type I distributions.
2.4 Maximum likelihood estimation
We assume that the lifetime are independently distributed, and also independent from the censoring mechanism. Considering rightweensored lifetime data, we observe t, = min(T,,C,), where T, is the lifetime and C, is the censoring, both for ith individual,
i
:
1, . . .,n. Assuming that t1,t2,. . . ,tn is a random sample ofthe random variable Twith BurrXII distribution (1) The likelihood functionofc, k and s corresponding to the observed sampleis given by
W=<kcrgl<1+<2>C)*““t:3]n[<1+of]
15C
where
r
is observed numberof failures,F
denotes theset ofuncensored observations and C denotes the set of censored observations. Thelog~likelihoodfunctionis given by:l(c,k-,s)
:
rlog(/€)+rlog(c)—()k+1 )glog<l+(: is))+;log<t: 1)
_k§log<l+ (:
)3)The maximumlikelihood estimator e, 1%and s of c,kands are obtainedby maximizing the log—likelihood, which results in solving the equations
81(c,k,s)
:
E‘ (k+ DEM-+21%(ts
1—)k:
(§)°(10g§)<90 C isF
(1+(§))
ieF iEC (1+(§)C)81(c,k, ) r
8,; :,_;10g(1+(g )) Q4143)
3K0,k,5)
_
C ticsr(c+1)_ E
C tics‘(c+1)fas ’ (“1&0ng)
s +§(1+<:+>°)
These equations cannot be solved analyticallysothat statistical software such as Ox or R can be used to solved them. In this paper, software OX (MAXBFGS subroutine)
is used to compute the maximumlikelihood estimator (MLE) but reparametrization is necessary. It can be used the transformations0
: i
ands = exp(u).5
3 Log—Burr XII regression models
3.1
Log—location—scaleregression model
In many practical applications,lifetimes are affected by variables, which are referred to as explanatory variablesor covariates, such asthe cholesterol level,blood pressure and many others. So it isimportant toexplore the relationshipbetween the lifetime and explanatory variables, an approach basedin regression model can be used.
The covariate vector is denoted by x
:
($1,362,. . . ,xp)T which is related to the res- ponses Y:
log(T) through a regression model.Considering the transformations c =
i
and s:
exp(n). Hence, it follows that thedensity function ofY canbe written as
fill; 16,01 M)= §<1 +
emf?» #kexpei-fi)
(3)—00 < y < 00, where k > 0, a > 0, and —00 < p < 00‘ And survival functionis given by S(y) = [1+
$424)] W).
Besides, we have the following important theorem.
Theorem 1: For the variableYthe moment generating function (mgf)is given by My(t)
:
kstBE + 1,k— E], if kc > twhere B[(1, b] is the completebetafunction (proofgiven in appendixB).
Hence the meanof Y is given by
E(Y)
: 8+
a[z/1(1)— w(k)], if kc > twhere 111(0) is the digamma function(see Lawless(2003)).
We can write the above model as a log—linearmodels Y
:
n+ aZwhere the variable Z followsthe density
f(2) = k(1 +exp(z))(_k_1)cxp(z), V
7
00 < z < 00 and k > 0. (4) Now,it is also considered that thescale parameter pofthelogBurr model dependson the matrix ofexplanatoryvariables X, this is, #1= ,TB. We also consider the regression model based on the log—BurrXH given in (6) relating the response Y and the eovariate vector x, so that the distributionle
can be representsasY,=xflB+aZ,-, i:1,...,n,
(5)where ['3
:
(B1, . . .,flp)T, a > 0andk > 0are unknownparameters,x? = ($11,129,i . .,xip) is the explanatory vector and Z followsthe distribution in (4).In this case, the survival functionof
le
is given byS(y|x) = [1
+exp(fl)}ik.
U3.2 Estimation by maximum likelihood
The corresponding values to the sample (y1,x1), (y2,xz), r . . , (gmxn) of n observations from the distribution (6) where yz- represents the logarithm of the survival time and x1-
the covariate vector associated with thei—th individual, the log-likelihood function can be written as
l(9) = rlog(k) —rlog(a)+
Z
ieF Z1— (k+ 1)Z
log(1+exp(zi)) (6)M?
—k:
2
isClog(1+exp(z;)),Where
r
is the numberof uncensored observation(failures) and zi2 gm.
Maximumlikelihood estimates forthe parameter vector 0
:
(k,a,fiT)T can be obtainedby maximiv Zing the likelihood function. In this paper, the software Ox (MAXBFGS subroutine) (see Doornik, 1996) was used to compute maximumlikelihood estimates (MLE). Covariance estimates for the maximumlikelihood estimators 3 can be obtained using the Hessian matrix. Confidence intervals and hypothesis testing can be conducted using the large sample distribution of the MLE which is a normal distribution with the covariance ma—trix as theinverse ofthe Fisher informationsince regularity conditionsare satisfied. More specifically, the asymptoticcovariance matrixis given by14(0) with1(0) = E[L(0)] such
" 6210
that L(0)
: 7{%(§;}.
Since it is not possible to compute the Fisher information matrix 1(0) due to the censored observations (censoring is random and noninformative), but it is possible to use the matrix of second derivatives ofthe log likelihood, —L(0), evaluated at the MLE 0
:
6,which is consistent. The asymptotic normal approximationfor3 may be expressed as 3T~
N(p+2){0T; L(0)‘1} where L(0) is the (0+2)(12+2) observed information matrix, obtained from:Llclc L/w Lkfl]
—t(0) = .
L” La,
with the submatriccsgiven in appendixA.
3.3 A Bayesian analysis
The use of the Bayesian method besides being an alternative analysis, it allows the in—
corporationof previous knowledge ofthe parameters through informative priori densities.
When there is not this informationone considers noninformativepriori. In the Bayesian approach7 the referent information to the model parametersis obtained through posterior marginal distribution. In this way it appears two difficulties. The first it refers to at—
tainment marginal posterior distribution and the second to the calculationofthe interest moments. In both cases are necessary integral resolutionsthatmany timesdonot present analytical solution. In this paper we have used the simulation method of Markov Chain MonteCarlo, such as the Gibbs sampler and Metropolis-Hasting algorithm.
Consider the Burr XII distribution (1), censored data and the likelihood function (2) for k, c and
s
For a Bayesian analysis, we assume the following priori densitiesfor k, sand c
o k N
Rahal),
a1 and b1 known;0 s N F(a2, b2), a2 and b2 known;
0 0
~
1"(a3,b3)7 a3 and b3 known;where “34: bi) denotes a gamma distribution withmean it, variance E; and densityfunc—
tion given by
b?v"‘"lexp{—vbi}
“V;8a, bi) =
T
where V > 0, ai > 0 and bi > 0.
In the special case where a1 = b; = a2
:
b2:
a3 = b3:0, the noninformativecase follows, and it assumed independenceamong the parameters the priori densities for k7 sand c is written as
k
fi
17r( ,s,c) (x ksc
Wefurther assume independenceamong the parametersk,sandc. The joint posteriori distributions for k,s and c is given by,
7r(k, s,CID) o< ka‘_lexp{—kb1}sa2‘1exp{7sb2}ca3_lexp{icb3}
(mile<S>C>*(k+”lgtf-lgl<1+<26>>ikl
where D denotes the datasets.
It canbe shown that the conditional postcriori densitiesare given by
«(14mm a k““*cxp{-kbl}H
ieF [(1 +(Ell—(Ml H
icC [(1 +(Elli
5«(slkicim o< s”’°“lcxp{-Sb2}i1} [(1 +
(Slalflwl ll
[(1 + (Sell—k]7r(c|k,s, D) oc c“"cxp{—cb3}crs_cr<§)rflicF
[(1+ (33%“?
Hit—1HieF ieC[(H (2))?
Observethat we need to use the Metropolis-Hastingsalgorithm to generate the vari—
ables k, s and c from the respective conditional posteriori densitiessince their forms are somewhatcomplex.
For Bayesian inference, considering model (5)7 assume the following priori densities for a, k and HF:
ok~l‘(cl,d1),
c1 and d1 known;8
o a~1nverseF(e2,d2), C2 and d2 known;
oflj~N(p0j,a§J-),
110) and 031- known7 j:0,.11p.Noninformative prioris assume independenceamong the parameters, follows by con- sidering(:1 =
C2:
d1—— dg—— 0 and aoj large.We again assume independenceamong the parameters. The joint posterioridistribu—
tion for 0', k and
fl
is given by:7r(o,k,,6T|D) o< kc11exp{—kd1}aC2+1)0xp{—d—}exp{—
22<flj—;
WHY}j—O 0i
(§)rcxp{ :21}
H[(1 +exp{zi})*(k+1)]H[(1+ exp{zi})'k]ieF ieF isc
where z- —
”—22.
It canbe show that the conditional marginal distributionsare given by:
”(klaa 5T»D) 0< kc‘+r_lexp{'kdi}CXDHl(1+eXP{Zi})’(k+l)l
Hi0
+ CXP{Zi})—k]ieF ice
~cz7r71 d2 +
7r(0|k,flT,D) Dc (7 exp{ —
;}exp{ g7
z,} 11;!“(1+
exp{z.)—}(k 1)]Hi0
+CXp{Zi})—k]iec
7r(fljlkiayfi_j,]:)) O( 6Xp{
_ i; (L
;0H0j)2}6"“ Z; 4.1111
+€XP{Zi})““+“lHi0
+expmrki
Observe that we need to use the Metropolis Hastings algorithm to generate from the posteriori conditional distributions of k, a andfij (j=0,...p).
3.4 The Jaekknife Estimator for the model
The idea the jaekknifingisto transform the problemofestimatinganypopulationparame—
ter into the problemofestimating a populationmean. So, what is done when estimating a meanvalue is realized in this method but from an unusual point of view. In this paper,
we used this method as an alternative method to estimate the population parameter.
Suppose that T1,T2,.. .,T" is a random sample of n values and the sample mean is given by
and is used to estimate the mean ofthe population.
Now, it is calculated the sample mean with the lt" observationmissed out, T_l
_
211:1]?n— 1 TlThen of two expressions above is obtain
Tl
:
7LT — (n— 1)T,l. (7)In a general situation, consider that 0 is a parameter estimated by E(T1,T2,.. i7Tn)
andfor ease ofnotation drop(T1,T2, . ii ,Tn). Finally, itiscalculatedEL;whatisobtained with the T; observationmissed out. Itfollows, by equation (7) that pseudo—values can be calculated
E; = HE— (n— 1)];11 7l
:
1,. . . ,nThe average ofthe pseudo—values is given by
A
: 2;;
E;E.
nthat is the jaekknife estimate of0.
Manly (1997) suggests that an approximate 100(1
7
00% confidence interval for 0 is given by E" :l:tat/2,",1s/fl,
where tat/231,1 is the value that is exceeded with probabilitya /
2 for the t distribution With (n—1) degrees of freedom and thejackknife estimator had the effect of removing bias of order 1/
miThe jaekknife estimator calculationsfor the log-Burr XII regression model are reali—
zed to k, (r and flj (j=0,...p) and confidence intervalsare calculated separately to each parameter.
4 Sensitivity analysis
4.
1Global influence
A first tool to perform sensitivity analysisasstated before is by means of global influence starting from case—deletion. Case—deletion is a common approach to study the effect of dropping the ith case from the data set. The case—deletion model for the model (5) is given by
Yl=xfe+az,, l:1,2,...,n,
17m: (8)In the following, a quantity with subscript "(i)” means the original quantity with the ith case deleted. For the model (8), the log—likelihood functionof0 is denotedby l(¢)(0). Let
9G) = (12(5),(”7(1),
fig)?
be theML estimatede 0from l(i)(0). To assess theinfluence of the ith case on the ML estimate 9 = (i9,6,fly,
the basic ideais to compare the difference between00-) and 0. Ifdeletionofacase seriously influences the estimates, more attention should be paid to that case Hence, if 9(1) is far from 9‘ then ith case is regarded as an10
influentialobservation. A first measure theinfluence global is defined asthe standardized normof0(1‘) —0 (generalized Cook distance)
GQWV=@M*9VEWHA@m—@
Other alternative is to assess the values GD,(,6) and GDi(k,0), such values reveal the impactof2thcase on the estimates of[3and (19,0), respectively. Another popular measure ofthe difference between 00-) and 9 is the likelihood distance
mm:W@4@M
Besides, we canalsocomputeflj—flfli)(j = 1,2, . .. ,p) toseethe difference between[3 and ,3“). Alternative global influence measuresare possible. One could think ofthe behavior of a test statistics, such as a Wald test for covariate or ccnsuring eflect, under a case deletion scheme.
4.2 Local influence
As asecond toolfor sensitivityanalysis the local influence methodwill now be described for log—BurrX11 regression models withcensored data. Local influence calculation can be carried out in the model (12). If the likelihood displacementLDOU) = 2{l(él)—
new}
is used, where 9w denotes the MLE under the perturbed model, the normal curvature
for 0 at the direction d, H d H: 1, is given by 001(0) = ZldTATL(0)’1Adl, where A
is a (p + 2) X n matrix that depends on the perturbation scheme and whose elements are given by A],
:
82l(0|w)/80j8w,, i:
1,2,...,n
andj
= 1,2,. . . ,p +2 evaluated at é and we, where mo is the no perturbation vector (see Cook, 1986). For the log—BurrXII model the elements of —L(l§) are given in appendixA. We can calculate the normal curvatures Cd(0), Cd(k), Cd(0) and Cd(,8) to performvarious index plots, for instance, the index plot of dmm the eigenveetor corresponding to Cum,» the largest eigenvalue of the matrix B
:
ATL(0)‘1A and the index plots of 0111(0), Cdi(kli Cdl(o) and Cd,(,6) named total local influence (see, for example, Lesaflre & Verbeke, 1998), where (1,denotes an nX 1 vectorof zeros withoneat the ith position. Thus, the curvature at the direction d, assumes the form C, = 2‘A?L(0)’1Ai|where A? denotes the ithrow ofA. It isusual to point out thosecases such that0,226, C=fiZC+
1:]
4.3 Curvature calculations
Next, we calculate, for three perturbation schemes, the matrix
62l(6|w) _ .
A:
(A--Jt)(p+2)xn =( —
601131 ,:1,2,..., +2
and1:1,2,.,.,n,
(P+2)Xn
J P
considering the model defined in (5) and itslogelikelihood functiongiven by (6).
11
4.3.1 Case-weights
perturbation
Consider the vectorof weights to = (1111, 11.12,.. .,wn)T.
In this case the log-likelihood function takes the form
[(0lw) = [log(k)
log(a)]2w1
+ Zw1z1++(k +1)Zw1log[1+
exp{z1}]iEF iEF iEF
_
26‘wilogfl + exp{21}]where 0 S M S l and w = (1,.. .,1)T. Letus denoteA = (A1,. . .,Ap+2)T.
Then the elementsofvector A1 take theform
A >
k
1 +log[1 +exp{z1}] if ieF11:log[1
+exp{z1}] if ieCOn the other hand, the elements of vector A; canbe shown to be given by A __ —5-1{1+21+(12+ 1)21cxp{21}[1+exp{z1}]‘1} if 15F
21 k6’121exp{21}[1 +exp{z1}]'1 if isC
The elements of vector Aj, for
j :
3,. .. ,p+ 2, may be expressed as A = —x11{r‘1{1+(k+1)exp{i1}[1+ exp{21}]_1} if ieFfl X11kd'lexp{z1}[l+ exp{z1}]_1 if ieC 4.3.2 Response
perturbation
We will consider here that each y1 is perturbed as gm = y1 +w1Sy, where Sy is a scale factor that maybe estimated standard deviationof Y and (U1- E R.
Here the perturbed log—likelihood functionbecomes expressed as
l(19W) = r[10g(k) '10g(0)l +
Z
iEFZ?— (k+ 1)Z
iEFlogll+CXP{Z?}]~19:
log[l + exp{zi*}]iEC
where
zf=
(y‘+w‘s”)_xafl
In additionthe elements ofthe vector A1 take the form A __ —Sy{7’121[1+exp{z1}]71 if ieF11
7
78134241 +exp{z1]>]71 if ieCOn the other hand, the elements ofvectorA2 can be shown to be given by
A —-Sya'2{l — (k+ l)exp{i1}[1 +exp{i1}]_1(i1[1+ exp{i1}]v1+ 1)} if ieF
2i
:
Syké‘zexp{i1}[1+A exp{21}1*1{21[1+ exp{21}]—‘ + 1} if 160 The elementsofvectorAj, forj
= 3,. . .,p+2, may be expressed asA“
_
x1J-Sy(k+ 1)&’Zcxp{i1}[1+exp{i1}]72 if ieFfl
a
x118ykfi’2exp{i1}[1+ exp{i1}]_2 if 15C 124.3.3
Explanatory
variableperturbation
Considernow an additive perturbation on a particular continuous explanatory variable, namely X), by making arm}
:
xii + WiSt, where S; is a scaled factor, a),- E R1 Thisperturbation scheme leads to the following expressions for the log—likelihood function and for the elementsofthe matrix A:
In this case the log—likelihood function takes the form
l(49W) = r[10g(k) 71050)] +
Z
iEFZ?7
(k+ 1)Zlogfl
+exp{z}'}liEF
7]:iec
Z
log[1 +exp{zi*}]h r =
w
d rT = . . . .w ere zI a an X, 51 +52392+ +[MM+wlst) + +IBDXIp'
In addition, the elements ofthe vectorA1 are expressed as
A:
sxriwlexpmm+cxp{2i}}—1 if ieF1’ Sxflifx’lcxpmu+exp{2,}]-1 if ieC, the elementsofthe vectorA2 are expressed as
A
pasta-2h
— (k+ l)exp{ii}[1+exp{ii}]_1(1+ Zg[1+ exp{2i}]‘1)} if ieF21' = A .
—fl(ka&‘2exp{i;}[1+exp{ii}]_1(1 + 241+exp{2i}]‘1) if 2-60,
the elementsofthe vector A], for
j :
3, . i i,p+ 2 andj
7é t, take the forms A” = —x;ijflt(k+1)o"20xp{ii}[1+Cxp{ii}]_2 if ieFfl —xiijfltka’2exp{ii}[1+exp{i;}]’2 if iEC,
the elementsofthe vector A; are given by
A
_
Sxé‘l + (k+ 1)Sxa’lexp{z;}[l+exp{i;}]_1[xu[3t — 1] if ieFn —
kaa‘lcxp{ii}[l
+exp{2i}]71[xitf3‘, — 1] if ieC4.4 Generalized Leverage
Let [(0) denote the log—likelihood functionfrom the postulated model in equation (5), 3 the MLE of0 and p. the expectationofY = (Y1, Y2,. . .,Yn)T,then, 5: = ”(3) will be the predicted responsevector.
The main idea behind the concept of leverage (sec, for instance, Cook and Weisberg, 1982; Wei et a1., 1998) is that of evaluating the influence of ya on its own predicted value This influence may well be representedbythe derivative35—; that equals ha is the i-th principal diagonal element of the projection matrix H = X(XTX)‘1XT and X is the model matrixi Extensions to more general regression models have been given, for instance, by St. Laurent and Cook (1992),andWei,et a1. (1998) andPaula. (1999), when 0 is restricted with inequalities. Hence, it follows from Wei et al.(1998) that the nxn matrix
(g9
of generalized leverage may be expressed as:GL(§) =
{D9[L(0)]_li0y}
13
evaluated at 0 = 6 and Where D0
:
(alElY‘fl B Ew‘) ,x,,-) and(92KB) ~~ .. .1 T
0y
2
aoayT:
( kaerrynLfiJ-ya )with r
t “Yi' , fla-lh,
~6’lexp{hi} ifif ieC,16FL 2
a-2{ — 1 +(12+ 1)fi,[1 + 2, +exp{2,}][1+exp{i;}]_1} if 16F(m [741213,[1+i; +exp{2i}][1+exp{2i}]_1 if 160,
ff ,
x,,-(r2(12+1)1},[1+exp{2,}]‘1 if ieFBy
’
x,,a-212fi,[1+exp{2,}]*1 if ieC, where h,:
exp{i,}[1+exp{i,}]’1.5 Residual analysis
In order to study departures from the error assumption as well as presence of outliers
we will consider the deviance residualproposed by Barlow and Prentice (1988) (see also Therneau et
al,
1990) and Martingale—type residual.5.1 Martingale-type residual
This residual was introduced in counting process and can be written in log—Burr XII regression models as
Vlzlogfl +exp{i,}) if ieF
TM, =
1
7
k10g(1 +exp{2i}) if ieCwhere i,- =
g
_ Due to the skewness distributional form of TMI, it has maximumvalue +1 and minimum value ~00, transformations to achieve a more normal shaped form would be more appropriate for residualanalysis.5.2 Deviance residual
Another possibilityis to use the deviance residual (see, for instance, definition in Me- Cullagh and Nelder, 1989, section2.4) that has beenlargely appliedin generalized linear models (GLMs). Various authorshave investigated the use of deviance residualsin GLMs (see, forinstance, Williams, 1987; Hinkley et al., 1991; Paula 1995) as Well as in other re- gression models (see, for example, Farhrmeir and Tutz, 1994). Inlog—BurrXII regression models the residualdeviance is expressed here as
14
1
2
—[— 2[— klog(1 + exp{ii})+log(1+ klog(1+ exp{2i}))H if ieF
TDI —
sign 1—klog(1+exp{2i})] [
7
2+ 2klog(1+ exp{2;})]E if ieC.5.3 Modified Deviance Residual
Weproposed a change in the deviance residual and can be written as
TMD, = 65 +’I‘D1
where 6i
:
0 denotescensored observation, 6.- = 1 uncensored andTD. is deviance residual that is defined in Section 5.2.In the log—BurrXII regression models the modified residualdeviance is given by
1
7 [7
2[—klog(1 +exp{2i}) +log(1 +klog(1+main)”
5 ifup
7‘MD,
:
Lsign[1 —klog(1 +exp{ii})][ ~2+2kleg(l +exp{ii})]2 if iGC.
5.4 Impact of the detected influential observations
To reveal the impact ofthe detected influentialobservations, we estimate the parameters again without the influentialobservations. Let [9 and 90 be the maximumlikelihood esti- matesofthe models that are obtainedfrom the datasets withand without the influential observations,respectively. Lee, Lu and Song (2006) define thefollowing two quantities to measure the diflerence between (9 and 9 :
n»
934;? i _
“.0TRC
: Z
i=1 i 6‘ 6‘and MR0 = max;
where TRC is totalrelativechanges7 MRC maximum relativechanges and np = 6 is the number of parameters, and likelihood displacement:LD1(0)= 2{l(9)
v “90)».
where9(1) denotesMLE of0 after the set (I) ofinfluential observations has been removed (see, Cook, Pena and Weisberg,1988).
Now, the same numberof the influential observationare randomlyselected from the non influential observations and TRC, MRC and LD, are again calculated. After this, the results can be compared if there is difference between them the observations are influential.
6 Application
We provide an application ofthe resultsderived in the previous sections using real data.
The required numerical evaluationswereimplementedusing the programOx (seeDoornik, 1996).
15