Contents lists available at ScienceDirect
Applied
Mathematical
Modelling
journal homepage: www.elsevier.com/locate/apm
Improve
d
likelihood-base
d
inference
in
Birnbaum–Saunders
nonlinear
regression
models
Artur
J.
Lemonte
a, ∗,
Gauss M.
Cordeiro
b,
Germán
Moreno-Arenas
b a Departamento de Estatística, Universidade Federal de Pernambuco, Recife, PE, Brazilb Escuela de Matemáticas, Universidad Industrial de Santander, Bucaramanga, Colombia
a
r
t
i
c
l
e
i
n
f
o
Article history: Received 4 July 2015 Revised 4 February 2016 Accepted 13 April 2016 Available online 27 April 2016 Keywords:
Bartlett-type correction Bootstrap
Likelihood ratio statistic Score statistic Wald statistic
a
b
s
t
r
a
c
t
WeaddresstheissueofperformingtestinginferenceinBirnbaum–Saundersnonlinear re-gressionmodelswhenthesamplesizeissmall.Thelikelihoodratio,Waldandscore statis-ticsprovidethebasisfortestinginferenceontheparametersinthisclassofmodels.We focusonthesmall-samplecase,wherethereferencechi-squareddistributiongivesapoor approximationto the truenull distributionofthesetest statistics.We derive ageneral Bartlett-typecorrectioninmatrixnotationforthescoretest,whichreducesthesize distor-tionofthetest,andnumericallycomparetheproposedtestwiththeusuallikelihoodratio, Waldandscoretests,andwiththeBartlett-correctedlikelihoodratiotest,and bootstrap-correctedtests.Oursimulationresultssuggestthattheproposedcorrectedtestcanbean interestingalternativetoothertestssinceitleadstoveryaccurateinferenceevenforvery smallsamples.Wealsopresentanempiricalapplicationforillustrativepurposes.
© 2016ElsevierInc.Allrightsreserved.
1. Introduction
Fatigue is a structural damage which occurs when a material is exposed to stress and tension fluctuations. When the effect of vibrations on material specimens and structures is studied, the first point to be considered is the mechanism that could cause fatigue of these materials. The fatigue process ( fatiguelife) begins with an imperceptible fissure, the initiation, growth, and propagation of which produces a dominant crack in the specimen due to cyclic patterns of stress, whose ul- timate extension causes the rupture or failure of this specimen. The failure occurs when the total extension of the crack exceeds a critical threshold for the first time. The partial extension of a crack produced by fatigue in each cycle is modeled by a random variable, which depends on the type of material, the magnitude of the stress, and the number of previous cycles, among other factors.
Motivated by problems of vibration in commercial aircraft that caused fatigue in the materials, [1]pioneered a two- parameter distribution to model failure time due to fatigue under cyclic loading and the assumption that failure follows from the development and growth of a dominant crack. This distribution is known as the two-parameter Birnbaum–Saunders (BS) distribution or as the fatigue life distribution. The BS distribution is an attractive alternative to the Weibull, gamma, and log-normal models, since its derivation considers the basic characteristics of the fatigue process, and it has received significant attention over the last few years by many researchers and there has been much theoretical developments with respect to this distribution. Although the BS distribution has its genesis from engineering, it has also received wide ranging
∗ Corresponding author. Tel.: +55 27998990247. E-mail address: arturlemonte@gmail.com (A.J. Lemonte). http://dx.doi.org/10.1016/j.apm.2016.04.007
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 t fT ( t ) α = 0.3 α = 0.5 α = 0.7 α = 0.9 α = 1.2 α = 1.5
BS distribution
−6 −4 −2 0 2 4 6 0.0 0.1 0.2 0.3 0.4 y fY ( y ) α = 1 α = 1.5 α = 2 α = 3SN distribution
Fig. 1. Plots of the BS and SN pdf’s: η= 1 , μ= 0 and σ= 2 .
applications in other fields that include business, environment, informatics, and medicine. For book treatments of the BS distribution, the readers are referred to [2–4].
The cumulative distribution function (cdf) of T having the BS distribution, say T∼ BS(
α
,η
), is defined by FT(
t)
=(
v
t)
, for t> 0, where( · ) is the standard normal cdf,
v
t =ρ
(
t/η
)
/α
,ρ
(
z)
= z1/ 2− z −1/ 2, andα
> 0 andη
> 0 are the shapeand scale parameters, respectively. The scale parameter is also the median of the distribution. For any k> 0, it follows that
kT∼ BS(
α
,kη
). It is noteworthy that the reciprocal property holds for the BS distribution: T−1∼ BS(
α
,η
−1)
. The probability density function (pdf) of T is fT(
t)
=κ
(
α
,η
)
t−3/ 2(
t +η
)
exp [ −τ
(
t/η
)
/2α
2] , for t> 0, whereκ
(
α
,η
)
= exp(
α
−2)
/(
2α
2πη
)
and
τ
(
z)
= z+ z−1. A general expression for the moments of T is E(
Tk)
=η
k [ Ik+1/ 2
(
α
−2)
+ Ik−1/ 2(
α
−2)
] /[2 I1/ 2(
α
−2)
] , where Iν( · ) denotes the modified Bessel function of the third kind and orderν
. Fig.1displays some plots of the BS pdf for selected values ofα
withη
=1 . Notice that the BS pdf is positively skewed and the asymmetry of the distribution decreases withα
. Asα
decreases, the distribution becomes more symmetric aroundη
, the median.Rieck and Nedelman [5]introduced the sinh-normal (SN) distribution with shape, location and scale parameters given by
φ
> 0,μ
∈ IR andσ
> 0, respectively, say Y∼ SN(φ
,μ
,σ
). The cdf of Y is FY(
y)
=(
2φ
−1sinh [(
y−μ
)
/σ
])
, for y∈ IR. The SN pdf takes the form:fY
(
y)
= 2φσ
√2π
cosh y−μ
σ
exp −2φ
2sinh 2y−μ
σ
, y∈IR. The moments of Y can be obtained from the moment generating function:M
(
k)
=exp(
kμ
)
Ik +1/ 2(
φ
−2)
+Ik −1/ 2(
φ
−2)
2I1/ 2(
φ
−2)
.The reliability and hazard rate functions of Y are given, respectively, by: RY
(
y)
=−2
φ
sinh y−μ
σ
, y∈IR, hY(
y)
=2cosh[
(
y−μ
)
/σ
]exp−2
φ
−2sinh2[
(
y−μ
)
/σ
]φσ
√2π
(
−2φ
−1sinh[(
y−μ
)
/σ
])
, y∈IR.The SN distribution is symmetrical, presents greater and smaller degrees of kurtosis than the normal model and also has bi-modality. It is symmetric around the mean E
(
Y)
=μ
; it is unimodal forφ
≤ 2 and its kurtosis is smaller than that of the normal case; it is bimodal forφ
> 2 and its kurtosis is greater than that of the normal case; and if Yφ ∼ SN(φ
,μ
,σ
), thenZφ= 2
(
Yφ−μ
)
/(
φσ
)
converges in distribution to the standard normal distribution whenφ
→ 0.Rieck and Nedelman [5]proved that if T ∼ BS(
α
,η
), then Y=log(
T)
is SN distributed with shape, location and scale parameters given byφ
=α
,μ
= log(
η
)
andσ
=2 , respectively; that is, if Y∼ SN(α
,μ
, 2), then T= exp(
Y)
follows the BS distribution with shape parameterα
and scale parameterη
= exp(
μ
)
. For this reason, the SN distribution is also called the log-BS distribution. Additionally, the SN and BS models correspond to a logarithmic distribution and its associated dis- tribution, respectively [3,Chapter 12]. Fig.1 displays some plots of the SN pdf for selected values ofα
withμ
=0 andσ
= 2 .The BS nonlinear regression model (BSNLM), first defined by Lemonte and Cordeiro [6], is a class of statistical models for relating responses to nonlinear combinations of predictor variables. The authors have discussed the maximum likelihood
estimation for the model parameters and derived closed-form expressions for the second-order biases of these estimates. It generalizes the BS linear regression model described by Rieck and Nedelman [5], allowing non-linearity in the systematic component of the model. The BSNLM can be applied in areas like economics, engineering, environmental studies, medicine, among others. Also, this class of models can be used in applications on the characterization of materials, accelerated life testing or to compare the median lives of several populations. The BSNLM has been the subject of some research papers. In particular, Lemonte and Cordeiro [7]presented a simple matrix formula for the skewness of the distribution of the maxi- mum likelihood estimators (MLEs) of the parameters in this class of models, Lemonte and Patriota [8]developed diagnostic analysis to detect influential observations, and Vanegas et al. [9]proposed a general definition of residuals and investigated their statistical properties analytically and empirically. A Bayesian analysis under a normal-gamma prior for the class of BSNLMs was developed by Farias and Lemonte [10].
The likelihood ratio (LR), Wald and Rao score tests are the large-sample tests usually employed for testing hypotheses in parametric models. These statistics for testing composite or simple null hypothesis H0against an alternative hypothesis Ha ,
in regular problems, have a
χ
2k null distribution asymptotically, where k is the difference between the dimensions of the parameter spaces under the two hypotheses being tested. However, in small samples, the use of these statistics coupled with their asymptotic properties become less justifiable. One way of improving the
χ
2approximation for the exact distribution ofthe LR statistic is by multiplying it by a correction factor known as the Bartlett correction [11]. This idea was later put into a general framework by Lawley [12]. The
χ
2approximation for the exact distribution of the score statistic can be improvedby multiplying it by a correction factor known as the Bartlett-type correction. This result was demonstrated in a general framework by Cordeiro and Ferrari [13]. There is no Bartlett-type correction to improve the
χ
2approximation of the exactdistribution of the Wald statistic in a general setting.
The Bartlett and Bartlett-type corrections became widely used for improving the large-sample
χ
2 approximation to thenull distribution of the LR and score statistics in several special parametric models. In recent years there has been a renewed interest in Bartlett and Bartlett-type factors and several papers have been published giving expressions for computing these corrections for special models. Some references are [14–25], among others. The reader is referred to [26] for a detailed survey on Bartlett and Bartlett-type corrections.
The asymptotic
χ
2 distribution of the LR, Wald and score statistics is used to test hypotheses on the model parametersin BSNLMs, since their exact distributions are difficult to obtain in finite samples. However, for small sample sizes, the
χ
2 distribution may not be a trustworthy approximation to the exact null distributions of these statistics in this class ofmodels. Higher order asymptotic methods, such as the Bartlett and Bartlett-type corrections, can be used to improve the LR and score tests. The first step in this direction was provided by Lemonte [25], who derived an improved LR statistic in the class of BSNLMs. (This result will be revised in this paper.) Although the algebraic forms of the Bartlett and Bartlett- type corrections are somewhat complicated, they can be easily incorporated into a computer program. This might be a worthwhile practice, since the Bartlett and Bartlett-type corrections act always in the right direction and, in general, give a substantial improvement.
This paper is concerned with small sample likelihood-based inference in BSNLMs. First, we derive a general Bartlett- type correction in matrix notation to improve the inference based on the score statistic in the class of BSNLMs when the number of observations available to the practitioner is small. Further, in order to evaluate and compare the finite-sample performance of the improved score test in BSNLMs with the usual LR, Wald and score tests, and with the improved LR test, we also perform some Monte Carlo simulations. Bootstrap-based tests are also included in the Monte Carlo experiments. The simulation study on the size properties of these tests evidences that the improved score test proposed in this paper can be an appealing alternative to the classic asymptotic tests in this class of models when the number of observations is small. We shall emphasize that we have not found any comprehensive simulation study in the statistical literature comparing the classical uncorrected and corrected large-sample tests in BSNLMs. This paper fills this gap, and includes the score test and its Bartlett-type corrected version derived here in the simulation study.
The paper is unfolded as follows. In Section 2, we define the class of BSNLMs and discuss estimation and hypothesis tests on the regression parameters. Improved likelihood-based inference is studied in Section 3. We present the Bartlett- corrected LR statistic, derive a Bartlett-type correction for the score statistic, and consider bootstrap-based tests. Monte Carlo simulation results on the regression parameters are presented and discussed in Section4. Tests on the parameter
α
are provided in Section5, as well as some simulations. An application to real data is performed in Section6. The paper ends with some concluding remarks in Section7.2. Themodel,estimationandtesting
The BSNLM is defined as:
yi =
μ
i(
x i ;β
)
+ε
i , i=1,...,n, (1)where yi is the logarithm of the ith observed lifetime, xi =
(
xi 1,...,xim)
is an m× 1 vector of known explanatory variables associated with the ith observable response yi , andβ
=(
β
1,...,β
p)
is a vector of unknown nonlinear parameters ( m≤ p<n) to be estimated from the data. The random variables
ε
i’s are mutually independent errors with SN distribution, that is,ε
i ∼ SN(α
, 0, 2) for i=1 ,...,n. We consider a nonlinear structure for the location parameterμ
i ( xi ;β
) in the model (1), whereμ
i ( xi ;β
) is assumed to be a known and twice continuously differentiable function with respect toβ
. Ifμ
i(
xi ;β
)
=xiβ
,then model (1)reduces to the linear model in [5]. It is also evident that the model (1)opens new possibilities to relate the response yi to the location parameter
μ
i ( xi ;β
) in a nonlinear manner instead of an only way, i.e. linear, to relate the response and the covariates.Notice that the main difference between the linear [5]and nonlinear [6]BS regression models is on the mean of the response variable. The linear model in [5] relates the mean response by a linear function, whereas the nonlinear model in [6] relates the mean response in a nonlinear structure, thus generalizing the linear model, since it contains the linear function as a special case. As an example, consider the partially nonlinear regression model defined by:
μ
=Zλ
+η
g(
γ
)
, (2)where
μ
=(
μ
1,...,μ
n)
is the mean response vector, Z is a known n×(
p− 2)
matrix of full rank, g(γ
) is an n× 1 vector,β
=(
λ
,η
,γ
)
,λ
=(
λ
1,...,λ
p−2
)
andη
andγ
are scalar parameters. This class of models occurs very often in statisticalmodeling. For example,
μ
=λ
1z1+λ
2z2 +η
exp(
γ
x)
[27],μ
=λ
−η
log(
x1 +γ
x2)
[28],μ
=λ
+η
log(
x1/(
γ
+x2))
[29], andμ
=λ
+η
x/(
γ
+ x)
[30]. Ratkowsky [30,Chapter5]discussed several models of the form (2), which include the asymptotic regression and Weibull-type models given byμ
=λ
−ηγ
x andμ
=λ
−η
exp(
−γ
x)
, respectively. Note that the linear model of Rieck and Nedelman [5]is not adequate for the above situations, while the nonlinear model of Lemonte and Cordeiro [6] is. So, the liner model [5]is of a limited use, whereas the nonlinear model [6]can be applied for modeling several situations, even the linear case (obviously).The log-likelihood function for the parameter vector
θ
=(
β
,α
)
from a random sample y=(
y1,...,yn)
obtained frommodel (1), except for constants, can be expressed as:
(
θ
)
= n i =1 log(
ξ
i 1)
− 1 2 n i =1ξ
2 i 2, whereξ
i 1=ξ
i 1(
θ
)
= 2α
cosh y i −μ
i 2 ,ξ
i 2=ξ
i 2(
θ
)
= 2α
sinh y i −μ
i 2 ,and
μ
i =μ
i(
xi ;β
)
. We assume that some standard regularity conditions on (θ
) and its first four derivatives hold as n goes to infinity see, for example, [31,Chapter9]. These conditions are the same regularity conditions required for Edgeworth expansions and are indeed fulfilled in this context. The nonlinear predictors x1,...,xn are embedded in an infinite sequenceof m× 1 vectors that must satisfy these regularity conditions for the asymptotics to be valid. Under these assumptions, the MLEs have good asymptotic properties such as consistency, sufficiency and normality. Further, the n× p local matrix X = X
(
β
)
=∂
μ
/∂
β
of partial derivatives ofμ
=(
μ
1,. . .,μ
n)
with respect toβ
is assumed to be of full rank, i.e. rank(
X)
=p for allβ
. The local model matrix X has elements that are, in general, functions of the unknown parameter vectorβ
.Let
θ
=(
β
,α
)
be the MLE ofθ
=(
β
,α
)
. A joint iterative algorithm to obtainθ
=(
β
,α
)
is:X (m )X (m )
β
(m +1)= X (m )ζ
(m ),α
(m +1)=α
(m ) 21+
ξ
¯(m ) 2 ,where m=0 ,1 ,... (the iteration counter),
ζ
(m )=X(m )β
(m )+[4 /ψ
1(
α
(m ))
] s(m ), s=s(
θ
)
=(
s1,...,sn)
with si =(
ξ
i 1ξ
i 2−ξ
i2/ξ
i1)
/2 , andξ
¯2(m )=n−1 n i=1ξ
i 22(m ). Also,ψ
0(
α
)
= 1−erf √ 2α
exp 2α
2 ,ψ
1(
α
)
=2+ 4α
2 − √ 2π
α ψ
0(
α
)
,where erf
(
·)
is the error function given by erf(
x)
=(
2 /√π
)
0x e −t2
d t. Details on erf
(
·)
can be found in [32]. We can writeθ
∼ Na p+1(
θ
,K(
θ
)
−1)
for n large, where ∼ denotesa approximately distributed, K(θ
) is the block-diagonal expected informationmatrix given by K
(
θ
)
= diag{
Kβ,Kα}
,K(
θ
)
−1is its inverse, Kβ=ψ
1(
α
)(
XX)
/4 is the expected information matrix forβ
,and Kα= 2 n/
α
2is the information forα
. Since K(θ
) is block-diagonal, the vectorβ
and the scalarα
are globally orthogonal[33], and
β
andα
are asymptotically independent.In the following, we shall consider the tests based on the LR ( SLR), Wald ( SW) and Rao score ( SR) statistics in the class of
BSNLMs for testing a composite null hypothesis. The hypothesis of interest is H0:
β
1 =β
10, which will be tested against thealternative hypothesis Ha :
β
1 =β
10, whereβ
is partitioned asβ
=(
β
1,β
2)
,β
1 =(
β
1,...,β
q)
andβ
2 =(
β
q +1,...,β
p)
. Here,β
10is a fixed column vector of dimension q, andβ
2 andα
act as nuisance parameters. The partition of the parametervector
β
induces the corresponding partitions: U β(
θ
)
=∂
(
θ
)
∂β
= U 1(
θ
)
U 2(
θ
)
, U 1(
θ
)
=∂
(
θ
)
∂β
1 = X 1s , U 2(
θ
)
=∂
(
θ
)
∂β
2 = X 2s , K β=ψ
1(
α
)
4 X 1X 1 X 1X 2 X 2X 1 X 2X 2 , where the matrix X is partitioned as X=X1 X2, X1 being n× q and X2 being n×
(
p− q)
. Letθ
=(
β
,α
)
, withβ
=(
β
be expressed, respectively, as: SLR=2
(
θ
)
−(
θ
)
, SW=ψ
1(
α
)
4(
β1
−β10
)
R R(
β1
−β10
)
, SR= 4ψ
1(
α
)
s X 1(
R R)
−1X 1s ,where R= X1− X 2C, and C =
(
X2X2)
−1X2X1 represents a(
p− q)
× q matrix whose columns are the vectors of regressioncoefficients obtained in the normal linear regression of the columns of X1 on the model matrix X2. Here, tildes and hats
indicate quantities available at the restricted and unrestricted MLEs, respectively. The limiting distribution of the three statis- tics under H0is
χ
q 2; see, for example, [34]. The null hypothesis is rejected for a given nominal level,γ
say, if the observed value of the test statistic exceeds the upper 100(
1 −γ
)
% quantile of theχ
2q distribution.
3. ImprovedinferenceinBSNLMs
The chi-squared distribution may be a poor approximation to the null distribution of the statistics discussed in Section 2 when the sample size is not sufficiently large. It is thus important to obtain refinements for inference based on these tests from second-order asymptotic theory. For BSNLMs, Bartlett corrections for LR statistics were obtained by Lemonte [25]. In addition to the corrected LR statistic to test hypotheses on the model parameters in the new class of re- gression models, we derive Bartlett-type corrections for the score statistics on the basis of the general results in [13,35]. These results are new and represent additional contributions to improve likelihood-based inference in BSNLMs. Further, we also consider an alternative testing procedure based on bootstrap.
To define the corrected LR statistic as well as to derive Bartlett-type corrections for the score statistic in BSNLMs, some additional notation is in order. We define the matrices:
Z = X
(
X X)
−1X =((
zlc))
, Z 2= X 2(
X 2X 2)
−1X 2=((
z2lc))
, Z d =diag{
z11,. . .,znn}
, Z 2d =diag{
z211,...,z2nn}
, D d =diag{
d1,. . .,dn}
, B =((
bi j))
, B d =diag{
b11,. . .,bnn}
, D 2d =diag{
d21,. . .,d2n}
, B 2=((
b2i j))
, B 2d =diag{
b211,. . .,b2nn}
, where di =tr{
X i(
X X)
−1}
, bi j =tr{
X i(
X X)
−1X j(
X X)
−1}
, d2i =tr{
X 22i(
X 2X 2)
−1}
, b2i j =tr{
X 22i(
X 2X 2)
−1X 22j(
X 2X 2)
−1}
,tr
{
·}
means the trace operator, Xi denotes a p × p matrix whose elements are∂
2μ
i /
∂
β
r∂
β
s (for r,s= 1 ,...,p and i = 1 ,...,n), and X22i is a(
p− q)
×(
p− q)
matrix obtained from the p× p partitioned matrix following the same partition ofβ
, X i =∂
∂
β
2μ
i r∂
β
s = X 11i X 12i X 21i X 22i . Additionally, we define the quantities:δ
0(
α
)
= 2+α
2ψ
1(
α
)
α
2,δ
1(
α
)
= 4δ
0(
α
)
2 2+α
2 +δ
0(
α
)
− 2αψ
3(
α
)
ψ
1(
α
)
,δ
2(
α
)
=2δ
0(
α
)
2,δ
3(
α
)
= 4ψ
2(
α
)
ψ
1(
α
)
2, g0(
α
)
= 12ψ
1(
α
)
, g1(
α
)
= 192ψ
1(
α
)
2ψ
2(
α
)
− 1 16ψ
5(
α
)
−ψ
1(
α
)
2 , g2(
α
)
=− 144ψ
1(
α
)
2ψ
2(
α
)
− 1 16ψ
5(
α
)
−ψ
1(
α
)
2 , g3(
α
)
=− 24α
2ψ
1(
α
)
2 2(
2+α
2)
α
3 −ψ
3(
α
)
2 ,g4
(
α
)
= 4(
2+α
2)
αψ
1(
α
)
2 2(
2+α
2)
α
3 −ψ
3(
α
)
, g5(
α
)
=α
ψ
1(
α
)
2(
2+α
2)
α
3 −ψ
3(
α
)
, g6(
α
)
=α
2ψ
1(
α
)
4(
1− 2α
2)
α
4 +ψ
4(
α
)
,ψ
2(
α
)
=− 1 4 2+ 7α
2−π
2 1 2α
+ 6α
3ψ
0(
α
)
,ψ
3(
α
)
= 3α
3− √ 2π
4α
2 1+α
42ψ
0(
α
)
,ψ
4(
α
)
=− 10α
4− 4α
6+ 1α
7π
2α
4+10α
2+8ψ
0(
α
)
,ψ
5(
α
)
=12+ 2α
2 + 16α
4+ 1α
π
2 1+12α
2ψ
0(
α
)
. We also define Z(2)=Z Z,Z(2)d =Zd Zd ,Z2(2)=Z2 Z2, etc., where “” denotes the Hadamard (elementwise) product
of matrices. Let 1n =
(
1 ,...,1)
be the n-vector of ones.From the general result of Lawley [12],Lemonte et al. [25]defined the Bartlett-corrected LR statistic for testing H0:
β
1 =β
10in BSNLMs as: S∗LR= SLR 1+aLR, (3) where aLR =(
ALR+ALR,βα)
/q, and ALR=− 1ψ
1(
α
)
trD (d 2)− D(2) 2d − 2
B d − B2d +
ZB − Z2B 2 +
δ
3(
α
)
trZ d (2)− Z(2) 2d , ALR,βα=1n
δ
1(
α
)
q+δ
2(
α
)(
2p− q)
q .The factor 1 +aLRis commonly referred to as the ‘Bartlett correction’.
In the following, we shall derive an improved score statistic for testing H0:
β
1 =β
10in BSNLMs. All the results regardingthe score test in BSNLMs are new. The basic idea of transforming the score test statistic in such a way that it becomes better approximated by the reference chi-squared distribution is due to [13]. The corrected score statistic proposed by these authors is obtained by multiplying the original score statistic by a second-degree polynomial in the original score statistic itself, producing a modified score test statistic whose null distribution has its asymptotic chi-squared approximation error reduced from O(n−1
)
to O(n−2)
. Thus, improved score tests may be based on the corrected score statistic, which are expected to deliver more accurate inferences with samples of typical sizes encountered by applied practitioners.The Bartlett-type correction for the score statistic derived by Cordeiro and Ferrari [13]is very general in the sense that it is not tied to a particular parametric model, and hence needs to be tailored for each application of interest. The general expression can be very difficult to particularize for specific regression models because it involves complicated functions of moments of log-likelihood derivatives up to the fourth order. As we shall see below, we have been able to apply their results for BSNLMs; that is, we derive closed-form expressions for the Bartlett-type correction that defines the corrected score statistic in this class of models, allowing for the computation of this factor with minimal effort. All moments of log- likelihood derivatives for Bartlett-type corrections in BSNLMs are provided in the Appendix.
The Bartlett-corrected score statistic is given by S∗R= SR[1 −
(
cR+ bRSR+ aRS2R)
] , where aR = AR3/[ 12 q(
q+ 2)(
q+ 4)
] ,bR =(
AR2+AR2,βα− 2AR3)
/[12 q(
q+2)
] ,cR =(
AR1 +AR1,βα− AR2− AR2,βα+AR3)
/(
12 q)
, AR1=g0(
α
)
1n [D 2d(
Z − Z2)
D 2d − 2B 2(
Z − Z2)
]1n +g1(
α
)
tr{
(
Z d − Z2d)
Z 2d}
, AR2=g2(
α
)
tr(
Z d − Z2d)
(2), AR3=0, AR1,βα=12nq[(
p− q)
g4(
α
)
+g5(
α
)
+g6(
α
)
], AR2,βα= q(
q+2n)
g3(
α
)
;when
α
is known AR1,βα and AR2,βα are zero. Since AR3 = 0 , we have that aR =0 and hence the Bartlett-corrected scorestatistic has the form: S∗R=SR
1−
cR+bRSR
where bR= AR2+AR2,βα 12q
(
q+2)
, cR= AR1− AR2+AR1,βα− AR2,βα 12q .The factor [1 −
(
cR+bRSR)
] in (4), which is a first-degree polynomial in the original score statistic itself, is regarded as aBartlett-type correction for the score statistic in such a way that the null distribution of S∗Ris better approximated by the reference
χ
2 distribution than the distribution of the uncorrected score statistic. The null distribution of S∗Ris chi-square
with approximation error reduced from order O(n−1
)
to O(n−2)
.A brief commentary on the quantities that define the improved score statistic is in order. Comments on the quantities that define the improved LR statistic are given in the corresponding article in which it was obtained. Note that AR1 and AR2 depend heavily on the particular local matrix X in question. They involve the shape parameter
α
. Unfortunately, theyare not easy to interpret in generality and provide no indication as to what structural aspects of the model contribute significantly to their magnitude. The quantities AR1,βαand AR2,βα can be regarded as the contribution due to the fact that
α
is considered unknown and has to be estimated from the data. Notice that AR1,βα depends on the local matrix X only through its rank, i.e. the number of regression parameters ( p), and it also involves the number of parameters of interest ( q) in the null hypothesis. Additionally, AR2,βα involves the number of parameters of interest. Therefore, it implies that thesequantities can be non-negligible if the dimension of
β
and/or the number of tested parameters in the null hypothesis are not considerably smaller than the sample size. Finally, the matrices D2d and B2 may be regarded as the amount of nonlinearityin the score statistic induced by the nonlinear parameters in
μ
i ( xi ;β
). In particular, ifμ
i(
xi ;β
)
=xiβ
(i.e. linear model), then these matrices become zero.Notice that the general expressions which define the improved LR and score statistics only involve simple operations on matrices and vectors, and can be easily implemented in any mathematical or statistical/econometric programming environ- ment, such as
R
[36],Ox
[37]andMAPLE
[38]. Also, all the unknown parameters in the quantities that define the improved statistics are replaced by their restricted MLEs. The improved LR and score tests that employ (3)and (4), respectively, as test statistics, follow from the comparison of S∗LR and S∗Rwith the critical value obtained as the appropriateχ
2q quantile. Another way of improving the size properties of tests is achieved by the bootstrap technique. The bootstrap corrected tests are performed as follows. Let S denote any of the uncorrected statistics. First, one generates B bootstrap samples from the assumed model with the parameters replaced by restricted estimates computed using the original sample, under H0,
i.e. imposing the restrictions stated in the null hypothesis. Second, for each pseudo-sample, one computes the statistic; let
Sb denote the statistic for the b-th sample, b= 1 ,...,B. Third, the 1 −
γ
percentile of S is estimated by q1−γ, such that #{
Sb ≤ q1−α}
/B=1 −γ
. Finally, one rejects the null hypothesis if S>q1−γ. Alternatively, the test may be based on the bootstrap p-value given by p=#{
Sb ≥ S}
/B. We denote the bootstrap corrected test statistics by SbootLR ,SbootW and SbootR . For
a good discussion of bootstrap tests, see [39,Chapter16]. Also, it is possible to prove second order asymptotic results on the bootstrap procedure. In fact, Carpenter [40]proved that the coverage error of LR confidence intervals constructed using bootstrap estimate of the critical point is O(n−2
)
.We have that, up to an error of order O(n−2
)
, the null distribution of the improved statistics S∗LR,S∗R,SbootLR ,SbootW and SbootR
is
χ
2q . Hence, if the sample size is large, all improved tests could be recommended, since their type I error probabilities do not significantly deviate from the true nominal level. The natural question is how these tests perform when the sample size is small or of moderate size, and which one is the most reliable to test hypotheses in BSNLMs. In the next section, we shall use Monte Carlo simulations to shed some light on this issue. In addition to the improved tests, for the sake of comparison we also consider the original LR, Wald and score tests in the simulation study.
4. Numericalevidence
In this section, we report the results from Monte Carlo simulations in order to compare the performance of the following tests in small- and moderate-sized samples in the class of BSNLMs: the usual LR ( SLR), Wald ( SW) and score ( SR) tests; the
improved LR ( S∗LR) and score ( S∗R) tests; and the bootstrap-based tests ( Sboot
LR ,SbootW and SbootR ). We consider the model:
yi =
β
1xi 1+β
2xi 2+· · · +β
p−1xip−1+exp(
β
p xip)
+ε
i ,where xi 1 = 1 and
ε
i ∼ SN(α
, 0, 2) for i= 1 ,...,n, andα
> 0 is assumed unknown and it is the same for all observations.The number of Monte Carlo replications is 15,0 0 0, and the nominal levels of the tests are
γ
=10% and 5%. We consider 600 bootstrap replications (i.e. B= 600 ). The simulations were carried out using theOx
matrix programming language, which is freely distributed for academic purposes and available at http://www.doornik.com. All log-likelihood maximizations with respect to the model parameters were carried out using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) quasi-Newton method with analytic first derivatives through the MaxBFGS subroutine. All regression parameters, except those fixed at the null hypothesis, were set equal to one. The covariate values were selected as random draws from the U(
0 ,1)
distribution. For each fixed n, the covariate values were kept constant throughout the experiment.We report the null rejection rates of H0:
β
1 =· · · =β
q =0 for all tests at the 10% and 5% nominal significance levels,i.e. the percentage of times that the corresponding statistics exceed the 10% and 5% upper points of the reference
χ
2distri-bution. The results are presented in Tables1, 2and 3. Entries are percentages. We consider different values for p (number of regression parameters), q (number of tested parameters in the null hypothesis), n (sample size), and
α
(shape parameter).Table 1
Null rejection rates (in %) for H 0 : β1 = β2 = 0 with α= 0 . 4 .
n Test p = 5 p = 6 p = 7 p = 8 10% 5% 10% 5% 10% 5% 10% 5% 20 SW 20 .24 14 .06 23 .44 15 .62 26 .82 18 .36 28 .88 21 .04 SLR 17 .28 11 .22 19 .30 11 .66 22 .64 14 .44 25 .06 16 .26 SR 13 .78 6 .68 15 .10 8 .02 18 .00 9 .48 20 .38 11 .38 S∗ LR 12 .08 6 .04 11 .74 6 .68 13 .52 7 .10 14 .40 8 .42 S∗ R 11 .48 5 .92 11 .16 6 .50 12 .70 6 .72 13 .90 8 .68 Sboot W 10 .66 5 .48 9 .94 5 .48 10 .24 4 .76 10 .12 5 .40 Sboot LR 10 .74 5 .42 9 .86 5 .58 10 .26 4 .74 10 .12 5 .36 Sboot R 10 .54 5 .42 9 .74 5 .66 10 .32 4 .80 10 .30 5 .32 30 SW 16 .72 10 .20 18 .90 11 .60 19 .04 11 .94 21 .34 14 .18 SLR 14 .36 8 .12 16 .46 9 .34 16 .70 9 .60 18 .76 11 .46 SR 11 .90 6 .00 13 .76 6 .74 14 .02 7 .32 16 .34 8 .78 S∗ LR 10 .38 5 .30 11 .22 5 .72 10 .86 6 .08 12 .18 6 .68 S∗ R 10 .08 5 .04 10 .28 5 .26 10 .16 5 .56 11 .38 5 .94 Sboot W 9 .98 5 .02 10 .28 5 .30 9 .86 5 .22 10 .76 5 .60 Sboot LR 9 .86 4 .96 10 .28 5 .24 9 .96 5 .22 10 .80 5 .54 Sboot R 9 .86 5 .08 10 .24 5 .32 9 .82 5 .20 10 .72 5 .64 50 SW 13 .92 7 .50 13 .86 7 .60 14 .92 8 .20 14 .32 8 .28 SLR 12 .44 6 .36 12 .74 6 .66 13 .54 7 .02 13 .02 7 .08 SR 10 .82 5 .08 11 .50 5 .58 11 .92 5 .96 11 .74 5 .96 S∗ LR 9 .82 4 .64 9 .74 4 .92 10 .02 4 .98 9 .62 4 .66 S∗ R 9 .82 4 .50 9 .56 4 .50 9 .62 4 .76 9 .02 4 .44 Sboot W 9 .76 4 .72 9 .74 4 .70 9 .66 4 .80 9 .36 4 .26 Sboot LR 9 .96 4 .74 9 .74 4 .80 9 .64 4 .80 9 .42 4 .30 Sboot R 9 .82 4 .78 9 .78 4 .78 9 .68 4 .84 9 .44 4 .30 Table 2
Null rejection rates (in %) for H 0 : β1 = β2 = 0 with α= 0 . 8 .
n Test p = 5 p = 6 p = 7 p = 8 10% 5% 10% 5% 10% 5% 10% 5% 20 SW 20 .74 13 .68 23 .86 16 .34 27 .48 19 .18 29 .64 22 .42 SLR 16 .44 9 .50 19 .68 11 .62 22 .68 13 .96 25 .58 17 .54 SR 11 .70 5 .26 14 .38 6 .96 16 .96 8 .90 20 .44 11 .42 S∗ LR 10 .76 5 .32 12 .14 6 .44 13 .26 7 .66 15 .70 8 .70 S∗ R 9 .48 4 .54 10 .38 5 .00 11 .16 5 .80 11 .98 5 .96 Sboot W 9 .10 4 .48 9 .70 5 .22 10 .04 5 .46 11 .00 5 .02 Sboot LR 9 .20 4 .40 9 .58 5 .20 9 .90 5 .42 10 .84 4 .88 Sboot R 9 .08 4 .30 9 .42 4 .92 9 .80 5 .18 10 .50 5 .10 30 SW 17 .90 11 .36 18 .28 11 .46 20 .20 12 .32 21 .94 14 .38 SLR 14 .94 8 .28 15 .42 8 .90 16 .98 9 .42 18 .86 11 .08 SR 11 .82 5 .60 12 .32 6 .38 13 .24 6 .30 15 .08 7 .88 S∗LR 11 .24 5 .62 10 .90 5 .86 10 .90 5 .60 12 .08 6 .92 S∗ R 10 .68 5 .42 10 .16 5 .30 10 .16 4 .96 10 .94 5 .82 Sboot W 10 .92 5 .18 10 .04 5 .26 9 .74 4 .82 10 .64 5 .84 Sboot LR 10 .78 5 .32 10 .12 5 .22 9 .64 4 .82 10 .58 5 .56 Sboot R 10 .46 5 .44 9 .90 5 .24 9 .48 4 .70 10 .38 5 .36 50 SW 13 .04 7 .84 14 .70 8 .62 16 .16 9 .26 16 .16 9 .56 SLR 11 .72 6 .74 12 .88 7 .32 14 .44 7 .74 14 .50 8 .06 SR 10 .38 5 .44 11 .18 5 .68 11 .94 6 .08 12 .44 6 .34 S∗ LR 9 .68 5 .16 10 .40 5 .42 10 .66 5 .62 10 .94 5 .68 S∗R 9 .70 5 .24 10 .28 5 .36 10 .44 5 .24 10 .70 5 .34 Sboot W 9 .76 5 .06 10 .26 5 .46 10 .58 5 .72 10 .40 5 .20 Sboot LR 9 .80 5 .12 10 .12 5 .42 10 .34 5 .64 10 .26 5 .14 Sboot R 9 .82 5 .26 10 .00 5 .26 10 .26 5 .50 10 .24 5 .04
Table 3
Null rejection rates (in %) for H 0 : β1 = · · · = βq with α= 0 . 6 ,
p = 7 and n = 25 . Test q = 2 q = 3 q = 4 10% 5% 10% 5% 10% 5% SW 22 .72 14 .74 25 .38 17 .42 27 .46 19 .72 SLR 19 .66 11 .48 20 .02 12 .76 20 .70 12 .38 SR 15 .38 8 .14 14 .38 6 .96 12 .32 5 .12 S∗ LR 12 .08 6 .56 12 .64 6 .76 11 .90 6 .36 S∗R 10 .60 5 .40 10 .84 5 .24 10 .14 4 .50 Sboot W 10 .26 5 .02 10 .46 5 .12 10 .14 5 .10 Sboot LR 10 .36 5 .06 10 .48 5 .08 10 .04 5 .22 Sboot R 10 .20 5 .10 10 .50 5 .06 10 .04 4 .82 Table 4
Nonnull rejection rates (in %) of the tests; γ= 10% , α= 0 . 5 and p = 5 .
n δ S∗
LR S∗R SbootW SbootLR SbootR 25 0 .15 16 .50 16 .10 15 .70 15 .90 15 .80 0 .30 26 .90 25 .20 25 .30 25 .70 25 .70 0 .50 42 .30 40 .70 40 .30 40 .20 40 .50 0 .70 78 .40 78 .00 78 .40 78 .60 78 .20 40 0 .15 17 .00 17 .00 17 .80 16 .80 16 .80 0 .30 42 .80 42 .20 43 .20 43 .40 43 .80 0 .50 70 .00 70 .00 71 .80 71 .00 71 .40 0 .70 89 .00 88 .20 87 .80 88 .00 87 .60
The figures in Tables1to 3reveal important information. The test that uses the Wald statistic ( SW) is markedly liberal
(rejecting the null hypothesis more frequently than expected based on the selected nominal level), more so as the number of tested parameters in the null hypothesis ( q) and the number of regression parameters ( p) increase. For example, if p=6 ,
γ
=10% and n= 20 , the null rejection rates are 23.44% forα
=0 .4 ( Table1), whereas we have 23.86% forα
= 0 .8 ( Table2); that is, more than twice the nominal level. Also, if q = 3 andγ
= 5% , the null rejection rate is 17.42% ( Table3); that is, more than thrice the nominal level. Notice that the test which uses the original LR statistic ( SLR) is also liberal, but lesssize distorted than the Wald test. In the above examples, the null rejection rates are 19.30% ( Table1), 19.68% ( Table2) and 12.76% ( Table3). The original score ( SR) test is also liberal, but less size distorted than the original LR and Wald tests in all
cases. It is noticeable that the original score test is much less liberal than the original LR and Wald tests.
As pointed out above, the usual score test is less size distorted than the original LR and Wald tests. However, its null rejection rates can also deviate considerably of the significance levels of the test. For example, if p= 7 ,
γ
= 10% and n = 20 ,the null rejection rates are 18.00% for
α
= 0 .4 (see Table1), and 16.96% forα
=0 .6 (see Table2). Also, if q=3 andγ
=10% ,the null rejection rate is 14.38%; see Table3. On the other hand, the Bartlett-corrected LR and score tests that employ S∗LR
and S∗Ras test statistics, respectively, and the bootstrap-based tests ( Sboot
LR , SbootW and SbootR ) are less size distorted than the
usual LR, Wald and score tests for testing hypotheses in BSNLMs; that is, the impact of the number of regressors and the number of tested parameters in the null hypothesis are much less important for the improved tests. Among the Bartlett- corrected tests, the test that uses the statistic S∗LR presents the worst performance, displaying null rejection rates more size distorted than the Bartlett-corrected score test. For example, if p= 8 ,
γ
= 5% and n= 30 , the null rejection rates ofS∗LRand S∗Rare 6.68% and 5.94%, respectively, for
α
= 0 .4 (see Table1), and 6.92% and 5.82%, respectively, forα
=0 .8 (see Table2). Also, if q= 3 andγ
= 10% , the null rejection rates of S∗LRand SR∗ are 12.64% and 10.84%, respectively; see Table3. The Bartlett-corrected score test produces null rejection rates that are close to the nominal levels in all cases. It is also interesting to note that the tests that use the bootstrap corrected critical values ( SbootLR ,SbootW and SbootR ) are less size distorted
than the corresponding uncorrected tests; that is, they present null rejection rates that are always close to the nominal levels. It reveals the good performance of the bootstrap-based tests in testing hypotheses in the class of BSNLMs. It also reveals that the bootstrap tests and the Bartlett-corrected score test are similarly efficient in correcting the size distortions of the tests. Finally, the figures in Tables1and 2reveal that the null rejection rates of all tests approach the corresponding nominal levels as the sample size grows, as expected.
We now turn to the finite-sample power properties of the Bartlett-corrected tests, and the bootstrap-based tests. (We have only considered the corrected tests since the original LR, Wald and score tests are considerably size distorted, as noted earlier.) Here, p= 5 ,
α
= 0 .5 ,γ
= 10% ,β
3 =β
4 =β
5 = 1 , and n= 25 and 40. For the power simulations, we evaluate therejection rates under the alternative hypothesis Ha :
β
1 =β
2 =δ
for different values ofδ
(δ
> 0). Table4lists the nonnulltests increase with n and also with
δ
, as expected. Power simulations carried out for other values of n,p,α
andγ
showed a similar pattern.The main findings from these simulation results can be summarized as follows. The usual LR, Wald and score tests can be considerably oversized (liberal) to test hypotheses on the regression parameters in BSNLMs, rejecting the null hypothesis much more frequently than expected based on the selected nominal levels. The analytically improved LR and score tests tend to overcome these problems, producing null rejection rates which are close to the nominal levels. Overall, in small to moderate-sized samples, the best performing test is the Bartlett-corrected score test. This improved test performs very well and hence should be recommended to test hypotheses in BSNLMs. Additionally, the Wald test should not be recom- mended to test hypotheses on the regression parameters in this class of models when the sample size is not large, since it is much more liberal than the other tests. We shall emphasize that an alternative to the analytical correction is the bootstrap technique. Both approaches lead to similarly performing tests, but the bootstrap approach is quite computationally intensive.
5. Testsontheparameter
α
In this section, the problem under consideration is that of testing a composite null hypothesis H0:
α
=α
0 against Ha :α
=α
0, whereα
0 is a positive specified value forα
. Here,β
acts as a vector of nuisance parameters. The likelihood ratio( SLR), Wald ( SW) and Rao score ( SR) statistics for testing H0:
α
=α
0can be expressed, respectively, as:SLR=2
(
θ
)
−(
θ
)
, SW=2nα
−α
0α
2 , SR= n(
ξ
¯2− 1)
2 2 ,where
ξ
¯2 =ξ
¯2(
θ
)
=n−1n i=1ξ
i 22(
θ
)
,θ
=(
β
,α
)
andθ
=(
β
,α
0)
representing the unrestricted and restricted MLEs ofθ
=(
β
,α
)
, respectively. Under H0, these statistics have a
χ
12distribution up to an error of order O(n−1)
.From [25], the Bartlett-corrected LR statistic for testing H0:
α
=α
0is given by S∗LR=SLR/[1 +(
α
0,p)
] , where(
α
,p)
=1n
13+
δ
1(
α
)
p+δ
2(
α
)
p2
.Note that
(
α
,p) depends on the local model matrix only through its rank, i.e. p. More specifically, it is a second degree polynomial in p divided by n. Hence,(
α
,p) can be non-negligible if the dimension ofβ
is not considerably smaller than the sample size. It is also noteworthy that(
α
,p) depends onα
but not onβ
. After some algebraic manipulations, we define the improved score statistic as S∗R=SR[1 −(
cR +bRSR+aRS2R)
] , where aR =AR3/180 ,bR =(
AR2− 2AR3)
/36 ,cR =(
AR1− AR2 + AR3)
/12 , and AR1= 24 n p(
p+6)
δ
0(
α
)
2− 4pαδ
0(
α
)
ψ
3(
α
)
ψ
1(
α
)
− p(
4+5α
2)
α
2ψ
1(
α
)
, AR2= 12 n 3−4(
2+α
2)
pα
2ψ
1(
α
)
, AR3= 40 n .It should be emphasized that these expressions are quite simple and depend on the model only through the rank of X and
α
. They do not involve the unknownβ
. The A’s are all evaluated atα
0. Under the null hypothesis, the adjusted statistics S∗LRand S∗Rhave a
χ
21 distribution up to an error of order O(n−2
)
. The improved LR and score tests follow from the comparisonof S∗LRand S∗Rwith the critical value obtained for the appropriate
χ
21 quantile. Finally, it is evident that bootstrap-based tests
can also be considered for testing the null hypothesis H0:
α
=α
0.Next, we report Monte Carlo evidence on the finite sample performance of the above tests for testing the parameter
α
. Here, n=35 ,p= 4 and 5, andβ
is a p-vector of ones. The null hypothesis under test is H0:α
=0 .5 . The number of MonteCarlo replications is 15,0 0 0, and the nominal levels of the tests are
γ
= 10% and 5%. We consider 600 bootstrap replications (i.e. B=600 ). Table5presents the null rejection rates for H0:α
=0 .5 . Note that the corrected tests (Bartlett and bootstrap)are much less size distorted than the uncorrected tests, indicating a very good performance of the improved tests in testing inference on
α
in the class of BSNLMs.6. Empiricalillustration
In this section, we shall illustrate an application of the usual LR, Wald and score statistics, and the improved LR and score statistics for testing hypotheses in the class of BSNLMs in a real data set. We consider an application to a biaxial fatigue data set reported by Rieck and Nedelman [5]on the life of a metal piece in cycles to failure. Let N be the number of cycles to failure and the explanatory variable ( x) is the work per cycle (mJ/m 3). Due to the genesis of the BS distribution, the fatigue