Asymptotic linearity and limit distributions, approximations.

(1)

and education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or

licensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of the

article (e.g. in Word or Tex form) to their personal website or

institutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies are

encouraged to visit:

(2)

Asymptotic linearity and limit distributions, approximations

Jo~ao T. Mexia

a

, Manuela M. Oliveira

b,

a_{FCT Nova University of Lisbon, Department of Mathematics, Quinta da Torre, 2825 Monte da Caparica, Portugal}

b_{University of Evora, Department of Mathematics and CIMA – Center for Research on Mathematics and its Applications, Cole´gio Luı´s Anto}_{´nio Verney, Rua Rom~ao}

Ramalho 59, 7000-671 E´vora, Portugal

a r t i c l e i n f o

Article history:

Received 20 January 2009 Received in revised form 13 September 2009 Accepted 14 September 2009 Available online 19 September 2009 Keywords:

Asymptotic linearity Linear and quadratic forms Polynomials

Normal distributions Central limit theorems

a b s t r a c t

Linear and quadratic forms as well as other low degree polynomials play an important role in statistical inference. Asymptotic results and limit distributions are obtained for a class of statistics depending onmþX, with X any random vector and mnon-random vector with JmJ- þ 1. This class contain the polynomials inmþX. An application to the case of normal X is presented. This application includes a new central limit theorem which is connected with the increase of non-centrality for samples of ﬁxed size. Moreover upper bounds for the suprema of the differences between exact and approximate distributions and their quantiles are obtained.

Contents

1. Introduction . . . 353

2. Notations and concepts . . . 354

3. Asymptotic results . . . 354

4. Application: the normal case . . . 356

5. Final remarks . . . 357

6. Uncited reference. . . 357

1. Introduction

Linear and quadratic forms as well as other low degree polynomials play an important role in statistics (e.g. Ito, 1980; Lazraq and Cleroux, 2001; Oliveira and Mexia, 2007a, 2007b; Sabatier and Traissac, 1994; Scheffe´, 1959). InAreia et al. (2008)it is pointed out that low degree polynomials in normal independent variables, with sufﬁciently small variation coefﬁcients, are approximately normal. Moreover, also in Areia et al. (2008), are presented simulations suggesting that, when J

l

J- þ 1, a polynomial Pð

l

þXÞ has a leading component linear in X. This linear component will be normal.

We will obtain asymptotic approximations and limit distributions for a class of statistics containing Pð

l

þXÞ.

In the next section we will present the concept of asymptotic linear functions which will play a central role in our study. We will also introduce the relevant notations and point out that polynomials are asymptotically linear functions. Next we

Contents lists available atScienceDirect

journal homepage: www.elsevier.com/locate/jspi

Journal of Statistical Planning and Inference

_{Corresponding author. Tel.: þ351 266 745300; fax: þ351 266 744546.}

(3)

use a variant of thedmethod (e.g.Lehmann and Casella, 2001) to obtain asymptotic and limit results. In the last section we will consider the case when

m

þX will be normal with mean vector

m

. This variant will display a good control of the approximation errors both for distribution functions and their quantiles.

If Vþ _{is the Moore–Penrose matrix inverse of the variance–covariance matrix V of}

_m

_þ_{X, the quadratic form ð}

_m

_þ

XÞtVþ_ð

_m

_þ_{XÞ will have a chi-square distribution with r ¼ rankðVÞ degrees of freedom and non-centrality parameter}

d¼

m

tVþ

_m

_{, (e.g.}_{Mexia, 1990}_{). Thus we may use}_d_{to measure the non-centrality of the sample. In our study of the normal}

case we will obtain a central limit theorem, for ﬁxed size samples whose non-centrality parameter diverges to þ1. Moreoverdmay be used to measure sample non-centrality whenever V, the variance–covariance matrix of X is deﬁned, andd! þ 1_{will imply J}

l

J- þ 1. Since our asymptotic and limit results are derived assuming that J

l

J- þ 1, they are connected with the increase of non-centrality in samples, normal or not, with ﬁxed size. When the components X1; . . . ; Xm

of X are i.i.d. with mean value

m

and variance

s

2 _{we have}

_l

_¼

_m

₁

m and V ¼

s

2Im so Vþ¼ ð1=

s

2ÞIm and

d¼mð

m

2₌

_s

2_{Þ ¼}_m=VC2_{, with VC ¼}

_s

=

m

the variation coefﬁcient. Thus when the X1; . . . ; Xm are observations with low

variation coefﬁcients we have a sample with larged. The variation coefﬁcient with high precision observations are used.

2. Notations and concepts

Given a sufﬁciently regular function g : Rk_{-R, let g and g be the gradient and the Hessian matrix of g. Then, with r} dðxÞ

the supremum of the spectral radius rðyÞ of g ðyÞ, when Jy xJrd, the function g is asymptotically linear, if whatever d40, we have kdðuÞ_u --10; with kdðuÞ ¼ sup rdðxÞ Jg ðxÞJ; JxJZu ( ) :

Since the ﬁrst and second order partial derivatives of polynomials are themselves polynomials with degree decreasing with successive derivations, polynomials will be asymptotically linear.

Given a random vector X and an asymptotically linear function gðÞ, we will show that, when J

l

_{J-1, the distribution F}Y

of

Y ¼ gð

l

þXÞ

approaches the distribution of Z ¼ gð

l

Þ þg ð

l

ÞtX:

With lpthe p th quantile of the random variable L, given

Z3 ¼ðZ gð

l

ÞÞ Jg ð

l

ÞJ and Y3 ¼ðY gð

l

ÞÞ Jg ð

l

ÞJ

we will obtain upper bounds for jy3

pz3pj. Moreover, when 1 Jg ð

l

ÞJg ð

l

Þ -JmJ-1 b

we will establish that FZ3converges to F_Z3 3, with Z

3

3¼b

t_X.

As we shall see these results are easy to apply when X is normal. Then Z3

3will also be normal and a central limit theorem

will apply.

3. Asymptotic results

If g : Rk_{-R is sufﬁciently regular we have, see}_{Khuri (2003)}

Y ¼ gð

l

þXÞ ¼ gð

l

Þ þg ð

l

ÞtX þ Z 1 0 ð1 hÞðXt_{g ð}

_l

_þ_{hXÞXÞ dh:} Thus, when JXJrd, jY Zj ¼ Z 1 0 ð1 hÞðXtg ð

l

þhXÞXÞ dh rrdð

l

Þd2

J.T. Mexia, M.M. Oliveira / Journal of Statistical Planning and Inference 140 (2010) 353–357 354

(4)

as well as jY3_Z3_j_rk

dðJ

l

JÞd2: We now establish

Proposition 3.1. If gðÞ is asymptotically linear, then jY3_Z3_j -a:s JmJ-1

0, where a.s. indicates almost sure convergence.

Proof. If gðÞ is asymptotically linear, whatever

e

40 and d40, there is uð

e

Þsuch that, for u4uð

e

Þ, kdðuÞo

e

. Thus, when

J

l

JZuð

e

Þ_{, JXJod implies jY}3_Z3_j_r

_e

_{. Whatever}_d4

0, we may choose d40 to ensure that PrðJXJodÞZ1 d, thus when J

l

JZuð

e

Þ, we will also have PrðjY3Z3jr

e

ÞZ1 dwhich establishes the thesis. &

Since kdðuÞ decreases with u we may deﬁne

uðd;

e

Þ ¼Minfu; kdðuÞd2r

e

g

thus, with lqthe p th quantile of the distribution of L ¼ JXJ2, whenever Lrlq, J

l

JZuqð

e

Þ ¼uðlq;

e

Þimplies

jY3_Z3_j_r

_e

:

So, if qð

e

Þis the probability of the event Að

e

Þ ¼ fjY3_Z3_jr

_e

_g_{we see that qð}

_e

_ÞZ

q, when J

l

_J4uqð

e

Þ.

Let Ac_ð

_e

_Þ_{be the complement of Að}

_e

_Þ_{and q}c_ð

_e

_{Þ ¼}_{1 qð}

_e

_{Þ. We now have}

Proposition 3.2. Whatever z and

e

40, FZ3ðz

e

Þ qcð

e

ÞrF_Y3ðzÞrF_Z3ðz þ

e

Þ þqcð

e

Þ.

Proof. The thesis follows from FY3ðzÞ ¼ PrðY

3_{rzÞ ¼ qð}

_e

_ÞPrðY3_rzjAð

_e

ÞÞ þqc_ð

_e

_ÞPrðY3_rzjAc_ð

_e

_ÞÞrqð

_e

_ÞPrðZ3_{rz þ}

_e

_jAð

_e

ÞÞ þqc_ð

_e

_Þ

and from

FZ3ðz

e

Þ qcð

e

Þrqð

e

ÞPrðZ3rz

e

jAð

e

ÞÞrqð

e

ÞPrðY3rzjAð

e

ÞÞrPrðY3rzÞ ¼ F_Y3ðzÞ: &

Corollary 3.1. When FZ3ðzÞ has density f_Z3ðzÞ bounded by C, we have SupfjF_YF_Zjg ¼SupfjF_Y3F_Z3jgrdð

e

Þ with

dð

e

Þ ¼2ðC

e

þqð

e

ÞÞ. Proof. Since FYðxÞ ¼ FY3 x gð

l

Þ Jg ð

l

ÞJ ! and FZðxÞ ¼ FZ3 x gð

l

Þ Jg ð

l

ÞJ !

we have SupfjFYFZjg ¼SupfjFY3F_Z3g. To complete the proof we have only to apply Proposition 3.1 remembering that,

since fZ3 is bounded by C; F_Z3ðz þ

e

Þ F_Z3ðz

e

Þr2C

e

. &

Corollary 3.2. When fZ3 is bounded by C and gðÞ is asymptotically linear, SupfjF_YF_Zjg

-JmJ-þ1

0. This last corollary gives conditions for FZ to approach FY uniformly when J

l

J!1.

Next we have

Proposition 3.3. If fZ3is bounded by C and if f_Z3ðzÞZmðdÞ40, whenever z3

ddðeÞrzrz 3 1dþdðeÞ, we have Supfjy3 pz 3 pj;drpr1 dgr2

e

þ dð

e

Þ mðdÞ :

Proof. According to Corollary 3.1 of Proposition 3.2, we have FZ3ðy 3 p

e

Þ dð

e

ÞrFY3ðy 3 pÞ ¼prFZ3ðy 3 pþ

e

Þ þdð

e

Þ

so FZ3ðy3_p

e

Þrp þ dð

e

Þ and p dð

e

ÞrF_Z3ðy3_pþ

e

Þ. Thus y3_p

e

rz_pþdð_e_Þ and z3

pdðeÞry 3

pþ

e

, so z3pdðeÞ

e

ry 3

prz3pþdðeÞþ

e

. To

complete the proof we have only to point out that

z3 pþdðeÞz 3 pdðeÞr 2dð

e

Þ mðdÞ: &

Corollary 3.3. If J

l

JZuqð

e

Þ, fZ3 is bounded by C and f_Z3ðzÞZmðdÞ40, then whenever z3

ddðeÞrzrz 3 1dþdðeÞ, we have Supfjy3 pz 3 pj;drpr1 dgr2ð

e

þdqð

e

ÞÞmðdÞÞ; with dqð

e

Þ ¼2ðC

e

þqcÞwhere qc¼1 q.

(5)

Proof. The thesis follows from Proposition 3.3 since uqð

e

Þ4q, when J

l

J4uqð

e

Þ. & We also have Proposition 3.4. If 1 Jg ð

l

ÞJg ð

l

ÞJm-J-1 b; FZ3 -JmJ-1 FZ3 3

this convergence being uniform when fZ3 is bounded by C and gð:Þ is asymptotically linear.

Proof. The ﬁrst part of the thesis follows from

1 Jg ð

l

ÞJg ð

l

Þ b !t X -a:s: JmJ-1 0 whenever 1 Jg ð

l

ÞJg ð

l

ÞJm-J-1 b:

The second part of the thesis follows from Proposition 3.3 since uqð

e

Þ4q, when J

l

J4uqð

e

Þ. &

4. Application: the normal case

Let X be normal with null mean vector and regular variance–covariance matrix M. Then

fZ3ðzÞ ¼ eJg ðlÞJ2ðzgðlÞÞ2=g ðlÞtMg ðlÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

g ð

l

ÞtMg ð

l

Þ q Jg ð

l

ÞJ is bounded by Cð

l

Þ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiJg ð

l

ÞJ 2

p

g ð

l

ÞtMg ð

l

Þ q :

Ifykis the smallest eigenvalue of M we have

Cð

l

ÞrC ¼ ffiffiffiffiffiffiffiffiffiffiffi1 2

p

yk

p :

With xpthe quantile for probability p of the standardized normal density, the quantiles for fZ3 will be

z3 pð

l

Þ ¼g ð

l

Þ þxp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi g ð

l

ÞtMg ð

l

Þ q Jg ð

l

ÞJ ; while the minimum of fZ3in ½z3

ddðeÞð

l

Þ; z 3 1dþdðeÞð

l

Þwill be mðdj

l

Þ ¼ e x2 ddðeÞ=2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

g ð

l

ÞtMg ð

l

Þ q Jg ð

l

ÞJ so that Cð

l

Þ mðdj

l

Þ¼e x2 ddðeÞ=2 and dð

e

j

l

Þ mðdj

l

Þ¼ 2ðCð

l

Þ

e

þqc_ð

_e

_ÞÞ mðdj

l

Þ ¼2e x2 ddðeÞ=2

e

_þ qc_ð

_e

ÞJgð

l

ÞJ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2

p

g ð

l

ÞtMg ð

l

Þ q 0 B @ 1 C Ar2ex2ddðeÞ=2

e

_þ q c_ð

_e

_Þ ffiffiffiffiffiffiffiffiffiffiffi 2

p

yk p ! :

We thus obtain bounds that do not depend on

l

, both for fZ3ðzÞ and for dð

e

j

l

Þ=mðdj

l

Þ. It is now easy to apply the previous

results to this case.

Moreover if X is normal with null mean vector and variance–covariance matrix M and if y1Z; . . . ; Zyh are the

eigenvalues of M with multiplicities g1; . . . ; gh,

L ¼ JXJ2

may be written (seeImhof, 1961) as a linear combinationPh_j¼1yj

w

2gj, of independent chi-square variates. When M is known

it is easy to compute the quantiles lp (Fonseca et al., 2007).

J.T. Mexia, M.M. Oliveira / Journal of Statistical Planning and Inference 140 (2010) 353–357 356

(6)

5. Final remarks

In this work we derived asymptotic and limit results for samples, with ﬁxed size, whose non-centrality diverges to þ1. In the case of normal samples a central limit theorem of a new type is established. Our approach is a variant of the d method. This variant will apply to asymptotically linear statistics gð

l

þXÞ such as the polynomials Pð

l

þXÞ and leads to a good control of the approximation involved (see Corollary 3.1, Proposition 3.2 and Corollary 3.3).

References

Areia, A., Oliveira, M.M., Mexia, J.T., 2008. Models for series of studies based on geometrical representation. Statistical Methodology 1, 277–288. Fonseca, M., Mexia, J.T., Zmys´lony, R., 2007. Jordan Algebras. Generating pivot variables and orthogonal normal models. Journal of Interdisciplinary

Mathematics 1, 305–326.

Imhof, J.P., 1961. Computing the distribution of quadratic forms in normal variables. Biometrics 48 (3–4), 419–426.

Ito, K., 1980. Robustness of anova and manocova test procedures. In: Krishnaiah, P.R. (Ed.), Handbook of Statistics, vol. I. North Holland, Amsterdam. Khuri, A.I., 2003. Advanced Calculus with Applications in Statistics, second ed. Wiley, New York.

Lazraq, A., Cleroux, R., 2001. The PLS multivariate regression model: testing the signiﬁcance of successive PLS components. Journal of Chemometrics 15, 523–536.

Lehmann, E.L., Casella, G., 2001. Theory of Point Estimation. Springer, Berlin.

Mexia, J.T., 1990. Best linear unbiased estimates, duality of F tests and the Scheffe´ multiple comparison method in presence of controlled heteroscedasticity. Computational Statistics & Data Analysis 10, 3.

Oliveira, M.M., Mexia, T.J., 2007a. ANOVA like analysis of matched series of studies with a common structure. Journal of Statistical Planning and Inference 137, 1862–1870.

Oliveira, M.M., Mexia, J.T., 2007b. Modelling series of studies with a common structure. Computational Statistics and Data Analysis 51, 5876–5885. Sabatier, R., Traissac, P., 1994. The ACT (STATIS method). Computational Statistics & Data Analysis 18, 97–119.