Estimability of variance components when all model matrices commute

(1)

Contents lists available atScienceDirect

Linear

Algebra

and

its

Applications

www.elsevier.com/locate/laa

Estimability

of

variance

components

when

all model

matrices

commute

R.A. Baileya,b,∗_,_Sandra _{S. Ferreira}c_, _{Dário Ferreira}c_,

Célia Nunesc

a _School_of_Mathematics_and_Statistics,_University_of _St_Andrews,_North_Haugh,

St Andrews,FifeKY169SS,UK

b _School_of _Mathematical_Sciences,_Queen_Mary,_University_of_London,_Mile_End

Road, LondonEC14NS,UK1

c

DepartmentofMathematicsandCenterofMathematicsandApplications, Universityof BeiraInterior,AvenidaMarquêsD’ÁvilaeBolama,6200Covilhã, Portugal

a r t i c l e i n f o a bs t r a c t

Article history:

Received22February2013 Accepted3November2015 Availableonline17December2015 SubmittedbyR.Brualdi MSC: 62J10 62K99 Keywords: Analysisofvariance Commutativity Mixedmodel

Orthogonalblockstructure Segregation

This paper deals with estimability of variance components inmixedmodelswhenall modelmatricescommute. Inthis situation, it is well known that the best linear unbiased estimators of ﬁxed eﬀects are the ordinary least squares estimators. If, in addition, thefamily of possible variance– covariancematricesformsanorthogonalblockstructure,then therearethesamenumberofvariancecomponentsasstrata, andthevariancecomponentsareallestimable ifandonlyif therearenon-zeroresidualdegreesoffreedomineachstratum. Weinvestigatethecasewherethefamilyofpossiblevariance– covariancematrices,whilestillcommutative,nolongerforms anorthogonalblockstructure.Nowthevariancecomponents may or may notall be estimable,but thereisno clear link withresidualdegreesoffreedom.Whetherornottheyareall estimable,theremayormaynotbeuniformlybestunbiased

* Correspondingauthorat:SchoolofMathematicsandStatistics,UniversityofStAndrews,NorthHaugh, St Andrews,FifeKY169SS,UK.

E-mailaddress:[email protected](R.A. Bailey).

1

Emerita.

http://dx.doi.org/10.1016/j.laa.2015.11.002

(2)

quadraticestimatorsof thosethat are estimable. Examples aregiventodemonstrateallfourpossibilities.

1. Someassumptionsaboutthelinear model

Let Y be a column vector of N random variables Y1,. . . ,YN. Write E(Y) for the

expectationofthevector Y,andV foritsvariance–covariancematrix.Inthissectionwe presentsomeof theassumptionsthatare commonlymadeabout E(Y) andV inorder tohavealinearmodelwithgoodproperties.

Assumption 1 (Linear expectation). There is a known integer n, a known N × n real

matrix X andanunknowncolumnvector τ oflength n suchthatE(Y)= Xτ .

UnderAssumption 1, letT be theN × N matrixoforthogonal projection ontothe column-space of X. Then T = X(XX)+X, where + denotes the Moore–Penrose generalizedinverse:see textssuchas [11,23].Also, letGN be thematrixof orthogonal projectionontothespaceW spannedbytheall-1 vector1,sothatGN= N−1JN,where JN istheN× N matrixwhoseentriesareallequalto 1.Itisoftenthecasethat1 isin thecolumn-spaceofX. Thishappensifand onlyifTGN= GNT= GN:see[11,39]. Assumption 2(Spectral formof variance–covariance matrix).There areknown orthog-onalsymmetricidempotentmatricesQ0,. . . ,Qm summingtotheidentitymatrixIN of order N ,andnon-negativescalarsγ0,. . . ,γmsuchthat

V = m i=0

γiQi. (1)

Thisassumptionsaysthatthescalarsγ0,. . . ,γmaretheeigenvaluesof V;the corre-spondingeigenspaces are thecolumnspaces ofQ0,. . . ,Qm,and these are known.It is

oftenthecasethatW isoneoftheeigenspaces:inthatcase,welabelthespacessothat Qm= GN.

Assumption3(Norelationsamongtheeigenvalues).Assumption 2istrueandthereare nofurtherconstraintsonthevaluesofγ0,. . . ,γm.

A weaker form of this, given in [37], is that there are no linear constraints on

γ0,. . . ,γm.Hereis analternativeweakerform.

Assumption4(Coneoffulldimension).Assumption 2istrueandthefamilyofpossible matricesV formsapositiveconeofdimension m+1 inthespacespannedbyQ0,. . . ,Qm.

(3)

There aretwocommonwaysofjustifying Assumption 2.Theﬁrststartswithknown factorswithrandomeﬀects.Afactorissimplyafunctionassigningvariousdiscretelevels to eachoftheunits1,. . . ,N .Forj = 0,. . . ,w,letMj betheN× N relationmatrixfor the j-thfactor:its(α,β)-entry isequalto 1 ifthis factorhasthesamelevelonunitsα

and β; otherwiseit isequalto 0: see [4,5]. Wealwaysincludethe trivialfactorwith N

diﬀerent levelsandlabelitas the0-thfactor,so thatM0= IN.

Assumption5(Mixedmodel).Therearew + 1 factorswithrandomeﬀects.Therelation matricesM0,. . . ,Mwofthesefactorsareknown,andM0= IN.Therearealsounknown

non-negative numbersσ2 0,. . . ,σw2 suchthat V = w j=0 σ_j2Mj. (2)

Norelationshipsareassumedamong σ2

0,. . . ,σw2.

Assumption 6 (Mixed model with linear independence). Assumption 5 is true and M0,. . . ,Mw arelinearlyindependent.

When Assumption 6 is true, there is a uniqueexpression for the right-handside of Equation (2). For an examplesatisfying Assumption 5 but notAssumption 6, use the ﬁvefactorsina2× 2 Latinsquare: seeTjur[42,§7.3].

Assumption7(Commutativityof relationmatrices).Assumption 6istrueandMiMj = MjMi for0 i< j w.

Proposition 1. If Assumption 7 is true, then M0,. . . ,Mw generate a commutative

al-gebra A of symmetric matrices. If the dimension of A is m+ 1, then m w and Equation (2) can be written as (1), where Q0,. . . ,Qm are the primitive idempotents

of A.Thenthere areknown scalarsbij such thatγi= w

j=0bijσj2 fori= 0,. . . ,m. Under Assumptions 1 and 7, Fonseca et al. [17–19] call the matrices XX and M0,. . . ,Mw model matrices. In view of the diﬀerent roles played by expectation and variance,andtoallowfortreatmentswithunequalreplication,weprefertousetheterm ‘modelmatrices’ forT and M0,. . . ,Mw.

TheothercommonwayofjustifyingAssumption 2comesfromconsideringthepattern of theentriesin V,whichmaywellbejustiﬁed byrandomization, asin[1,2,27]. Assumption 8(Patterns of covariance).There are knownnon-zerosymmetric matrices A0,. . . ,Aw summing to JN, suchthatA0 = IN and allentries inAi are in{0,1} for

(4)

i= 1,. . . ,w,andalsoanon-negativenumberσ2,andnumbersρ1,. . . ,ρwin[−1,1] such that V = σ2A0+ σ2 w i=1 ρiAi.

Theonlyfurtherconditionsassumedonρ1,. . . ,ρw arethatV isnon-negativedeﬁnite. Assumption9(Commutativity of pattern matrices).Assumption 8 istrueand AiAj = AjAi for0 i< j w.

Assumption 9gives a result similar to Proposition 1, with Mi replaced by Ai. For simplicity, from now on we use thenotation Mi inboth cases. Furthermore, let B be the(m+ 1)× (w + 1) matrixwithentriesbij.

Theﬁnalassumption inthis sectionrelatestheexpectationpartofthemodeltothe variance–covariancepart.

Assumption 10 (Commutativity between expectation and variance matrices). Assump-tions 1 and2aretrue,andthematrixT commutes withQi fori= 0,. . . ,m.

UnderAssumption 10, putTi = TQi = QiT andPi = Qi− Ti fori= 0,. . . ,m. If thecolumn-spacesofX andQi areorthogonalto eachotherthenTi iszero;otherwise, Ti isthematrixof orthogonalprojectionontotheintersection Ui ofthese spaces.IfUi isthe wholeofthe column-spaceWi of Qi then Ti = Qi andPi = 0; otherwise,Pi is the matrix of orthogonal projection onto the space Wi∩ Ui⊥, which is the orthogonal complement of Ui in Wi. Let di be the rank of Pi, so that di = 0 if Pi = 0 and

di= dim(Wi)− dim(Ui) otherwise.

UnderAssumption 1, theordinaryleastsquares(OLS) estimatorτˆOLS of τ is given

byτˆOLS= (XX)+XY = (XX)+XTY.IfV isknown,thenthegeneralized

least-squares (GLS) estimator τˆGLS of τ is given by τˆGLS = (XV−1X)+XV−1Y. The

importanceofAssumption 10isshownbythefollowingtheorem,whichcanbe foundin

[20,22,24,26,33,44],forexample.

Theorem 1.Under Assumptions 1 and2, Assumption 10is equivalent to the condition that theOLSestimatorof τ isthesame astheGLS estimatorof τ no matterwhat the valuesofγ0,. . . ,γm are.

Morerecently,Assumption 10hasbeencalled‘equivalentestimation’;see,forexample,

[25,30,43]. In [8], Brown says that ‘ANOVA exists’ if and only if Assumption 10 is satisﬁed.

(5)

2. Orthogonalblockstructure

Many authors havegiven conditions on V whichensure thatAssumption 2 istrue. A factor issaidtobebalancedifall itslevels occuronthesamenumberofunits. Assumption11(Tjurblock structure).Assumption 7istrue,allthefactorswithrandom eﬀects arebalanced,andM0,. . . ,MwspanA.

In [42], Tjur discussed variance–covariance models satisfying Assumption 11. He showed that this implies that m = w. Thus the matrix B is square and invertible, and there is no linear relationship speciﬁed among γ0,. . . ,γm, but the non-negativity

of σ2₀,. . . ,σ2_w implies that Assumption 3 is not satisﬁed if m ≥ 1. He further showed thatitispossibleto labeltherelationmatricesandtheprimitiveidempotentsinsucha way thatthematrix B islowertriangular:weadoptthisconventioninExamples 1–3,5 and 6.

Nelder [27] and Bailey [4,6] deﬁned an orthogonal block structure (OBS) to be a collection of factors satisfying some extra conditions in addition to Tjur’s, but with

Assumption 7replacedbyAssumption 9.

Houtman and Speed [21] generalized the definition of OBS to mean any family of variance–covariancematricessatisfyingAssumption 3.Theyexplicitlyspecifiedthatthe family must consist of allpositive semi-definitematrices of theform (1), and gave the following twoexamples,whichconsideronlythestructure ofV,toclarifythispoint. Example 1.Here N = bk, and the unitsaregrouped into b blocks of size k. Theusual mixedmodelgivesV = σ₀2M0+ σ21M1,where M0= IN andM1 istherelation matrix

forblocks.Then

V = γ0Q0+ γ1Q1, (3)

where Q1= k−1M1,Q0= IN− Q1,γ0= σ02andγ1= σ02+ kσ12.ThisisanOBS,apart

from thepositivityconstraintγ1 γ0.

On the other hand,therandomization model of[1,2] gives V = σ2_A

0+ σ2(ρ1A1+

ρ2A2),whereA0= IN,A1= M1− IN andA2= JN− M1.Then

V = γ2Q2+ γ3Q3+ γ4Q4, (4)

where we startthe numberingat 2 to avoidconfusion with(3). Here Q4 = GN, Q3 =

Q1−Q4,Q2= Q0,γ2= σ2(1−ρ1),γ3= σ2[1+(k−1)ρ1−kρ2] andγ4= σ2[1+(k−1)ρ1+

(N − k)ρ2]. This is adiﬀerent OBS. The usual mixed modelgives the constraintthat

γ3 = γ4, andso expression (4)shouldnotbe usedto show thatits variance–covariance

structure isanOBS.

Example 2. Here N = rc, and the units are arranged in a rectangle with r rows and

(6)

eﬀectsarerandom:M0= IN,M1andM2aretherelationmatricesforrowsandcolumns

respectively,andM3= JN.This deﬁnesanOBSwith Q3= GN, Q2= r−1M2− GN, Q1= c−1M1− GN andQ0= IN− Q1− Q2− Q3. Moreover,γ0= σ20,γ1 = σ02+ cσ12,

γ2= σ20+ rσ22andγ3= σ20+ cσ12+ rσ22+ N σ32.Nelderexplicitlyallowsσ2i tobenegative solongasγ0,. . . ,γ3are allnon-negative.

AnalternativemixedmodelomitsM3,so thatσ32= 0.However,M1M2= M2M1=

M3,and so thespectraldecompositionof V isstill V = γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3.

Now γ3 = γ2+ γ1− γ0, and this linear constraint implies that the family of possible

matricesV doesnotformanOBS.

Unfortunately, some later authors, such as Bailey [3] and Caliński and Kageyama

[9],deﬁnedOBSwithoutincluding theconditionof nolinearrelationshipamong theγ

parameters.ThereisnowsomeconfusionaboutwhatanOBSis.Ferreiraetal.[16]tryto clarifythediﬀerencebycallingAssumption 3 OBSandAssumption 2generalizedOBS, while Bailey and Brien [7] call them orthogonal variance structure and commutative variancestructure respectively.FortheremainderofthispaperweuseAssumption 4as ourdeﬁnitionof OBS, so thatitincludes classeslike (3) withthe positivityconstraint

γ1 γ0.

The combination of the Nelder–Bailey type of OBS with Assumption 10 is called simply‘orthogonality’in[6].ThecombinationofAssumption 7(sometimeswithoutlinear independence of M0,. . . ,Mw) with Assumption 10 is called ‘commutative orthogonal blockstructure’(COBS)in[10,15,19,29].

EstimationunderAssumption 10is straightforwardandwell-known. ByTheorem 1, ˆ

τOLS = ˆτGLS.The column-spaceof Qi is called thei-th stratum in[2,6,27,28], andin thestatistical softwareGenStat [31,32], anddi is calledthenumberof residualdegrees offreedominthei-th stratum.

Consideravalueofi suchthatTi= 0.Thestandarderrorofanyscalarlinearfunction ofTiXτ isproportionalto√γi.Thusweusuallywanttoestimateγ0,. . . ,γmaswellas τ . Proposition2.Supposethat Assumptions 3 and 10 hold.

(a) Ifdi= 0 thenPiY2/di isanunbiasedestimatorfor γi.

(b) Ifdi= 0,thedistributionofY ismultivariatenormal,ti= rank(Ti) andTiXτ = 0,

thendiTiY2/tiPiY2 hasan F-distributionon ti anddi degreesoffreedom. (c) Ifdi= 0 thenthere isnounbiasedestimatorforγi.

Proof. (a)and (b)See[36]. (c)See[37]. 2

From (b), when di = 0 then the quantity diTiY2/tiPiY2 is used as the test statisticfor testing thenull hypothesisthatTiXτ = 0. If di = 0 then this hypothesis cannotbetested.

(7)

Thefollowingtwoverysimpleexamplesdemonstratetwofeaturesofthisprocessthat are oftenoverlooked. Theﬁrstfeatureisdiscussed byTjurin[42, §7.3].

Example 3. Consider an n× n Latin square, with expectation corresponding to the

n letters.ThenN = n2.TheapproachofNelder[27,28]givesamixedmodelwithrandom factorslikethoseinExample 2:M0= IN,M1 andM2correspondtorowsandcolumns

respectively,and M3= JN. ThenQ3= GN,Q2 = n−1M2− GN, Q1 = n−1M1− GN and Q0 = IN− Q1− Q2− Q3, while γ0 = σ20, γ1 = σ20+ nσ12, γ2 = σ02+ nσ22 and

γ3= σ20+ nσ21+ nσ22+ n2σ23.

Because thetreatmentsareappliedinaLatinsquare,T1= T2= 0 andT3= GN = Q3.Henced3= 0,andso itisimpossibletoestimateγ3 orσ23.

Put τ = n¯ −1(τ1+· · · + τn),so thatT3Xτ = ¯τ 1.It isimpossible to giveastandard

error fortheestimateof ¯τ ,and thereisno testofthehypothesis thatτ = 0.¯ This may bewhythelinefortheoverallmeanisoftenomittedfromtheanalysis-of-variancetable. Example 4.Consider the two-sample t-test. Then n = 2. Suppose thattreatment 1 is applied to the ﬁrst r units and that treatment 2 is applied to the remaining s units,

where r + s= N ,r 2 ands 2.LetQ0bethediagonalmatrixwhoseﬁrstr diagonal

entries areequalto 1,theremainderbeing 0,andputQ1= IN− Q0.

Assumethat

V = γ0Q0+ γ1Q1. (5)

(Usually wewouldwrite γi asσ2i,butwewantto be consistentwith(1).)If webelieve that γ0 and γ1 are diﬀerent then we haveOBS with Assumption 10, and we estimate

thestandarderrorofthediﬀerenceτˆ1− ˆτ2 as

ˆ γ0 r + ˆ γ1 s, where ˆγ0=P0Y2/(r− 1) andγˆ1=P1Y2/(s− 1).

Ontheotherhand,ifwebelievethatγ0= γ1= γ thenexpression(5)isnotOBS.In

thiscase,P0Y2/(r− 1) andP1Y2/(s− 1) arebothunbiasedestimatorsofγ.They

are usuallycombinedtogivetheestimator

P0Y2+P1Y2

N− 2 =

(IN− T)Y2

N− 2 ,

whichcomes directlyfrom writingV = γIN,whichisindeedanOBS. 3. Linearrelationshipsamongthe eigenvalues

Fromnowon,weassumeAssumption 10andeitherAssumption 7orAssumption 9,so thatthematrix B isdeﬁnedanditscolumnsarelinearlyindependent.Forsimplicity,we

(8)

discussonlyAssumption 7,buttheresultsholdwheneverthematricesMiaresymmetric and commute with each other. In particular, writing Mi as Ai gives Assumption 9. Finally,we assumethatm> w,so thatthere is atleast onelinearrelationshipamong

γ0,. . . ,γm.

Theorem 2. Suppose that there are exactly t+ 1 values of i for which di = 0. Label

the primitive idempotents in such a way that di = 0 if and only if t < i m. For

i= 0,. . . ,t, putγˆi =PiY|2/di,so that γˆi is anunbiased estimatorfor γi.Let B be˜

the(t+ 1)× (w + 1) matrixobtainedfromtheﬁrst(t+ 1) rowsof B.Letθ =t_i=0λiγi,

whereλ0,. . . ,λtare known scalars.

(i) If therows of B are˜ linearly independent andthe columnsof B are˜ linearly inde-pendent,thent= w andB is˜ invertible. ThusB˜−1(ˆγ0,. . . ,γˆw) gives anunbiased

estimateof(σ2

0,. . . ,σ2w) andsoB ˜B−1(ˆγ0,. . . ,ˆγw) gives anunbiasedestimateof (γ0,. . . ,γm).

(ii) Thequantityt_i=0λiγˆi isanunbiasedestimatorforθ.IftherowsofB are˜ linearly

independent,thennootherlinearcombinationofγˆ0,. . . ,ˆγtisanunbiasedestimator

for θ.

(iii) If the rows of B are˜ not linearly independent, then there is at least one i with

0 i t such that there isa linearcombination of γˆ0,. . . ,γˆt otherthanγˆi which

isanunbiasedestimatorfor γi.Thisnon-uniquenessof estimationthenpropagates

tosomeof σ2

0,. . . ,σw2 andhence may propagate tosomeof γt+1,. . . ,γm. (iv) If the columns of B are˜ linearly independent, then each of σ2

0,. . . ,σ2w, and hence

each of γ0,. . . ,γm,can be estimated unbiasedly by at leastone linear combination

ofγˆ0,. . . ,γˆt.

(v) If the columns of B are˜ not linearly independent, then there is atleast one value of i suchthat nolinearcombination of ˆγ0,. . . ,ˆγt gives anunbiasedestimate of γi. Proof. (i),(ii)and(iii)areimmediate.

(iv) SeeFerreira[12].

(v) IfthecolumnsofB are˜ notlinearlyindependent,thenthereisatleastonevalueof j suchthatσ2

j isnotalinearcombinationofγ0,. . . ,γt.Hencenolinearcombination of γˆ0,. . . ,ˆγt gives anunbiased estimate of σj2. There is at least onevalue of i for whichbij= 0:foranysuch i,thereisnolinearcombinationofγˆ0,. . . ,ˆγtwhichgives anunbiasedestimateofγi. 2

WhenB is˜ invertible,the only diﬀerence from Section 2is that, when w < i m,

the estimate of γi may be a linearcombination of two or moremean squares, and so Satterthwaite’sapproximation[35]mustbeusedforperforminganF-test.Sincem> w,

(9)

WhenthecolumnsofB are˜ linearlyindependent,theneachparameterσ_i2hasatleast oneunbiasedestimatorwhichisalinearcombinationoftheresidualmeansquares.In[8], Brown saysthat(σ₀2,. . . ,σw2) is ‘estimable’ifand onlyifthis happens.This property is called‘segregation’in[13–16].

WhenthecolumnsofB are˜ notlinearlyindependentthenthereisoneormorevalues of i for whichthere is no unbiased estimatorof γi. In suchcases, clearlyi > t,and so Ti = Qi = 0.Thusthereisnoestimatorofastandarderrorforanyscalarlinearfunction of TiXτ ,and there isnotest ofthehypothesis thatTiXτ = 0.If thisoccurs for only onevalueofi,andthecorrespondingprimitiveidempotentisGN,asinExample 3,then this maynotberegardedas problematic.

When the rows of B are˜ not linearly independent, which estimator shouldwe use? Should we combine the diﬀerent estimators in some way? In Example 4, there was a simpleanswer,givenbyreplacingmodel(5)byanOBS,butthissolutionisnotavailable ingeneral.

The set of residual mean squares forms a minimal suﬃcient set of statistics for

γ0,. . . ,γm when the columns of B are˜ linearly independent: see [8]. However, Seely showed in[38] thatitis notcomplete ifthe rowsof B are˜ linearly dependent,because nounbiased linearcombination isuniformlybetterthananyother.

Onepossibility isto maximizethelikelihood of(IN− T)Y under theassumptionof multivariate normality.However,Szatrowski[40]andSzatrowskiandMiller[41]showed thatthereisnoclosed-formsolutionwhentherowsofB are˜ linearlydependent.

Whenneithertherowsnorthecolumns ofB are˜ linearlyindependent,wehaveboth problems at the same time: inestimability of some variance components but multiple estimatesof others.

There arefour possiblecombinations oflinearindependence/dependenceoftherows of B with˜ linearindependence/dependenceof thecolumns of B.˜ Weshow inSection4

thatallfourpossibilities canoccur. 4. Examples

This section contains two examples to demonstrate all the behaviour described in Section 3. Each considers avariance modelsatisfying Assumption 7 with m > w,and comparesitwiththemodelsatisfyingAssumption 3withthesameidempotents.Several different modelsforexpectationareusedineachcase,allsatisfyingAssumption 10and alldefinedbythecombinationsoflevelsofoneormorefactors.Inordertomaintainthe samenotationfortheprimitiveidempotentswhileusingdifferentmodelsforexpectation, itis notalwayspossible forthestrata forwhichdi= 0 tobe labelled0,. . . ,t.

Example 5. Suppose that N = 96, and that the units are partitioned into six blocks, each of which is a4× 4 array, so thatthe 24 rows and 24 columns are nested within blocks:seeFig. 1.

One practical instance of this occurs in consumer testing. An experiment uses 24 volunteer consumers during 24 weeks. The weeks are divided into six groups of four

(10)

Fig. 1. Structure of the 96 units inExample 5.

weeks each, so that each group is approximately one month. The consumers are also partitionedintosixgroupsoffour.Fori= 1,. . . ,6,theconsumersingroup i participate duringalltheweeksinmonth i, andinnoothers. Eachvolunteeris givenapacketof a typeofcoﬀeeduringeachweekinwhichheorsheparticipates.Heorsheusesthatcoﬀee allweek,andgivesitascoreattheendoftheweek.Herethemonthsaretheblocks,the weeksaretherowsandtheconsumersarethecolumns.

A second instance of this structure is an experiment on feeds for lactating cows. Six diﬀerent pensare used, oneineach four-week‘month’. Twenty-four cows are used, four per pen. Each cowis given onetype of feed throughout eachweek, and her milk production for that week is recorded. Of course, if cows inthe same pen get diﬀerent feeds in the same week then they must be fed individually. Now the pens, weeks and cowsaretheblocks,rowsandcolumnsrespectively.

Inmoststatisticalsoftware,thisstructure iscreatedbyﬁrstdeclaringfactorsblocks, rowandcolumnswithsix,fourandfourlevelsrespectively.Thenoneunitiscreatedfor eachcombinationof levels.Finally,theexperimentalstructureis declaredbyaformula suchas

blocks/(rows∗ columns), whichisusedinGenStat[31,32]and R[34].

PutM0= I96.LetM1,M2 andM3be therelationmatricescorresponding torows,

columns andblocksrespectively.ThenM1M2= M2M1= M3.Theprimitive

idempo-tentsofthealgebragenerated byM0,M1,M2and M3areQ0,Q1,Q2and Q3,where

Q3= 16−1M3,Q2= 4−1M2− Q3,Q1= 4−1M1− Q3 andQ0= M0− Q1− Q2− Q3.

The Appendix shows the R code which generates these matrices, for one systematic orderingoftheunits.

Firstweconsider theorthogonalblockstructure deﬁnedby V = σ₀2M0+ σ21M1+ σ22M2+ σ32M3

(11)

Table 1

SkeletonanalysisofvariancetablesinExample 5.

Stratum df Source df Source df Source df Source df

Q3 6 mean 1 mean 1 mean 1 mean 1

A 2 H 5 H 5

residual 5 residual 3

Q2 18 G 3 G 3

G.H 15

residual 15 residual 18 residual 18

Q1 18 F 3

residual 15 residual 18 residual 18 residual 18

Q0 54 F.G 9

residual 45 residual 54 residual 54 residual 54 Expectation model (a) model (b) model (c) model (d)

where γ0= σ20,γ1= σ02+ 4σ21,γ2= σ02+ 4σ22 andγ3= σ20+ 4σ12+ 4σ22+ 16σ23.Thereis

noassumed linearrelationshipamongγ0,. . . ,γ3,noramongσ02,. . . ,σ23.

Weconsider fourdiﬀerentmodels forexpectation.

(a) Treatment factorsF and G eachhavefourlevels. Within eachblock,levelsof F

are randomlyappliedto thefourrowsandlevelsofG arerandomlyappliedtothefour columns. Each combination of levels of F and G givesan entry inτ , so that n = 16. ThenT0correspondstotheinteractionF.G,T1tothemaineﬀectofF ,T2tothemain

eﬀectofG,andT3 totheoverallmean.

Part (a)ofTable 1givestheskeletonanalysisofvariance,includingtheoverallmean. It showsthatd0= 45, d1= d2= 15 andd3 = 5.Thereis oneresidualmeansquare for

eachγ-parameter.

(b) TreatmentfactorA hasthree levels,randomly appliedto twowhole blockseach. Each level of A givesan entry inτ , so thatn = 3.Now T= T3, which includesboth

the main eﬀect of A and the overall mean. As part (b) of Table 1 shows, d0 = 54,

d1= d2= 18,d3= 3,andthere isstilloneresidualmeansquare foreachγ-parameter.

(c) Treatment factor H is like treatment factor A, except that it has six levels. As part (c)ofTable 1shows,t= 2 andthereisnoestimatorforγ3.

(d) The treatment factors are G, as in (a), and H, as in (c). Expectation eﬀects correspondtotheoverallmean,themaineﬀectsofG andH,andtheirinteractionG.H.

Part (d) ofTable 1showsthatthereisnoestimatorforγ3or γ2.

Now wesuppose thatσ2

3= 0.Theprimitiveidempotentsofthealgebragenerated by

M0,M1 andM2 arestillQ0,. . . ,Q3.Now

V = σ2₀M0+ σ21M1+ σ22M2

= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3,

where γ0= σ20,γ1= σ02+ 4σ12, γ2= σ20+ 4σ22 andγ3= σ20+ 4σ12+ 4σ22= γ1+ γ2− γ0.

Iftheexpectationfollowsmodel (a),thenpart (a)ofTable 1showsthattherearefour independentresidualmeansquareswhoseexpectationsareγ0,. . . ,γ3.Thereistherefore

(12)

Fig. 2. Structure of the 96 units inExample 6.

theproblem of deciding, for example,whetherto estimate γ3 by ˆγ3 or ˆγ1+ ˆγ2− ˆγ0 or

someweightedaverageofthese.Model (b)hasthesameproblem.

Ifthe expectationfollows model (c) then part (c) of Table 1 showsthat t= w = 2. Moreover, ˜ B = ⎡ ⎢ ⎣ 1 0 0 1 4 0 1 0 4 ⎤ ⎥ ⎦ ,

whichisinvertible.Theuniqueestimatorofγ3isˆγ1+ˆγ2−ˆγ0.Thiscanbeusedtoestimate

standarderrorsofdiﬀerencesbetweenlevelsofH,andforanapproximateF-testforH,

eventhoughd3= 0.

Ifthe expectationfollows model (d)then part (d) ofTable 1 shows thatt= 1. The parametersγ0 andγ1 canbeestimated.Henceσ20 and σ12 canbeestimated,butσ22,γ2

and γ3 cannot. So there are noF-tests for any treatment eﬀects,and no estimatorsof

standarderrorsofdiﬀerences.

Example6.Inamodiﬁedversionofthecow-feedingexperiment,allsixpensareusedat thesametimeforasingle‘month’offourweeks.Thusrowsarenowcrossedwithblocks, andtheformulafortheexperimentalstructure is

rows∗ (blocks/columns) : seeFig. 2.

PutM0= I96.LetM1betherelationmatrixforrow-blockintersections.LetM2,M3

andM4 be therelationmatrices forcolumns,blocksand wholerows, respectively,and

letM5 = J96. Theprimitive idempotents ofthe algebragenerated by M0,. . . ,M5 are

Q0,. . . ,Q5,whereQ5= 96−1M5= G96,Q4= 24−1M4−Q5,Q3= 16−1M3−Q5,Q2=

4−1M2−16−1M3,Q1= 4−1M1−Q3−Q4−Q5,andQ0= M0−Q1−Q2−Q3−Q4−Q5.

Thecorrespondingorthogonalblockstructurehas

V = σ₀2M0+ σ12M1+ σ22M2+ σ32M3+ σ42M4+ σ25M5

= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3+ γ4Q4+ γ5Q5,

where γ0 = σ20, γ1 = σ20 + 4σ21, γ2 = σ20 + 4σ22, γ3 = σ20 + 4σ12 + 4σ22 + 16σ32, γ4 =

σ02+ 4σ12+ 24σ42 andγ5= σ20+ 4σ12+ 4σ22+ 16σ32+ 24σ42+ 96σ25.

(13)

Table 2

SkeletonanalysisofvariancetablesinExample 6.

Stratum df Source df Source df Source df

Q5 1 mean 1 mean 1 mean 1

Q4 3 F 3

Q3 5 A 2 H 5

Q2 18 G 3

residual 18 residual 15 residual 18

Q1 15 residual 15 residual 15 residual 18

Q0 54 F.G 9

residual 54 residual 45 residual 54 Expectation model (a) model (b) model (c)

(a) Treatment factor A has three levels, each applied to two whole blocks, so that

n= 3.Part (a)ofTable 2givestheskeletonanalysisof variance.Itshowsthatthereis oneresidualmean squareforeachof γ0,. . . ,γ4.As inExample 3, thereis noestimator

forγ5 andnotestofthehypothesisthat¯τ = 0.

(b)TreatmentfactorsF andG eachhavefourlevels.EachlevelofF isappliedtoone wholerow.Levels ofG areappliedtothefour columns withineachblock.There isone entryin τ for eachcombinationoflevelsofF and G,sothatn= 16.Part (b)ofTable 2

shows thatd4= d5= 0, sothatthere is noestimateof γ4 orγ5,no testfor τ = 0 and¯

notest forthemain eﬀectof F .

(c) Treatment factor H has six levels, each of which is appliedto onewhole block. Now part (c)ofTable 2showsthatthereisnoestimatorforγ3or γ5.

Amixedmodelforthisstructuremighthaveσ2

3= σ52= 0.(Thelabellingofthe

matri-ces M0,. . . ,M5 ischosensothatM0,. . . ,M3 denotethesamematricesinExamples 5

and 6.)However,M1M2 = M2M1 = M3 and M2M4 = M4M2 = M5,so the algebra

generated byM0,M1,M2and M4 stillhasprimitiveidempotentsQ0,. . . ,Q5.Now

V = σ₀2M0+ σ12M1+ σ22M2+ σ24M4

= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3+ γ4Q4+ γ5Q5,

whereγ0= σ02,γ1= σ02+ 4σ21,γ2= σ20+ 4σ22,γ3= σ02+ 4σ21+ 4σ22,γ4= σ02+ 4σ12+ 24σ42

and γ5= σ20+ 4σ12+ 4σ22+ 24σ42.

Iftheexpectationfollowsmodel (a)thenpart (a)ofTable 2showsthatthereareﬁve independentresidualmeansquareswhoseexpectationsareγ0,. . . ,γ4.However,γ3= γ1+

γ2− γ0,soeachofγ0,. . . ,γ3canbeunbiasedlyestimatedbyseverallinearcombinations

ofthese meansquares.Sinceγ5= γ3+ γ4− γ1,itisalsopossibletoestimateγ5,andso

(14)

Undermodel (b),part (b)of Table 2showsthatt= w = 3, withdi= 0 wheni= 0, 1,2 and3.Thus ˜ B = ⎡ ⎢ ⎢ ⎢ ⎣ 1 0 0 0 1 4 0 0 1 0 4 0 1 4 4 0 ⎤ ⎥ ⎥ ⎥ ⎦.

Neithertherowsnorthe columnsof B are˜ linearlyindependent. Thereis noestimator forσ₄2, γ4 orγ5,andhencenotestforthemaineﬀectofF .However,therearemultiple

estimatorsforγ0,γ1,γ2,γ3,σ20,σ21 andσ22.

Ontheotherhand,iftheexpectationfollowsmodel(c)thenpart (c)ofTable 2shows thatagaint = w = 3, with di = 0 wheni= 0, 1, 2 and4. Labellingtherowsof B by˜

γ0,γ1,γ2 andγ4gives ˜ B = ⎡ ⎢ ⎢ ⎢ ⎣ 1 0 0 0 1 4 0 0 1 0 4 0 1 4 0 24 ⎤ ⎥ ⎥ ⎥ ⎦,

which is invertible. Hence there are unique linear combinations of the residual mean squareswhichprovideunbiased estimatorsforeachofγ0,. . . ,γ5,σ20,σ21,σ22 andσ24.All

treatmenteﬀects,includingtheoverallmean,canbetestedandassignedstandarderrors. 5. Conclusion

Itiswidelybelievedthatestimationandinferencearestraightforwardformixed mod-els in which all possible variance–covariance matrices commute with each other and with thematrix of orthogonal projection ontothe space of allpossible ﬁtted values of the expectation vector. As summarized in Section 2, this is indeed true when the di-mension m+ 1 ofthecommutativealgebraspanned byallpossiblevariance–covariance matricesisequaltothenumber w + 1 oflinearlyindependentunknownvariance compo-nents.However,whenm> w thentherearefourpossibilitiesforestimabilityofvariance components. We recommend summarizing the data in an analysis-of-variance tablein every case. However, the criterion for unique estimability of all variance components is nolonger a functionof how many residual degreesof freedom are non-zero: whatis neededisthatthematrixB introduced˜ inSection3beinvertible.

The examples discussed in Section 4 show realistic experimental situations where diﬀerent assumptions about the variance–covariance matrix, combined with diﬀerent simplemethodsofassigningtreatmentstoexperimentalunits,canleadtoallfourtypes of behaviour: variance components may be all estimable or not, and their estimators mayormaynotbeuniquelinearcombinationsoftheresidualmeansquares.Thisshows that, when an experiment is being planned, it is advisable to construct the skeleton

(15)

analysis-of-variancetableandﬁndthepropertiesofthematrixB,˜ beforetheexperiment is carriedout.

Acknowledgements

ThisworkwaspartiallysupportedbynationalfundsofFCT–FoundationforScience and TechnologyunderUID/MAT/00212/2013.

Appendix A. Rcode a=4 b=4 c=6 Ia= as.matrix(diag(a)) Ib= as.matrix(diag(b)) Ic= as.matrix(diag(c)) Ja= rep(1,a) Jb= rep(1,b) Jc= rep(1,c) X0=Ic%x%Ib%x%Ia X1=Ic%x%Jb%x%Ia X2=Ic%x%Ib%x%Ja X3=Ic%x%Jb%x%Ja M0=X0%*%t(X0) M1=X1%*%t(X1) M2=X2%*%t(X2) M3=X3%*%t(X3) Q3=(1/b)*(1/a)*M3 Q2=(1/a)*M2-Q3 Q1=(1/b)*M1-Q3 Q0=M0-Q1-Q2-Q3 References

[1]R.A.Bailey,Auniﬁed approachto designofexperiments,J.Roy.Statist.Soc.Ser. A144(1981) 214–223.

[2]R.A.Bailey,Strataforrandomized experiments(withdiscussion),J.Roy.Statist.Soc.Ser. B53 (1991)27–78.

(16)

[3]R.A.Bailey,Generalbalance:artiﬁcialtheoryorpracticalrelevance?,in:T.Caliński,R.Kala(Eds.), ProceedingsoftheInternationalConferenceonLinearStatisticalInference,LINSTAT’93,Kluwer, Dordrecht,Netherlands,1994,pp. 171–184.

[4]R.A.Bailey,Orthogonalpartitionsindesignedexperiments,Des.CodesCryptogr.8(1996)45–77. [5]R.A. Bailey, Association Schemes, CambridgeStud. Adv.Math., vol. 84, Cambridge University

Press,Cambridge,2004.

[6]R.A. Bailey, Design of Comparative Experiments, Camb. Ser. Stat. Probab. Math., Cambridge UniversityPress,Cambridge,2008.

[7]R.A. Bailey, C.J.Brien,Randomization-based modelsfor multitieredexperiments: I.Achain of randomizations,Ann.Statist.(2016),inpress.

[8]K.G.Brown,Onanalysisofvarianceinthemixedmodel,Ann.Statist.12(1984)1488–1499. [9]T.Caliński,S.Kageyama,Block Designs:ARandomization Approach. Vol.I: Analysis, Lecture

NotesinStatist.,vol. 150,Springer-Verlag,NewYork,2000.

[10]F.Carvalho,J.T.Mexia,C.Santos,C.Nunes,Inferencefortypesandstructuredfamiliesof com-mutativeorthogonalblockstructures,Metrika78(2015)337–372.

[11]R.Christensen,PlaneAnswerstoComplexQuestions,SpringerTextsStatist.,Springer-Verlag,New York,1987.

[12]S.S.Ferreira,Inferênciaparamodelosortogonaiscomsegregação,UnpublishedPhDthesis, Univer-sidadedaBeiraInterior,2006.

[13]S.S.Ferreira,D.Ferreira,E.Moreira,J.T.Mexia,InferenceforLorthogonalmodels,J.Interdiscip. Math.12(2009)815–824.

[14]S.S.Ferreira,D.Ferreira,E.Moreira,J.T.Mexia,Exactestimatorsfornormallinearmixedmodels, FarEastJ.Theor.Stat.31(2010)33–48.

[15]S.S.Ferreira,D.Ferreira,C.Nunes,J.T.Mexia,Crossingsegregatedmodelswithcommutative or-thogonalblockstructure,in:Proceedingsofthe8th_{International}_Conference_on_Numerical_Analysis

andAppliedMathematics,ICNAAM2010,in:AIPConf.Proc.,vol. 1281,2010,pp. 1229–1232. [16]S.S.Ferreira,D.Ferreira,C.Nunes,J.T.Mexia,Estimationofvariancecomponentsinlinearmixed

modelswithcommutativeorthogonalblockstructure,Rev.ColombianaEstadíst.36(2013)261–271. [17]M.Fonseca,J.T.Mexia,R.Zmyślony,BinaryoperationsonJordanalgebrasandorthogonalnormal

models,LinearAlgebraAppl.417(2006)75–86.

[18]M.Fonseca,J.T.Mexia,R.Zmyślony,Jordanalgebras,generatingpivotvariablesandorthogonal normalmodels,J.Interdiscip.Math.10(2007)305–326.

[19]M. Fonseca,J.T.Mexia, R.Zmyślony,Inference innormal modelswithcommutativeorthogonal blockstructure,ActaComment.Univ.Tartu.Math.12(2008)3–16.

[20]M.Fonseca,J.T.Mexia,R.Zmyślony,Leastsquaresandgeneralizedleastsquaresinmodelswith orthogonalblockstructure,J.Statist.Plann.Inference140(2010)1346–1352.

[21]A.M.Houtman,T.P.Speed,Balanceindesignedexperimentswithorthogonalblockstructure,Ann. Statist.11(1983)1069–1085.

[22]J.Isotalo,S.Puntanen,G.P.H.Styan,TheBLUE’scovariancematrixrevisited:areview,J.Statist. Plann.Inference138(2008)2722–2737.

[23]T.Kariya,H.Kurata,GeneralizedLeastSquares,Wiley,NewYork,2004.

[24]W. Kruskal,When are Gauss–Markov andleastsquares estimatorsidentical? A coordinate-free approach,Ann.Math.Statist.39(1968)70–75.

[25]H.Macharia,P.Goos,D-optimalandD-eﬃcientequivalent-estimationsecond-ordersplit-plot de-signs,J.Qual.Technol.42(2010)358–372.

[26]A.Markiewicz,S.Puntanen,G.P.H.Styan,AnoteontheinterpretationoftheequalityofOLSE andBLUE,PakistanJ.Statist.26(2010)127–134.

[27]J.A. Nelder, The analysis of randomized experiments withorthogonal block structure. I. Block structureandthenullanalysisofvariance,Proc.Roy.Soc.Lond.Ser.A273(1965)147–162. [28]J.A.Nelder,Theanalysisofrandomizedexperimentswithorthogonalblockstructure.II.Treatment

structureandthegeneralanalysisofvariance,Proc.Roy.Soc.Lond.Ser.A273(1965)163–178. [29]C.Nunes,C.Santos,J.T.Mexia,Relevantstatisticsformodelswithcommutativeorthogonalblock

structureandunbiasedestimatorforvariancecomponents,J.Interdiscip.Math.11(2008)553–564. [30]P.Parker,S.M.Kowalski,G.G.Vining,Constructionofbalancedequivalentestimationsecond-order

split-plotdesigns,Technometrics49(2007)56–65.

[31] R.W.Payne,GenStat.Wileyinterdisciplinaryreviews,Comput.Statist.1(2009)255–258,http:// dx.doi.org/10.1002/wics.32.

[32]R.W.Payne,D.A.Murray,S.A.Harding, D.B.Baird, D.M.Soutar,GenStatforWindows: Intro-duction,12thedition,VSNInternational,HemelHempstead,2009.

(17)

[33]S.Puntanen,G.P.Styan,Theequalityoftheordinaryleastsquaresestimatorandthebestlinear unbiasedestimator(withdiscussion),Amer.Statist.43(1989)153–164.

[34]RCoreTeam,R:ALanguageandEnvironmentforStatisticalComputing,RFoundationfor Sta-tisticalComputing,Vienna,Austria,2013.

[35]F.E.Satterthwaite,Anapproximatedistributionofestimatesofvariancecomponents,Biometrics Bull.2(1946)110–114.

[36]H.Scheﬀé,TheAnalysisofVariance,JohnWiley&SonsInc.,NewYork,1959.

[37]J.Seely,Linearspacesandunbiasedestimators.Applicationstoamixedmodel,Ann.Math.Statist. 41(1970)1735–1748.

[38]J.Seely,Quadraticsubspacesandcompleteness,Ann.Math.Statist.42(1971)710–721.

[39]J.Seely,G.Zyskind,Linearspacesandminimumvarianceunbiasedestimators,Ann.Math.Statist. 42(1971)691–703.

[40]T.H.Szatrowski,Necessaryandsuﬃcientconditionsforexplicitsolutionsinthemultivariatenormal estimationproblemforpatternedmeansandcovariances,Ann.Statist.8(1980)802–810.

[41]T.H. Szatrowski, J.J. Miller, Explicit maximum likelihood estimatesfrom balanced data inthe mixedmodeloftheanalysisofvariance,Ann.Statist.8(1980)811–819.

[42]T.Tjur,Analysis of variancemodelsinorthogonal designs (withdiscussion), Int.Stat. Rev.52 (1984)33–81.

[43]G.G.Vining,S.M.Kowalski,D.C.Montgomery,Responsesurfacedesignswithinasplit-plot struc-ture,J.Qual.Technol.37(2005)115–129.

[44]R.Zmyślony,Acharacterizationofbestlinearunbiasedestimatorsinthegenerallinearmodel,in: MathematicalStatisticsandProbabilityTheory,ProceedingsoftheSixthInternationalConference, Wisla1978,in:LectureNotesinStatist.,vol. 2,Springer,NewYork,Berlin,1980,pp. 365–373.