Contents lists available atScienceDirect
Linear
Algebra
and
its
Applications
www.elsevier.com/locate/laaEstimability
of
variance
components
when
all model
matrices
commute
R.A. Baileya,b,∗,Sandra S. Ferreirac, Dário Ferreirac,
Célia Nunesc
a SchoolofMathematicsandStatistics,Universityof StAndrews,NorthHaugh,
St Andrews,FifeKY169SS,UK
b Schoolof MathematicalSciences,QueenMary,UniversityofLondon,MileEnd
Road, LondonEC14NS,UK1
c
DepartmentofMathematicsandCenterofMathematicsandApplications, Universityof BeiraInterior,AvenidaMarquêsD’ÁvilaeBolama,6200Covilhã, Portugal
a r t i c l e i n f o a bs t r a c t
Article history:
Received22February2013 Accepted3November2015 Availableonline17December2015 SubmittedbyR.Brualdi MSC: 62J10 62K99 Keywords: Analysisofvariance Commutativity Mixedmodel
Orthogonalblockstructure Segregation
This paper deals with estimability of variance components inmixedmodelswhenall modelmatricescommute. Inthis situation, it is well known that the best linear unbiased estimators of fixed effects are the ordinary least squares estimators. If, in addition, thefamily of possible variance– covariancematricesformsanorthogonalblockstructure,then therearethesamenumberofvariancecomponentsasstrata, andthevariancecomponentsareallestimable ifandonlyif therearenon-zeroresidualdegreesoffreedomineachstratum. Weinvestigatethecasewherethefamilyofpossiblevariance– covariancematrices,whilestillcommutative,nolongerforms anorthogonalblockstructure.Nowthevariancecomponents may or may notall be estimable,but thereisno clear link withresidualdegreesoffreedom.Whetherornottheyareall estimable,theremayormaynotbeuniformlybestunbiased
* Correspondingauthorat:SchoolofMathematicsandStatistics,UniversityofStAndrews,NorthHaugh, St Andrews,FifeKY169SS,UK.
E-mailaddress:[email protected](R.A. Bailey).
1
Emerita.
http://dx.doi.org/10.1016/j.laa.2015.11.002
quadraticestimatorsof thosethat are estimable. Examples aregiventodemonstrateallfourpossibilities.
© 2015ElsevierInc.All rights reserved.
1. Someassumptionsaboutthelinear model
Let Y be a column vector of N random variables Y1,. . . ,YN. Write E(Y) for the
expectationofthevector Y,andV foritsvariance–covariancematrix.Inthissectionwe presentsomeof theassumptionsthatare commonlymadeabout E(Y) andV inorder tohavealinearmodelwithgoodproperties.
Assumption 1 (Linear expectation). There is a known integer n, a known N × n real
matrix X andanunknowncolumnvector τ oflength n suchthatE(Y)= Xτ .
UnderAssumption 1, letT be theN × N matrixoforthogonal projection ontothe column-space of X. Then T = X(XX)+X, where + denotes the Moore–Penrose generalizedinverse:see textssuchas [11,23].Also, letGN be thematrixof orthogonal projectionontothespaceW spannedbytheall-1 vector1,sothatGN= N−1JN,where JN istheN× N matrixwhoseentriesareallequalto 1.Itisoftenthecasethat1 isin thecolumn-spaceofX. Thishappensifand onlyifTGN= GNT= GN:see[11,39]. Assumption 2(Spectral formof variance–covariance matrix).There areknown orthog-onalsymmetricidempotentmatricesQ0,. . . ,Qm summingtotheidentitymatrixIN of order N ,andnon-negativescalarsγ0,. . . ,γmsuchthat
V = m i=0
γiQi. (1)
Thisassumptionsaysthatthescalarsγ0,. . . ,γmaretheeigenvaluesof V;the corre-spondingeigenspaces are thecolumnspaces ofQ0,. . . ,Qm,and these are known.It is
oftenthecasethatW isoneoftheeigenspaces:inthatcase,welabelthespacessothat Qm= GN.
Assumption3(Norelationsamongtheeigenvalues).Assumption 2istrueandthereare nofurtherconstraintsonthevaluesofγ0,. . . ,γm.
A weaker form of this, given in [37], is that there are no linear constraints on
γ0,. . . ,γm.Hereis analternativeweakerform.
Assumption4(Coneoffulldimension).Assumption 2istrueandthefamilyofpossible matricesV formsapositiveconeofdimension m+1 inthespacespannedbyQ0,. . . ,Qm.
There aretwocommonwaysofjustifying Assumption 2.Thefirststartswithknown factorswithrandomeffects.Afactorissimplyafunctionassigningvariousdiscretelevels to eachoftheunits1,. . . ,N .Forj = 0,. . . ,w,letMj betheN× N relationmatrixfor the j-thfactor:its(α,β)-entry isequalto 1 ifthis factorhasthesamelevelonunitsα
and β; otherwiseit isequalto 0: see [4,5]. Wealwaysincludethe trivialfactorwith N
different levelsandlabelitas the0-thfactor,so thatM0= IN.
Assumption5(Mixedmodel).Therearew + 1 factorswithrandomeffects.Therelation matricesM0,. . . ,Mwofthesefactorsareknown,andM0= IN.Therearealsounknown
non-negative numbersσ2 0,. . . ,σw2 suchthat V = w j=0 σj2Mj. (2)
Norelationshipsareassumedamong σ2
0,. . . ,σw2.
Assumption 6 (Mixed model with linear independence). Assumption 5 is true and M0,. . . ,Mw arelinearlyindependent.
When Assumption 6 is true, there is a uniqueexpression for the right-handside of Equation (2). For an examplesatisfying Assumption 5 but notAssumption 6, use the fivefactorsina2× 2 Latinsquare: seeTjur[42,§7.3].
Assumption7(Commutativityof relationmatrices).Assumption 6istrueandMiMj = MjMi for0 i< j w.
Proposition 1. If Assumption 7 is true, then M0,. . . ,Mw generate a commutative
al-gebra A of symmetric matrices. If the dimension of A is m+ 1, then m w and Equation (2) can be written as (1), where Q0,. . . ,Qm are the primitive idempotents
of A.Thenthere areknown scalarsbij such thatγi= w
j=0bijσj2 fori= 0,. . . ,m. Under Assumptions 1 and 7, Fonseca et al. [17–19] call the matrices XX and M0,. . . ,Mw model matrices. In view of the different roles played by expectation and variance,andtoallowfortreatmentswithunequalreplication,weprefertousetheterm ‘modelmatrices’ forT and M0,. . . ,Mw.
TheothercommonwayofjustifyingAssumption 2comesfromconsideringthepattern of theentriesin V,whichmaywellbejustified byrandomization, asin[1,2,27]. Assumption 8(Patterns of covariance).There are knownnon-zerosymmetric matrices A0,. . . ,Aw summing to JN, suchthatA0 = IN and allentries inAi are in{0,1} for
i= 1,. . . ,w,andalsoanon-negativenumberσ2,andnumbersρ1,. . . ,ρwin[−1,1] such that V = σ2A0+ σ2 w i=1 ρiAi.
Theonlyfurtherconditionsassumedonρ1,. . . ,ρw arethatV isnon-negativedefinite. Assumption9(Commutativity of pattern matrices).Assumption 8 istrueand AiAj = AjAi for0 i< j w.
Assumption 9gives a result similar to Proposition 1, with Mi replaced by Ai. For simplicity, from now on we use thenotation Mi inboth cases. Furthermore, let B be the(m+ 1)× (w + 1) matrixwithentriesbij.
Thefinalassumption inthis sectionrelatestheexpectationpartofthemodeltothe variance–covariancepart.
Assumption 10 (Commutativity between expectation and variance matrices). Assump-tions 1 and2aretrue,andthematrixT commutes withQi fori= 0,. . . ,m.
UnderAssumption 10, putTi = TQi = QiT andPi = Qi− Ti fori= 0,. . . ,m. If thecolumn-spacesofX andQi areorthogonalto eachotherthenTi iszero;otherwise, Ti isthematrixof orthogonalprojectionontotheintersection Ui ofthese spaces.IfUi isthe wholeofthe column-spaceWi of Qi then Ti = Qi andPi = 0; otherwise,Pi is the matrix of orthogonal projection onto the space Wi∩ Ui⊥, which is the orthogonal complement of Ui in Wi. Let di be the rank of Pi, so that di = 0 if Pi = 0 and
di= dim(Wi)− dim(Ui) otherwise.
UnderAssumption 1, theordinaryleastsquares(OLS) estimatorτˆOLS of τ is given
byτˆOLS= (XX)+XY = (XX)+XTY.IfV isknown,thenthegeneralized
least-squares (GLS) estimator τˆGLS of τ is given by τˆGLS = (XV−1X)+XV−1Y. The
importanceofAssumption 10isshownbythefollowingtheorem,whichcanbe foundin
[20,22,24,26,33,44],forexample.
Theorem 1.Under Assumptions 1 and2, Assumption 10is equivalent to the condition that theOLSestimatorof τ isthesame astheGLS estimatorof τ no matterwhat the valuesofγ0,. . . ,γm are.
Morerecently,Assumption 10hasbeencalled‘equivalentestimation’;see,forexample,
[25,30,43]. In [8], Brown says that ‘ANOVA exists’ if and only if Assumption 10 is satisfied.
2. Orthogonalblockstructure
Many authors havegiven conditions on V whichensure thatAssumption 2 istrue. A factor issaidtobebalancedifall itslevels occuronthesamenumberofunits. Assumption11(Tjurblock structure).Assumption 7istrue,allthefactorswithrandom effects arebalanced,andM0,. . . ,MwspanA.
In [42], Tjur discussed variance–covariance models satisfying Assumption 11. He showed that this implies that m = w. Thus the matrix B is square and invertible, and there is no linear relationship specified among γ0,. . . ,γm, but the non-negativity
of σ20,. . . ,σ2w implies that Assumption 3 is not satisfied if m ≥ 1. He further showed thatitispossibleto labeltherelationmatricesandtheprimitiveidempotentsinsucha way thatthematrix B islowertriangular:weadoptthisconventioninExamples 1–3,5 and 6.
Nelder [27] and Bailey [4,6] defined an orthogonal block structure (OBS) to be a collection of factors satisfying some extra conditions in addition to Tjur’s, but with
Assumption 7replacedbyAssumption 9.
Houtman and Speed [21] generalized the definition of OBS to mean any family of variance–covariancematricessatisfyingAssumption 3.Theyexplicitlyspecifiedthatthe family must consist of allpositive semi-definitematrices of theform (1), and gave the following twoexamples,whichconsideronlythestructure ofV,toclarifythispoint. Example 1.Here N = bk, and the unitsaregrouped into b blocks of size k. Theusual mixedmodelgivesV = σ02M0+ σ21M1,where M0= IN andM1 istherelation matrix
forblocks.Then
V = γ0Q0+ γ1Q1, (3)
where Q1= k−1M1,Q0= IN− Q1,γ0= σ02andγ1= σ02+ kσ12.ThisisanOBS,apart
from thepositivityconstraintγ1 γ0.
On the other hand,therandomization model of[1,2] gives V = σ2A
0+ σ2(ρ1A1+
ρ2A2),whereA0= IN,A1= M1− IN andA2= JN− M1.Then
V = γ2Q2+ γ3Q3+ γ4Q4, (4)
where we startthe numberingat 2 to avoidconfusion with(3). Here Q4 = GN, Q3 =
Q1−Q4,Q2= Q0,γ2= σ2(1−ρ1),γ3= σ2[1+(k−1)ρ1−kρ2] andγ4= σ2[1+(k−1)ρ1+
(N − k)ρ2]. This is adifferent OBS. The usual mixed modelgives the constraintthat
γ3 = γ4, andso expression (4)shouldnotbe usedto show thatits variance–covariance
structure isanOBS.
Example 2. Here N = rc, and the units are arranged in a rectangle with r rows and
effectsarerandom:M0= IN,M1andM2aretherelationmatricesforrowsandcolumns
respectively,andM3= JN.This definesanOBSwith Q3= GN, Q2= r−1M2− GN, Q1= c−1M1− GN andQ0= IN− Q1− Q2− Q3. Moreover,γ0= σ20,γ1 = σ02+ cσ12,
γ2= σ20+ rσ22andγ3= σ20+ cσ12+ rσ22+ N σ32.Nelderexplicitlyallowsσ2i tobenegative solongasγ0,. . . ,γ3are allnon-negative.
AnalternativemixedmodelomitsM3,so thatσ32= 0.However,M1M2= M2M1=
M3,and so thespectraldecompositionof V isstill V = γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3.
Now γ3 = γ2+ γ1− γ0, and this linear constraint implies that the family of possible
matricesV doesnotformanOBS.
Unfortunately, some later authors, such as Bailey [3] and Caliński and Kageyama
[9],definedOBSwithoutincluding theconditionof nolinearrelationshipamong theγ
parameters.ThereisnowsomeconfusionaboutwhatanOBSis.Ferreiraetal.[16]tryto clarifythedifferencebycallingAssumption 3 OBSandAssumption 2generalizedOBS, while Bailey and Brien [7] call them orthogonal variance structure and commutative variancestructure respectively.FortheremainderofthispaperweuseAssumption 4as ourdefinitionof OBS, so thatitincludes classeslike (3) withthe positivityconstraint
γ1 γ0.
The combination of the Nelder–Bailey type of OBS with Assumption 10 is called simply‘orthogonality’in[6].ThecombinationofAssumption 7(sometimeswithoutlinear independence of M0,. . . ,Mw) with Assumption 10 is called ‘commutative orthogonal blockstructure’(COBS)in[10,15,19,29].
EstimationunderAssumption 10is straightforwardandwell-known. ByTheorem 1, ˆ
τOLS = ˆτGLS.The column-spaceof Qi is called thei-th stratum in[2,6,27,28], andin thestatistical softwareGenStat [31,32], anddi is calledthenumberof residualdegrees offreedominthei-th stratum.
Consideravalueofi suchthatTi= 0.Thestandarderrorofanyscalarlinearfunction ofTiXτ isproportionalto√γi.Thusweusuallywanttoestimateγ0,. . . ,γmaswellas τ . Proposition2.Supposethat Assumptions 3 and 10 hold.
(a) Ifdi= 0 thenPiY2/di isanunbiasedestimatorfor γi.
(b) Ifdi= 0,thedistributionofY ismultivariatenormal,ti= rank(Ti) andTiXτ = 0,
thendiTiY2/tiPiY2 hasan F-distributionon ti anddi degreesoffreedom. (c) Ifdi= 0 thenthere isnounbiasedestimatorforγi.
Proof. (a)and (b)See[36]. (c)See[37]. 2
From (b), when di = 0 then the quantity diTiY2/tiPiY2 is used as the test statisticfor testing thenull hypothesisthatTiXτ = 0. If di = 0 then this hypothesis cannotbetested.
Thefollowingtwoverysimpleexamplesdemonstratetwofeaturesofthisprocessthat are oftenoverlooked. Thefirstfeatureisdiscussed byTjurin[42, §7.3].
Example 3. Consider an n× n Latin square, with expectation corresponding to the
n letters.ThenN = n2.TheapproachofNelder[27,28]givesamixedmodelwithrandom factorslikethoseinExample 2:M0= IN,M1 andM2correspondtorowsandcolumns
respectively,and M3= JN. ThenQ3= GN,Q2 = n−1M2− GN, Q1 = n−1M1− GN and Q0 = IN− Q1− Q2− Q3, while γ0 = σ20, γ1 = σ20+ nσ12, γ2 = σ02+ nσ22 and
γ3= σ20+ nσ21+ nσ22+ n2σ23.
Because thetreatmentsareappliedinaLatinsquare,T1= T2= 0 andT3= GN = Q3.Henced3= 0,andso itisimpossibletoestimateγ3 orσ23.
Put τ = n¯ −1(τ1+· · · + τn),so thatT3Xτ = ¯τ 1.It isimpossible to giveastandard
error fortheestimateof ¯τ ,and thereisno testofthehypothesis thatτ = 0.¯ This may bewhythelinefortheoverallmeanisoftenomittedfromtheanalysis-of-variancetable. Example 4.Consider the two-sample t-test. Then n = 2. Suppose thattreatment 1 is applied to the first r units and that treatment 2 is applied to the remaining s units,
where r + s= N ,r 2 ands 2.LetQ0bethediagonalmatrixwhosefirstr diagonal
entries areequalto 1,theremainderbeing 0,andputQ1= IN− Q0.
Assumethat
V = γ0Q0+ γ1Q1. (5)
(Usually wewouldwrite γi asσ2i,butwewantto be consistentwith(1).)If webelieve that γ0 and γ1 are different then we haveOBS with Assumption 10, and we estimate
thestandarderrorofthedifferenceτˆ1− ˆτ2 as
ˆ γ0 r + ˆ γ1 s, where ˆγ0=P0Y2/(r− 1) andγˆ1=P1Y2/(s− 1).
Ontheotherhand,ifwebelievethatγ0= γ1= γ thenexpression(5)isnotOBS.In
thiscase,P0Y2/(r− 1) andP1Y2/(s− 1) arebothunbiasedestimatorsofγ.They
are usuallycombinedtogivetheestimator
P0Y2+P1Y2
N− 2 =
(IN− T)Y2
N− 2 ,
whichcomes directlyfrom writingV = γIN,whichisindeedanOBS. 3. Linearrelationshipsamongthe eigenvalues
Fromnowon,weassumeAssumption 10andeitherAssumption 7orAssumption 9,so thatthematrix B isdefinedanditscolumnsarelinearlyindependent.Forsimplicity,we
discussonlyAssumption 7,buttheresultsholdwheneverthematricesMiaresymmetric and commute with each other. In particular, writing Mi as Ai gives Assumption 9. Finally,we assumethatm> w,so thatthere is atleast onelinearrelationshipamong
γ0,. . . ,γm.
Theorem 2. Suppose that there are exactly t+ 1 values of i for which di = 0. Label
the primitive idempotents in such a way that di = 0 if and only if t < i m. For
i= 0,. . . ,t, putγˆi =PiY|2/di,so that γˆi is anunbiased estimatorfor γi.Let B be˜
the(t+ 1)× (w + 1) matrixobtainedfromthefirst(t+ 1) rowsof B.Letθ =ti=0λiγi,
whereλ0,. . . ,λtare known scalars.
(i) If therows of B are˜ linearly independent andthe columnsof B are˜ linearly inde-pendent,thent= w andB is˜ invertible. ThusB˜−1(ˆγ0,. . . ,γˆw) gives anunbiased
estimateof(σ2
0,. . . ,σ2w) andsoB ˜B−1(ˆγ0,. . . ,ˆγw) gives anunbiasedestimateof (γ0,. . . ,γm).
(ii) Thequantityti=0λiγˆi isanunbiasedestimatorforθ.IftherowsofB are˜ linearly
independent,thennootherlinearcombinationofγˆ0,. . . ,ˆγtisanunbiasedestimator
for θ.
(iii) If the rows of B are˜ not linearly independent, then there is at least one i with
0 i t such that there isa linearcombination of γˆ0,. . . ,γˆt otherthanγˆi which
isanunbiasedestimatorfor γi.Thisnon-uniquenessof estimationthenpropagates
tosomeof σ2
0,. . . ,σw2 andhence may propagate tosomeof γt+1,. . . ,γm. (iv) If the columns of B are˜ linearly independent, then each of σ2
0,. . . ,σ2w, and hence
each of γ0,. . . ,γm,can be estimated unbiasedly by at leastone linear combination
ofγˆ0,. . . ,γˆt.
(v) If the columns of B are˜ not linearly independent, then there is atleast one value of i suchthat nolinearcombination of ˆγ0,. . . ,ˆγt gives anunbiasedestimate of γi. Proof. (i),(ii)and(iii)areimmediate.
(iv) SeeFerreira[12].
(v) IfthecolumnsofB are˜ notlinearlyindependent,thenthereisatleastonevalueof j suchthatσ2
j isnotalinearcombinationofγ0,. . . ,γt.Hencenolinearcombination of γˆ0,. . . ,ˆγt gives anunbiased estimate of σj2. There is at least onevalue of i for whichbij= 0:foranysuch i,thereisnolinearcombinationofγˆ0,. . . ,ˆγtwhichgives anunbiasedestimateofγi. 2
WhenB is˜ invertible,the only difference from Section 2is that, when w < i m,
the estimate of γi may be a linearcombination of two or moremean squares, and so Satterthwaite’sapproximation[35]mustbeusedforperforminganF-test.Sincem> w,
WhenthecolumnsofB are˜ linearlyindependent,theneachparameterσi2hasatleast oneunbiasedestimatorwhichisalinearcombinationoftheresidualmeansquares.In[8], Brown saysthat(σ02,. . . ,σw2) is ‘estimable’ifand onlyifthis happens.This property is called‘segregation’in[13–16].
WhenthecolumnsofB are˜ notlinearlyindependentthenthereisoneormorevalues of i for whichthere is no unbiased estimatorof γi. In suchcases, clearlyi > t,and so Ti = Qi = 0.Thusthereisnoestimatorofastandarderrorforanyscalarlinearfunction of TiXτ ,and there isnotest ofthehypothesis thatTiXτ = 0.If thisoccurs for only onevalueofi,andthecorrespondingprimitiveidempotentisGN,asinExample 3,then this maynotberegardedas problematic.
When the rows of B are˜ not linearly independent, which estimator shouldwe use? Should we combine the different estimators in some way? In Example 4, there was a simpleanswer,givenbyreplacingmodel(5)byanOBS,butthissolutionisnotavailable ingeneral.
The set of residual mean squares forms a minimal sufficient set of statistics for
γ0,. . . ,γm when the columns of B are˜ linearly independent: see [8]. However, Seely showed in[38] thatitis notcomplete ifthe rowsof B are˜ linearly dependent,because nounbiased linearcombination isuniformlybetterthananyother.
Onepossibility isto maximizethelikelihood of(IN− T)Y under theassumptionof multivariate normality.However,Szatrowski[40]andSzatrowskiandMiller[41]showed thatthereisnoclosed-formsolutionwhentherowsofB are˜ linearlydependent.
Whenneithertherowsnorthecolumns ofB are˜ linearlyindependent,wehaveboth problems at the same time: inestimability of some variance components but multiple estimatesof others.
There arefour possiblecombinations oflinearindependence/dependenceoftherows of B with˜ linearindependence/dependenceof thecolumns of B.˜ Weshow inSection4
thatallfourpossibilities canoccur. 4. Examples
This section contains two examples to demonstrate all the behaviour described in Section 3. Each considers avariance modelsatisfying Assumption 7 with m > w,and comparesitwiththemodelsatisfyingAssumption 3withthesameidempotents.Several different modelsforexpectationareusedineachcase,allsatisfyingAssumption 10and alldefinedbythecombinationsoflevelsofoneormorefactors.Inordertomaintainthe samenotationfortheprimitiveidempotentswhileusingdifferentmodelsforexpectation, itis notalwayspossible forthestrata forwhichdi= 0 tobe labelled0,. . . ,t.
Example 5. Suppose that N = 96, and that the units are partitioned into six blocks, each of which is a4× 4 array, so thatthe 24 rows and 24 columns are nested within blocks:seeFig. 1.
One practical instance of this occurs in consumer testing. An experiment uses 24 volunteer consumers during 24 weeks. The weeks are divided into six groups of four
Fig. 1. Structure of the 96 units inExample 5.
weeks each, so that each group is approximately one month. The consumers are also partitionedintosixgroupsoffour.Fori= 1,. . . ,6,theconsumersingroup i participate duringalltheweeksinmonth i, andinnoothers. Eachvolunteeris givenapacketof a typeofcoffeeduringeachweekinwhichheorsheparticipates.Heorsheusesthatcoffee allweek,andgivesitascoreattheendoftheweek.Herethemonthsaretheblocks,the weeksaretherowsandtheconsumersarethecolumns.
A second instance of this structure is an experiment on feeds for lactating cows. Six different pensare used, oneineach four-week‘month’. Twenty-four cows are used, four per pen. Each cowis given onetype of feed throughout eachweek, and her milk production for that week is recorded. Of course, if cows inthe same pen get different feeds in the same week then they must be fed individually. Now the pens, weeks and cowsaretheblocks,rowsandcolumnsrespectively.
Inmoststatisticalsoftware,thisstructure iscreatedbyfirstdeclaringfactorsblocks, rowandcolumnswithsix,fourandfourlevelsrespectively.Thenoneunitiscreatedfor eachcombinationof levels.Finally,theexperimentalstructureis declaredbyaformula suchas
blocks/(rows∗ columns), whichisusedinGenStat[31,32]and R[34].
PutM0= I96.LetM1,M2 andM3be therelationmatricescorresponding torows,
columns andblocksrespectively.ThenM1M2= M2M1= M3.Theprimitive
idempo-tentsofthealgebragenerated byM0,M1,M2and M3areQ0,Q1,Q2and Q3,where
Q3= 16−1M3,Q2= 4−1M2− Q3,Q1= 4−1M1− Q3 andQ0= M0− Q1− Q2− Q3.
The Appendix shows the R code which generates these matrices, for one systematic orderingoftheunits.
Firstweconsider theorthogonalblockstructure definedby V = σ02M0+ σ21M1+ σ22M2+ σ32M3
Table 1
SkeletonanalysisofvariancetablesinExample 5.
Stratum df Source df Source df Source df Source df
Q3 6 mean 1 mean 1 mean 1 mean 1
A 2 H 5 H 5
residual 5 residual 3
Q2 18 G 3 G 3
G.H 15
residual 15 residual 18 residual 18
Q1 18 F 3
residual 15 residual 18 residual 18 residual 18
Q0 54 F.G 9
residual 45 residual 54 residual 54 residual 54 Expectation model (a) model (b) model (c) model (d)
where γ0= σ20,γ1= σ02+ 4σ21,γ2= σ02+ 4σ22 andγ3= σ20+ 4σ12+ 4σ22+ 16σ23.Thereis
noassumed linearrelationshipamongγ0,. . . ,γ3,noramongσ02,. . . ,σ23.
Weconsider fourdifferentmodels forexpectation.
(a) Treatment factorsF and G eachhavefourlevels. Within eachblock,levelsof F
are randomlyappliedto thefourrowsandlevelsofG arerandomlyappliedtothefour columns. Each combination of levels of F and G givesan entry inτ , so that n = 16. ThenT0correspondstotheinteractionF.G,T1tothemaineffectofF ,T2tothemain
effectofG,andT3 totheoverallmean.
Part (a)ofTable 1givestheskeletonanalysisofvariance,includingtheoverallmean. It showsthatd0= 45, d1= d2= 15 andd3 = 5.Thereis oneresidualmeansquare for
eachγ-parameter.
(b) TreatmentfactorA hasthree levels,randomly appliedto twowhole blockseach. Each level of A givesan entry inτ , so thatn = 3.Now T= T3, which includesboth
the main effect of A and the overall mean. As part (b) of Table 1 shows, d0 = 54,
d1= d2= 18,d3= 3,andthere isstilloneresidualmeansquare foreachγ-parameter.
(c) Treatment factor H is like treatment factor A, except that it has six levels. As part (c)ofTable 1shows,t= 2 andthereisnoestimatorforγ3.
(d) The treatment factors are G, as in (a), and H, as in (c). Expectation effects correspondtotheoverallmean,themaineffectsofG andH,andtheirinteractionG.H.
Part (d) ofTable 1showsthatthereisnoestimatorforγ3or γ2.
Now wesuppose thatσ2
3= 0.Theprimitiveidempotentsofthealgebragenerated by
M0,M1 andM2 arestillQ0,. . . ,Q3.Now
V = σ20M0+ σ21M1+ σ22M2
= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3,
where γ0= σ20,γ1= σ02+ 4σ12, γ2= σ20+ 4σ22 andγ3= σ20+ 4σ12+ 4σ22= γ1+ γ2− γ0.
Iftheexpectationfollowsmodel (a),thenpart (a)ofTable 1showsthattherearefour independentresidualmeansquareswhoseexpectationsareγ0,. . . ,γ3.Thereistherefore
Fig. 2. Structure of the 96 units inExample 6.
theproblem of deciding, for example,whetherto estimate γ3 by ˆγ3 or ˆγ1+ ˆγ2− ˆγ0 or
someweightedaverageofthese.Model (b)hasthesameproblem.
Ifthe expectationfollows model (c) then part (c) of Table 1 showsthat t= w = 2. Moreover, ˜ B = ⎡ ⎢ ⎣ 1 0 0 1 4 0 1 0 4 ⎤ ⎥ ⎦ ,
whichisinvertible.Theuniqueestimatorofγ3isˆγ1+ˆγ2−ˆγ0.Thiscanbeusedtoestimate
standarderrorsofdifferencesbetweenlevelsofH,andforanapproximateF-testforH,
eventhoughd3= 0.
Ifthe expectationfollows model (d)then part (d) ofTable 1 shows thatt= 1. The parametersγ0 andγ1 canbeestimated.Henceσ20 and σ12 canbeestimated,butσ22,γ2
and γ3 cannot. So there are noF-tests for any treatment effects,and no estimatorsof
standarderrorsofdifferences.
Example6.Inamodifiedversionofthecow-feedingexperiment,allsixpensareusedat thesametimeforasingle‘month’offourweeks.Thusrowsarenowcrossedwithblocks, andtheformulafortheexperimentalstructure is
rows∗ (blocks/columns) : seeFig. 2.
PutM0= I96.LetM1betherelationmatrixforrow-blockintersections.LetM2,M3
andM4 be therelationmatrices forcolumns,blocksand wholerows, respectively,and
letM5 = J96. Theprimitive idempotents ofthe algebragenerated by M0,. . . ,M5 are
Q0,. . . ,Q5,whereQ5= 96−1M5= G96,Q4= 24−1M4−Q5,Q3= 16−1M3−Q5,Q2=
4−1M2−16−1M3,Q1= 4−1M1−Q3−Q4−Q5,andQ0= M0−Q1−Q2−Q3−Q4−Q5.
Thecorrespondingorthogonalblockstructurehas
V = σ02M0+ σ12M1+ σ22M2+ σ32M3+ σ42M4+ σ25M5
= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3+ γ4Q4+ γ5Q5,
where γ0 = σ20, γ1 = σ20 + 4σ21, γ2 = σ20 + 4σ22, γ3 = σ20 + 4σ12 + 4σ22 + 16σ32, γ4 =
σ02+ 4σ12+ 24σ42 andγ5= σ20+ 4σ12+ 4σ22+ 16σ32+ 24σ42+ 96σ25.
Table 2
SkeletonanalysisofvariancetablesinExample 6.
Stratum df Source df Source df Source df
Q5 1 mean 1 mean 1 mean 1
Q4 3 F 3
residual 3 residual 3
Q3 5 A 2 H 5
residual 3 residual 5
Q2 18 G 3
residual 18 residual 15 residual 18
Q1 15 residual 15 residual 15 residual 18
Q0 54 F.G 9
residual 54 residual 45 residual 54 Expectation model (a) model (b) model (c)
(a) Treatment factor A has three levels, each applied to two whole blocks, so that
n= 3.Part (a)ofTable 2givestheskeletonanalysisof variance.Itshowsthatthereis oneresidualmean squareforeachof γ0,. . . ,γ4.As inExample 3, thereis noestimator
forγ5 andnotestofthehypothesisthat¯τ = 0.
(b)TreatmentfactorsF andG eachhavefourlevels.EachlevelofF isappliedtoone wholerow.Levels ofG areappliedtothefour columns withineachblock.There isone entryin τ for eachcombinationoflevelsofF and G,sothatn= 16.Part (b)ofTable 2
shows thatd4= d5= 0, sothatthere is noestimateof γ4 orγ5,no testfor τ = 0 and¯
notest forthemain effectof F .
(c) Treatment factor H has six levels, each of which is appliedto onewhole block. Now part (c)ofTable 2showsthatthereisnoestimatorforγ3or γ5.
Amixedmodelforthisstructuremighthaveσ2
3= σ52= 0.(Thelabellingofthe
matri-ces M0,. . . ,M5 ischosensothatM0,. . . ,M3 denotethesamematricesinExamples 5
and 6.)However,M1M2 = M2M1 = M3 and M2M4 = M4M2 = M5,so the algebra
generated byM0,M1,M2and M4 stillhasprimitiveidempotentsQ0,. . . ,Q5.Now
V = σ02M0+ σ12M1+ σ22M2+ σ24M4
= γ0Q0+ γ1Q1+ γ2Q2+ γ3Q3+ γ4Q4+ γ5Q5,
whereγ0= σ02,γ1= σ02+ 4σ21,γ2= σ20+ 4σ22,γ3= σ02+ 4σ21+ 4σ22,γ4= σ02+ 4σ12+ 24σ42
and γ5= σ20+ 4σ12+ 4σ22+ 24σ42.
Iftheexpectationfollowsmodel (a)thenpart (a)ofTable 2showsthattherearefive independentresidualmeansquareswhoseexpectationsareγ0,. . . ,γ4.However,γ3= γ1+
γ2− γ0,soeachofγ0,. . . ,γ3canbeunbiasedlyestimatedbyseverallinearcombinations
ofthese meansquares.Sinceγ5= γ3+ γ4− γ1,itisalsopossibletoestimateγ5,andso
Undermodel (b),part (b)of Table 2showsthatt= w = 3, withdi= 0 wheni= 0, 1,2 and3.Thus ˜ B = ⎡ ⎢ ⎢ ⎢ ⎣ 1 0 0 0 1 4 0 0 1 0 4 0 1 4 4 0 ⎤ ⎥ ⎥ ⎥ ⎦.
Neithertherowsnorthe columnsof B are˜ linearlyindependent. Thereis noestimator forσ42, γ4 orγ5,andhencenotestforthemaineffectofF .However,therearemultiple
estimatorsforγ0,γ1,γ2,γ3,σ20,σ21 andσ22.
Ontheotherhand,iftheexpectationfollowsmodel(c)thenpart (c)ofTable 2shows thatagaint = w = 3, with di = 0 wheni= 0, 1, 2 and4. Labellingtherowsof B by˜
γ0,γ1,γ2 andγ4gives ˜ B = ⎡ ⎢ ⎢ ⎢ ⎣ 1 0 0 0 1 4 0 0 1 0 4 0 1 4 0 24 ⎤ ⎥ ⎥ ⎥ ⎦,
which is invertible. Hence there are unique linear combinations of the residual mean squareswhichprovideunbiased estimatorsforeachofγ0,. . . ,γ5,σ20,σ21,σ22 andσ24.All
treatmenteffects,includingtheoverallmean,canbetestedandassignedstandarderrors. 5. Conclusion
Itiswidelybelievedthatestimationandinferencearestraightforwardformixed mod-els in which all possible variance–covariance matrices commute with each other and with thematrix of orthogonal projection ontothe space of allpossible fitted values of the expectation vector. As summarized in Section 2, this is indeed true when the di-mension m+ 1 ofthecommutativealgebraspanned byallpossiblevariance–covariance matricesisequaltothenumber w + 1 oflinearlyindependentunknownvariance compo-nents.However,whenm> w thentherearefourpossibilitiesforestimabilityofvariance components. We recommend summarizing the data in an analysis-of-variance tablein every case. However, the criterion for unique estimability of all variance components is nolonger a functionof how many residual degreesof freedom are non-zero: whatis neededisthatthematrixB introduced˜ inSection3beinvertible.
The examples discussed in Section 4 show realistic experimental situations where different assumptions about the variance–covariance matrix, combined with different simplemethodsofassigningtreatmentstoexperimentalunits,canleadtoallfourtypes of behaviour: variance components may be all estimable or not, and their estimators mayormaynotbeuniquelinearcombinationsoftheresidualmeansquares.Thisshows that, when an experiment is being planned, it is advisable to construct the skeleton
analysis-of-variancetableandfindthepropertiesofthematrixB,˜ beforetheexperiment is carriedout.
Acknowledgements
ThisworkwaspartiallysupportedbynationalfundsofFCT–FoundationforScience and TechnologyunderUID/MAT/00212/2013.
Appendix A. Rcode a=4 b=4 c=6 Ia= as.matrix(diag(a)) Ib= as.matrix(diag(b)) Ic= as.matrix(diag(c)) Ja= rep(1,a) Jb= rep(1,b) Jc= rep(1,c) X0=Ic%x%Ib%x%Ia X1=Ic%x%Jb%x%Ia X2=Ic%x%Ib%x%Ja X3=Ic%x%Jb%x%Ja M0=X0%*%t(X0) M1=X1%*%t(X1) M2=X2%*%t(X2) M3=X3%*%t(X3) Q3=(1/b)*(1/a)*M3 Q2=(1/a)*M2-Q3 Q1=(1/b)*M1-Q3 Q0=M0-Q1-Q2-Q3 References
[1]R.A.Bailey,Aunified approachto designofexperiments,J.Roy.Statist.Soc.Ser. A144(1981) 214–223.
[2]R.A.Bailey,Strataforrandomized experiments(withdiscussion),J.Roy.Statist.Soc.Ser. B53 (1991)27–78.
[3]R.A.Bailey,Generalbalance:artificialtheoryorpracticalrelevance?,in:T.Caliński,R.Kala(Eds.), ProceedingsoftheInternationalConferenceonLinearStatisticalInference,LINSTAT’93,Kluwer, Dordrecht,Netherlands,1994,pp. 171–184.
[4]R.A.Bailey,Orthogonalpartitionsindesignedexperiments,Des.CodesCryptogr.8(1996)45–77. [5]R.A. Bailey, Association Schemes, CambridgeStud. Adv.Math., vol. 84, Cambridge University
Press,Cambridge,2004.
[6]R.A. Bailey, Design of Comparative Experiments, Camb. Ser. Stat. Probab. Math., Cambridge UniversityPress,Cambridge,2008.
[7]R.A. Bailey, C.J.Brien,Randomization-based modelsfor multitieredexperiments: I.Achain of randomizations,Ann.Statist.(2016),inpress.
[8]K.G.Brown,Onanalysisofvarianceinthemixedmodel,Ann.Statist.12(1984)1488–1499. [9]T.Caliński,S.Kageyama,Block Designs:ARandomization Approach. Vol.I: Analysis, Lecture
NotesinStatist.,vol. 150,Springer-Verlag,NewYork,2000.
[10]F.Carvalho,J.T.Mexia,C.Santos,C.Nunes,Inferencefortypesandstructuredfamiliesof com-mutativeorthogonalblockstructures,Metrika78(2015)337–372.
[11]R.Christensen,PlaneAnswerstoComplexQuestions,SpringerTextsStatist.,Springer-Verlag,New York,1987.
[12]S.S.Ferreira,Inferênciaparamodelosortogonaiscomsegregação,UnpublishedPhDthesis, Univer-sidadedaBeiraInterior,2006.
[13]S.S.Ferreira,D.Ferreira,E.Moreira,J.T.Mexia,InferenceforLorthogonalmodels,J.Interdiscip. Math.12(2009)815–824.
[14]S.S.Ferreira,D.Ferreira,E.Moreira,J.T.Mexia,Exactestimatorsfornormallinearmixedmodels, FarEastJ.Theor.Stat.31(2010)33–48.
[15]S.S.Ferreira,D.Ferreira,C.Nunes,J.T.Mexia,Crossingsegregatedmodelswithcommutative or-thogonalblockstructure,in:Proceedingsofthe8thInternationalConferenceonNumericalAnalysis
andAppliedMathematics,ICNAAM2010,in:AIPConf.Proc.,vol. 1281,2010,pp. 1229–1232. [16]S.S.Ferreira,D.Ferreira,C.Nunes,J.T.Mexia,Estimationofvariancecomponentsinlinearmixed
modelswithcommutativeorthogonalblockstructure,Rev.ColombianaEstadíst.36(2013)261–271. [17]M.Fonseca,J.T.Mexia,R.Zmyślony,BinaryoperationsonJordanalgebrasandorthogonalnormal
models,LinearAlgebraAppl.417(2006)75–86.
[18]M.Fonseca,J.T.Mexia,R.Zmyślony,Jordanalgebras,generatingpivotvariablesandorthogonal normalmodels,J.Interdiscip.Math.10(2007)305–326.
[19]M. Fonseca,J.T.Mexia, R.Zmyślony,Inference innormal modelswithcommutativeorthogonal blockstructure,ActaComment.Univ.Tartu.Math.12(2008)3–16.
[20]M.Fonseca,J.T.Mexia,R.Zmyślony,Leastsquaresandgeneralizedleastsquaresinmodelswith orthogonalblockstructure,J.Statist.Plann.Inference140(2010)1346–1352.
[21]A.M.Houtman,T.P.Speed,Balanceindesignedexperimentswithorthogonalblockstructure,Ann. Statist.11(1983)1069–1085.
[22]J.Isotalo,S.Puntanen,G.P.H.Styan,TheBLUE’scovariancematrixrevisited:areview,J.Statist. Plann.Inference138(2008)2722–2737.
[23]T.Kariya,H.Kurata,GeneralizedLeastSquares,Wiley,NewYork,2004.
[24]W. Kruskal,When are Gauss–Markov andleastsquares estimatorsidentical? A coordinate-free approach,Ann.Math.Statist.39(1968)70–75.
[25]H.Macharia,P.Goos,D-optimalandD-efficientequivalent-estimationsecond-ordersplit-plot de-signs,J.Qual.Technol.42(2010)358–372.
[26]A.Markiewicz,S.Puntanen,G.P.H.Styan,AnoteontheinterpretationoftheequalityofOLSE andBLUE,PakistanJ.Statist.26(2010)127–134.
[27]J.A. Nelder, The analysis of randomized experiments withorthogonal block structure. I. Block structureandthenullanalysisofvariance,Proc.Roy.Soc.Lond.Ser.A273(1965)147–162. [28]J.A.Nelder,Theanalysisofrandomizedexperimentswithorthogonalblockstructure.II.Treatment
structureandthegeneralanalysisofvariance,Proc.Roy.Soc.Lond.Ser.A273(1965)163–178. [29]C.Nunes,C.Santos,J.T.Mexia,Relevantstatisticsformodelswithcommutativeorthogonalblock
structureandunbiasedestimatorforvariancecomponents,J.Interdiscip.Math.11(2008)553–564. [30]P.Parker,S.M.Kowalski,G.G.Vining,Constructionofbalancedequivalentestimationsecond-order
split-plotdesigns,Technometrics49(2007)56–65.
[31] R.W.Payne,GenStat.Wileyinterdisciplinaryreviews,Comput.Statist.1(2009)255–258,http:// dx.doi.org/10.1002/wics.32.
[32]R.W.Payne,D.A.Murray,S.A.Harding, D.B.Baird, D.M.Soutar,GenStatforWindows: Intro-duction,12thedition,VSNInternational,HemelHempstead,2009.
[33]S.Puntanen,G.P.Styan,Theequalityoftheordinaryleastsquaresestimatorandthebestlinear unbiasedestimator(withdiscussion),Amer.Statist.43(1989)153–164.
[34]RCoreTeam,R:ALanguageandEnvironmentforStatisticalComputing,RFoundationfor Sta-tisticalComputing,Vienna,Austria,2013.
[35]F.E.Satterthwaite,Anapproximatedistributionofestimatesofvariancecomponents,Biometrics Bull.2(1946)110–114.
[36]H.Scheffé,TheAnalysisofVariance,JohnWiley&SonsInc.,NewYork,1959.
[37]J.Seely,Linearspacesandunbiasedestimators.Applicationstoamixedmodel,Ann.Math.Statist. 41(1970)1735–1748.
[38]J.Seely,Quadraticsubspacesandcompleteness,Ann.Math.Statist.42(1971)710–721.
[39]J.Seely,G.Zyskind,Linearspacesandminimumvarianceunbiasedestimators,Ann.Math.Statist. 42(1971)691–703.
[40]T.H.Szatrowski,Necessaryandsufficientconditionsforexplicitsolutionsinthemultivariatenormal estimationproblemforpatternedmeansandcovariances,Ann.Statist.8(1980)802–810.
[41]T.H. Szatrowski, J.J. Miller, Explicit maximum likelihood estimatesfrom balanced data inthe mixedmodeloftheanalysisofvariance,Ann.Statist.8(1980)811–819.
[42]T.Tjur,Analysis of variancemodelsinorthogonal designs (withdiscussion), Int.Stat. Rev.52 (1984)33–81.
[43]G.G.Vining,S.M.Kowalski,D.C.Montgomery,Responsesurfacedesignswithinasplit-plot struc-ture,J.Qual.Technol.37(2005)115–129.
[44]R.Zmyślony,Acharacterizationofbestlinearunbiasedestimatorsinthegenerallinearmodel,in: MathematicalStatisticsandProbabilityTheory,ProceedingsoftheSixthInternationalConference, Wisla1978,in:LectureNotesinStatist.,vol. 2,Springer,NewYork,Berlin,1980,pp. 365–373.