2 Preliminary analysis of the longitudinal pretest-posttest data

(1)

Competing Regression Models for Longitudinal Data

Airlane P. Alencar^∗,1, Julio M. Singer¹,andFrancisco M. M. Rocha²

1Universidade of São Paulo(Departamento de Estat´ıstica, Instituto de Matemática e Estat´ıstica, Caixa Postal 66281, São Paulo, SP, 05314-970, Brazil)

2Universidade Federal de São Paulo(Departamento de Ciência e Tecnologia - São José dos Campos, SP, Brazil)

Received zzz, revised zzz, accepted zzz

The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest-posttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM) and models based on generalized estimating equations (GEE). We show how some special features of the data, like a non-constant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses.

Supplemental materials for this article are available on line.

Key words: Mixed models; Generalized linear models; Estimating equations method; Longitu- dinal data; Pretest/posttest measures;

1 Introduction

Mixed models are very useful to analyze data with a hierarchical structure, where measures from distinct units are independent and those from within units are correlated. In particular, they are appropriate to analyze repeated measures in longitudinal studies. The classical linear mixed model considered by Laird

& Ware (1982) is expressed as

y_i=X_iβ+Z_ib_i+e_i,i=1, . . . ,N, (1.1)

wherey_iis the(n_i×1)vector of responses for thei-th unit,X_i is the(n_i×p)fixed effects specification matrix, β is the corresponding(p×1) vector of parameters, b_i is a(q×1)vector of random effects with E(b_i) =0and Var(b_i) =G,Z_iis a(n_i×q)specification matrix for the random effects, ande_iis a (n_i×1)vector of random errors independent ofb_i, with E(e_i) =0, Var(e_i) =R_i, andG(q×q) andR_i (n_i×n_i) are symmetric positive definite matrices. The marginal expected response is E(y_i) =X_iβ and the conditional (on the individual effectb_i) expected response is E(y_i|b_i) =X_iβ+Z_ib_i. The inclusion of random effects may account for heterogeneity of regression coefficients across sample units, possibly due to several unmeasured factors that affect the response variable (Fitzmaurice et al. 2004).

It follows that

Σ=Var(y_i) =Z_iGZ^>_i +R_i, (1.2)

Corresponding author:∗ e-mail: [email protected], Phone: +55-11-30916218

(2)

where the first and second summands correspond, respectively, to the inter- and intra-individual covariance structures. WhenR_i=σ²I, the model is termed homoskedastc conditionally independent model and correlation among observations on the same unitiarises from their sharing the unobservable (latent) variable, b_i(Diggle et al, 2002, p. 128).

In the original formulation, Laird & Ware (1982) assumed thatb_i ande_iin (1.1) follow normal distributions. Unbiased and consistent estimators of the variance components may be obtained by maximizing the restricted likelihood function and estimators of the fixed effects may be obtained via maximum likelihood (Jiang 2007). In practice, however, it is often not plausible to assume that the response is normally distributed. To bypass this problem, a possible strategy is to consider a log-normal linear mixed model (LNLMM), where the response is assumed to follow a multiplicative model with log-normal errors; the corresponding linearized model is additive and may be expressed as (1.1). For a broader choice of response distributions, two alternatives are generalized linear mixed models (GLMM) and models in which the parameters are estimated via generalized estimating equations (GEE).

In the GLMM approach, a distribution of the exponential family is considered for the response. In the usual formulation, individual observations are assumed to be independent conditionally on the random effects. In a more general setting, it is possible to consider different covariance structures. Estimation may be based on maximum likelihood (ML), penalized quasi-likelihood (PQL) or pseudo-likelihood methods (see Fitzmaurice et al. (2008) p.90-94 for details).

In the GEE-based approach, the form of the distribution of the vectors of responses(y_i)is not specified;

the only assumption is that the marginal model depends only on the mean vector and on the covariance matrix ofy_i. Estimators are obtained as solutions to generalized estimating equations and are consistent even when the covariance structure is misspecified. In this case, the standard errors may not be correct, but valid standard errors can be obtained via the sandwich estimator (Liang & Zeger 1986). Because GEE- based models do not belong to the class of mixed models, they do not allow the evaluation of individual effects.

Excellent surveys on linear and generalized linear mixed models are presented in Demidenko (2004) and Jiang (2007). Marginal models analyzed via GEE are carefully addressed in Hardin & Hilbe (2002).

Song (2007) and Fitzmaurice et al. (2008) present recent reviews on the all these models. Good references on longitudinal data analysis using mixed and marginal models are: Diggle et al. (2002), Fitzmaurice et al.

(2004) and Molenberghs & Verbeke (2005).

We compare some features of the three approaches by fitting LNLMM, GLMM and GEE-based models to data obtained from a pretest-posttest longitudinal study. We show how some important characteristics of the data may be incorporated in the three models, discuss the robustness of parameter estimators in the presence of outliers and comment on their computational implementation. In Section 2 we introduce the data along with a preliminary exploratory analysis with the objective of identifying models for the analysis. In Sections 3, 4 and 5, we present the results of fitting, respectively, LNLMM, GLMM and GEE-based models, detailing their specification, the estimation method and discussing diagnostic tools. In Section 6 we discuss some advantages, disadvantages, and properties of each family of models and present some concluding remarks.

2 Preliminary analysis of the longitudinal pretest-posttest data

We present a pretest-posttest longitudinal dataset, highlighting its distinctive characteristics and conduct a preliminary descriptive analysis. In particular, we evaluate coefficients of variation and fit a naive model using least squares in order to identify possible competing models.

A study conducted at the School of Dentistry of the University of S˜ao Paulo, Brazil, was designed to compare the efficiency in the removal of bacterial plaque under daily use of a low cost experimental (monoblock) toothbrush to that of a conventional toothbrush. In the study, 32 children, 4 to 6 years old,

(3)

were randomly divided into 2 groups. Sixteen children received the monoblock toothbrush and the remain- ing children received the conventional toothbrush at the beginning of the study and used it for 45 days.

During this period, data on a bacterial plaque index were collected before (pretest) and after (posttest) toothbrushing in sessions spaced by 15 days. The data is available in Nobre & Singer (2007).

Scatter plots of the pretest and posttest bacterial plaque indices for each session and type of toothbrush are presented in Figure 1.

Figure 1: Scatter plots of pretest versus posttest bacterial plaque indices.

Insert Figure 1 approximately here.

The bacterial plaque indices measured before (x) and after (y) toothbrushing are supposed to have the following characteristics, suggested in Singer and Andrade (1997) which we include for the sake of self- containment:

a) A pretest plaque index equal to zero implies an expected posttest plaque index also equal to zero.

b) Pretest and posttest plaque indices are non-negative.

c) The data are possibly heteroskedastic (because the response is non-negative and follows the relation E(y)≤x); i.e. the expected posttest bacterial plaque index (y) must be smaller than the pretest index (x).

d) The relation between pretest and posttest plaque indices may be non-linear.

e) The observations carried out on the same child are possibly correlated.

For simplicity, we analyze the data conditionally on the values of the pretest observations. For an error-in-variables approach to a similar problem, the reader is referred to Aoki et al. (2003).

Following the suggestions of Singer & Andrade (1997) and Singer et al. (2002), a model that satisfies (a)-(e) is a multiplicative model of the form

y_{i jd}=β_jdx_{i jd}^γ^jdε_{i jd}, (2.1)

wherey_{i jd} is the posttest bacterial plaque index for the i-th individual using the j-th toothbrush in the d-th session,x_{i jd}is the corresponding pretest index, i=1, . . . ,16, j=0 (conventional), 1 (monoblock), d=1,2,3,4,β_jdandγ_jdare parameters to be estimated andε_{i jd}are non-negative random errors.

The efficiency of a toothbrush may be measured by the relative residual bacterial plaque index, defined as the expected ratio between the posttest and pretest bacterial plaque indices, say E(y_{i jd}/x_{i jd}) = βjdx^γ_{i jd}^jd⁻¹E(ε_{i jd}). With this definition, the smaller the relative residual bacterial plaque index, the more efficient is the toothbrush. Note that the measure of efficiency is proportional toβjdand it is equal toβjd

whenγjd=1 and E(ε_{i jd}) =1. In this multiplicative model, ifγjd=1, the efficiency does not depend on the pretest index, but whenγjd>1(γ_jd<1), the efficiency decreases (increases) according to the pretest index.

The characteristics (a)-(d) are clearly satisfied by model (2.1); in particular the heteroskedasticity for the response stems from Var(y_{i jd}) = (β_jdx^γ_{i jd}^jd)²Var(ε_{i jd}), even for homoskedastic errors(ε_{i jd}). Requirement (e) may be incorporated in the specification of the error distribution. Model (2.1) may be linearized by taking logarithms, i.e.,

ln(y_{i jd}) =λjd+γjdln(x_{i jd}) +e_{i jd} (2.2)

whereλjd=ln(β_jd)ande_{i jd}=ln(ε_{i jd}).

To identify whether random intercepts should be included, we started by fitting model (2.2) assuming uncorrelated errors. In Figure 2a, we plot the residuals versus the rank of the mean residual (mean of

(4)

the four residuals for each child); some children have all four residuals larger than others, suggesting that there are subject-specific components affecting the response. This variability may be modelled by random intercepts. The individual profiles of ln(y)versus ln(x)plotted in Figure 2b do not suggest that individual slopes are different so that random slopes may not be necessary. Standardized residuals are plotted versus fitted values in Figure 2c. Three outliers may be identified; they correspond to the second evaluation for the 12-th child in the conventional toothbrush group and for the third and fourth evaluation sessions of the 13-th child in the monoblock toothbrush group.

Figure 2: (a) Residuals versus ranks of subject mean residuals. (b) Individual profiles of ln(y)versus ln(x).

(c) Residuals versus fitted values.

Insert Figure 2.

The analysis of the parameter estimates of model (2.2) also suggests thatγ_jd=1. Under this restriction, the efficiency of a toothbrush in each session can be measured by the expected posttest/pretest bacterial plaque index ratio, E(y_{i jd}/x_{i jd}) =β_jdE(ε_{i jd}). Since the corresponding standard deviation is SD(y_{i jd}/x_{i jd}) =β_jdSD(ε_{i jd}), the coefficient of variation CV(y_{i jd}/x_{i jd}) =SD(ε_{i jd})/E(ε_{i jd})should be constant for all sessions and toothbrush groups if the errorsε_{i jd}were identically distributed. Figure 3 suggests that the coefficients of variation of the ratios y_{i jd}/x_{i jd} are different, and consequently that the standard deviations of the errorsε_{i jd}in (2.1), may be larger for the conventional toothbrush in sessions 1 and 2 and for the monoblock toothbrush in sessions 3 and 4. Even after removing the 3 outliers suggested in Figure 2c, the coefficients of variation still exhibit a similar behaviour.

Figure 3: Sample coefficients of variation of ratios between the posttest and pretest indices for each toothbrush and session for the complete and reduced (excluding 3 outliers) data sets.

Insert Figure 3.

The preliminary analysis suggests that the candidate models should include random intercepts, heteroskedasticity, and a larger coefficient of variation for the ratio posttest/pretest bacterial plaque indices for the conventional toothbrush in sessions 1 and 2 and for the monoblock toothbrush in sessions 3 and 4.

3 Analysis via log-normal linear mixed models

Here we consider a LNLMM model of the form (2.2) that includes random effects to account for possible positive within-subjects correlations and also considers an error covariance matrix that incorporates the non-constant coefficients of variation identified in Section 2. Such models may be fitted via well estab- lished linear mixed model methodology, where the location and the covariance structure may be modeled separately.

The log-normal linear mixed model may be expressed as

y^∗_{i j}=X_{i j}β+1₄bi j+e_{i j},i=1, . . . ,16, j=0,1, (3.1) wherey^∗_{i j}= (ln(y_{i j1}),ln(y_{i j2}),ln(y_{i j3}),ln(y_{i j4}))^T, the specification matrices for the conventional and monoblock toothbrush groups (j=0, 1) are respectively,

X_i0= I₄,0₄,

4 M

d=1

ln(x_{i jd}),0₄

!

,X_i1= 0₄,I₄,0₄,

4 M

d=1

ln(x_{i jd})

! ,

where^L⁴_j=1aidenotes a diagonal matrix with the elementsaialong the main diagonal,β = [λ^Tγ^T]^T is a (16×1)vector with λ^T = [λ₀₁,λ02,λ₀₃,λ₀₄,λ11,λ12,λ13,λ14],γ^T = [γ₀₁,γ02,γ03,γ04,γ11,γ12,γ13,γ14], 1₄= (1,1,1,1)^T, and0₄denotes a 4×4 matrix with all elements equal to zero. The error vectorse_{i j}are

(5)

independent and followN(0,R_j)distributions, where, as suggested by the preliminary analysis, R_j=

4 M

d=1

r²_jd=

diag{τ₁²,τ₁²,τ₂²,τ₂²}, if j=0

diag{τ₂²,τ₂²,τ₁²,τ₁²}, if j=1 . (3.2) Also, we assume that the random effects,b_{i j}, are independent and followN(0,σ_b²)distributions.

We present estimates of β_jd^∗ =E(y_{i jd}|x_{i jd}=1)instead of the estimates ofβjd=exp(λ_jd)in (2.1) as suggested in Singer et al. (2002). To compute these estimates, note thatexp(b_i+e_{i jd})has a log-normal distribution, so that E[exp(b_{i j})+e_{i jd}] =exp[(σ_b²+r²_jd)/2]. This approach is more convenient for interpreta- tion and comparison with other models. Expressions for estimatorsβb_jd^∗ and their corresponding asymptotic variances (obtained by the delta method) are presented in Table 1. Model (3.1)-(3.2) was fitted to the data using restricted maximum likelihood methods, available inSAS PROC MIXED(Littell et al. 2006). The codes and the data to fit the proposed models may be found on line in the Supplemental Materials link.

The estimates are presented in Table 2.

Table 1: Expressions for the estimators ofβ^∗_jd=E(y_{i jd}|x_{i jd}=1)and corresponding asymptotic variances.

Insert Table 1.

Table 2: Estimates (Est.) and standard errors (SE) for saturated LNLMM, GLMM and GEE fitted to the complete and reduced data sets.

Insert Table 2.

Results obtained under this model were compared to those based on two other models of the form (3.1):

the first withR_j=diag(τ²_j1, . . . ,τ²_j4),j=0,1, and the second withR_j=τ²I₄. This latter was the model adopted in Nobre & Singer (2007) in their analysis and implies a constant coefficient of variation. The AIC and BIC obtained via restricted maximum likelihood for the three models were respectively (AIC=

-96.8 and BIC=-92.4), (AIC= -91.8, BIC=-78.6) and (AIC= -74.9 and BIC=-72.0), suggesting that (3.1)- (3.2) is acceptable. The reader is referred to Jiang & Rao (2003) for a detailed analysis of generalized information criteria and to Guerin & Stroup (2000) or Gurka (2006) for a discussion on AIC and BIC for model comparison.

In order to evaluate the robustness of the estimators, the model was refitted to the reduced data obtained by ommiting the 3 outliers suggested in Figure 2c. The results are also presented in Table 2. The major discrepancies between the estimates for the complete and reduced data occur for the parametersβ^∗_jd and γ_jdfor(j,d) = (0,2),(1,3),(1,4)which correspond to the treatments with identified outliers. As expected, the estimated error variance (σc_b²) is smaller for the reduced data set.

Following the suggestions of Nobre & Singer (2007), studentized conditional residual plots are presented in Figure 4 for both the complete and reduced data; the two outlying observations in Figure 4a correspond to large decreases in the posttest index relatively to the pretest index in the second session for the 12-th child in the conventional toothbrush group and in the fourth session for the 13-th child in the monoblock toothbrush group. It is worth noting that fitting the model with R_j =τ²I₄, as in Nobre &

Singer (2007), these 2 outliers are more apparent in the residual analysis; furthermore an extra outlier is detected in the third session for a child in the monoblock group (as in Figure 2c). The model that includes (3.2) accommodates this observation and seems to be more suitable. The QQ plots in Figures 5a (complete data) and 5b (ommiting possible outliers) suggest mild deviations from the normality assumption, but we do not believe that this could jeopardize the results.

Figure 4: Studentized conditional residuals versus subject indices for LNMM. (a) Complete data. (b) Re- duced data.

Insert Figure 4.

Figure 5: QQ plot of studentized conditional residuals for LNMM. (a) Complete data. (b) Reduced data.

(6)

Insert Figure 5.

We did not identify evidence against the hypothesis thatγjd=1, j=0,1, d=1,2,3,4, under model (3.1)-(3.2) (p=0.310). The estimates of the parameters for this model are displayed in Table 3. Note that Nobre & Singer (2007) did not verify whetherγjd=1.

Table 3: Estimates (Est.) and standard errors (SE) for saturated LNLMM, GLMM and GEE fitted to the complete and reduced data sets assumingγjd=1.

Insert Table 3.

The conventional toothbrush seems more efficient than the monoblock toothbrush as suggested in Figure 6. P-values for interaction and main effects tests are presented in Table 4. The difference in the expected efficiencies of the two types of toothbrush is not the same for all sessions (p=0.010): for the monoblock toothbrush it is constant along the sessions (p=0.387) but this is not so for the conventional toothbrush (p=0.006). The expected efficiencies are different for the monoblock and conventional toothbrushes in sessions 1 and 2 (p=0.021 and p<0.001, respectively) and are similar in sessions 3 and 4 (p=0.266 and p=0.840). This suggests that the monoblock toothbrush is less efficient only in sessions 1 and 2 and is as efficient as the conventional toothbrush in sessions 3 and 4. In Nobre & Singer (2007), under the constant coefficient of variation assumption, the interaction effects were not significant.

Figure 6: Estimated relative residual bacterial plaque index and corresponding 95% confidence intervals based on the LNLMM (complete data set).

Insert Figure 6.

Table 4: P-values for interaction and main effects tests for LNLMM, GLMM and GEE.

Insert Table 4.

4 Analysis via generalized linear mixed models

In this section, we consider a GLMM based on a gamma distribution. As the LNLMM, this model is appropriate for a non-negative variable like the posttest bacterial plaque index and meets all the requirements set forth in Section 2. A positive correlation among the repeated measures is induced by the inclusion of random effects.

The typical gamma generalized linear mixed model with a logarithmic link function for the posttest bacterial plaque indices,y_{i j}= (y_{i j1}, . . . ,y_{i j4})^>, may be specified as

y_{i jd}|b_{i j}∼Gamma(µ_{i jd},φ), (4.1)

µ_{i j}=E(y_{i j}|b_{i j}), ln(µ_{i j}) =X_{i j}β+1₄b_{i j}, b_{i j}∼N(0,σ_b²), (4.2)

withµ_{i j}= (µ_{i j1}, . . . ,µi j4)^>and Var(y_{i j}) =φA_µ_{i j}, whereφis a dispersion parameter,A_µ_{i j}=diag(µ_{i j1}² ,µ_{i j2}² ,µ_{i j3}² ,µ_{i j4}² ) andX_{i j}andβ defined in (3.1). In this context, the coefficient of variation of the ratio of the posttest bac-

terial plaque index to the pretest bacterial plaque index is constant for all treatments. Since this disagrees with our previous conclusions, we consider an alternative GLMM, maintaining (4.2) and replacing (4.1) with

Var(y_{i j}|b_{i j}) = A^1/2_µ

i jW_{i j}A^1/2_µ

i j, (4.3)

where, by takingW_{i j}=R_jas in (3.2), we include the heteroskedastic pattern identified in Figure 3. If we letW_{i j}=φI, it follows that (4.3) is less restrictive than (4.1) since it does not requirey_{i jd}|b_{i j} to follow gamma distribution as pointed by Jiang (2007).

(7)

We estimated the parameters in (4.2)-(4.3) using the restricted pseudo-likelihood method presented in Wolfinger & O’Connell (1993) and implemented in thePROC GLIMMIXofSAS(Littell et al. 2006). This method is based on a first-order Taylor approximation to ln(y_{i j})around µ_{i j}. In order to maximize the restricted pseudo-likelihood function, a Newton-Raphson ridge optimization technique is considered.

Estimates of β^∗_jd=E(y_{i jd}|x_{i jd}=1) =E[E(y_{i jd}|b_i,x_{i jd}=1)]as well as of the other parameters in the GLMM are presented in Table 2. Also, there is no evidence against the hypothesisγ_jd=1 (p=0.441) and estimates of the parameters of model (4.2)-(4.3) withγ_jd=1 are presented in Table 3.

The studentized conditional residuals in Figure 7 present the same pattern as the corresponding residuals obtained for the LNLMM displayed in Figure 4.

Figure 7: Studentized conditional residuals for the GLMM model versus subject indices. (a) Complete data. (b) Reduced data.

Insert Figure 7.

5 Analysis via generalized estimating equations based models

Here we analyze the pretest-posttest data via models where the marginal distribution is not completely specified; such models depend only on a dispersion parameter and on a vector of parameters (θ) that indexes the meanµ_i(θ)and the varianceV_i(θ). Estimates of the parameters are obtained as solutions to the generalized estimating equations

S(β) =

K i=1

∑

∂ µ^>_i

∂ β Σ⁻¹_i (y_i−µ_i(β)) =0, (5.1)

whereΣi=φdiag(V_i)^1/2W(α)diag(V_i)^1/2 andWis a working covariance matrix that incorporates the within subject correlation. If we setVi(µ_i) =µ_i²andW=I, the solution is equivalent to a GLM with a gamma distribution and a log link function.

For the sake of comparison, we consider a GEE-based model for which

µ_{i j}=E(y_{i j}), ln(µ_{i j}) =X_{i j}β, Corr(y_{i jd},y_{i jd}0) =α,d6=d⁰, (5.2) whereXandβare defined in (3.1) andφ_jdare dispersion parameters. To capture the behaviour of the coefficients of variation identified in Figure 3, the dispersion parameter may depend on the type of toothbrush (j) and session (d), i.e.,

φ_jd=

φ, if (j=0 andd=3 or 4) or (j=1 andd=1 or 2)

φ+φ1,otherwise. (5.3)

This model was fitted via theR package geepack(Halekoh et al. 2006) using GEE that allows for different covariates and even different link functions and estimating equations for the mean, scale and correlations as considered in Yan & Fine (2004). The estimates of β^∗_jd and of the other parameters of model (5.2)-(5.3) are presented in Table 2. Again, no evidence against the hypothesisγjd=1 (p=0.481) was detected and the estimates under this restriction are presented in Table 3.

Plots of the studentized residuals are discussed in Venezuela et al. (2007, 2011) as well as Vens and Ziegler(2011). These residuals are presented in Figure 8 and their pattern are similar to those in Figure 4.

These authors also show how to compute Cook’s distance and a leverage measure in this setting. These quantities are also presented in Figure 8. The outlying observations are the same identified previously.

Although, these observations do not present high leverage (Figure 8b), the corresponding Cook’s distances are large for these outliers. Moreover, excluding them, no other observation is identified as an outlier.

Figure 8: Residual analysis for the GEE-based model with complete data. (a) Studentized conditional residuals. (b) Cook’s distance. (c) Leverage measure for each observation. (d) Leverage measure for each subject.

(8)

Insert Figure 8.

6 Discussion

We considered three families of linear models that are sufficiently flexible to accommodate different characteristics of longitudinal data. In particular, they may account for non-constant coefficients of variation, a feature that may justify differences generated by underlying log-normal and gamma distributions, as discussed in Wiens (1999).

Differences between estimates obtained under marginal and conditional models may occur due to non- comparable parameters, as indicate Lee & Nelder (2004) and Fitzmaurice et al. (2008). In our example, the main goal was to evaluate the efficiency of two types of toothbrushes along 45 days. This may not be carried out by a direct comparison of the estimates ofβ_jd in models (3.1)-(3.2), (4.2)-(4.3) and (5.2)- (5.3), because such parameters have different interpretations. Assuming the correct specification of all models, the estimates of the original parametersβ_jd were smaller for LNLMM than for GLMM or GEE- based models. To bypass this problem, we compared the population averaged effects that correspond to the relative residual bacterial plaque indices, β^∗_jd=E(y_jd|x_jd=1). For both LNLMM and GLMM, computation of this marginal expectation requires integration of the random effects in E[Y_i|b_i]while GEE- based models allow a direct estimation, an advantage reported by Serroyen et al. (2009), specially for non-linear mixed models.

In general, the estimates of the fixed parameters and their corresponding standard errors obtained under LNLMM, GLMM or GEE-based models are very close to each other, both with or without the exclusion of the outliers. The estimates of the variance components are also similar for LNLMM and GLMM. The estimate ofτ₁²is the most affected by the outliers because it is related to the extra-variance for the treatments where the outliers occurred. For the marginal model (GEE-based), the estimated correlation of repeated measures is smaller for the data without outliers and so is its standard error. The conclusions of all tests are the same for the three approaches, although the p-values are in general smaller for GEE-based models (see Table 4).

For random-effects models, like LNLMM and GLMM, another important issue is the robustness of estimation of fixed effects with respect to misspecification of the distribution of the random effects as discussed in Fitzmaurice et al. (2008). Verbeke & Lesaffre (1997) showed that maximum likelihood estimators of fixed effects obtained for a linear mixed model under Gaussian distributed random effects are consistent even when the random effects are non-Gaussian. Also, restricted maximum likelihood estimators of the variance components are consistent; this is important because they are used to obtain estimates of bothβ_jd^∗ and of its variance. Moreover, Verbeke & Lesaffre (1996) also comment that it is not possible to assess the distribution of the random effects by analyzing the predictedbb_i, what turns estimation robustness even more important. For GLMM, Liti`ere et al. (2008) conclude that misspecification of the distribution of random effects may produce biased maximum likelihood estimators of fixed effects but that this bias is small when the variability of random effects is small; however, as the estimators of this variability are severely biased, it is not possible to evaluate whether the biases of fixed effect estimators are negligible or not.

An additional advantage of GEE-based models is that GEE provide consistent estimators of fixed effects if the corresponding model is correctly specified regardless of the correlation and scale structures (Liang

& Zeger 1986). Also, when the scale parameter depends on covariates, the corresponding estimators are consistent if the fixed effects and scale structures are correctly specified (Yan and Fine, 2004). Therefore, estimates of the fixed effects are more robust under GEE-based models and LNLMM than under GLMM if the covariance structure is misspecified.

Diagnostic tools are well developed for LNMM (see Nobre & Singer (2007), for example). For GLMM, Vonesh et al. (1996) propose a goodness-of-fit statistic to evaluate the adequacy of an assumed mean and covariance structure and an approximate pseudo-likelihood test for the adequacy of the covariance structure; Xiang et al. (2002) use Cook’s distance for clustered data to identify influential clusters and Zhu &

(9)

Lee (2003) propose a different method to measure influence for GLMM. Also, Waagepetersen (2006) as- sesses the goodness-of-fit of the random effects distribution using simulation. Nevertheless, more research is needed for residual analysis for GLMM as already indicated in Dean & Nielsen (2007). Diagnostic tools for the GEE are presented, for example, in Venezuela et al. (2007), Venezuela et al. (2011) and Vens

& Ziegler (2011) to detect influential and outlying observations using the projection (hat) matrix, Cook’s distance, standardized residuals and half-normal plots of absolute residuals with simulated envelopes, the latter only for constant scale parameter as proposed in Park & Shin (1998). Some diagnostic tools and references are presented in Table 5.

Table 5: Diagnostic Tools and References.

Insert Table 5.

Other estimation issues involving the GEE method are related to missing data problems and the estimation of the correlation matrix. For example, Lu et al. (2009) show that fixed-effect estimators of LMM are biased under departures from normality in the presence of missing data. Also, based on simulation, they conclude that other (weighted and augmented weighted) robust estimators based on GEE provide valid inference for skewed non-Gaussian data when missing data follows a missing at random pattern. Sun et al.

(2009) compare estimation methods for the correlation in the GEE framework and conclude that the degree of imbalance and variability in the temporal spacing of measurements, the value of the correlation and the type of outcome affect the choice of the best method.

A practical issue to be addressed is the availability of computational software to fit the families of models under consideration. Although we used R only to fit GEE-based models, there are libraries to fit GLMM (MASS with the command glmmPQL using penalized quasi-likelihood) and LNMM (lme4 and nlme). Table 6 presents procedures of SAS and R libraries to estimate the studied models.

Table 6: Available statistical packages using SAS and R and estimation methods used in this study.

Insert Table 6.

If on the one hand estimates obtained under LNMM are robust to misspecification of random effects distributions, on the other, GEE-based models may be preferable for computational reasons, when the main goal is only to estimate and compare marginal expected responses. LNMM may be preferable to estimate subject-specific expected responses since a variety of diagnostic analyses are available for this family. To help users with respect to the choice of appropriate families of models, estimation procedures, software and diagnostic tools, we summarize different characteristics for the three approaches in Table 5. Other estimation methods are available for example in Fitzmaurice et al. (2008).

Finally, we mention that because of the relatively small (but not uncommon in studies of this type) sample size (32 children with 4 observations/child), the conclusions based on asymptotic results should be viewed with caution. A good practice in such cases is to fit different models to the data, use different diagnostic tools to detect possible outliers and verify whether the results are coherent. In our case, the three models generated similar results even when the possible outliers were included, suggesting that the conclusion seem pertinent.

(10)

●

●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

conventional − session 1

pretest

posttest ●

●

●●

● ●

●

●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest ^●

●

● ●

●

●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest ^●●

●

●●

●

● ●

●

●●● ●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest

●

●●

●

● ●

●

● ●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

monoblock − session 1

pretest

posttest

●

●●

●

●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest ^●_●^●^●^●

●●

●

●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest

●●

●

● ●●●

●

0.0 0.5 1.0 1.5 2.0

0.00.51.01.52.0

pretest

posttest

Figure 1 Scatter plots of pretest versus posttest bacterial plaque indices.

●

●●

●

● ●

●

●●

●

●●

●

●●

●

●● ●

●

●●

●

●●

●

0 5 10 15 20 25 30

−0.8−0.6−0.4−0.20.00.2

subject

resid

monoblock,subj=13, session=4, obs=116 conventional,subj=12, session=2, obs= 46 monoblock,subj=13, session=3, obs=115

(a)

log x

log y

−1.0

−0.5 0.0 0.5

−0.5 0.0 0.5 conventional

−0.5 0.0 0.5 monoblock

(b)

●

● ●

●

● ●

●

● ●

●

● ●

●

●●

●

● ●

●

● ●● ●

●

● ●

●

●●

●

●●

●

● ●

●

● ●

●

−1.0 −0.5 0.0 0.5

−5−4−3−2−1012

fitted

standardized OLS resid

46 115

116

(c)

Figure 2 (a) Residuals versus ranks of subject mean residuals. (b) Individual profiles of ln(y)versus ln(x). (c) Residuals versus fitted values.

(11)

●

0.000.050.100.150.20

session

CV

1 2 3 4

●

conventional monoblock

conventional without outliers monoblock without outliers

●

Figure 3 Sample coefficients of variation of ratios between the posttest and pretest indices for each toothbrush and session for the complete and reduced (excluding 3 outliers) data sets.

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

0 5 10 15 20 25 30

−4−3−2−1012

complete data

subject

studentized conditional residuals

46 116

(a)

●

●●

●

●●

●

●●

●

●●

●

●●

●

0 5 10 15 20 25 30

−4−3−2−1012

reduced data

subject

(b)

Figure 4 Studentized conditional residuals versus subject indices for LNMM. (a) Complete data. (b) Re- duced data.

(12)

●

●●●

●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●

−2 −1 0 1 2

−4−3−2−10123

complete data

Gaussian quantiles

(a)

●

●●● ●

●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●

−2 −1 0 1 2

−4−3−2−10123

reduced data

Gaussian quantiles

(b)

Figure 5 QQ plot of studentized conditional residuals for LNMM. (a) Complete data. (b) Reduced data.

●

0.60.70.80.9

session

Estimated relative residual bacterial plaque index

●

● ●

1 2 3 4

●

Figure 6 Estimated relative residual bacterial plaque index and corresponding 95% confidence intervals based on the LNLMM (complete data set).