• Nenhum resultado encontrado

Lucio Manuela Tadeu Amilcar Teresa artigo_2_grecia

N/A
N/A
Protected

Academic year: 2021

Share "Lucio Manuela Tadeu Amilcar Teresa artigo_2_grecia"

Copied!
5
0
0

Texto

(1)

Tucker3 method in biometrical research: Analysis of experiments with three

factors, using R

Lúcio B. de Araújo

,

Manuela M. Oliveira

,

Carlos Tadeu dos S. Dias

,

Amílcar Oliveira

, and

Teresa A. Oliveira

Citation:

AIP Conf. Proc.

1479, 1728 (2012); doi: 10.1063/1.4756506

View online:

http://dx.doi.org/10.1063/1.4756506

View Table of Contents:

http://proceedings.aip.org/dbt/dbt.jsp?KEY=APCPCS&Volume=1479&Issue=1

Published by the

American Institute of Physics.

Additional information on AIP Conf. Proc.

Journal Homepage:

http://proceedings.aip.org/

Journal Information:

http://proceedings.aip.org/about/about_the_proceedings

Top downloads:

http://proceedings.aip.org/dbt/most_downloaded.jsp?KEY=APCPCS

(2)

Tucker3 Method in Biometrical research: Analysis of

experiments with three factors, using R

Lúcio B. de Araújo

a

, Manuela M. Oliveira

b

, Carlos Tadeu dos S. Dias

c

, Amílcar

Oliveira

d

, Teresa A. Oliveira

d

a Faculdade Matemática, Universidade Federal de Uberlândia,38408-100, Uberlândia-MG, Brasil b

Department of Mathematics, University of Évora, Portugal

c

Departamento de Ciências Exatas, Universidade de São Paulo, Piracicaba-SP, Brasil

d

CEAUL and DCeT, Universidade Aberta, Rua da Escola Politécnica, 147, 1269-001 Lisboa, Portugal

Abstract. The present work aims to propose a systematic study and interpretation of a variable response in relation to

three factors, using a model of Joint Table Analysis, the Tucker3 model, as well as the joint biplot graph. The proposed method seems efficient and suitable for separating standard technical response, and the pattern of noise contained in a three inputs table, as well as allows its interpretation. The joint plot graph facilitates the study and interpretation of the data structure and provides additional information on these. In our application the aim is to identify the combinations of genotypes, locations and years that contribute or not to a high yield of bean cultivars.

Keywords: Multi-way, principal components, joint biplot. PACS: 62-06, 62H25,62k10,62k15

INTRODUCTION

The multi-environment experiments (MET) are conducted over several years for major agricultural products in the world, constituting an expensive procedure, but essential for the release of new genotypes and for the recommendation of cultivars. Therefore, appropriate methods for data analysis should be explored and developed. Considering the case where the MET are evaluated by several years (ie, genotype ×location × year), the data may be in this case organized into three entries in tables, each entry corresponds to genotype, local and year. In some cases, the researcher may be interested to know if there is a common structure for covert locations with respect to the years and how the various genotypes respond through the structure formed by environments and years. Some genotypes may respond with high responses in some locations but not others, and some locations may be more associated with some genotypes than others in some years. Thus, one way to analyze and interpret a table of three inputs is to determine a lower dimensional structure, expressed as principal components and then studying the relationship between genotype × location × year. To analyze the data organized in tables of three entries, there are models themselves, such as the models proposed by [1], which provide a decomposition of trilinear data arranged in the arrangement. Thus, this paper aims to propose the main study and interpretation of the relationship of bean production in relation to genotype × location × year, through the Tucker3 model, exploring the utility of the joint plot graphic [2].

METHODOLOGY

The principal component analysis by Singular Value Decomposition (SVD), looks for an approach based on P components for each element of an array (two inputs) X (I x J), so that an element of this array can be expressed as [3]:

1

P

ij p ip pp jp ij

x

¦

a g b e

where aip is an element of a matrix of eigenvectors components A; bjp is an element of the matrix of eigenvectors B;

gpp is an element of the matrix of eigenvalues G; eij is the information element which is not explained by the P

componentes.

A possible generalization of the principal components model for data of two inputs, to the case of three inputs, X, with elements xijk, may be written as:

(3)

1 1 1

P Q R

ijk p q r ip jq kr pqr ijk

x

¦ ¦ ¦

a b c g e

where eijk is the residual element of E (I×J×K); aip, bjq and ckr are typical elements of matrices A (I×P), B (J×Q) and C (K×R); and gpqr is a typical element of G(P×Q×R). This model is known as Tucker3 model of X and is

represented by X (P,Q,R), where P,Q and R indicate the components number in each entrance of X.

Once fixed the number of components in matrices A, B and C, the parameters estimation of aip, bjq and ckr of

Tucker3 model is realized by the iteractive method of least squares alternate, where each set of parameters is estimated conditional on the remaining parameters. Thus, the estimation is to be repeated iteratively until no significant changes are found in parameter values and, as the initial solution, the default values suggested by [1] are used. To determine the best model Tucker3 solution the method proposed in [4] was used.

For an array it can be obtained the biplot graph in which the rows and columns are displayed in a graph with two or three dimensions, whose construction can be found in [5]. As for the data contained in an arrangement of three inputs, one can obtain a joint plot [2], which is used to graphically represent models Tucker3 being similar to one biplot, and all of the principles of interpretation of biplot may be used. These graphs differ in their construction, and the joint plot is constructed as a biplot for two factors given the component matrix of the model Tucker3, concerning the third factor (third input), meaning that each joint plot is constructed by using different groups of G. For the construction of a joint after adjusting aTucker3 model, it is necessary to obtain a matrix r = AGrB' = A*rB*'

od dimension I×J, with r=1, 2, ..., R and then, using DVS,  r can be represented by a biplot, obtaining then the joint

plot [6]. For each group Gr it is necessary to do a joint plot, given the matrices of components A* (J×P) and B*

(J×Q).

About the interpretation of a joint plot [7], suppose a graphic is projected on the r-th principal component of the third entry, such that in the obtained joint plot, appear all levels of the first two entries. Then select, from matrices C (array of principal components of the third entry), the levels of this factor with the greatest weight in the r-th component (positive or negative). Suppose that the matrix C has a positive high value associated to the ao k-th level of the third entry. So, proximity between the levels of the first and second entry indicate that the triple interaction between the i-th level of the first input, the jth level of the second input and the k-th level of the third input is positive. In contrast, if the i-th level of the first factor is far (vectors in opposite directions) of the jth level of the second factor, this indicates that the triple interaction associated with these three factors is negative. If matrix C has a high negative value associated with the k-th level of the third factor, the triple interactions will be the opposite of when the value is positive. In general, levels located in the center of the joint plot are considered a joint assembly that has an average performance in all other modes.

RESULTS

The data refer to 13 common bean genotypes, conducted in nine different experimental conditions and observed in the years of 2000/2001 2001/2002 and 2005/2006, in the cities of Golden and Aquidauana, Brazil. These experiments were installed in the rainy season (Golden) and also in the dry season (Golden and Aquidauana). Each location consists of a city and a time of installation. Moreover, in each experimental situation, we used a randomized block design with three random blocks. In the evaluation of results, aiming at selecting the best cultivars, the variable grain yield in t ha-1 was considered.

Using the R software, we applied the Tucker3 model. The great advantage of this method relative to other multiplicative models of two inputs is the possibility of the simultaneous study of several factors. This means, for example, you can do a breakdown of production between genotypes × locations × years, making the findings more accurate and real than those obtained with multiplicative models for two entries. Thus, it is possible to construct a cubic arrangement of dimension (13x3x3) with the medium production effect of each combination of genotype×location× year, so that the lines are genotypes, columns are the places and in "tubes" are the years.

To select the best model Tucker3, we used the procedure Timmerman-Kiers, as for the data set in question the procedure suggests to select the Tucker3 model (2, 2, 2). This model explains 97.45% of the total variation in productivity, and the components p1 e p2, of matrix A, explain 95.37% and 2.09%, respectively. The two

components, q1 and q2, of matrix B, explain 93.18% and 4.27%, respectively, and in matrix C both components r1

(4)

1 2 -0.324 0.155 1 -0.341 -0.209 2 -0.299 -0.186 3 -0.252 -0.058 4 -0.303 -0.126 5 -0.251 -0.070 6 -0.244 -0.062 7 -0.345 -0.406 8 -0.241 -0.240 9 -0.217 0.508 10 -0.250 0.148 11 -0.208 0.415 12 -0.286 0.442 13 p p G G G G G G G G G G G G G

§

¨

¨

¨

¨

¨

¨

¨

¨

¨

¨

¨

©

A ;

·

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¨

¸

¨

¸

¨

¸

¨

¸

¨

¸

¨

¸

¨

¸

¹

1 2 1 -0.435 0.868 2 -0.524 -0.029 ; 3 -0.732 -0.495 q q L L L

§

·

¨

¸

¨

¸

¨

¸

©

¹

B 1 2 1 0.574 0.320 2 0.634 0.409 ; 3 0.519 -0.854 r r Y Y Y

§

·

¨

¸

¨

¸

¨

¸

©

¹

C G ¸¸ ¸ ¸ ¸ ¹ ·       ¨¨ ¨ ¨ ¨ © § 28 46 59 1920 55 2315 25 29 21 3628 94 6 25 24 92 20013 2 1 2 1 . . . . . . . . q q q q r r p p 2 1 2 1

It can be seen that the first matrix component C is characterized by the first (0574) and second years (0634). The second component is characterized by the third year (-0854). Thus, when it is built a joint plot, which projects the genotypes and locations within the first year of the component, the conclusions will be restricted only to the year 1 and year 2 (Figure 1a), but designed to be the genotypes and locations within the second component of the years, the conclusions are valid for the third year (Figure 1b).

The joint plot, Figures 1a and 1b, correspond respectively, to the biplots of matrices 1 = AG1B' and 2 = AG2B'

where G1 is the first slice front and G2 is the second slice front of G, obtained by adjusting the model Tucker3

(2,2,2).

By Figure 1a, considering years 1 and 2, (c11 and c21 are positive), we observe the following relations:

1- On location L3 yields are high to G1 and G13 and yields are low to G9;

2- On location L2 yields are high to G1, G2 e G8 and yields are low to G9, G10, G11; 3- On location L1 yields are high to genotypes G2 and G8 and yields are low to G10 and G12. The other genotypes, G3, G4, G5, G6, G7 and G11 present intermediate productions.

In the same way, by Figure 1b, we observe the characteristics to year 3 (with c32 negative). We observe that:

5- On location L1 there are low yieds to G2 and G8; 6- On location L2 there is low yield to G8;

7- On location L3 there is low yield to G9.

In year 2003 the high yields are allocated to G10, G12 and G13, being that G10 and G12 are related with L1, L3 is related to G13 and L2 is hightly correlated with G10, G12 and G13.

(5)

(a) (b)

FIGURE 1. Joint plot projected in the first (a) and second component (b), in third mode, for the beans yield (t ha-1), 13 genotypes, 3 locations, 3 years.

For the analysis of Figures 1a and 1b it can be observed that there are genotypes which are not related to certain environments. This feature is observed if the angles between the vectors of genotype and environment ar near 90º. For example, in Figure 1a, genotypes G1 and G9 do not relate to L1 and G13 do not relate to L2. In Figure 1b we observe that G2 do not relate to L3, G9 do not relate to L1 and G1 is not correlated with L2.

CONCLUSION

The proposed systematic analysis using Tucker3 models proved to be efficient and adequate to separate the response pattern technique and the noise contained in the tables of three entries, as well as its interpretation. The graph joint plot facilitates the understanding of the data structure, in addition to providing information about it, to identify which combinations of genotypes, locations and years contribute or not to a high yield.

ACKNOWLEDGMENTS

The research leading to these results has received funding from ForEAdapt project, funded by the European Union Seventh Framework Programme (FP7-PEOPLE-2010-IRSES) under grant agreement n° PIRSES-GA-2010-269257. Research also partially sponsored by national founds through the Fundação Nacional para a Ciência e Tecnologia, Portugal - FCT under the project (PEst-OE/MAT/UI0006/2011) and by FAPEMIG. CNPQ for Granting a scholarship research.

REFERENCES

1. L. Tucker, Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279-311 (1966).

2. P.M. Kroonenberg, Three-mode principal component analysis: theory and applications. Leiden: DSWO, 1983, 398p.

3. P.M. Kroonenberg, Applied Multiway Data Analysis. New Jersey: Wiley-Interscience, 2008, 579p

4. M. E. Timmerman and H.A.L. Kiers, Three-mode principal components analysis: Choosing the numbers of components and sensitivity to local optima. British Journal of Mathematical and Statistical Psychology, 53, 1-16 (2000).

5. GABRIEL, K.R. The biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58, 453-467 (1971).

6. P.M. Kroonenberg, The TUCKALS line: A suite programs for three-way data analysis. Computational Statistics and Data

Analysis, 18, 73-96 (1994).

7. M. Varela, J. Crossa, J. Rane, A.K. Joshi and R. Trethowan Analysis of a three-way interaction including multi-attributes.

Referências

Documentos relacionados

Despercebido: não visto, não notado, não observado, ignorado.. Não me passou despercebido

gulbenkian música mecenas estágios gulbenkian para orquestra mecenas música de câmara mecenas concertos de domingo mecenas ciclo piano mecenas coro gulbenkian. FUNDAÇÃO

Tendo como referência o fragmento de texto acima e aspectos a ele relacionados, julgue os itens subsequentes. 93 Infere-se do texto que a população de menor renda ficará

Neste trabalho o objetivo central foi a ampliação e adequação do procedimento e programa computacional baseado no programa comercial MSC.PATRAN, para a geração automática de modelos

Ousasse apontar algumas hipóteses para a solução desse problema público a partir do exposto dos autores usados como base para fundamentação teórica, da análise dos dados

i) A condutividade da matriz vítrea diminui com o aumento do tempo de tratamento térmico (Fig.. 241 pequena quantidade de cristais existentes na amostra já provoca um efeito

Neste sentido, com este projecto, pretendemos integrar as novas tecnologias de informação e comunicação no ensino da Matemática, com o objectivo de tornar esta disciplina

In this section, we formulate an optimal control problem for HIV-1 infection, with time delay in state and control variables, and derive extremals for the minimization of virus by