O Método dos Mínimos Quadrados com a Matemática do Ensino Médio

Texto

(1)The least squares method ARTIGO with O high RIGINAL school / ORIGINAL mathematics ARTICLE. The least squares method with high school mathematics O método dos mínimos quadrados com a matemática do Ensino Médio. Antonio Carlos Baptista Antunes* Weuber da Silva Carvalho** * Universidade Federal do Rio de Janeiro. ** Instituto Brasileiro de Geografia e Estatística.. Resumo Ensinar o Método dos Mínimos Quadrados para o ajuste de linhas retas a um conjunto de dados pode ser uma tarefa difícil se os alunos não dominam cálculo elementar. Neste trabalho apresentamos o Método dos Mínimos Quadrados utilizando a matemática do ensino médio.Também derivamos expressões de incerteza para os parâmetros das linhas retas. Palavras-chave: Ajuste de linhas retas. Métodos elementares de ajuste. Expressões de incerteza.. Abstract Teaching the least squares method for fitting straight lines to a data set can be difficult if students have not learned elementary calculus. In this paper we present the least squares method using high school mathematics. We also derive uncertainty expressions for the straight-line parameters. Key words: Fitting straight lines. Elementary adjusting methods. Uncertainty expressions.. 1 Introduction In this paper we present the Least Squares Method (LSM) for fitting a straight line to a set of data using high school mathematics. This derivation of the LSM can be studied and applied by students of high school, technical school students, and first year students of undergraduate courses. In general these students have not acquired the necessary skill in calculus. Our experience with students of the first period of experimental physics, which have not sufficient training in the use of partial derivatives to minimize functions of several variables, shows a tendency to accept passively the results of the LSM, which give the parameters of the straight line to be fitted. In most of the applications, students use a computer program to calculate the parameters of the straight line after introducing the data set as input. In general these students do not have an idea of the meaning of the LSM. They have an intuitive perception that the best fitted straight line should correspond to the minimum of the sum of squared deviations between the straight line and the data set points. However the process of the minimization remains obscure. Our aim in this paper is to shed some light on the visualization of the LSM as a statistical process of fitting a given function to a data set. The name least squares comes from the fact that the procedure consists in minimizing the squares of the differences between the measured results and the values predicted by the mathematical function to be fitted. This function is often the mathematical expression of a physical model, which contains free parameters. The fitting of this function to. a data set means to determine the values of these parameters that minimize the sum of the squared deviations of the model predictions from the experimental results. Comparing the graphic of the function given by the best fit process of the least squares method with the plot of the data set, including the errors bars, the validity of the physical model can be tested. The first publication on the LSM was in 1806, in a book on the determination of the orbits of comets, by A. M. Legendre. However, C. F. Gauss was the first one that developed the probability foundations of the method and its relations with the normal distribution of errors. In the present approach of the LSM, the geometrical aspect of the minimization process is emphasized. The sum of the squared deviations between the data and the function to be fitted, sometimes called chi-square function, is observed to be a function defined on the space of the parameters. In the particular case of fitting a straight line, which has two parameters to be determined, the chi-square function is a quadratic function of the straight line parameters. This quadratic function defines a hyperboloid on the plane of straight-line parameters. The minimization process consists in to determine the minimum of this hyperboloid. A previous version of this paper has been shared with some students that demonstrate special interest in the least square method. Our conclusion about this test with these students is that the present derivation of the LSM, despite being more laborious than the straightforward derivation using calculus, gives a better insight on the meaning of the LSM. In section 2 we apply the LSM to a set of data with. ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005. 39.

(2) The least squares method with high school mathematics. unknown errors, and in section 3 to a set of data with known errors. In section 4 we show an evaluation of the uncertainties of the parameters of the straight line fitted in section 3. A similar evaluation is used in section 5 to obtain the uncertainties in the parameters of the straight line, fitted in section 2, to a data set with unknown errors. An example is shown in section 6 to illustrate our formulation of the LSM.. which are given by (9) Due to the symmetry of the parabola with respect to an axis parallel to S which pass on its minimum, the abcissa of the minimum is placed between the two roots, see Figure 1.. 2 Derivation of the LSM Consider the problem of fitting a straight line. y = ax + b. to a set of data { xi , yi ; i = 1....N } with unknown errors. First we define the deviations of the data points from the curve to be fitted. These deviations are given by Lyons (1989) and (1990): (1) In order to determine the two free parameters, a and. b, we must minimize the mean of the squared deviations (2) We introduce the symbol (3). Figure 1. The parabola that appears in sections 2 and 3.. to denote the mean over N values of A. Then equation (2) can be rewritten as. The two roots are obtained in the intersection of the parabola with the abcissa. The coordinate of the minimum corresponds to an average of the two roots is given by. (4). (10) which is the equation of a paraboloid defined on the plane (a, b). For a chosen value of a equation (4). then. describes a parabola in the plane (b, S) . This becomes clear writing. (11). (5). If result of the equation (11) is reintroduced in equation (4) we get a new parabola in the plane (a, S):. where. (12). (6). where. and (13) (7). The coefficient of the term b 2 in equation (5) is positive thus this parabola has a minimum. In order to determine the position b0 (a) of this minimum, we seek for the roots of the equation. As in the preceding case we seek for the roots of the equation (14) that are. (8) (15) 40. ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005.

(3) The least squares method with high school mathematics. The abcissa of the minimum is given by the mean (16). with the mean values given by (26). which is As in to the preceding case the minimum value of (17) We observe that because. corresponds to. a0. is the abcissa of a minimum. (27) Putting this result in we. (18) and a0 is the angular coefficient of the straight line. The linear coefficient is. (28). (19). (29). S obtain. with. Thus, the fitted straight line is and. (20) which can be rewritten as. The minimum of. (21). (30). showing clearly that the mean point point of the fitted straight line.. S (a ). corresponds to. (áxñ, áyñ) is a. y(x) = a0 + b0 , where b0= áyñ – a0 áxñ.Another expression, y(x) = áyñ +a0(x – áxñ) 3 Fitting a Straight Line to a Data set with Uncertainties Sshows that the weighted mean point (áxñ, áyñ) is a Let {xj, yj ± sj, j = 1,....N} be a set of data where point of the fitted straight line. the sj are the uncertainties in the yj . In order to fit a 4 Uncertainties in the Parameters of the Fitted Straight straight line y = ax + b to this set of data using the. LSM we must firstly define the weights corresponding to each datum. By convention the weight is chosen to be inverse of the squared uncertainty see Lyons (1990) and Bulmer (1979). That means pj = 1/ sj2 . With this choice the weighted mean of the square deviations is. (22). In this way we are giving more weight to the data with small uncertainty and less weight to the data with large uncertainty. The expression of the equation (22) can be rewritten as. The fitted straight line is. Line. The parameters a0 e b0 of the fitted straight line contain uncertainties which come from the errors sj in the measurements of yj . These uncertainties can be evaluated using methods of error propagation Lyons (1990) and (1989) and Bulmer (1979). For our purposes we need to consider only a linear function of independent variables: (31) Let sj ( j = 1,....,N) denote the errors in the variables. yj . The uncertainty in f is given by. (23). (32). where. where the. (24) and. Cj's. are constants.. Using an expression for. áyñ. like that defined in. equation (26) we can calculate the uncertainty in which is given by. áyñ. (25) (33). ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005. 41.

(4) The least squares method with high school mathematics. with the weighs. pj = 1/ sj2 we obtain. (41). (34) The angular coefficient of the fitted straight line is. a0 = Sxy / Sx2 . Sx2 is independent of yj, whereas Sxy can be written as. We assume that the errors in the yj are given by sj = sy ( j = 1,....,N). Using the same procedure. applied in section 4 to calculate the uncertainties in the fitting. Doing so we obtain. (35) (42) which is linear in the yj . The uncertainty in a0 is given by. (43). (36) (44). where. Choosing. (37). x = 0 in the fitted straight line we obtain. y(0) = b0 , and (45). 6 Example - The Hubble Law In order to illustrate the present formulation of the LSM we consider the problem of fitting a straight line to a set of data points obtained from the measurements of the red shift (z) of the light coming from some galaxies and its distances (D). In 1929 Edwin Hubble announced a linear relation between the red shifts and the distances. Then, we have. (38). The fitted straight line can be written as y(x) = áyñ. +a0(x – áxñ), and the uncertainty in the fitting is given by. (46) where c is the light speed and H0 is the Hubble constant for galaxies with small red shifts (z < 0.1) .. This result shows that the minimum of the uncertainty of the fitted straight line corresponds to the position of the mean point, x = áxñ, y = áyñ where sy(x)= s<y> . For. Recently it was observed a deviation from this linearity in light coming from galaxies with larger red shifts (z » 1), Hogan (2003). To observe this effect we rewrite the above relation as a quadratic function of the red shift,. x = 0 we have y(0) = b0, and equation (39) gives :. (47). (39). (40) Therefore, equations (38) and (40) give the uncertainties in the parameters of the fitted straight line.. where H and α are free parameters to be fitted. This relation can be linearized writting (48). 5 Uncertainties in the Fitting to Data with Unknown Errors If the errors in the values of yj are unknown we need to estimate them. Firstly we perform the fitting, according the prescription of section 2, which gives the straight line y(x) = a0 + b0 . The variance sy2 in this fitting is given by Lyons (1990).. 42. where b = H0 / H . Comparing equation (48) with the straight line equation y = az + b, we obtain y = H0 D / cz and a = αb. Fitting this straight line equation to a set of values of z and y, obtained from measurements of z and D, we. ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005.

(5) The least squares method with high school mathematics. determine values of a and b. Then we can calculate H and α . In Tonry (2003) there are several values of z, x = log(cz), h = log(H0D) and the uncertainties sh. The values of y and their uncertainties are given by y = 10(h–x) and sy= (ln10)y respectively. The nominal value of H0 is 65km.s-1.Mpc-1. Applying the formulation of section 3 and 4 to these data with known uncertainties we obtain the straight line parameters a = 0.80 ± 0.02 and b = 0.873 ± 0.002 whichgive H = 74.5 ± 0.2km.s-1.Mpc-1 and α = 0.92 ± 0.02. Using the formulation of sections 2 and 5 for data with unknown uncertainties we obtain a = 0.58 ± 0.02 and b = 1.01 ± 0.05. These parameters give H = 64.7 ± 3.2km.s-1.Mpc-1 and α = 0.58 ± 0.02. In the Figure 2 we can observe the two straight lines fitted according the two procedures. The continous line is the straightline fitted with known uncertainties. The dashed line is the straightline fitted assuming unknown uncertainties. The significant difference between these two fittings shows the importance of the uncertainties in weighing the data to be the fitted. In this figure, the error bars are omitted in order to become the fitted lines line more visible. An enlargement of the Figure 2 including the error. Figure 2. The vertical axis is the adimensional parameter y and the horizontal axis is the redshift (z).. bars of the y variable for the interval 0 < redshift (z) < 0.2 is shown in the Figure 3. The non-vanishing value of α = 0.92 ± 0.02 shows clearly the deviation from linearity of the original relation by Hubble. Physically this effect shows that there is an acceleration of the nearest galaxies in comparison with the farthest ones, which means that the universe expansion is accelerating Hogan (2003).. References BULMER, M. G. Principles of Statistics. New York: Dover Publications, 1979. HOGAN, C. J. et al. The once and the future cosmos. Scientific American, v. 12, n. 2, p. 26-33, Oct. 2003. Special ed. LYONS, L. A practical guide to data analysis for physical science students. Cambridge: Cambridge University Press, 1990. LYONS, L. Statistics for nuclear and particle physics, Cambridge: Cambridge University Press, 1989. TONRY, J. L. et al. Cosmology results from high-z supernovae. Astrophysics Journal, v. 594, pt. 1, p. 1-24, Sept. 2003.. Figure 3. An enlargement of the Figure 2.. ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005. 43.

(6) The least squares method with high school mathematics. Antonio Carlos Baptista Antunes* Instituto de Física. Universidade Federal do Rio de Janeiro (UFRJ). e-mail: <antunes@if.ufrj.br>.. Weuber da Silva Carvalho Departamento de Pesquisas. Instituto Brasileiro de Geografia e Estatística (IBGE). e-mail: <weuber@ibge.gov.br> * Endereço para correspondência: Rua Almirante Tamandaré, 59 apto 601 – Bairro Flamengo – CEP 22.210-060 - Rio de Janeiro - RJ.. 44. ANTUNES, A. C. B.; CARVALHO, W. S. / UNOPAR Cient., Ciênc. Exatas Tecnol., Londrina, v. 4, p. 39-44, nov. 2005.

(7)