• Nenhum resultado encontrado

IMSL STAT/LIBRARY Chapter 2: Regression · 113 DATA (X(1,J),J=1,NIND)/1.0, -2.0, 0.0/, Y(1)/-3.0/

DATA (X(2,J),J=1,NIND)/1.0, -1.0, 2.0/, Y(2)/ 1.0/

DATA (X(3,J),J=1,NIND)/1.0, 2.0, 5.0/, Y(3)/ 2.0/

DATA (X(4,J),J=1,NIND)/1.0, 7.0, 3.0/, Y(4)/ 6.0/

!

DO 10 I=1, NOBS

! Assign weights W(I) = 1.0/I**2

! Store square roots of weights W(I) = SQRT(W(I))

10 CONTINUE

! Transform regressors DO 20 J=1, NIND

CALL SHPROD (NOBS, W, 1, X(:,J), 1, X(:,J), 1) 20 CONTINUE

! Transform response CALL SHPROD (NOBS, W, 1, Y, 1, Y, 1)

!

CALL RLSE (Y, X, B, INTCEP=INTCEP, SST=SST, SSE=SSE)

!

CALL WRRRN (’B’, B) CALL UMACH (2, NOUT) WRITE (NOUT,*)

WRITE (NOUT,99999) ’SST = ’, SST, ’ SSE = ’, SSE 99999 FORMAT (A7, F7.2, A7, F7.2)

END

Output

B 1 -1.431 2 0.658 3 0.748

SST = 11.94 SSE = 1.01

114 · Chapter 2: Regression IMSL STAT/LIBRARY XYMEAN — Vector of length NIND + NDEP containing variable means. (Input, if

INTCEP = 1)

The first NIND elements of XYMEAN are for the independent variables in the same order in which they appear in COV. The last NDEP elements of XYMEAN are for the dependent variables in the same order in which they appear in COV. If weighting is desired, XYMEAN contains weighted means. If INTCEP = 0, XYMEAN is not referenced and can be a vector of length one.

SUMWTF — Sum of products of weights with frequencies. (Input, if INTCEP = 1) In the ordinary case when weights and frequencies are all one, SUMWTF equals the number of observations.

BINTCEP + NIND by NDEP matrix containing a least-squares solution Bˆ for the regression coefficients. (Output)

Column j is for the j-th dependent variable. If INTCEP = 1, row 1 is for the intercept.

Row INTCEP + i is for the i-th independent variable. Elements of the appropriate row(s) of Bˆ are set to 0.0 if linear dependence of the regressors is declared.

Optional Arguments

INTCEP — Intercept option. (Input) Default: INTCEP = 1.

INTCEP Action

0 An intercept is not in the model.

1 An intercept is in the model.

NIND — Number of independent (explanatory) variables. (Input) Default: NIND = size (B,1) – INTCEP.

NDEP — Number of dependent (response) variables. (Input) Default: NDEP = size (B,2).

LDCOV — Leading dimension of COV exactly as specified in the dimension statement in the calling program. (Input)

Default: LDCOV = size (COV,1).

TOL — Tolerance used in determining linear dependence. (Input)

For RCOV, TOL = 100 *AMACH(4) is a common choice. See documentation for routine AMACH (Reference Material).

Default: TOL = 1.e-5 for single precision and 2.d -14 for double precision.

LDB — Leading dimension of B exactly as specified in the dimension statement in the calling program. (Input)

Default: LDB = size (B,1).

IMSL STAT/LIBRARY Chapter 2: Regression · 115 RINTCEP + NIND by INTCEP + NIND upper triangular matrix containing the R matrix

from a Cholesky factorization RT R of the matrix of sums of squares and crossproducts of the regressors. (Output)

Elements of the appropriate row(s) of R are set to 0.0 if linear dependence of the regressors is declared.

LDR — Leading dimension of R exactly as specified in the dimension statement in the calling program. (Input)

Default: LDR = size (R,1).

IRANK — Rank of R. (Output)

IRANK less than INTCEP + NIND indicates that linear dependence of the regressors was declared. In this case, some rows of Bˆ are set to zero.

SCPENDEP by NDEP matrix containing the error (residual) sums of squares and crossproducts. (Output)

LDSCPE — Leading dimension of SCPE exactly as specified in the dimension statement in the calling program. (Input)

Default: LDSCPE = size (SCPE,1).

FORTRAN 90 Interface

Generic: CALL RCOV(COV, XYMEAN, SUMWTF, B [,…]) Specific: The specific interface names are S_RCOV and D_RCOV. FORTRAN 77 Interface

Single: CALL RCOV (INTCEP, NIND, NDEP, COV, LDCOV, XYMEAN, SUMWTF, TOL, B, LDB, R, LDR, IRANK, SCPE, LDSCPE)

Double: The double precision name is DRCOV. Example

This example uses a data set from Draper and Smith (1981, pages 629 - 630). This data set is put into the matrix X by routine GDATA (See Chapter 19, Utilities). The first four columns are for the independent variables, and the last column is for the dependent variable. Routine CORVC in Chapter 3, “Correlation,” is invoked to compute the corrected sum of squares and crossproducts matrix. Then, RCOV is invoked to compute the regression coefficient estimates, the R matrix, and the sum of squares for error.

USE RCOV_INT USE GDATA_INT USE CORVC_INT USE UMACH_INT USE WRRRN_INT

116 · Chapter 2: Regression IMSL STAT/LIBRARY PARAMETER (LDX=13, NDX=5, NIND=4, NDEP=1, LDCOV=NIND+NDEP, &

LDSCPE=NDEP)

PARAMETER (INTCEP=1, LDB=INTCEP+NIND, LDR=INTCEP+NIND) REAL XYMEAN(NIND+NDEP)

REAL X(LDX,NDX), B(LDB,NDEP), R(LDR,INTCEP+NIND) REAL COV(LDCOV,NIND+NDEP), SCPE(LDSCPE,NDEP), SUMWTF INTEGER INCD(1,1), ICOPT

!

CALL GDATA (5, X, NROW, NVAR)

!

ICOPT = 1

CALL CORVC (NVAR, X, COV, ICOPT=ICOPT, XMEAN=XYMEAN, SUMWT=SUMWTF)

!

CALL RCOV (COV, XYMEAN, SUMWTF, B, R=R, IRANK=IRANK, &

SCPE=SCPE)

!

CALL UMACH (2, NOUT)

WRITE (NOUT,*) ’IRANK = ’, IRANK, ’ SCPE(1,1) = ’, SCPE(1,1) CALL WRRRN (’B’, B, 1, INTCEP+NIND, 1)

CALL WRRRN (’R’, R) END

Output

IRANK = 5 SCPE(1,1) = 47.8638 B

1 2 3 4 5 62.40 1.55 0.51 0.10 -0.14 R

1 2 3 4 5 1 3.6 26.9 173.6 42.4 108.2 2 0.0 20.4 12.3 -18.3 -14.2 3 0.0 0.0 52.5 1.1 -54.6 4 0.0 0.0 0.0 12.5 -12.9 5 0.0 0.0 0.0 0.0 3.4

Comments

1. Informational error Type Code

3 1 COV is not a variance-covariance matrix within the tolerance defined by TOL.

Description

Routine RCOV fits a multivariate linear regression model given the variance-covariance matrix (or sum of squares and crossproducts matrix) for the independent and dependent variables.

Typically, an intercept is to be in the model, and the corrected sum of squares and crossproducts matrix is input for COV. Routine CORVC in Chapter 3, “Correlation,” can be invoked to compute the corrected sum of squares and crossproducts matrix. Routine RORDM in Chapter 19,

“Utilities,” can reorder this matrix, if required. If an intercept is not to be included in the model,

IMSL STAT/LIBRARY Chapter 2: Regression · 117

a raw (uncorrected) sum of squares and crossproducts matrix must be input for COV; and SUMWTF and XYMEAN are not used in the computations. Routine MXTXF

(IMSL MATH/LIBRARY) can be used to compute the raw sum of squares and crossproducts matrix.

Routine RCOV is based on a Cholesky factorization of COV. Let k (input in NIND) be the the number of independent variables, and d (input in SUMWTF) the denominator used in computing the x means (input in the first k locations of XYMEAN). The matrix R is formed by computing a Cholesky factorization of the first k rows and columns of COV. If INTCEP equals one, the k rows from this factorization are appended to the initial row

, 1, , k

d d x K d x

The resulting R matrix is the Cholesky factor of the XT X matrix where X contains a column of ones as its first column and the independent variable settings as its remaining k columns.

Maindonald (1984, Chapter 3) discusses the Cholesky factorization as it applies to regression computations.

The routine RCOV checks sequentially for linear dependent regressors. Linear dependence of the regressors is declared if

1-Ri2×1 2, , ,Ki-1

is less than or equal to TOL. Here, Ri×1,2,¼,i-1is the multiple correlation coefficient of the i-th independent variable with the first i - 1 independent variables. If no intercept is in the model (INTCEP = 0), the “multiple correlation” coefficient is computed without adjusting for the mean.

When a dependence is declared, elements of the corresponding rows of R and B are set to zero.

Maindonald (1984, Sections 3.3, 3.4, and 3.9) discusses these implementation details of the Cholesky factorization in regression problems.

Documentos relacionados