• Nenhum resultado encontrado

Quantifying the roles of uncertainty and preferences in understanding schooling choices and earnings inequality

N/A
N/A
Protected

Academic year: 2017

Share "Quantifying the roles of uncertainty and preferences in understanding schooling choices and earnings inequality"

Copied!
62
0
0

Texto

(1)

Quantifying The Roles of Uncertainty and Preferences In Understanding Schooling Choices and Earnings Inequality

Flavio Cunha, James J. Heckman and Salvador Navarro

Escola de Pos-Graduacao em Economia Fundacao Getulio Vargas

(2)

Goal of this research

(1) For decades, economists and others have attempted to distinguish "luck" or chance from other factors in determining earnings.

(2) crude description of the decomposition: Look at Mincer earnings equation

(∗) lnyit = xitβ + θi + εit

yit =earnings of person i at time t xit =observables

θi =person speciÞc Þxed effect εit = G(L)νt

νt iid mean zero

(3)

(7) Example: Macurdy, MofÞtt and Gottschalk, Abowd and Card all estimate earnings models in which versions of (*) are Þt

(8) These statistical decompositions are interesting but tell us little about components unforecastable by agents

(9) Thus Gottschalk and MofÞtt show rising variances in (*) over the 1980s. But this says nothing about increasing uncertainty although some (e.g. Ljungquist and Sargent) interpret their evidence this way

(10) Goal of this paper is to identify these components. Need a theory and a link of (*) to behavioral theory. Extract forecastable from unforecastable components

(11) Precedent: work by Flavin. Using P|H. Innovations in consumption linked to innovations in income

(12) This paper uses schooling.

(a) What future components of income are forecastable at the date schooling decisions are being made?

(4)

(c) Can uncertainty and increasing uncertainty explain sluggish responses to the increase in the returns to schooling over the past 30 years?

(d) What role of uncertainty and what is the role of nonpecuniary factors?

DeÞnitions:

(a) Heterogeneity: all sources of variability among persons

(i) observable (ii) unobservable.

(b) Uncertainty: unforecastable components of hetero-geneity.

Goal: Estimate components in agent information sets at age t. See the evolution over time.

(5)

By-products: Unite and Extend Three Literatures

1. Treatment Effect and Policy Evaluation Literature (Derive Treatment Effects from Well Developed Economic Model)

2. Inequality Measurement Literature (Lift anonymity postu-lates)

(6)

1. Treatment Effect Literature

a. Traditional focus is on means (different means empha-sized in different subÞelds)

b. Our focus is on Distributions of Treatment Effects

c. We link the Treatment Effect Literature to the Choice-Theoretic Structural Econometric Literature using Robust Semiparametric Methods

d. We identify and estimate both “objective” outcomes of policies and “subjective” evaluations of policies as perceived by persons who choose (self select into) treatments.

(7)

2. Inequality Measurement Literature

a. Traditional Focus is on Overall Measures of Inequal-ity (e.g., Gini CoefÞcients; Variance) and subgroup decompositions (emphasis on statistical accounting).

b. Link to Policy Analysis is Obscure. Decompositions do not correspond to well-deÞned policies

c. Anonymity Postulate: Makes a virtue of a cross sectional necessity by comparing distributions using second order stochastic dominance

(8)

e. Our approach allows us to make such comparisons and informs us about which groups are affected and which are not and how policy shifts people within an aggregate distribution.

f. Allows us to examine how speciÞc policies affect choices (e.g., schooling) and how this affects distributions and movements across distributions

(9)

3. Earnings Dynamics Literature.

(a) Does not correct for self selection into schooling.

(b)Silent on components of uncertainty.

(10)

Basic Framework used In Our Analysis: Roy (1951) economy

S = 1 college S = 0 high school Two potential outcomes

(Y0, Y1) (present value of earnings) Decision rule governing sectoral choices:

S =

(

1 if I = Y1 − Y0 − C ≥ 0, 0 otherwise.

(11)

Decompose Into Means: Y1 = µ1 + U1 E(U1) = 0 Y0 = µ0 + U0 E(U0) = 0

Implicitly condition on X variables

C = µC + UC,

(12)

Two Kinds of Policies (a) Those that affect (Y0, Y1, C) (GE Policies) (b) Those that affect only C and hence S

(Treatment Effect Policies)

General Case

Under policy A : (Y0A, Y1A, CA)

(13)

Treatment Effect Case:

Policy A Policy B

(Y0, Y1, CA) and (Y0, Y1, CB)

(14)

Case I Y1 − Y0 is the same for everyone

Case II

Y1 − Y0 differs among people of the same X but Pr(S = 1|Y1 − Y0) = Pr(S = 1)

Case III

Y1 − Y0 differs among people and people act on these differences

M T E : E(Y1 − Y0|I = 0)

(15)

Why Useful to Estimate Joint Distributions of Counterfactuals?

1. The proportion of people who beneÞt (in terms of gross gains) from participation in the program

P r(Y1 ≥ Y0 | X).

2. Gains to participants at selected levels of the no treatment state:

F(Y1 − Y0 | Y0 = y0, X) or treatment distribution

F(Y1 − Y0 | Y1 = y1, X)

3. The option value of social programs

(16)

E(Yi − Yj | X) for all i, j and E(Yi − Yj | X, D = 1) for all i, j

(17)

1 Estimating Distributions of

Coun-terfactual Outcomes

Two counterfactual states (Y0, Y1),

((Y0, Y1)⊥6⊥ X), S = 1 if Y1 is observed ; S = 0 otherwise.

(so Y0 is observed)

Y = SY1 + (1 − S)Y0 Allow for the possibility of an instrument:

(Y0, Y1) ⊥⊥ Z | X

P r(S = 1 | Z, X) But not essential.

(18)

Data Available To Analyst:

Observe Y0 if S = 0 and Y1 if S = 1

but we do not observe (Y0, Y1) for anyone. F(Y0 | S = 0, X),

(19)

Three separate problems.

Selection Problem:

Know:

F(Y1 | S = 1, X) F(Y0 | S = 0, X)

If solved we get:

F(Y1 | X) F(Y0 | X)

not

(20)

Second Problem : Recovering Joint Distributions

Traditional approach:

conditional on X : Y1 and Y0 are deterministically related

Y1 ≡ Y0 + (X) (1)

“Common Effect”

From means of

F(Y0 | S = 0, X) F(Y1 | S = 1, X)

(21)

Generalization

Quantiles perfectly ranked: Y1 = F1−1,X(F0,X(Y0))

F1,X = F(y1 | X) F0,X = F(y0 | X) Y1 = F1−1,X(1 − F0,X(Y0))

(22)

To See Why Consider the Following Markov kernels

M(y1, y0 | X) and M˜ (y0, y1 | X) F(y1 | X) =

Z

M(y1, y0 | X)dF0(y0 | X) F(y0 | X) =

Z

˜

M(y0, y1 | X)dF1(y1 | X)

so

F(y1 | X) =

Z

M(y1, y0 | X) ˜M(y0, y1 | X)dF(y1 | X)

(23)

Approach Using Choice Theory

S = 1(µs(Z) ≥ es) , Z ⊥⊥ es (2)

Y1 = µ1(X) + U1, E(U1) = 0

Y0 = µ0(X) + U0, E(U0) = 0

(U1, U0) ⊥⊥ X, Z.

where S = 1( Y1 > Y0) (Roy model).

(24)

For More General Rules method Breaks Down: Heckman (1990) and Heckman and Smith (1998):

(Z, X) ⊥⊥ (U0, U1, es)

(i) µs(Z) is a nontrivial function of Z conditional on X (ii) full support assumptions on µ1(X), µ0( X) and µs(Z) IdentiÞcation of F(U0, es), F(U1, es) and µ1(X), µ0(X) and µ(Z).

(25)

An alternative: Factor Models

Aakvik, Heckman and Vytlacil (1999, 2003):

U0 = α0θ + ε0, U1 = α1θ + ε1, es = αsθ + εs, θ ⊥⊥ (ε0, ε1, εs), ε0 ⊥⊥ ε1 ⊥⊥ εs.

Can identify α0, α1, αs, F(U0, es) and F(U1, es)

COV (U0, es) = α0αs σ2θ COV (U1, es) = α1αsσ2θ

(26)

Third Problem: What is in the Information Set of Agents at Date Schooling Decisions are being Made?

Let Ies be the information set at the date of schooling decisions. What is Pr(s = 1|Ies)?

Is Ise = ( Y1

(27)

2 Counterfactuals for the Multiple

Outcome Case

State s, age a for person ω ∈ :

Ys,a(ω) s = 1, ...,S¯, a = 1, ..., A.¯ ¯

S states and A¯ ages

(3)

Ceteris paribus Effect (or individual treatment effect) s0 at age a00 is:

∆((s, a),(s0, a00)) = Ys,a(ω) − Ys0,a00(ω). (4) Average Treatment Effect:

ATE((s, a),(s0, a), x) = E(Ys,a − Ys0,a | X = x)

(28)

Lifetime utility: Vs(ω), s = 1, ...,S¯.

˜

s = argmax s

{Vs(ω)} ¯ S

s=1 . (5)

Ds = 1 if

_

S , ¯ S

P

s=1

(29)

Marginal treatment effect:

limMT E(a, V s,s0) = E(Ys,a−Y

s0,a | Vs = Vs0 = V s,s0 ≥ Vj, j 6= s, s0)

(6)

s0 = 1, ..., S;¯ s0 6= s.

MT Es(a,{V s,s0}Ss0

=1,s06=s) = (7)

¯

S

X

s0=1

s06=s

limMT E(a, V s,s0)

³

f(Vs, Vs0 | Vs = Vs0 = V s,s0 ≥ Vj, j 6= s, s0)/ψ(a, {V s,s0}Ss0

=1,s06=s)

´

ψ ³a, {V s,s0}S

s0=1,s06=s

´ =

S

X

s0=1

s06=s

(30)

3 Factor Structure Models

Vs = µs(Z) − es s = 1, ..,S.¯ (8)

example :

Vs = Z0βs − es s = 1, ..,S.¯

es = α0sθ + εs (9)

θ is a L × 1 vector (θ. ⊥⊥θ.0, . 6= .0)

θ⊥⊥εs s = 1, ...,S¯;

εs⊥⊥εs0 ∀s, s0 = 1, ..., S¯; s 6= s0

(31)

Potential outcomes Ys,a∗ :

Ys,a∗ = µs,a(X) + α0s,aθ + εs,a (11)

E(εs,a) = 0

θ⊥⊥εs,a; s = 1, ...,S¯; a = 1, ...,A,¯ (12)

εs,a⊥⊥εs0,a00; ∀ s 6= s0, ∀ a, a00. (13)

εs,a⊥⊥εs0,

∀ s0, s = 1, ...,S¯; a = 1, ..., A.¯

(32)

Xs,a⊥⊥(θ, εs0,a00),

∀ s, a, s0, a00, s = 1, ...,S¯; a = 1, ...,A.¯

(15)

Ys,a∗ vector valued

Ys,a∗ in (11) as a latent variable

(33)

Relationship To Matching

Matching

Ys,a⊥⊥Ds | X = x, Z = z, Θ = θ for all s

Vs⊥⊥Ds | X = x, Z = z, Θ = θ for all s.

ATE:

E(Ys,a − Ys0,a | X, θ) =

(34)

Measurement System

M1 = µ1(x) + β11θ1 + .... + β1KθK + ε1

... (16)

ML = µL(x) + βL1θ1 + .... + βLKθK + εL

εM = (ε1, ..., εL), E(εM) = 0 (17)

θ = (θ1, .., θK) ⊥⊥(ε1, ..., εL), θ ⊥⊥εs,

θ ⊥⊥εs,a, εs,a ⊥⊥εM ⊥⊥εs, εi ⊥⊥εj,

(35)
(36)

Choice Equations

V (s) = µs(Z, X) + γ0sθ + εs (18) V = {V (s)}Ss¯=1

(37)

4 Identi

Þ

cation of Semi-parametric

Factor Models

G = µ + Λθ + ε (19)

G: L × 1, θ ⊥⊥ ε, µ: L × 1 , θ: K × 1,

ε : L × 1 and Λ: L × K. εi εj i 6= j, i, j = 1, ..., K

Even if θi ⊥⊥ θj, the model is underidentiÞed

COV (G) = ΛΣθΛ0 + Dε (20)

(38)

We require that

L(L − 1)

2

| {z }

Number of off-diagonal covariance elements

(L × K − K)

| {z }

Number of unrestricted Λ

+ |{z}K Variances ofθ

A necessary condition:

(39)

Λ =                

1 0 0 0 ... ... ... 0

λ21 0 0 0 ... ... ... 0 λ31 1 0 0 ... ... ... 0 λ41 λ42 0 0 ... ... ... 0

λ51 λ52 1 0 ... ... ... 0 λ61 λ62 λ63 0 ... ... ... 0 λ71 λ72 λ73 1 ... 0 ... 0

λ81 λ82 λ83 λ84 ... 0 ... 0 ... ... ... ... ... ... ... ... λL,1 λL,2 λL,3 ... ... ... ... λL,K

(40)

COV (gj, gl) = λjl1σ 2

θ1, l = 1, 2; j = 1, ..., L; j 6= l.

In particular

COV (g1, g.) = λ.1σ 2

θ1

COV (g2, g.) = λ.1λ21σ 2

θ1.

Assuming λ.1 6= 0, we obtain

COV (g2, g.)

COV (g1, g.) = λ21.

Hence, from COV (g1, g2) = λ21σ 2

θ1, we obtain σ

2

θ1, and hence

λ.1, . = 1, . . . , L. We can proceed to the next set of two measurements and identify

COV (gl, gj) = λlj1σ 2

θ1+λl2λj2σ

2

(41)

Our System:

Gs = (M, Ys, Ds), Gs are of two types: (a) continuous

vari-ables and (b) discrete or censored random varivari-ables, including binary strings.

Mc,Ysc,Md,Ysd :

Table 1: Components of Gs

Continuous Discrete Variables deÞned for all s Mc Md Variables deÞned for s Ysc Ysd

Indicator of state − Ds

Continuous variables:

Mc = µcm (X) + Umc Ysc = µcs (X) + Usc. Discrete variables:

(42)

Data used for factor analysis :

Mc, M∗d, Ysc, Ys∗d and V (s), s = 1, ..., S.

Y =

S

P

s=1

DsYs.

M, Y, and Ds, s = 1, .., S all contain information on θ.

(43)

(A-1) ¡Umc , Umd, Usc, Usd, εW¢ have distribution functions that are absolutely continuous with respect to Lebesgue measure with means zero with support Uc

m × Udm × Ucs × Uds × EW

with upper and lower limits beingmc ,U¯md, U¯sc,U¯sd, εW

and Ucm, Udm, Ucs, Uds, εW, respectively, which may be bounded or inÞnite. Thus the joint system is measurably separable (variation free). We assume Þnite variances. The cumulative distribution function of εW is assumed to be

strictly increasing over its full supportW, εW).

(A-2) (X, Z, Q) ⊥⊥ (U, εW) where U = (Umc , Umd, Usc, Usd), where

(44)

Theorem 1 From data on F (M | X), one can identify the joint distribution of ¡Umc , Um(the latter component only up to scale), the function µdm(X) is identiÞed and µcm(X) is iden-tiÞed over the support of X (up to scale) provided that the fol-lowing assumptions, in addition to the relevant components of (A-1) and (A-2), are invoked.

(A-3) Order the discrete measurement components to be

Þrst. Suppose that there are Nm,d discrete

compo-nents, followed by Nm,c continuous components.

As-sume Support

³

µd1,m (X), . . . , µdNm,d,m (X) ´

Support

¡

U1d,m, ..., UNdm,m ¢

.

(A-4) For each l = 1, ..., Nm,d µdl,m (X) = Xβdl,m.

(A-5) The X lives in a subset of RNX. There exists no linear

proper subspace of RNX having probability 1 under F

X,

(45)

Proof of Theorem 1: The case where M consists of purely continuous components is trivial. We observe Mc for each X and can recover the marginal distribution for each component. Recall that M is not state dependent.

For the purely discrete case, we encounter the usual problem that there is no direct observable counterpart for µdm (X). Under (A-1)-(A-5), we can use the analysis of Manski (1988) to identify the slope coefÞcients βdl,m up to scale, and the marginal distribution of Ul,md . From the assumption that the mean (or median) of Ul,md is zero, we can identify the intercept in βdl,m. We can repeat this for all discrete components. Thus coordinate by coordinate we can identify the marginal distributions of Umc , Uemd, µcm (X) and eµdm(X), the latter up to scale (“~” means identiÞed up to scale).

To recover the joint distribution write:

Pr (Mc ≤ mc, Md = (0, ...,0) | X)

= FUc m,Uemd

³

mc − µcm(X), −eµdm(X)´

by assumption (A-2). To identify FUc

m,Uemd (t

1, t2) for any given evaluation points in the support of (Umc ,Uemd), we know the function µedm (X) and using (A-3) we can Þnd an X where e

(46)

proof, t1, t2 may be vectors. Thus

Pr (Mc ≤ mc, Md = (0, ...,0) | X = xb) = FUc

m,Uemd (mc

− µcm(xb) , t2)

Let mbc = t1 − µcm(xb) to obtain

Pr (Mc ≤ mbc, Md = (0, ..., 0) | X = xb) = FUc

m,Uemd (t1, t2)

We know the left hand side and thus identify FUc

m,Uemd at the

(47)

Theorem 2 For the relevant subsets of the conditions (A-1), and (A-2) (speciÞcally, assuming absolute continuity of the dis-tribution of εW with respect to Lebesgue measure and εW ⊥⊥

(Z, Q)), and the additional assumptions:

(A-6) cs(Qs) = Qsηs, s = 1, ..., S, ϕ(Z) = Z0β

(A-7) (Q1, Z) is full rank (there is no proper subspace of the

support (Q1, Z) with probability 1). The Z contains no intercept.

(A-8) Qs for s = 2, . . . , S is full rank (there is no proper subspace

of ¡RQs¢ with probability 1).

(A-9) Support (c(Q1) − ϕ(Z)) ⊇ SupportW)

Then the distribution function FεW is known up to a scale normalization on εW and cs(Qs), s = 1, ...s,¯ and ϕ(Z) are

(48)

Proof of Theorem 2:

Pr (D1 = 1 | Z, Q1) = Pr

µ

c1 (Q1) − ϕ(Z)

σW

> εW σW

Under (A-1), (A-2), (A-6), (A-7) and (A-9), it follows that

c1(Q1)−ϕ(Z)

σW and F˜εW (where eεW = εW

σW) are identiÞed (see

Manski, 1988 or Matzkin 1992, 1993). Under rank condition (A-7), identiÞcation of c1(Q1)−ϕ(Z)

σW implies identiÞcation of c1(Q1)

σW and

ϕ(Z)

σW separately. Write

Pr (D2 = 1 | Z, Q1,Q2) = FeεW µ

c2(Q2) − ϕ(Z)

σW

Feε

W

µ

c1 (Q1) − ϕ(Z)

σW

¶ .

From the absolute continuity of eεW and the assumption that

the distribution function of eεW is strictly increasing, we can

write c2 (Q2)

σW = F

−1

eεW

·

Pr (D2 = 1 | Z, Q1, Q2) + FeεW µ

c1(Q1) − ϕ(Z)

σW

¶¸

+ϕ(Z)

σW . Thus we can identify c2(Q2)

(49)

Pr ¡Mc ≤ mc, M∗d ≤ 0, Ysc ≤ ysc, Ys∗d ≤ 0|Ds = 1, X, Z, Qs, Qs1¢

×Pr(Ds = 1|Z, Qs, Qs1) (22)

=

Z mc−µcm(X)

Uc

Z −eµdm(X)

e

U d

m

Z ysc−µcs(X)

Ucs

Z −eµd(X)

e

U d

Z cs(Qs)−ϕ(Z)

σW

cs−1(Qs−1)−ϕ(Z)

σW

f ³Um,c Uem,d Us,cUes,deεW

´

dUmc dUemddUscdUesddeεW.

(A-10)

Support µ

−eµdm(X),µeds(X),

µ

cs (Qs) − ϕ(Z)

σW

− cs−1(Qs−1) − ϕ(Z)

σW

¶¶

Support(Umd, Usd,eεW) = (Ud

m × U d

s × EeW).

(A-11) There is no proper linear subspace of (X, Z, Qs, Qs−1) with probability one so the model is full rank.

As a consequence of (A-6) and (A-10) we can Þnd values of Qs, Qs1, Q¯s, Qs−1 respectively so that

lim

Qs→Q¯s Qs−1→Qs−1

(50)

Theorem 3 Under assumptions 1), 2), 4), 6), (A-7),(A-8),(A-9),(A-10) and (A-11), µcm(X), µcs(X), eµdm(X),

e

µds(X),ϕ(Ze ), cs(Qs) s = 1, ..., S − 1 are identiÞed as is the

(51)

Proof of Theorem 3: From (A-2), the unobservables are jointly independent of (X, Z, Q). For Þxed values of

(Z, Qs, Qs−1), we may vary the points of evaluation for

the continuous coordinates (ysc) and pick alternative val-ues of X = xb to trace out the vector µc(X) up to intercept terms. Thus we can identify µcs,l(X) up to a constant for all

l = 1, ..., Nc,s.(Heckman and Honoré, 1990). Under (A-2), we recover the same functions for whatever values of Z, Qs, Qs−1

are prespeciÞed as long as cs (Qs) > cs−1(Qs−1), so that there

is interval of εW bounded above and below with positive prob-ability. This identiÞcation result does not require any passage to a limit argument.

For values of (Z, Qs, Qs−1) such that lim

Qs→Q¯s(Z)

Qs−1→Qs

−1(Z)

Pr (Ds= 1|Z, Qs, Qs1) = 1.

where Q¯s (Z) is an upper limit and Qs1(Z) is a lower

limit, and we allow the limits to depend on Z, we essentially integrate out eεW and obtain

Pr(Mc ≤ mc,µemd ≤ −Umd, Usc ≤ ysc −µc(X), Uesd ≤ −µeds(X))

(52)

Then proceeding as in the proof of Theorem 1, we can identify µeds(X) coordinate by coordinate and we obtain the constants in µcs,l(X), l = 1, ..., Nc,s as well as the constants in

e

µd(X). From the assumption of mean or median zero of the unobservables. In this exercise, we use the full rank condition on X which is part of assumption (A-11).

With these functions in hand, under the full conditions of assumption (A-10) we can Þx ysc, ymc ,µeds, eµdm, cs(Qs)σ−ϕ(Z)

W ,

cs−1(Qs−1)−ϕ(Z)

σW at different values to trace out the joint

(53)

S distributions: we factor analyze them.

Us = α0sθ + εs s = 1, ..., S

Um = α0mθm + εm

m = 1, ..., Nm

(54)

Theorem 4 Under the normalizations on the factor loadings

of the type in (21) for one system s under the conditions of Theorems 1-3, given the normalizations for the

unobserv-ables for the discrete components and given at least 2K + 1

measurements (Y, M, V ), the unrestricted factor loadings

and the variances of the factors (σ2θ

i, i = 1, ..., K) are

iden-tiÞed over all systems.

(55)

Nonparametric IdentiÞcation of Distributions

({Um}Nm

m=1,{Us,a}aA=1,eεV ) = Ts

First B1 elements depend only on θ1; next B2 − B1 elements depend on (θ1, θ2) and so forth.

(56)

Theorem 5 If

T1s = θ1 + v1 and

T2s = θ1 + v2

and θ1 ⊥⊥v1 ⊥⊥v2, the means of all three generating random variables are Þnite, E(v1) = E(v2) = 0, and the conditions of Fubini’s theorem are satisÞed for each random variable, and the random variables possess nonvanishing (a.e.) characteris-tic functions, then the densities of1, v1, v2), g(θ1), g1(v1), g2(v2), respectively, are identiÞed.

(57)

T1s = λs11θ1 + εs1 where λs11 = 1 T2s = λs21θ1 + εs2 where λs21 6= 0.

T1s = θ1 + εs1 T2s

λs21 = θ1 + ε ∗,s

2 .

Proceeding to equations B1 + 1 and B1 + 2

TBs1+1 = λsB1+1,1θ1 + θ2 + εsB1+1

TBs1+2 = λsB1+2,1θ1 + λsB1+2,2θ2 + εsB1+2.

TBs+1 − λsB1+1,1θ1 = θ2 + εsB1+1 TBs+2 − λsB1+2,1θ1

λsB1+2,2 = θ2 + ε ∗,s B1+2

ε∗B,s1+2 = ε s B1+2

(58)

Example:

Two potential outcomes at a : (Y0, Y1). Set Z big, so Ds = 1.

Y0a = α0aθ + ε0a, a = 1, ..., A Y1a = α1aθ + ε1a, a = 1, ..., A

(ε0a ⊥⊥ ε1a0) ∀a,a0

(ε0a, ε1a) ⊥⊥ θ ∀a

Suppose that Pr( Ds = 1|Z) = 1. for Z ∈ Z1 and Pr( Ds = 0|Z) = 0. , for Z ∈ Z0.

Y0a = µ0a + α0aθ + ε0a, a = 1, . . . , A Y1a = µ1a + α1aθ + ε1a, a = 1, . . . , A,

(59)

A = 3, so if we have three panel observations, we obtain

COV (Y01, Y02) = α01α02σ2θ,

COV (Y01, Y03) = α01α03σ2θ,for Z ∈ Z0,

COV (Y02, Y03) = α02α03σ2θ,

COV (Y11, Y12) = α11α12σ2θ,

COV (Y11, Y13) = α11α13σ2θ, for Z ∈ Z1

COV (Y12, Y13) = a12a13σ2θ.

Assuming α01 = 1 or σ2θ = 1, α03 6= 0

COV (Y01, Y02)

COV (Y01, Y03) =

α02 α03

Given σ2θ = 1 and COV (Y02, Y03) = α02α03, we obtain

03)2 = COV (Y02, Y03)COV (Y01, Y03)

(60)

The variances of the uniquenesses:

V ar(ε0a) = V ar(Y0a) − α20aσ2θ a = 1, . . . , A

V ar(ε1a) = V ar(Y1a) − α2

1aσ2θ a = 1, . . . , A.

COV (T, Y0a) = βα0aσ2θ a = 1, ..., A COV (T, Y1a0) = βα1a0σ2

θ a

0 = 1, ..., A

Assume β 6= 0 and α0a, α1a0 6= 0 :

COV(T,Y1a0)

COV(T,Y0a) =

α1a0

α0a

a = 1, ..., A, a0 = 1, ..., A and a 6= a0

(61)

Factor Representation of Stationary AR(1) Processes

Theorem Let yt

t1

T be an stationary AR(1) process with T odd integer

observations where||  1andtiid0,2. Then we can always findM T1 2

orthogonal factors, corresponding loadings (which may be normalized to zero or one) and uniquenessesutiid0,st2such thatytyt1t can be represented by yt1t1. . .Mt M ut.

Proof By induction. The theorem obviously holds for T1 observations of

the AR(1) process. Assume it holds for any odd integer T number of observations. We have to show that it also holds for T2. Indeed, let

yt1t1 . . .Mt Mut,t  1, . . . ,T

be the factor representation of an stationary AR(1) processyttT1. If we add two more observations, one cannot representyttT12 with the originalM factors, because the extra number of parameters in the correlation matrix to be reproduced far exceeds the number of loadings that we get from the two extra observations. Thus, we need more factors. We show that only one more factor is necessary to represent the stationary AR(1) processyttT12 after the addition of the two extra observations. Indeed, let

yt1t1 . . .Mt Mut,t  1, . . . ,T1

yt1t1 . . .Mt MtM1M1 ut,tT,T1,T2

be its factor representation. Notice that

Corryt,yT1  

i1

M it

i

T1i2, t 1, . . . ,T1

Corryt,yT2  

i1

M it

i

T2i2, t 1, . . . ,T1

But this is a system of 2T1 with2MT1 unknowns. Thus, we can solve the system above to recoveriT1,iT2

i1

M

. Now,

CorryT,yT1  

i1

M

iTiT1i2

M1

T M

1

T1M

1 2

CorryT,yT2  

i1

M

iTiT2i2

M1

T M

1

T2M

1 2 2

CorryT1,yT2  

i1

M

iT1iT2i2

M1

T1M

1

T2M

(62)

and defineA1  iM1iTiT1i2 ,A2  iM1TiiT2i2 andA3  Mi1iT1iT2i2 .

Then, the following system must be solved forMT11,M

1

T2, andM

1 2 :

MT11M21  A1

MT21M21  A2

MT11TM21M21  A3

and the solution is:

MT11 

A3

A2

MT21 

A3

A1

M21 

A2A1

A3

Referências

Documentos relacionados

Being isolated in the middle of the Atlantic Ocean and taking into account the low pollution levels of seawater, the Azores Islands have beco me a very promising location as a

Por consequência do ser humano não ser capaz de lidar com a vida nua e crua (a partir da cristalização da ideia de verdade em Sócrates, Platão e Aristóteles, até à

Para tal, foram analisadas 1465 fichas de atendimento, utilizando uma grelha de recolha de dados construída para o efeito, e que inclui informação sócio- demográfica dos

In addition, decompositions of the racial hourly earnings gap in 2010 using the aggregated classification provided by the 2010 Census with 8 groups, instead of the one adopted in

The probability of attending school four our group of interest in this region increased by 6.5 percentage points after the expansion of the Bolsa Família program in 2007 and

Note: Cumulative change in the price and composition effects of the between-groups variance of the logarithm of real hourly wages (2005 prices) with respect to 1995.

This paper examines the impact of the education composition of the workforce and of the changing returns to schooling on the dispersion of male labor earnings in Brazil in the

The empirical strategy adopted to identify the sheepskin effects consists in estimating earnings equations, including linear years of schooling, and splines and discontinuous