Three essays on the estimation of asset pricing models

(1)

FUNDAÇÃO GETULIO VARGAS

ESCOLA de PÓS-GRADUAÇÃO em

ECONOMIA

Diego Gusmão Brandão

Three Essays on the Estimation of

Asset Pricing Models

Rio de Janeiro

2017

(2)

Diego Gusmão Brandão

Three Essays on the Estimation of

Asset Pricing Models

Tese submetida à Escola de Pós

Graduação em Economia como

req-uisito parcial para a obtenção do

grau de Doutor em Economia

Área de concentração: Finanças

Orientador: Caio Almeida

Rio de Janeiro

2017

(3)

Ficha catalográfica elaborada pela Biblioteca Mario Henrique Simonsen/FGV

Brandão, Diego Gusmão

Three essays on the estimation of asset pricing models / Diego Gusmão Brandão. – 2017.

86 f.

Tese (doutorado) - Fundação Getulio Vargas, Escola de Pós- Graduação em Economia. Orientador: Caio Ibsen Rodrigues de Almeida. Inclui bibliografia.

1. Avaliação de ativos - Modelo (CAPM). 2. Risco (Economia). 3. Catástrofesnaturais - Aspectos econômicos. I.Almeida, Caio

IbsenRodrigues de. II. Fundação GetulioVargas. Escola de Pós-Graduaçãoem Economia. III. Título.

(4)

(5)

Abstract

The thesis consists in three articles about the estimation of asset pricing models. The rst paper analyses small sample properties of Generalized Empirical Likelihood estima-tors for the risk aversion parameter in CRRA preferences when the economy is character-ized by rare disasters. In the second article, we develop and test a methodology to assess misspecied asset pricing models by taking into account the smallest probability distor-tion necessary to assign correct prices. In the nal paper, we estimate an approximate long run risks model using Brazilian data.

(6)

List of Figures

3.1 Small sample distributions - 2 assets . . . 56

3.2 Small sample δ distributions - 2 assets . . . 59

3.3 Asymptotic distribution of π - 2 assets . . . 59

3.4 Small sample π distributions . . . 60

3.5 Small sample ˆα distribution . . . 60

4.1 Quarterly Aggregate Dividends . . . 75

4.2 Consumption and GDP. . . 77

4.3 Impulse Response Function for Consumption . . . 81

4.4 Impulse Response Functions for Consumption and Dividends . . . 83

(7)

List of Tables

2.1 Baseline Parameter Values . . . 20

2.2 Estimating α . . . 29 2.3 Optimal weights ζ = λ/λ(1) . . . 29 2.4 Estimating α . . . 30 2.5 Optimal weights ζ = λ/λ(1) . . . 30 2.6 Estimating α . . . 31 3.1 Parameters . . . 54

3.2 Parameters for Dividends Processes . . . 54

3.3 Mean Discrepancies . . . 56

4.1 Sample Moments . . . 76

4.2 Testing for Unit Riots . . . 78

4.3 Cointegration Tests . . . 78

4.4 Long Run Consumption Claim Prices . . . 81

(8)

Chapter 1 Introduction

The thesis consists in three articles about the estimation of asset pricing models. The rst paper analyses small sample properties of Generalized Empirical Likelihood estimators for the risk aversion parameter in CRRA preferences when the economy is characterized by rare disasters. In the second article, we develop and test a methodology to assess misspecied asset pricing models by taking into account the smallest probability distortion necessary to assign correct prices. In the nal paper, we estimate an approximate long run risks model using Brazilian data.

GMM estimators for relative risk aversion in a simple disaster asset pricing model, in which aggregate consumption growth may suer large drops with small probabilities, are heavily biased. In the rst chapter, we analyze the ability of some members in the family of Generalized Empirical Likelihood (GEL) Estimators in estimating parameters in Barro's-type disaster models. Simulating from an economy in which large consumption drops might happen with a small probability, we show that the ability of all GEL estimators will strongly depend on the conditional occurrence or not of a disaster. When disasters do not occur, the estimators perform poorly, delivering biased parameters, in special, the coecient of risk-aversion in CRRA preferences. This allows for the conclusion that in relatively young economies in which there is interest in testing for the existence of disasters via estimation of Barro's-type models, if only small samples of data on aggregate returns

(11)

and consumption are available in the estimation process, all GEL estimators will estimate risk-aversion coecients with large positive biases, therefore inducing Equity Premium like puzzles.

The second chapter develops an alternative method to compute misspecication degree of asset pricing models by taking into account the smallest multiplicative correction to the stochastic discount factor necessary to assign correct asset prices. These multiplicative corrections can also be described as probability distortions to the real measure associated with the model. Estimators obtained minimizing our misspecication measure belong to the Generalized Empirical Likelihood class. We obtain consistency and asymptotic normality results for GEL estimators assuming the model is misspecied. To test our methodology, we conduct Monte Carlo experiments using an habit model, where a mis-specied stochastic discount factor is estimated. Results indicate that, although dierent distance measures have a signicant impact to the estimation of probability distortions with minimum discrepancy, they have a small impact in the estimation of misspecication degrees and the relative risk aversion for this model.

In our nal chapter, we study the temporal structure of risk prices and returns for Brazil assuming the economy follows a long run risk model. We use an approximation of the stochastic discount factor around a log-linear process, where the dynamic properties of risk prices can be obtained using VAR methods. We apply those methods using aggregate consumption, dividends and the gross domestic product. Consumption and dividends are decomposed using Beveridge and Nelson (1981) methods, where the martingale component that dominates asymptotic risk premia is extracted. We also identify a temporary and a long run shock using the Blanchard and Quah (1989) identication scheme. Results indicate that consumption and dividends respond in opposite directions to temporary shocks in the short run, generating negative risk premium for temporary shocks. When the investment horizon is larger, risk prices for temporary shocks are zero and the risk premium is dominated by the long run shock.

(12)

Chapter 2 Estimating Disaster Models with GEL

Estimators

Abstract

Recently,Martin(2013) showed the inability of GMM in estimating a simple disaster asset pricing model in which aggregate consumption growth may suer large drops with small probabilities as in Barro (2006). In this paper, building on Martin (2013), we analyze the ability of some members in the family of Generalized Empirical Likelihood (GEL) Estimators in estimating parameters in Barro's-type disaster models. Simulating from an economy in which large consumption drops might happen with a small probability, we show that the ability of all GEL estimators will strongly depend on the conditional occurrence of a disaster. When disasters do not occur, the estimators perform poorly delivering biased parameters, in special, the coecient of risk-aversion in CRRA prefer-ences. This allows for the conclusion that in relatively young economies in which there is interest in testing for the existence of disasters via estimation of Barro's-type models, if only small samples of data on aggregate returns and consumption are available in the estimation process, all GEL estimators will estimate risk-aversion coecients with large positive biases, therefore inducing Equity Premium like puzzles.

(13)

2.1 Introduction

Standard consumption based models with representative agent fail to explain the equity premium found in US data, requiring an unrealistic coecient of risk aversion to com-pensate the low observed consumption risk. This result appears in many forms. Mehra and Prescott (1985) calibrate a Lucas (1978) tree model using a simple specication for consumption dynamics and nd that high excess returns are inconsistent with small risk aversion. Hansen and Jagannathan(1991) show that a modest risk aversion in the basic consumption model is insucient to generate the necessary volatility in the stochastic discount factor to correctly price market returns. GMM methods are extensively used to estimate and test these model, see e.g.Hansen and Singleton(1982),Campbell(2003), Let-tau and Ludvigson (2009), either rejecting the canonical model or nding implausible estimates of risk aversion.

As rst observed by Rietz (1988) and later developed by Barro (2006), a high equity premium can be explained as the compensation for risks of rare disasters. If disasters are not observed in data but are reected in asset pricing, a high risk aversion is necessary to rationalize the equity premium. In this case, estimating the coecient of relative risk aversion can be problematic, as the existence or not of disasters in the sample substantially impacts results.Martin(2013) shows that if the disaster model is the true representation of the economy, then GMM generates highly biased and inaccurate estimates of risk aversion in small samples.Julliard and Ghosh(2012) conduct a similar exercise using the empirical likelihood estimator, and conclude that they unlikely estimate the relative risk aversion coecient as high as it is obtained in historical samples. This experiment suggests that empirical likelihood methods may present better small sample properties for estimating risk aversion if the economy is subject to rare disasters.

In this paper we use Monte Carlo simulations to explore the performance of Generalized Empirical Likelihood (GEL) estimators in economies characterized by rare disasters. Our main objective is to investigate if the distortion to the empirical measure associated with GEL helps to reduce the upward bias of estimated relative risk aversion, and how

(14)

these distortions are aected by the choice of dierent members in the GEL family. Our simulations suggest that if there are no disaster events in a sample, results are strongly aected by the choice of the member of the GEL family. If disasters appear in the sample, however, all estimators present similar behavior.

The GEL family includes important semiparametric estimators as Empirical Likeli-hood (EL, Owen (1988)), Exponential Tiling (ET, Kitamura and Stutzer (1997)) and Continuous Updating (CU, Hansen, Heaton, and Yaron (1996)) estimators, created to improve small sample properties of the ecient GMM. AlthoughSmith(1997) show that all GEL estimators share the same asymptotic properties as the ecient two step GMM estimator, their small sample properties dier. We examine some of these properties using the disaster model environment, exploring how those estimators behave when the sample distribution diverges severely from its populational counterpart.

Every member of the GEL family has a minimum distance dual representing the smallest distortion, under some specic criteria, to the empirical measure so that moment conditions are satised. Instead of minimizing a combination of moment condition errors, as in GMM, these estimators look for minimum discrepancy between measures. We show in our simulations that the choice of distance criteria has a major impact in the associated distortion. This is important to understand how these estimators adjust the empirical measure to compensate missing observations of disasters in a sample.

The baseline parametrization used in our simulations follows Barro (2006). The pa-rameters characterizing the size and frequency of disasters are specially important as many extensions of the basic model, e.g. Gabaix (2008), Barro and Jin (2011), Gabaix (2012) and Martin (2013) use them to calibrate the disaster component of consumption. In this literature, the coecient of relative risk aversion and the subjective discount factor are usually not estimated, but chosen to match the equity premium. We x preference parameters as inBarro (2006) and explore how the properties of GEL estimators change when we use alternative specications for dividend growth.

(15)

2.2 Disaster Model

The economy is a version ofBarro(2006)'s model, a representative agent Lucas tree model where the exogenous stochastic consumption is characterized by a small probability of disaster. The representative agent's preferences are standard, given by

E ( _∞ X t=0 βtC 1−α t − 1 1 − α ) (2.1) where β is the subjective discount factor and α ≥ 0 is the relative risk aversion.

The main ingredient of the model is the stochastic process describing the dynamics of consumption. We follow Backus, Chernov, and Martin (2011) and model logarithm consumption growth ∆ct+1 as independent and identically distributed random variables

composed by two independent components:

∆ct+1 = wt+1+ zt+1 (2.2)

The rst component wt+1has a normal distribution N (µ, σ2)and represents

consump-tion growth when no disasters hit the economy. The second component zt+1, a Poisson

mixture of normals, controls the size and frequency of disasters. To determine zt+1, a

random variable j representing the number of jumps is draw from a Poisson distribution P (ω), where ω is the mean and variance of j. Conditional on the number of jumps, zt+1

has the normal distribution zt+1|j ∼ N (jθ, jδ2). When ω is small, as is the case under

the standard parametrization used in this paper, the number of jumps will generally be at most one and j will behave approximately as a Binomial variable.

To understand the consequences of rare disasters for asset pricing in this economy, suppose there is one asset paying Dt+1 as dividends. Throughout this paper, we assume

that every asset traded in t will pay dividends solely in t + 1, as there are no interesting dynamic eects for consumption growth, and consequently for the stochastic discount factor, in this i.i.d. environment. Optimization of the representative consumer problem requires that the price of any asset must satisfy the Euler equation

(16)

Pt =Et " β Ct+1 Ct −α Dt+1 # (2.3) Prices are the discounted mean of future dividends and are determined by the stochas-tic discount factor βCt+1

Ct

−α

, whose behavior is controlled by the representative agent's preference parameters and the distribution of consumption growth. To gain intuition on the pricing properties of the rare disaster environment, we focus on a consumption claim asset that delivers Ct+1 as dividends. The price of this asset is given by

Pt =Et " βCt Ct+1 Ct 1−α# =Et βCte(1−α)wt+1e(1−α)zt+1 = βCtE e(1−α)wt+1Ee(1−α)zt+1 (2.4)

The rst expectation in the above formula is easily computed: E [exp {(1 − α)wt+1}] = exp (1 − α)µ + 1 2(1 − α) 2_σ2 (2.5) The second expectation, concerning the jump component of consumption growth, can be obtained exploring the conditional normality of zt:

E [exp {(1 − α)zt+1}] = ∞ X j=0 E [exp {(1 − α)zt+1} |i = j]Prob(i = j) = ∞ X j=0 exp (1 − α)jθ + 1 2(1 − α) 2_jδ2 e −ω_ωj j! (2.6)

The rare disaster hypothesis means that ω should be very small. Under the standard parametrization to be described below, we set ω = .017, so the probability of having more than ve jumps, for example, is virtually zero. This means that equation2.6 can be well approximated with small computational costs.

(17)

The return of the consumption claim asset, interpreted as the market return in a Lucas tree model, is given by Rt+1= C_Pt+1_t . To compute the expected return, we use the pricing

equation2.4 to get E[Rt+1] =E Ct+1 Pt =E Ct+1 βCtE [exp ((1 − α)wt+1)]E [exp ((1 − α)zt+1)] = 1 β E[exp(wt+1)] E[exp((1 − α)wt+1)] E[exp(zt+1)] E[exp((1 − α)zt+1)] (2.7) The return on the consumption claim asset is highly aected by the disaster compo-nent E [exp (zt+1)] /E [exp ((1 − α)zt+1)]. Although expected dividend growth decreases

when disasters are allowed to happen, as zt+1 will tend to be negative, the greater

un-certainty forces investors to hold more assets, increasing their prices. Under the baseline parametrization, the expected return of the consumption claim asset is 5.8% lower when we take into account the disaster component.

We are interested in risk premiums, the expected dierence between the return on a risky asset and the risk free rate. We denote the risk free rate by Rf _{and drop the time}

subscript as it will be a constant in this i.i.d. setting. The riskless asset pays one unit of consumption in t + 1 in every state of nature, so its price is E

βCt+1 Ct −α and Rf must satisfy Rf = 1 E βCt+1 Ct −α (2.8) = 1 βE [exp(−αwt+1)]E [exp(−αzt+1)] (2.9) The return on the riskless asset falls when disasters can hit the economy, as investors look for safety to compensate the increased uncertainty. Under the standard

(18)

parametriza-tion, Rf is 11% lower when compared to the riskless rate without the disaster component.

This means that, although the expected return of the risky asset decreases with the possi-bility of disasters, the equity premium is higher. The result is expected, as the covariance between the stochastic discount factor and the consumption claim asset will be higher when there are large jumps in the consumption process.

To conduct the Monte Carlo experiments in this paper, we require the existence of more than one Euler equation to be used as moment conditions for estimation, so the model is overidentied and the estimators diverge. For this reason, we consider dierent risky assets by specifying dividend growth processes characterized by distinct exposures to consumption risks. Specically, we assume that logarithm dividend growth ∆dt+1 satisfy

∆dt+1= ρφwt+1+

p

1 − ρ2_v

t+1+ ηzt+1 (2.10)

where wt+1 and zt+1 are the normal and disaster components of consumption growth,

respectively, vt+1 ∼ N (µ, σv2) is an independent shock to dividends, ρ is the correlation,

conditional on no disasters, between dividend and consumption growth, η measures the exposure of dividends to disaster risk and φ determines dividend growth exposure to wt+1.

Dierences between assets are fully characterized by the parameters (ρ, φ, η,σv). Note that

the consumption claim asset is a special case of 2.10 for (ρ, φ,η,σv) = (1,1,1,0).

We abstract general equilibrium considerations regarding market-clearing mechanisms and require only that Euler equations are satised. Once again it is assumed, considering the i.i.d. structure of dividend growth, that each asset marketed in t will pay dividends solely in t + 1. The price Pt of an asset whose dividends Dt+1 = exp(dt+1) satises

(19)

Pt=Et " β Ct+1 Ct −α Dt+1 # (2.11) = βDtEt " β Ct+1 Ct −α Dt+1 Dt # (2.12) = βDtEt h expn(ρφ − α)wt+1+ (η − α)zt+1+ p 1 − ρ2_v t+1 oi (2.13) = βDtexp (ρφ − α)µ +1 2(ρ − α) 2_σ2₊p_{1 − ρ}2_{µ +} 1 2(1 − ρ 2_)σ2 v E [exp {(η − α)zt+1}] (2.14) To estimate relative risk aversion, we will make use of the restrictions on excess return Re

t+1= Rt+1− Rf derived from Euler equations:

0 =E " β Ct+1 Ct −α R_t+1e # = E h Dt+1 Dt i E βCt+1 Ct −α Dt+1 Dt − 1 E βCt+1 Ct −α (2.15)

Our main interest in this paper is to study the interplay between consumption dynam-ics and the estimation of the risk aversion coecient α through the moment conditions determined by 2.15. The parametrization of the disaster component, in particular the probability of rare disasters, changes dramatically the asset pricing implications of the standard consumption model. We describe the parametrization used in Barro (2006), slightly modied by Backus, Chernov, and Martin (2011) to adapt to the new distribu-tion of the disaster component, and explore its consequences for the Euler equadistribu-tions2.15. To calibrate the utility function, it is used β = exp(−0.3) for the subjective discount factor and α = 4 for relative risk aversion, inside the range commonly used in the litera-ture. The calibration of the power utility is standard, as the objective of disaster models is to explain the equity premium puzzle without relying on unconventional specications

(20)

of preferences.

Consumption growth in no-disaster times is calibrated using international data on real per capita GDP growth from 1954-2008. More specically, the countries belong to G7, a group not aected by disasters during this period. The log consumption growth mean µ is calibrated as 0.025 and its standard deviation σ as 0.02.

Disasters are dened as a decline of 15% or more in real per capita GDP. Using international data, the sample to compute the frequency and size of disasters ranges 100 years, from 1900 to 2000. A set of twenty OCDE countries including the United States and many Western Europe countries is used. If a disaster happens in consecutive years, the total drop in GDP is condensed into one event to match the model, where i.i.d. jumps only aect current consumption growth. In this sample, the frequency of disasters is 1.7%, so the expected number of jumps in each period is set as ω = 0.017. In the case of one jump being draw from the Poisson distribution, the mean consumption drop is 35% and its standard deviation 25%.

Table 2.1: Baseline Parameter Values

Parameter Variable Value

Subjective discount factor β exp(−0.3)

Relative risk aversion α 4

Mean consumption growth when j = µ 2.5%

Standard deviation of consumption growth when j = 0 σ 2%

Jump intensity ω 1.7%

Mean jump size θ −39%

Standard deviation of jumps δ 25%

Using the baseline parametrization, we highlight some issues of the disaster model that will be important in our Monte Carlo experiments. The model is capable of rationalizing the equity premium puzzle assuming a small probability of an event not observed in US data. We may ask what is the relative risk aversion α∗ _{necessary to price the excess}

(21)

this case, disasters are reected in prices but not on observed consumption growth. The coecient α∗ _satises 0 =E " β Ct+1 Ct −α∗ Rt+1− Rf |j = 0 # (2.16) Under the standard parametrization we nd α∗ _{= 156}_{, as the high market excess}

return can only be justied by a a large risk aversion if we do not observe disasters. As emphasized in the literature, although it is possible to price excess market returns using the basic consumption model, it cannot be done with acceptable risk aversion coecients. In the disaster model, α∗ _{will be high because the equity premium is the compensation}

for a risk that exists, and therefore aects asset prices, but is not observed.

0 20 40 60 80 100 120 140 160 180 α 0 0.5 1 1.5 2 2.5 3 3.5 4×10

-3 Squared pricing erros - Market return

Figure 1 plots the squared pricing error for excess market return as a function of relative risk aversion α conditional on no disasters. Prices are set taking disasters into account, so excess returns satisfy Eβ(Ct+1/Ct)−αRet+1

= 0. When we take expectations conditional on no disasters, we change the distribution of consumption growth, decreasing the probability of bad events and overrepresenting good returns, so lower values of α generate positive pricing errors. If the risk aversion is increased, though, it is possible to oset this eect as the value of consumption in bad times will be higher, assigning more

(22)

weight for negative excess returns in the conditional Euler equation. Figure 2 illustrates how α aects the stochastic discount factor.

0.97 0.98 0.99 1 1.01 1.02 1.03 1.04 Ct+1 Ct 0 2 4 6 8 10 12 14 16 18 20 β 1 Ct+ 1 Ct 2 − α

Stochastic discount factor

5 20 50 100

α

A main concern of this paper is how to reweight events so that Euler equations are always satised. As we have seen, it can be done by changing α in the stochastic discount factor to control state prices. Another way of thinking about the problem is to x risk aversion and use distortions on the probability measure that generate correct prices for excess returns, that is, that induces the Euler equation to be respected when expectations are taken under the new measure. This idea motivates the estimators described in the next session.

2.3 Generalized Empirical Likelihood

We are interested in the small sample properties of the estimation of risk aversion in an economy with rare disasters using Generalized Empirical Likelihood (GEL) estimators. In this section, we describe the GEL class and compare it to an equivalent Minimum Discrepancy family of estimators.

Suppose the data random vector z has T independent and identically distributed observations {zt}Tt=1. The parameter to be estimated is θ

∗_{, a p-dimensional vector in}

(23)

E[g(z,θ∗

)] = 0 (2.17)

where g(z,θ) is a possibly nonlinear function of data and parameters whose values are in Rm_{. In our framework, the parameter of interest is the relative risk aversion α, data}

is given by consumption growth and returns, and the moment conditions are the Euler equations E[β (Ct+1/Ct)

−α

Re

t+1] = 0, one for each asset.

We rst describe the estimators used in this paper in a Minimum Discrepancy (MD) setting, easier to interpret, and show they are equivalent to the computationally simpler GEL class. Fixed a convex function φ(·), a MD estimator is the solution to

ˆ θM D = argmin θ∈Θ min {πt} X φ(πt) T (2.18) s.t. T−1X πt= 1 (2.19) T−1Xπtg(zt,θ) = 0 (2.20)

For each θ ∈ Θ, if the sequence {πt(θ)} solving the inner minimization problem is

nonnegative, it can be interpreted as the smallest distortion to the empirical measure so that the moment condition T−1P_π

tg(zt,θ) = 0 is satised. The MD estimator nds θ

that generates the lowest discrepancy necessary to satisfy the Euler equation.

The function φ(·) is restricted to be convex, so any sequence {πt}Tt=1 with unitary

mean must satisfy

P φ(πt) T ≥ φ P πt T = φ(1) (2.21)

The equation above means that if the empirical measure satises the Euler equation for some ¯θ ∈ Θ, then2.18 is solved with ˆθM D = ¯θ and πt= 1 for all t, that is, if no distortion

(24)

idea behind the estimator is that if the sample moment condition converges in probability to E[g(z,θ)] for every θ ∈ Θ, then under certain conditions the MD estimator ˆθM D should

converge in probability to θ∗ _{because solutions near the no distortions case become feasible}

as T → ∞.

Computation of solutions for the MD problem can be computationally demanding as minimization is performed in the Rp+T _space. _{Newey and Richard} ₍₂₀₀₄_{) show that if}

φ(·) belongs to the Cressie and Read (1984) family of discrepancies, the solution for the dual of the optimization problem 2.18 will be a Generalized Empirical Likelihood (GEL) estimator, easier to compute and having some of the most commonly used MD estimators also as special cases. A function φ(·) is in the Cressie-Read family if

φ(π) = π

(1+γ)_{− 1}

γ(1 + γ) (2.22)

where γ ∈ R. For γ = 0 or γ = −1, the values of φ(π) are given by their limits − log(π) and π log(π), respectively.

The MD solution (ˆθ,πt), when specialized for Cressie-Read functions, can be recovered

by nding the solution to the saddle point problem:

ˆ θGEL =argmin θ∈Θ max λ∈Λ(θ) − 1 T X (1 + γλ0_g(z t,θ)) γ+1 γ γ + 1 (2.23)

To show that the problems are equivalent, write the Lagrangian of the MD prob-lem 2.18: L(θ,π,α,µ) = X π γ+1 t − 1 T γ(γ + 1) + µ 1 −X πt T − α0X πtg(zt,θ) T (2.24) The solution (ˆθ, ˆπt, ˆα, ˆµ) must satisfy the rst order conditions:

(25)

[π] : (ˆπt) γ γ − ˆµ − ˆα 0 g(zt,ˆθ) = 0 (2.25) [θ] : αˆ0Xπˆt ∂ ∂θg(zt,ˆθ) = 0 (2.26) [α] : Xπˆtg(zt,ˆθ) = 0 (2.27) [µ] : X ˆπt T = 1 (2.28) Solving for ˆπt, we nd: ˆ πt = γ ˆµ + γ ˆα0g(zt,ˆθ) 1/γ (2.29) Distortions are restricted to have unitary mean, so

X ˆπt T = X (γ ˆµ)1/γ 1 + γ ˆ α γ ˆµ 0 g(zt,ˆθ) 1/γ (2.30) = 1 (2.31) We nd ˆµ implicitly in (γ ˆµ)−1/γ =X 1 + γ_{γ ˆ}αˆ_µ 0 g(zt,ˆθ) 1/γ T (2.32)

so distortions can now be written as

ˆ πt= T 1 + γ ˆ α γ ˆµ 0 g(zt,ˆθ) 1/γ P 1 + γ_{γ ˆ}αˆ_µ 0 g(zt,ˆθ) 1/γ (2.33)

The rst order conditions for the GEL problem, where the optimum is denoted by (¯θ, ¯λ), are

(26)

X 1 + γ ¯λ0g(zt, ¯θ) 1/γ_¯ λ0 ∂ ∂θg(zt,¯θ) = 0 (2.34) X 1 + γ ¯λ0g(zt, ¯θ) 1/γ g(zt,¯θ) = 0 (2.35)

To recover the MD solution (ˆθ, ˆπt)from the GEL problem, we nd (¯θ, ¯λ) satisfying the

two equations above and note that for ˆθ = ¯θ and

ˆ πt= T 1 + γ ¯λ0g(zt,¯θ) 1/γ P 1 + γ ¯λ0_g(z t,¯θ) 1/γ (2.36)

the MD problem's rst order conditions are also satised.

GEL estimator shares some of the asymptotic properties of the ecient GMM estima-tor. In fact, all the estimators are asymptotically equivalent and they dier only in their higher order properties. For reference, we present conditions for consistency and asymp-totic normality for GEL estimators presented and demonstrated in Newey and Smith (2004).

If the following assumptions are satised, then the GEL estimator ˆθGELdened in2.23

converges in probability to θ∗_:

Assumption 1. (a) θ∗ _{is the unique solution to E[g(z,θ)] = 0; (b) Θ is compact; (c) g( θ)}

is continuous at each θ ∈ Θ with probability approaching one; (d) E [supθ∈Θkg(z,θ)kα] <

∞ for some α > 2; (e) E [g(z,θ∗_)g(z,θ∗₎0_] _{is nonsingular.}

Assumption (a) is a standard identication restriction ensuring that for any θ 6= θ∗ in Θ, the moment conditions can only be matched by distorting the real measure. Compactness of Θ is a strong assumption in our framework as the relative risk aversion, although bounded below by zero, can take any nonnegative value. Although compactness is unattractive, alternatives as the concavity of the objective function with respect to θ do not apply as well, so we keep assumption (b) assuming that Θ is bounded by a large enough constant. Condition (c) is common in the literature for extremum estimators,

(27)

easily applicable in practice. Finally, assumption (e) is a regularity condition regarding the boundedness of the moment function and condition (e) requires that the moment conditions are not redundant.

Asymptotic normality requires a new set of assumptions, besides Assumption 1, that we present below:

Assumption 2. (a) θ∗ _∈_{int (Θ); (b) g(z,θ) is continuously dierentiable in a}

neighbor-hood N of θ∗ _{and E [sup}

θ∈N k∂g(z,θ)/∂θk] < ∞; (c) rank(E∂g(z,θ

∗_{)/∂θ) = p}_.

Assumptions (a) and (b) are required so that ˆθ satises the rst order conditions and we are able to perform a Taylor expansion around the true parameter θ∗_{, along with}

a technical condition allowing us to interchange integration and dierentiation orders. Condition (c) guarantees that the asymptotic variance is well dened.

Dene G = E [∂g(z,θ)/∂θ] and Ω = E [g(z,θ∗_)g(z,θ∗₎0_]_{. Under assumptions 1 and 2,}

the asymptotic distribution of ˆθ is √

T (ˆθ − θ∗)→ N 0, (Gd 0Ω−1G)−1 (2.37)

2.4 Simulations and Results

We explore small sample properties of GEL estimators when applied to disaster models. Specically, we focus on the estimation of risk aversion and how the implied probabilities derived from the GEL procedure may contribute to uncover the true parameter α. Esti-mation is problematic when disasters are not observed in a sample, but are reected in asset prices. As we have seen, the risk aversion necessary to price excess returns when we do not observe disasters is too high, so we ask whether distorting the empirical measure can help alleviate this bias problem.

For each test to be described below, we simulate the disaster economy outlined in this paper 1000 times and discard samples with disaster events. Each simulation contains 100 observations, roughly reproducing the sample size for yearly US data used in the

(28)

literature. We estimate risk aversion applying GEL estimators to the Euler equations for excess returns, specializing in three components of the GEL family: empirical likelihood (EL, γ = −1), exponential tilting (ET, γ = 0) and continuous updating estimator (CUE, γ = 0).

The estimators chosen for our exercise represent the most commonly used members of the GEL family and covers interesting values of γ. As shown in Almeida and Garcia (forthcoming), the contribution of dierent moment conditions of π to the measurement of discrepancies is aected by the choice of γ. We explore the consequences of this observation to the distribution of induced distortions in the disaster model environment. To perform the optimization required to estimate the model, we split the GEL proce-dure2.23 into two parts. We rst create a function that computes, for each risk aversion coecient, the distortion {πt(α)} with minimum discrepancy. This function solves what

will be called the inner problem, a standard concave optimization problem where we search for a vector λ satisfying the rst order conditions 2.34. To nd the risk aversion minimizing the overall discrepancy, we solve what we call the outer problem, generally more complicated as the objective function is not necessarily concave.

It is common, under the baseline parametrization outlined in section 2.2, that no solution to the GEL problem exists. A solution for the inner problem with xed risk aversion ¯α must satisfy the Euler equation

X πt( ¯α) Ct+1 Ct − ¯α Re_t = 0 (2.38)

If a sample does not contain at least one disaster, the model frequently generates positive excess return in every period, as asset dividends are high with respect to their prices. Consumption growth is always positive because it is a lognormal variable, so it is not possible for a nonnegative πt( ¯α) to satisfy equation 2.38. For EL and ET, the inner

problem cannot be solved in this case because any solution πt( ¯α) must be nonnegative

as a necessary condition. The CU problem, though, will have a solution for the inner problem as negative distortions are permitted, but commonly does not have a solution for

(29)

Table 2.2: Estimating α

Median Standard deviation

ˆ αEL _15.03 _30.95 ˆ αET _26.11 _37.32 ˆ αGM M _76.32 _15.93

the outer problem. If events are allowed to have negative probabilities, then two events of the type β (Ct+1/Ct)

−α

can diverge to innity but compensate each other, still satisfy the Euler equation. This happens in most of our simulations when there are no disasters in the sample, as the objective function will always decrease as α goes to innity.

In our rst experiment, we use two assets diering only on their volatility and estimate

α using EL, ET and the ecient GMM. As explained above, the CU estimator does

not behave well in this environment, so we chose not to include it in our results. The parameters dening the dividend processes are η = (1,1), ρ = (.6,.6), φ = (1,1) and σv = (2₃σ,3₂σ), so their only dierence is in their volatility. Table 2 shows the median and

standard deviation of the estimated relative risk aversion ˆα. As we see, the upward bias problem is less severe for GEL estimators used in this exercise compared to the ecient GMM. The standard deviation of the GMM estimator is lower, but it is centered at a very high coecient of risk aversion.

To understand how EL and ET estimators distort the empirical probabilities in this setting, we examine the optimal λ in each simulation by normalizing its rst component to one. We take the median weights λ/λ(1) that the optimal distortion gives to each asset and present them in table 3. As we can see, both estimators give more weight to the rst asset, less volatile.

Our next exercise shares the same structure as the above, but now we increase the Table 2.3: Optimal weights ζ = λ/λ(1)

ζ(1) ζ(2)

Empirical Likelihood 1 0.67

(30)

ˆ αEL _17.93 _39.98 ˆ αET _29.33 _44.47 ˆ αGM M _86.45 _20.19

volatility of the normal component of consumption growth for one asset. Using this setting, we can investigate how the optimal weight for each asset behaves when the extra volatility comes from a shock correlated with consumption. The parameters used are η = (1,1), ρ = (.6,.6), σv = (σ,σ) and φ = (1,1.5). Table 4 shows that results are similar

to the above, but estimates of risk aversion tends to be higher. This is natural because we are increasing not only the volatility of dividend growth, but also the covariance with consumption growth, so the equity premium is higher.

Table 5 shows that the average weight for each asset in the GEL formulation inverts in this new parametrization. Now both estimators assigns a larger weight for the asset with higher covariance with consumption growth. In the above parametrization, the volatility was dierent for each asset, but the covariance with consumption growth was the same and the source of extra volatility was an idiosyncratic shock.

Table 2.5: Optimal weights ζ = λ/λ(1)

ζ(1) ζ(2)

Empirical Likelihood 0.25 1

Exponential Tilting 0.46 1

Our nal exercise explore how these estimators may change when we add an extra asset. Is the new information relevant for the estimation of relative risk aversion? We chose a diverse set of assets, parametrized by η = (1,1,.7), ρ = (1,.4,.8), σv = (0,σ,σ) and

φ = (1,1,1). The rst asset corresponds to the consumption claim asset. As we can see in table 6, the additional asset does not change substantially the median estimated risk aversion, but it has a huge impact in the standard deviation of the estimators.

(31)

ˆ αEL _19.31 _16.87 ˆ αET _27.33 _21.85 ˆ αGM M _84.32 _12.49

2.5 Conclusion

The disaster models are important contributions to the equity premium puzzle literature. Although it can explain the equity premium in calibration exercises, it is a challenge for the estimation not only of the full disaster model, but also of some of its key parameters by semiparametric methods. If we want to use Euler equations to estimate the basic repre-sentative agent consumption model, which estimator has the best small sample properties if we accept that the rare disaster hypothesis is true?

Our simulations suggests that the empirical likelihood estimator has smaller bias com-pared to exponential tilting and the ecient GMM estimator. We also argue that the continuous updating estimator often has no solution for its associated minimization pro-gram, becoming a potential problem if we want to compare results from a large set of samples, as the comparison of international data and estimates of relative risk aversion. Finally, our results suggest that GEL estimators assign more weight to assets with a higher covariance with consumption.

(32)

Bibliography

Caio Almeida and René Garcia. Economic implications of nonlinear pricing kernels. Man-agement Science, forthcoming.

David Backus, Mikhail Chernov, and Ian Martin. Disasters implied by equity index options. Journal of Finance, 66(6):19692012, 2011.

Robert Barro and Tao Jin. On the size distribution of macroeconomic disasters. Econo-metrica, 79(5):15671589, 2011.

Robert J. Barro. Rare disasters and asset markets in the twentieth century. The Quarterly Journal of Economics, 121(3):823866, 2006.

John Campbell. Consumption-based asset pricing. Handbook of the Economics of Fi-nance,, page 803887, 2003.

Noel Cressie and Timothy Read. Multinomial goodness-of-t tests. Journal of the Royal Statistical Society, 46(3):440464, 1984.

Xavier Gabaix. Variable rare disasters: A tractable theory of ten puzzles in macro-nance. American Economic Review, 98(2):6467, 2008.

Xavier Gabaix. Variable rare disasters: An exactly solved framewrk for ten puzzles in macro-nance. Quarterly Journal of Economics, 127(2):645700, 2012.

Lars Hansen and Ravi Jagannathan. Implications of security market data for models of dynamic economies. Journal of Political Economy, 99(2):225262, 1991.

(33)

Lars Hansen and Kenneth Singleton. Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica, 50(5):12691286, 1982.

Lars Hansen, John Heaton, and Amir Yaron. Finite-sample properties of some alternative gmm estimators. Journal of Business and Economic Statistics, 14(3):262280, 1996. Christian Julliard and Anisha Ghosh. Can rare events explain the equity premium puzzle.

The Review of Financial Studies, 95(10):30373076, 2012.

Yuichi Kitamura and Michael Stutzer. An information-theoretic alternative to generalized method of moments estimation. Econometrica, 65(4):861874, 1997.

Martin Lettau and Sysney Ludvigson. Euler equation errors. Review of Economic Dy-namics, 12:255283, 2009.

Robert Lucas. Asset prices in an exchange economy. Econometrica, 46(6):14291445, 1978.

Ian Martin. Consumption-based asset pricing with higher cumulants. Review of Economic Studies, 80:745773, 2013.

Rajnish Mehra and Edward Prescott. The equity premium: A puzzle. Journal of Monetary Economics, 15:145161, 1985.

Whitney Newey and Smith Richard. Higher order properties of gmm and generalized empirical likelihood estimators. Econometrica, 72(1):219255, 2004.

Whitney Newey and Richard Smith. Higher order properties of gmm and generalized empirical likelihood estimators. Econometrica, 72(1):219255, 2004.

Art Owen. Empirical likelihood ratio condence intervals for a single functional. Biometrika, 75:237249, 1988.

Thomas Rietz. The equity risk premiuma solution. Journal of Monetary Economics, 22:117131, 1988.

(34)

Richard Smith. Alternative semi-parametric likelihood approaches to generalised method of moments. The Economic Journal, 107(441):503519, 1997.

(35)

Chapter 3 Estimation of Misspecied Asset

Pricing Models with Multiple Entropic

Estimators

Abstract

We develop an alternative method to compute the misspecication degree of an asset pric-ing model by takpric-ing into account the smallest multiplicative correction to the stochastic discount factor necessary to assign correct asset prices. These multiplicative corrections can also be described as probability distortions to the real measure associated with the model. Estimators obtained minimizing our misspecication measure belong to the Gen-eralized Empirical Likelihood class. We obtain consistency and asymptotic normality results for GEL estimators assuming the model is misspecied. To test our methodology, we conduct Monte Carlo experiments using the habit model of Campbell and Cochrane (1999), where a misspecied stochastic discount factor is estimated. Results indicate that, although dierent distance measures have a signicant impact to the estimation of prob-ability distortions with minimum discrepancy, they have a small impact in the estimation of misspecication degrees and the relative risk aversion for this model.

(36)

3.1 Introduction

Asset pricing models are designed to capture features of reality to understand how asset prices are related with their future payos. However, these models are only approxima-tions and we in general cannot price assets without errors. If we assume that pricing errors are inherent to asset pricing models, we need measures of misspecication either to improve the model or to compare it to competing ones.Hansen and Jagannathan (1997) took advantage of the stochastic discount factor framework in asset pricing models to develop a measure of misspecication. The idea is to compute the least square distance from the proxy for a stochastic discount factor associated with a model to the set of all stochastic discount factors that price assets without errors. Almeida and Garcia (2012) extend this measure using a general class of convex functions to determine the distance between the model and the set of stochastic discount factors, taking into account dier-ent momdier-ents of the model to determine its misspecication. The extension is justied by growing evidence that higher moments of the stochastic discount factor and payos, as skewness and kurtosis, are important in explaining asset pricing, see e.g. Harvey and Siddique(2000) and Dittmar (2002).

This paper develops an alternative method to compute misspecication measures by taking into account the smallest multiplicative correction to the stochastic discount factor necessary to price assets correctly. These measures are related to Almeida and Garcia (2012), whose misspecication assessments are equivalently dened by the size of the smallest additive correction to the model. In our framework, we are able to give an al-ternative probabilistic interpretation of the misspecication measure by calculating the distance between the probability measure of the model and alternative probability sures under which the model is correctly specied. The distance between these two mea-sures is robustly computed using dierent moments of the Radon-Nikodym derivative that connects them.

Misspecication measures can be used to estimate parameters of the asset pricing model. If parameters are estimated using the Hansen-Jagannathan distance, then the

(37)

as-sociated estimators belong to the Generalized Method of Moments (GMM) class,Hansen (1982), as the distance measure turns out to be a quadratic function of pricing errors determined by the matrix of second moments of payos. Estimators obtained mini-mizing our misspecication measure are in the Generalized Empirical Likelihood (GEL) class, Smith (1997), a family of semiparametric estimators identied by moment condi-tions. This family has many important special cases, as the Continuous Updating Es-timator (CUE), Hansen, Heaton, and Yaron (1996), the Empirical Likelihood estimator (EL), Owen (1988) and the Exponential Tilting estimator (ET), Kitamura and Stutzer (1997). Although these estimators share the asymptotic properties of the ecient GMM, they have better small sample properties, see e.g.Hansen, Heaton, and Yaron(1996), Im-bens (1997). In this paper, we obtain consistency and asymptotic normality results for GEL estimators assuming the model is misspecied. The estimator converges to a pseudo true parameter that minimizes a misspecication measure in our family of discrepancies. To test our methodology, we conduct Monte Carlo experiments using the habit model of Campbell and Cochrane (1999). This model is characterized by a representative con-sumer concerned with the dierence between its current consumption level and a function of past consumption levels interpreted as habits. The stochastic discount factor obtained from this model is the product of a standard pricing kernel obtained from CRRA prefer-ences and a habit component. We apply the estimation methodology associated with our discrepancy measures to a misspecied stochastic discount factor obtained by not taking into account the habit component. Results indicate that, although dierent distance mea-sures have a signicant impact to the estimation of probability distortions with minimum discrepancy, they have a small impact in the estimation of misspecication degrees and the relative risk aversion of the model.

(38)

3.2 Environment

In this section, we describe an abstract setting for an economy with asset markets by characterizing asset payos and prices. It is well known, e.g. see Due (2001), that a simple structure can guarantee, under some restrictions on prices and payos, the exis-tence of a stochastic discount factor. Asset pricing models generate proxies for stochastic discount factors that will be used as testable restrictions to assess misspecication.

Suppose an economy where K assets are traded at time t and payos are received at time t + 1. We x a probability space (Ω,F,µ) and denote by L2 _{the space of all random}

variables with nite second moments. Payos for the basis assets are denoted by the K-dimensional random vector x, where each coordinate is an element of L2 _{and we assume}

that E(xx)0 _{is a nonsingular matrix. Feasible payos in this economy are represented by}

the set of linear combinations of basis assets P = {α0_{x|α ∈}_RK_}_{. As a consequence of the}

nonsingularity of E(xx0₎_{, each payo in P is characterized by a unique linear combination}

of basis assets.

Assume the law of one price is valid and there is a vector q of prices for the basis assets, so we can form a pricing functional % : P → R by setting

%(α0x) = α0q (3.1)

An stochastic discount factor is a random variable m ∈ L2 _{such that}

q =E(mx) (3.2)

so m prices any asset in this economy by taking the mean of discounted future payos. The existence of a stochastic discount factor in this environment is guaranteed by Riesz representation theorem using the fact that %(·) is a linear and continuous functional.

If a stochastic discount factor satises equation 3.2, then any feasible payo α0_{x ∈ P}

(39)

%(α0x) = α0q =E(mα0x) (3.3) Asset pricing models generate a proxy y for a stochastic discount factor used to com-pute prices. For our purposes, all relevant information about an asset pricing model is encoded in y and its associated pricing functional %y(·), dened by

%y(α0x) =E(yα0x) (3.4)

Economists build models to explain some features of reality, oering approximations for the object of study. These models, however, cannot incorporate all complexities of the real world and are expected to be misspecied. Even if we accept models as approx-imations, we need measures of the degree of their misspecication, either to improve a particular model or to compare it to another one.

We take advantage of the stochastic discount factor framework in asset pricing models to develop measures of misspecication. To be concrete, a model y ∈ L2 _{is said to be}

misspecied if it does not price correctly payos in P , so the associated pricing functional %y(·) is not equal to %(·). This is equivalent to say that y is not a stochastic discount

factor.

3.2.1 Measuring misspecication

To measure misspecication of asset pricing models, we dene the concept of multiplicative corrections. For a given proxy y for a stochastic discount factor, we say that the random variable π satisfying E[π] = 1 is a multiplicative correction if

E[πyx] = q (3.5)

for every x ∈ P , so πy is a stochastic discount factor.

By restricting our multiplicative corrections to have unitary means, we are able to provide a probabilistic based interpretation for them. A positive multiplicative correction

(40)

π induces a new probability measure under which the model y is correctly specied. Expectations taken under the new probability measure, denoted by Eπ_[·], satisfy

Eπ

[yx] =E[πyx] = q (3.6)

so a multiplicative correction can be seen either as a change in the discount rates of future events or in the probabilities of those events happening.

There is potentially an innite number of multiplicative corrections to any asset pricing model, even if the model is correctly specied when markets are incomplete. A convenient way to measure misspecication in this framework is by calculating the smallest entropy between the probability measure determined by the model and the distorted measures that price the basis assets correctly using y. There are multiple ways to calculate entropy, so we followAlmeida and Garcia(2012) and adopt a family of generalized entropy measures constructed from Cressie and Read (1984) discrepancies.

The Cressie-Read discrepancies are indexed by γ ∈ R and dened by

φ(π) = π

1+γ_{− 1}

γ(γ + 1) (3.7)

The entropy of a multiplicative correction π is given by E[φ(π)]. When restricted to positive multiplicative corrections, all functions of the Cressie-Read family are convex and satisfy

E[φ(π)] ≥ φ (E[π]) = 0 (3.8)

so for any φ(·) the minimum discrepancy is achieved by multiplicative corrections π that are equal to one almost everywhere. When the multiplicative correction is not a constant, the choice of γ will aect the amount of entropy, providing dierent results depending on the chosen discrepancy.

The family of discrepancies described above is a generalization of frequently used measures of entropy. For γ = −1 and 0, E[φ(π)] is given by their limits, the familiar

(41)

measures of entropy: lim γ→−1E πγ+1_{− 1} γ(γ + 1) =E[− log(π)] (3.9) lim γ→0E πγ+1− 1 γ(γ + 1) =E[π log(π)] (3.10)

We dene δγ _{as the misspecication degree of a model, given by the minimum}

discrep-ancy necessary so that y, the proxy for a stochastic discount factor, can price correctly the basis assets. In mathematical terms, δγ is the solution to

δγ = min π≥0 E [πγ+1_{] − 1} γ(γ + 1) s.t. E [πyx] = q (3.11) E[π] = 1

When the model is correctly specied, a feasible solution to problem 3.11, for any Cressie-Read function, is given by π = 1, so the misspecication degree δγ will be equal

to zero. If this is not the case, then δγ _{> 0} and the selection of γ matters for computing

the size of the model misspecication.

It is usually the case that the model y(θ) depends on a vector of parameters θ ∈ Θ, where Θ ⊂ Rp _{is the parameter set. A natural extension for the misspecication measure}

in3.11 is given by δγ = min θ∈Θ minπ≥0 E [πγ+1_{] − 1} γ(γ + 1) s.t. E [πy(θ)x] = q (3.12) E[π] = 1

(42)

To characterize the solutions for π∗ _{and θ}∗ _{in problem} _3.12_{, assume that π}∗ _{> 0} _and

θ∗ ∈intΘ. Form the Lagrangian:

L(θ, π, α, µ) =E [φ(π)] − α0E[π(y(θ)x − q)] + µ (1 − E[π]) The solution (θ∗_{, π}∗_{, α}∗_{, µ}∗₎ _{must satisfy}

[π] : (π ∗₎γ γ − α ∗0 (y(θ∗)x − q) − µ∗ = 0 (3.13) [θ] : α∗0E π∗x∂y ∂θ (θ ∗ ) = 0 (3.14) [α] : E[π∗(y(θ∗)x − q)] = 0 (3.15) [µ] : E[π∗] = 1 (3.16)

Solving equation 3.13 for π∗ _{we nd that}

π∗ = (γα∗0(y(θ∗)x − q) + γµ∗)1γ

Using the fact that a multiplicative correction must satisfy E[π∗_{] = 1}_{, we have 1 =}

E[π∗_{] = (γµ}∗₎_γ1_Eh_γ α∗ γµ∗ 0 (y(θ∗x − q) + 1i 1 γ and nally: π∗ = γ_γµα∗∗ 0 (y(θ∗)x − q) + 1 1/γ E " γ_γµα∗∗ 0 (y(θ∗_{)x − q) + 1} 1/γ# (3.17)

Finding (θ∗_,π∗_,α∗_,µ∗₎_{that solves equations}_3.13_-_3.16_{can be computationally}

demand-ing as the problem has innite dimension in general. We follow Newey and Smith(2004) approach to show that (θ∗_,π∗₎ _{can be recovered from the solution of a simpler problem,}

(43)

min θ∈Θ _λ∈sup_RK −E " (γλ0(y(θ)x − q) + 1)γ+1γ − 1 γ + 1 # (3.18) The rst order conditions of a solution (¯θ,¯λ) for problem3.18 are given by

E γ ¯λ0 y(¯θ)x − q+ 11/γλ¯0x∂y ∂θ(¯θ) = 0 (3.19) Eh γ ¯λ0 y(¯θ)x − q+ 11/γ y(¯θ)x − qi= 0 (3.20) Comparing the above equations with3.14and3.15, along with π∗_{, we see that ¯λ =} α∗

γµ∗

and ¯θ = θ∗ _{are solutions to the rst order condition above, so we can use the simpler nite}

dimensional program3.18 to recover the solution (θ∗_,π∗₎_{from the original problem.}

3.3 Estimation

In general, we don't know the exact probability distribution underlying the stochastic discount factor and payos, so we rely on estimation to compute the misspecication degree δγ _{and the parameter vector θ}∗ _{that minimizes it. In this section we propose}

an estimator for (θ∗_,λ∗₎ _{and establish its asymptotic properties assuming the model is}

possibly misspecied. The estimator is computed by replacing population moments in3.18 by their sample counterparts.

For this section, we represent the proxy for the stochastic discount factor as y(θ,zt),

where y(·,·) is its time invariant functional form and ztis a vector containing explanatory

variables and payos. To simplify notation, we dene the K-dimensional vector g(zt,θ)

that will be used to represent moment conditions for estimation by

(44)

3.3.1 Generalized Empirical Likelihood

The estimation problem can be written as

min θ∈Θ _λ∈sup_RK − 1 T T X t=1 " (γλ0g(zt,θ) + 1) γ+1 γ − 1 γ + 1 # (3.22) To nd a solution for 3.22, we apply a two step procedure. The rst step is to nd a function ˆλ(θ) such that for each θ ∈ Θ:

ˆ λ(θ) = argmax λ∈RK − 1 T T X t=1 (γλ0g(zt,θ) + 1)) γ+1 γ − 1 γ + 1 (3.23)

Problem3.23is a standard concave maximization problem, so to obtain ˆλ(θ) it suces to nd the solution to the rst order conditions

T

X

t=1

(γ ˆλ(θ)0g(zt,θ) + 1)1/γg(zt,θ) = 0K (3.24)

In the second step, we solve the outer minimization in3.22given the estimated function ˆ λ(·): ˆ θ = argmin θ∈Θ − 1 T T X t=1 (γ ˆλ(θ)0g(zt,θ) + 1)) γ+1 γ − 1 γ + 1 (3.25)

The minimization problem 3.25 is more delicate as there are no guarantees that we are optimizing a convex function, so rst order conditions can be satised by potentially many parameters θ ∈ Θ. After nding ˆθ, the estimator for λ∗ _{will be given by ˆλ(ˆθ) and}

we are able to recover estimated distorted probabilities {ˆπt}using sample counterparts of

equation3.17: ˆ πt= T γ ˆλ(ˆθ)0g(zt,ˆθ) + 1 1/γ PT t=1 γ ˆλ(ˆθ)0_g(z t,θ) + 1 1/γ (3.26)

(45)

Likelihood (Smith (1997)), a family of semiparametric estimators that generalizes many frequently used estimators in the literature. We review some important special cases of this family that will be used in the Monte Carlo section of this paper.

When γ = 1, ˆθ is equivalent to the continuous updating estimator (CUE) of Hansen, Heaton, and Yaron (1996). To understand the equivalence between the estimators, set γ = 1 and write the rst order condition 3.24 for ˆλ(θ):

T

X

t=1

(ˆλ(θ)0g(zt,θ) + 1)g(zt,θ) = 0K (3.27)

We can obtain an analytical solution for ˆλ(θ) using the equation above: ˆ λ(θ) = − T X t=1 g(zt,θ)g(zt,θ)0 !−1 _T X t=1 g(zt,θ) (3.28)

Plugging the solution ˆλ(θ) in the outer minimization problem3.25, we have

ˆ θ = argmin θ∈Θ − 1 T T X t=1 (ˆλ(θ)0g(zt,θ) + 1))2− 1 2 (3.29) = argmin θ∈Θ − 1 T T X t=1 1 + 2ˆλ(θ)0g(zt,θ) + ˆλ(θ)0g(zt,θ)g(zt,θ)0λ(θ) − 1ˆ 2 (3.30) = argmin θ∈Θ 1 T T X t=1 g(zt,θ) ! _T X t=1 g(zt,θ)g(zt,θ)0 !−1 _T X t=1 g(zt,θ) ! (3.31) where in the last line we eliminated unnecessary terms for minimization, so equation3.31 becomes the objective function for CUE. Using the GEL formulation to write the CUE problem, we are able to recover the empirical probability distortion that minimizes the associated misspecication measure:

ˆ πt= T 1 −PT_t=1g(zt,θ) 0P T t=1g(zt,θ)g(zt,θ) 0−1_g(z t,θ) PT t=1 1 −PT_t=1g(zt,θ) 0P T t=1g(zt,θ)g(zt,θ)0 −1 g(zt,θ) (3.32)

(46)

The exponential tilting (ET) estimator, Kitamura and Stutzer (1997), can also be obtained as a special case of GEL for γ = 0. A simple application of the L'Hôpital rule shows that lim γ→0 γ + 1 γ log (1 + γλ 0_g(z t,θ)) = λ0g(zt,θ) (3.33)

so the limit when γ → 0 of the objective function 3.25 is exp (λ0_g(z

t,θ)) and the

mini-mization problem becomes

min θ∈Θ _λ∈sup_RK 1 T T X t=1 exp (λ0g(zt,θ)) (3.34)

The ET estimator does not have an explicit formula for ˆλ(θ), it must only satisfy 1 T T X t=1 expλ(θ)ˆ 0g(zt,θ) g(zt,θ) = 0 (3.35)

The probability distortions in the ET case are given by

ˆ πt= T−1expλ(ˆˆ θ)0g(zt,ˆθ) PT t=1exp ˆ λ(ˆθ)0_g(z t,ˆθ) (3.36)

Finally, the Empirical Likelihood (EL) estimator is a special case of GEL when γ → −1. The objective function of the EL estimator is given by

min θ∈Θ _λ∈RsupK − 1 T T X t=1 log(1 + λ0g(zt,θ)) (3.37)

The rst order condition for ˆλ(θ) the above minimization problem is given by 1 T T X t=1 1 1 + ˆλ(θ)0_g(z t,θ) = 0 (3.38)

(47)

ˆ πt= T−1 1 1+ˆλ(θ)0_g(z t,θ) PT t=1 1 1+ˆλ(θ)0_g(z t,θ) (3.39) The probability distortion for the ET estimator will always be greater than zero be-cause its exponential form. For EL, we can see in the objective function that distortions tend to be away from zero, as the logarithm function approaches minus innity when πt

is near zero.

3.3.2 Asymptotic Properties

We derive some asymptotic properties of the GEL estimators when the model is misspeci-ed, following the poof strategy inAlmeida and Garcia(2012). More specically, we show that under some regularity conditions the estimator ˆθ converges to the parameter θ∗ _that

minimizes discrepancy and that√T (ˆθ − θ∗)is asymptotically normal. Dene f(θ,λ,zt) as f (θ,λ,zt) = (γλ0g(zt,θ) + 1)) γ+1 γ − 1 γ + 1 (3.40)

The Lagrange multipliers can be written as

λ(θ) =argmax_λ∈_RK −E[f(θ,λ,z_t)] (3.41) ˆ λ(θ) =argmax_λ∈_RK − 1 T T X t=1 f (θ,λ,zt) (3.42)

Dene λ∗ _{= λ(θ}∗₎ _{and denote by B(a,r) the open sphere with center a and radius r.}

Consistency can be proved if the following assumptions are satised. Assumption 1. 1. zt is stationary and ergodic.

2. Θ is compact.

(48)

4. For suciently small ∆, E[supθ∈B(¯θ,∆)f (θ,λ,zt)] < ∞ for any vector λ in a

neigh-borhood of λ∗ _{and every ¯θ ∈ Θ.}

5. E[fλλ(θ,λ,zt)] is nonsingular ∀θ ∈ Θ.

6. θj → θ ∈ Θ ⇒ y(θj,x) → y(θ,z) for almost every x.

Assumption 1.1 is a standard distributional assumption in time series, ensuring that the law of large numbers can be applied to continuous functions of zt. Compactness of Θ

is common in extremum estimator settings that does not rely on the concavity of f with respect to θ, as in our general case. Uniqueness of (θ∗_,λ∗₎ _{is an identication condition}

implying that the pseudo true value θ∗ _{is the unique minimizer of the discrepancy measure}

δγ_{. Assumption} _1.4 _{is a technical condition, stronger than a similar assumption in the}

GMM literature, used to apply the dominated convergence theorem. Assumption 1.5 is necessary so that λ(θ) is well dened and continuous. Finally, assumption 1.6 implies that the stochastic discount factor is a continuous function of the explaining variables of the model almost everywhere.

Theorem 1. Under assumption1, ˆθ converges in probability to the pseudo-true value θ∗_.

Proof. The second derivative of f(θ,λ,zt) with respect to λ is positive semidenite, so

−1 T

PT

t=1f (θ,λ,zt) is concave and −E[f(θ,λ,zt)] is strictly concave by assumption 1.5,

so λ(θ) in 3.41 is well dened and continuous. Lemma 1 in Hong, Preston, and Shum (2003) guarantees that, under assumption 1, 1

T

PT

t=1f (θ,λ,zt) p

→E[f(θ,λ,zt)] uniformly,

so we can apply Theorem 2.1 in Newey and McFadden (1994), also valid for station-ary and ergodic processes, to show that ˆλ(θ) p

→ λ(θ) for every θ ∈ Θ. For every

¯

θ ∈ Θ, it must be true, by the continuity of λ(θ) and y(θ) almost everywhere, that limδ↓0supθ∈B(¯θ,δ)f (θ,λ(θ)) = f (¯θ,λ(¯θ)). Assumption 1.4 allows us to use the dominated

convergence theorem to show that

lim δ↓0 E " sup θ∈B(¯θ,δ) f (θ,λ(θ)) # =E[f(¯θ,λ(¯θ),zt)] <E[f(θ∗,λ∗,zt)] ∀¯θ 6= θ∗ (3.43)

(49)

For a suciently small δ > 0, we cover the compact set Θ − B(θ∗_,δ) _{using H spheres}

B(θj,δj), where δj is small enough so that

E " sup θ∈B(θj,δj) f (θ,λ(θ)) # <E[f(θ∗,λ∗,zt)] (3.44) Dene hj > 0 by 2hj =E[f(θ,λ,zt)] −E " sup θ∈B(θj,δj) f (θ,λ(θ)) # (3.45) The law of large numbers implies that for some ε > 0 there exists a suciently large Tj sucha that for T > Tj:

P 1 T T X t=1 sup θ∈B(θj,δj) f (θ,ˆλT(θ),z) >E[f(θ∗,λ(θ∗),zt)] − hj ! < ε 2H (3.46) We have Θ − B(θ∗_{,δ) ⊆ ∪}H

j=1B(θj,δj), so for T > maxjTj and h = minjhj it must be

true that P sup θ∈Θ−B(θ∗_,δ) 1 T T X t=1 f (θ,ˆλT(θ),z) >E[f(θ∗,λ(θ∗),zt)] − h ! < ε 2 (3.47)

The consistency of ˆλ(θ∗₎ _{and the law of large numbers imply that for a suciently}

large T : P 1 T T X t=1 f (θ∗,ˆλT(θ∗),z) −E[f(θ∗,λ(θ∗),zt)] < − h 2 ! < ε 2 (3.48)

If equations 3.47 and 3.48 are valid, then

lim T →∞P _{θ∈Θ−B(θ}sup∗_,δ) 1 T T X t=1 f (θ,ˆλT(θ),z) − 1 T T X t=1 f (θ∗,λ(θ∗),z) > −h 2 ! = 0 (3.49)

(50)

P (ˆθ /∈ B(θ∗,δ)) ≤ P sup θ∈Θ−B(θ∗_,δ) 1 T T X t=1 f (θ,ˆλT(θ),z) − 1 T T X t=1 f (θ∗,λ(θ∗),z) > 0 ! p → 0 (3.50) so ˆθ is a consistent estimator for θ∗_.

For results about asymptotic normality, dene Φ = (θ,λ) and the functions:

G(Φ) = ∂ ∂Φ 1 T T X t=1 f (θ,λ,zt) (3.51) HT(Φ) = ∂2 ∂Φ2 1 T T X t=1 f (θ,λ,zt) (3.52) H(Φ) = ∂ 2 ∂Φ2E[f(θ, λ, z)] (3.53)

A new set of assumptions is necessary to prove asymptotic normality: Assumption 2. 1. E [supΦ∈N kH(Φ)k] < ∞ for some neighborhood N of Φ

∗

2. H(Φ∗₎ _{is nonsingular}

3. V ar√T G(φ∗)→ Sp Φ > 0

Assumptions in 2are used in the Taylor expansion of the score function G(Φ) around Φ∗:

0 = G( ˆΦ) = G(Φ∗) + H( ¯Φ)( ˆΦ − Φ∗) (3.54)

Assumption 2.1 is necessary to guarantee that:

HT( ˆΦ) p

→ H(Φ∗) (3.55)

The nonsingularity of the Hessian matrix implies that Hn(Φ) is invertible with

(51)

the Hessian. Finally, assumption 1.3 is a general assumption regarding the asymptotic distribution of the score function. If assumptions 1 and 2 are satised, we are able to apply proposition 7.8 in Hayashi(2000) to show that

Theorem 2.

√

T ( ˆΦ − Φ∗)→ N (0, Vd Φ) (3.56)

where VΦ = H(Φ∗)−1SΦH(Φ∗)−1.

3.4 Monte Carlo Experiments

In this section, we test our methodology using Monte Carlo simulations to estimate a misspecied stochastic discount factor assuming that the economy follows the habit model of Campbell and Cochrane (1999). The model explains the equity premium puzzle by adding a new component to the stochastic discount factor determined by past aggregate consumption levels. The new element is justied by the idea that agents care not only about the level of current consumption, but also on how it is related to a function of past consumptions interpreted as habit. The representative agent becomes more risk averse when consuming less than his habit level, decreasing asset prices and making it easier to match the Euler equation with an acceptable risk aversion parameter.

The stochastic discount factor of the model depends on a habit component determined by a specic stochastic process designed to match several data moments. Although this specication for the habit dynamics is successful in explaining the equity premium and some characteristics of nancial data in calibration exercises, it is a challenge for estima-tion. The central question is what functional form can be used to relate past levels of consumption to habits.

If the full specication of the model is accepted to be correct, simulation methods as inBansal, Gallant, and Tauchen (2007) can be used to estimate the underlying parame-ters. Another approach for estimation is to treat the functional form of habits unknown and to estimate it jointly along the other parameters of the model, as inChen and