• Nenhum resultado encontrado

A risk analysis of the brazilian stock market using value-at-risk and GARCH models

N/A
N/A
Protected

Academic year: 2021

Share "A risk analysis of the brazilian stock market using value-at-risk and GARCH models"

Copied!
188
0
0

Texto

(1)
(2)
(3)
(4)
(5)
(6)
(7)

I dedicate this dissertation to my mother, Ana, and my girlfriend, Isabela.

(8)
(9)

Acknowledgements

First and foremost, I would like to express my deepest gratitude towards my co-advisor, prof. Wilton. With his enthusiasm and eagerness to help, I can safely say that this work would not be possible without prof. Wilton’s assistance. I hope that this work makes the countless late-night Sunday meetings over the Internet worthwhile.

I must also thank my mother, Prof. Ana Rosa, for being my role model of academic integrity and excellence for as long as I can remember, and for her enduring love and support.

I am also thankful to my loving girlfriend Isabela, who patiently endured all these months of constant absence, weekend dissertation writing, weariness and overall crankiness.

(10)
(11)

Everything in this life passes away — only God remains, only He is worth struggling towards. We have a choice: to follow the way of this world, of the society that surrounds us, and thereby find ourselves outside of God; or to choose the way of life, to choose God Who calls us and for Whom our heart is searching. —FR. SERAPHIM ROSE

(12)
(13)

Resumo

O objetivo desta dissertação é estudar um conjunto de metodologias de Valor-em-Risco (VaR) que apresentam bom desempenho na literatura e avaliar como elas podem ser usadas para es-timar o risco de diferentes setores da economia brasileira partindo de uma perspectiva de um investidor.

VaR é a medida de risco mais usada na indústria financeira, e é utilizado por bancos privados e governos do mundo todo. Há uma vasta literatura tratando do VaR, porém há poucos estudos que investigam o uso do VaR como uma ferramenta para pequenos investimentos. Também há poucos estudos analisando estimativas do VaR para ações de empresas brasileiras.

Este trabalho inicia-se com a revisão de algumas metodologias de cálculo de VaR e a identifi-cação das metodologias com melhor desempenho. Em seguida, fazemos dois experimentos. O primeiro experimento consiste numa análise estatística de dados provenientes de diversas ações e índices setoriais da bolsa de valores brasileira em vários momentos diferentes afim de identi-ficar quais metodologias VaR são potencialmente mais adequadas para cada ativo. O segundo experimento avalia o desempenho de uma seleção de metodologias VaR utilizando dados dos mesmos ativos e épocas do experimento anterior. Na última parte deste trabalho, otimizamos uma seleção de metodologias VaR para atuarem com dados recentes da bolsa de valores e anal-isamos os VaRs estimados supondo a visão de um potencial investidor.

Os resultados dos nossos experimentos indicam que o VaR pode ser uma ferramenta eficiente na minimização da exposição ao risco, e pode potencialmente reduzir ou evitar perdas em nego-ciações na bolsa de valores brasileira. Os experimentos também mostram que diferentes setores da economia brasileira tem propriedades de risco significativamente diferentes umas das outras. Palavras-chave:Valor-em-Risco. Séries temporais. Volatilidade. Modelos GARCH.

(14)
(15)

Abstract

The purpose of this dissertation is to study several leading Value-at-Risk (VaR) methodologies and evaluate how they can be used to assess the risk of different sectors of the Brazilian economy with the perspective of a potential investor.

VaR is the financial industry’s most widely used risk measure, commonly adopted by banks and governments around the world. There is a great amount of ongoing research on VaR; however, there are few studies that use VaR as a potential tool for small investments. There are also very few studies that analyze VaR estimation of Brazilian companies.

This dissertation first reviews VaR methodologies and elects a few among the best performing according to current literature. In a second stage, two experiments are conducted. The first experiment consists of a statistical evaluation of data from the Brazilian stock market during different time ranges so that adequate VaR methodologies may be chosen according to the data. The second experiment benchmarks the chosen VaR methodologies during the same time ranges. In a third and final stage, the chosen VaR methodologies are backtested using recent data from sectoral indices of the Brazilian stock market.

The results of the experiments suggest that VaR may be an effective tool in minimizing risk exposure and potentially reducing or avoiding losses when trading in the Brazilian stock market. The experiments also show that different sectors of the economy have significantly different risk properties.

(16)
(17)

List of Figures

2.1 Non-stationarity: multiple random walk processes. . . 24 2.2 (Strict) stationarity: multiple iid processes. . . 24 3.1 Detail of the left tail of a Normal distribution. Shaded area represents losses

greater than VaR. . . 39 3.2 KDE for VALE5 sample data. . . 47 3.3 PETR4 returns in 2004 (first row) and 2008 (second row). . . 50 4.1 Standard Normal (a) and Non-standard Normal (b) plotted (solid curve) over

IBOV returns during Range 1. A fitted Student-t distribution (dotted curve) is also present. . . 57 4.2 Normal, Student-t and Azzalini’s Skew-Student-t VaR estimates. IBOV data

(range 1) in Figure 4.1a, GOLL4 (range 4) data in Figure 4.1b, CRUZ3 (range 4) data in Figure 4.1c and NATU3 (range 4) data in Figure 4.1d. . . 59 4.3 IBOV data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 61 4.4 IEEX data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 62 4.5 IBOV data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 65 4.6 IEEX data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 66 4.7 ICON data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 67 4.8 INDX data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 68 4.9 IBOV data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 71

(18)

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 72 4.11 ICON data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 73 4.12 INDX data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 74 4.13 IMOB data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 75 4.14 IFNC data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 76 4.15 IMAT data during range5. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 77 4.16 ENBR3 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 81 4.17 EMBR3 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 82 4.18 NATU3 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 83 4.19 CYRE3 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 84 4.20 ITSA4 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 85 4.21 USIM5 data during range1. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 86 4.22 EMBR3 and INDX returns ((a) and (b), respectively) with fitted Normal,

(19)

4.23 ENBR3 data during range2. Histogram of returns with fitted Normal, Student-t and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 90 4.24 EMBR3 data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 91 4.25 NATU3 data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 92 4.26 CYRE3 data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 93 4.27 ITSA4 data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 94 4.28 USIM5 data during range2. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 95 4.29 ENBR3 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 98 4.30 EMBR3 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 99 4.31 NATU3 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 100 4.32 CYRE3 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 101 4.33 ITSA4 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 102 4.34 USIM5 data during range3. Histogram of returns with fitted Normal, Student-t

and AST PDF curves respectively in (a), (b) and (c). Quantile-Quantile plots using the same distributions in (d), (e) and (f). . . 103 5.1 IBOV index violation ratios using Normal VaR during ranges 1 and 2. . . 110 5.2 IBOV index violation ratios using Student-t VaR during ranges 1 and 2. . . 110

(20)

5.4 Normal VaR using IBOV data during ranges 1, 2 and 3 (top to bottom rows) with WE lengths of 50, 100 and 200 observations (left to right columns). . . 113 5.5 Normal VaR using IEEX data during ranges 1, 2 and 3 (top to bottom rows)

with WE lengths of 50, 100 and 200 observations (left to right columns). . . 114 5.6 Student-t VaR using IBOV data during ranges 1, 2 and 3 (top to bottom rows)

with WE lengths of 50, 100 and 200 observations (left to right columns). . . 117 5.7 Student-t VaR using IEEX data during ranges 1, 2 and 3 (top to bottom rows)

with WE lengths of 50, 100 and 200 observations (left to right columns). . . 118 5.8 AST VaR using IBOV data during ranges 1, 2 and 3 (top to bottom rows) with

WE lengths of 50, 100 and 200 observations (left to right columns). . . 120 5.9 Normal-GARCH(1,1) VaR using IBOV data during ranges 1, 2 and 3 (top to

bottom rows) with WE lengths of 50, 100 and 200 observations (left to right columns). . . 122 5.10 Normal-GARCH(1,1) VaR using IEEX data during ranges 1, 2 and 3 (top to

bottom rows) with WE lengths of 50, 100 and 200 observations (left to right columns). . . 123 5.11 Student-t-GARCH(1,1) VaR using IBOV data during ranges 1, 2 and 3 (top to

bottom rows) with WE lengths of 50, 100 and 200 observations (left to right columns). . . 125 5.12 On the left column, frequency distribution of violation ratios. On the right

column, observed violation ratio sorted in ascending order. The area shaded in gray corresponds to the acceptable VR, 0.85 ≤ VR ≤ 1.15. Upper, middle and lower rows correspond to time ranges 1, 2 and 3, respectively. . . 128 6.1 IBOV value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in

light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 132 6.2 IEEX value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in

light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 133 6.3 ICON value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in

light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 134 6.4 INDX value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in

light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 135

(21)

6.5 IMOB value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in light gray and estimated VaR is black. The dotted line represents average VaR

in the entire period. . . 136

6.6 IFNC value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 137

6.7 IMAT value-at-risk backtest, 01/06/2011 - 20/07/2015. Returns are shown in light gray and estimated VaR is black. The dotted line represents average VaR in the entire period. . . 138

6.8 ABEV3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 140

6.9 BRFS3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 141

6.10 JBSS3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 141

6.11 EMBR3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 142

6.12 FIBR3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 143

6.13 WEGE3 value-at-risk backtest, 01/06/2011 - 20/07/2015. . . 143

E.1 Violation ratios using different WE lengths using Normal VaR during ranges 1, 2 and 3 . . . 175

E.2 Violation ratios using different WE lengths using Student-t VaR during ranges 1, 2 and 3 . . . 176

E.3 Violation ratios using different WE lengths using AST VaR during ranges 1, 2 and 3 . . . 177

E.4 Violation ratios using different WE lengths using Normal-GARCH VaR during ranges 1, 2 and 3 . . . 178

(22)
(23)

List of Tables

3.1 Value-at-risk methodology classification according to scedasticity and distribu-tional assumption. . . 43 4.2 Indices and stocks analyzed in the experiments. . . 55 4.4 Date ranges used in the experiments. Standard deviation is used to measure

volatility. . . 56 4.5 Range 1 resumed descriptive statistics. . . 63 4.6 Range 1 descriptive statistics for each index (IBOV and IEEX, in (a) and (b)

respectively). . . 63 4.7 Range 2 resumed descriptive statistics (mean values of all the indices that were

considered during this time range, i.e.: IBOV, IEEX, INDX and ICON). . . 69 4.8 Range 2 descriptive statistics. . . 69 4.9 Range 5 summary descriptive statistics. . . 78 4.10 Range 5 descriptive statistics. . . 78 4.11 Descriptive statistics for Bovespa indices during ranges 1, 2 and 5. . . 79 4.12 Range 1 descriptive statistics for USIM5, EMBR3, ITSA4, CYRE3, NATU3

and ENBR3. . . 88 4.13 Range 2 descriptive statistics for USIM5, EMBR3, ITSA4, CYRE3, NATU3

and ENBR3. . . 96 4.14 Range 3 descriptive statistics for USIM5, EMBR3, ITSA4, CYRE3, NATU3

and ENBR3. . . 104 4.15 Stock portfolio descriptive statistics. . . 105 5.1 Selection of Value-at-Risk methodologies to be used in our experiment. . . 107 5.3 IBOV Normal VaR performance metrics during ranges 1, 2 and 3 (upper, middle

and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). . . 112 5.4 IEEX Normal VaR performance metrics during ranges 1, 2 and 3 (upper, middle

and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). . . 115 5.5 IBOV Student-t VaR performance metrics during ranges 1, 2 and 3 (upper,

mid-dle and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). . . 116 5.6 IEEX Student-t VaR performance metrics during ranges 1, 2 and 3 (upper,

mid-dle and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). . . 119

(24)

and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). . . 121 5.8 IBOV Normal-GARCH(1,1) VaR performance metrics during ranges 1, 2 and

3 (upper, middle and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). 124 5.9 IEEX Normal-GARCH(1,1) VaR performance metrics during ranges 1, 2 and

3 (upper, middle and lower rows, respectively) with estimation window lengths of 50, 100 and 200 observations (left, middle and right columns, respectively). 124 5.10 Summary of results three four different optimized Value-at-Risk methodologies

using the condensed data set listed in Appendix C. Columns with an asterisk represent average values. . . 127 6.1 Summarized statistics for the sectoral indices’ backtesting . . . 139 6.2 Summarized statistics for the backtesting of the selection of stocks from ICON

(25)

Contents

1 Introduction 21

2 Math review: how to model financial data 23

2.1 Stochastic processes . . . 23 2.1.1 Definition . . . 23 2.1.2 Stationarity . . . 24 2.2 Time series . . . 26 2.2.1 Trend and seasonality . . . 27 2.2.2 Modeling . . . 27 2.2.3 ARMA model . . . 28 2.3 Volatility models . . . 30 2.3.1 Conditional and unconditional volatility . . . 31 2.3.2 EWMA . . . 31 2.3.3 ARCH and GARCH . . . 32

3 Related work 35

3.1 Value-at-Risk . . . 35 3.1.1 Definition . . . 36 3.1.2 Choosing a probability distribution . . . 37 3.1.3 Non-parametric simulations . . . 38 3.1.4 Parametric simulations . . . 39 3.1.5 The role of volatility . . . 40 3.1.6 Expected Shortfall . . . 40 3.1.7 Limitations and criticism . . . 41 3.2 How should we classify Value-at-Risk? . . . 42 3.3 A brief survey of Value-at-Risk methodologies . . . 43 3.3.1 Normality assumption . . . 44 3.3.2 Modeling Conditional volatility . . . 45 3.3.3 Skewness . . . 46 3.3.4 Non-parametric models . . . 47 3.3.5 Semi-parametric models . . . 48 3.4 Evaluating value-at-Risk performance . . . 49 3.4.1 Caveats . . . 49 3.5 Conclusion . . . 51

(26)

4.1 Computational tools . . . 53 4.2 Data definition . . . 54 4.2.1 BM&F BOVESPA sectoral indices . . . 54 4.2.2 Time ranges . . . 55 4.3 Probability distribution candidates . . . 56 4.4 Measuring goodness of fit . . . 57 4.5 A note on Skewness . . . 59 4.6 Investigating Bovespa sectoral indices . . . 60 4.6.1 Range 1 (2004 - 2007) . . . 60 4.6.1.1 IBOV . . . 60 4.6.1.2 IEEX . . . 63 4.6.1.3 Descriptive statistics . . . 63 4.6.2 Range 2 (2008 - 2009) . . . 64 4.6.2.1 IBOV . . . 64 4.6.2.2 IEEX . . . 64 4.6.2.3 ICON . . . 64 4.6.2.4 INDX . . . 64 4.6.2.5 Descriptive statistics . . . 64 4.6.3 Range 5 (2014) . . . 69 4.6.3.1 IBOV . . . 69 4.6.3.2 IEEX . . . 70 4.6.3.3 ICON . . . 70 4.6.3.4 INDX . . . 70 4.6.3.5 IMOB . . . 70 4.6.3.6 IFNC . . . 70 4.6.3.7 IMAT . . . 70 4.6.3.8 Descriptive statistics . . . 78 4.6.4 Comparing time ranges . . . 79 4.7 Investigating stocks . . . 79 4.7.1 Range 1 (2004 - 2007) . . . 80 4.7.1.1 ENBR3 (IEEX) . . . 80 4.7.1.2 EMBR3 (INDX) . . . 80 4.7.1.3 NATU3 (ICON) . . . 80 4.7.1.4 CYRE3 (IMOB) . . . 87 4.7.1.5 ITSA4 (IFNC) . . . 87 4.7.1.6 USIM5 (IMAT) . . . 87 4.7.1.7 Descriptive statistics . . . 87 4.7.2 Range 2 (2008 - 2009) . . . 88

(27)

4.7.2.1 ENBR3 (IEEX) . . . 88 4.7.2.2 EMBR3 (INDX) . . . 89 4.7.2.3 NATU3 (ICON) . . . 89 4.7.2.4 CYRE3 (IMOB) . . . 89 4.7.2.5 ITSA4 (IFNC) . . . 96 4.7.2.6 USIM5 (IMAT) . . . 96 4.7.2.7 Descriptive statistics . . . 97 4.7.3 Range 3 (2003 - 2014) . . . 97 4.7.3.1 ENBR3 (IEEX) . . . 97 4.7.3.2 EMBR3 (INDX) . . . 97 4.7.3.3 NATU3 (ICON) . . . 97 4.7.3.4 CYRE3 (IMOB) . . . 97 4.7.3.5 ITSA4 (IFNC) . . . 104 4.7.3.6 USIM5 (IMAT) . . . 104 4.7.3.7 Descriptive statistics . . . 104 4.7.4 Comparing time ranges . . . 105 4.8 Conclusion . . . 105

5 VaR methodology comparison 107

5.1 Introduction . . . 107 5.1.1 Ranking metrics . . . 108 5.2 Estimation window length . . . 109 5.3 VaR methodology comparison . . . 111 5.3.1 Normal (Homoscedastic) . . . 111 5.3.2 Student-t (Homoscedastic) . . . 116 5.3.3 AST . . . 119 5.3.4 Normal GARCH(1,1) . . . 121 5.3.5 Student-t GARCH(1,1) . . . 125 5.3.6 Optimizing WE . . . 126 5.4 Conclusion . . . 129 6 Backtesting 131

6.1 Sectoral index analysis . . . 132 6.1.1 IBOV . . . 132 6.1.2 IEEX . . . 133 6.1.3 ICON . . . 134 6.1.4 INDX . . . 135 6.1.5 IMOB . . . 136 6.1.6 IFNC . . . 137 6.1.7 IMAT . . . 138

(28)

6.2 Stock analysis . . . 139 6.2.1 ICON . . . 139 6.2.1.1 ABEV3 . . . 139 6.2.1.2 BRFS3 . . . 140 6.2.1.3 JBSS3 . . . 140 6.2.2 INDX . . . 142 6.2.2.1 EMBR3 . . . 142 6.2.2.2 FIBR3 . . . 142 6.2.2.3 WEGE3 . . . 144 6.2.3 Summarized results . . . 144 6.3 Conclusion . . . 145 7 Conclusion 147

7.1 Suggestions for future work . . . 148

References 149

A VaR Python implementation 155

A.1 Generic VaR calculation . . . 155 A.2 VaR methodologies . . . 160

B Bovespa indices composition 163

B.1 IBOV . . . 163 B.2 ICON . . . 164 B.3 IEEX . . . 165 B.4 IFNC . . . 165 B.5 IMAT . . . 166 B.6 IMOB . . . 166 B.7 INDX . . . 167 C Portfolio 169 D Descriptive statistics 171 D.0.1 Range 1 (2004 - 2007) . . . 171 D.0.2 Range 2 (2008 - 2009) . . . 172 D.0.3 Range 3 (2003 - 2014) . . . 173 E Optimal WE lengths 175

E.0.4 Normal VaR . . . 175 E.0.5 Student-t VaR . . . 176

(29)

19 E.0.6 AST VaR . . . 177 E.0.7 Normal-GARCH VaR . . . 178

(30)
(31)

21 21 21

1

Introduction

Value-at-risk (VaR) is extensively used by financial institutions (BASEL., 2011) and governments (BRAZIL, 2009) throughout the world, and is a popular research topic amongst statisticians, mathematicians, computer scientists, econometricians and actuaries. VaR is the most used risk measure in the finance industry and can also be applied to any problem described as a time series. It has been successfully used in research in many different fields of study such as agronomy (ASFAHA et al., 2014), engineering (WEBBY et al., 2007) and social sciences (BI; GILES, 2009).

Despite its flexibilty and variety of different possible implementations, VaR has seen little application in the field of equity trading. Currently VaR is mostly used by large finan-cial institutions and virtually unknown to the general trading public - at least as a mainstream technical analysis tool.

As an empirical research proposal, we intend to use VaR as a tool to maximize stock trading profitability. We hope to achieve this goal by investigating, evaluating and ranking several different VaR implementations, selecting the best performing methods and backtesting them against recent data from selected Brazilian stock equities. Finally, we attempt to use VaR as a risk measure able to help investors make a portfolio selection on the Brazilian stock market.

Our work is structured as follows:

 In Chapter 2 we present a brief review of some of the mathematical concepts usually

used to model financial assets. This chapter is focused mostly on basic stochastic process theory needed to understand the concept of Value-at-Risk.

 In Chapter 3 we examine financial risk and present the formal definition of

Value-at-Risk. A few classic VaR calculation methodologies are also discussed. Further-more, we make a brief summary about the concept of volatility and present a few different ways of measuring conditional and unconditional volatility.

 In Chapter 3 we review the history of Value-at-Risk research and study recent work

(32)

 In Chapter 4 we study the data that will be used througout our work. We chose to

use equities from the Brazilian stock market, working with data from 2004 to 2015. In this chapter we classify data sets and study their statistical properties, evaluating which distributional assumptions should best fit each data sample.

 In Chapter 5 we calculate Value-at-Risk using several different methodologies based

on Chapter 3 and Chapter 4. An optimization framework aimed at improving Value-at-Risk estimates is defined. We devise comparison procedures based on literature and evaluate Value-at-Risk results accordingly.

 In Chapter 6, aided by the Value-at-Risk methodology elected in Chapter 5, we

analyze current1 data from the Brazilian stock market with the perspective of a po-tential investor and examine what additional knowledge is gained by analyzing risk using VaR.

 In Chapter 7 we synthesize the results found and discuss future works.

(33)

23 23 23

2

Math review: how to model financial data

In Chapter 3 several mentions are made to “financial assets” and “returns series” without formally defining what exactly those are. To actually calculate VaR (or anything else) we must, of course, have some kind of mathematical structure to represent financial data. In order to do so, we must first go through some basic concepts in mathematics and statistics.

Fortunately, the concepts needed to understand how to properly model financial data are very simple. It is assumed the reader has familiarity with basic concepts of statistics but no further knowledge in mathematics or programming is required.

2.1

Stochastic processes

As a field, stochastic processes are studied since the 1950s. Simply put, a stochastic pro-cess is a sequence of random variables. Stochastic propro-cesses are widely used to model financial asset prices and statistics, amongst many other applications in several different fields (such as signal processing) (BROCKWELL; RICHARD, 2002, p. 15). For us, the most important use of stochastic processes will be in the form of time series, which are nothing more than stochastic processes realizations.

2.1.1

Definition

A sequence of random variables {Xt|t ∈ R} is a (discrete) stochastic process. Usu-ally, t ∈ R will be an integer representing or indexing time (denoted by T), which leads us to the following definition: let T ∈ N. A sequence of random variables denoted by {Xt|t ∈ T }, is a discrete-time stochastic process. Hitherto we considered both continuous and discrete stochastic time processes; from now on, whenever we mention stocastic processes, we will be referring to this specific definition. For convenience, instead of refering to the set T as a generic subset of N, we will simply treat it as a set of certain points in time; we may for example use “varying in time” instead of “varying t ∈ T”. While t will be interpreted as a point in time (e.g. January 10th, 2010), it is usually convenient to represent it as a natural number that represents a specific point in time.

(34)
(35)

2.1. STOCHASTIC PROCESSES 25

Definition 2(Strict stationarity). A stochastic process is called strictly (or strongly) station-ary if and only if its joint probability distribution is the same when vstation-arying time, i.e., given two sets of random variables, subsets of a stochastic process Xt, namely ζ = (X1, ..., Xn) and η = (X1+h, ..., Xn+h), the process Xt is strictly stationary if and only if ∀h,n ∈ N, ζ and η have the same joint distributions.

The differences between strict stationarity and non-stationarity can be clearly seen in Figures 2.1 and 2.2. Furthermore, a stochastic process is called weakly stationary if its co-variance and mean do not vary over time. Before a formal definition, we must first define the covariance and autocovariance functions of a stochastic process:

Definition 3(Covariance function). Given a stochastic process Xt, its covariance function is defined as: γX(r, s) = COV (Xr, Xs) = E[(Xr− µX(r))(Xs− µX(s))] ☛ ✡ ✟ ✠ 2.1 where µX(t) and µX(s) are, respectively, the means of the processes Xt and Xs.

From which we can define the following concepts:

Definition 4 (Autocovariance function). Given a stochastic process Xt and its covariance function γX(t + h,t), the autocovariance function of the process is:

γX(h) = COV (Xt+h, Xt), ☛ ✡ ✟ ✠ 2.2 where h ∈ T.

Definition 5(Autocorrelation function). Given a stochastic process Xt and its autocovari-ance function γX(h), the autocorrelation function of Xt(h) is:

ρX(h) = γX(h)/γX(0) = Cor(Xt+h, Xt). ☛ ✡ ✟ ✠ 2.3

We can now formally define a weakly stationary stochastic process.

Definition 6(Weak stationarity). A stochastic process Xt is said to be weakly strictly sta-tionary iff:

(36)

1. µX(t) = E(Xt) does not depend on t (i.e., it is a constant). 2. γX(t + h,t), ∀ h ∈ N, does not depend on t.

Because we are more interested in the statistical properties of processes than on details about their paths, and for the sake of convenience, from now we will refer to weakly stationary stochastic processes simply as stationary processes. In Figure 2.2, iid (independent and identi-cally distributed, i.e. a collection of random variables which have the same probability distri-bution and are mutually independent) processes are always (at least) weakly stationary since all of its random variables have, by definition, the same probability distribution. Iid processes are a specific case of white noise processes (a process Xt formed by random variables with mean 0 and variance σ2is a white noise process). Note the difference: while in iid processes all the random variables are iid, they may be entirely uncorrelated in a white noise process. Therefore white noise processes may also be strictly stationary, depending on the correlation between its random variables.

2.2

Time series

Stochastic processes paths (or trajectories) are also called time series. A simple analogy can be made with random variables: a stochastic process is to time series what random variables are to numbers. Just as a number can be seen as a possible output to some random variable, so can time series be seen as a possible output to some stochastic process. Conversely: just as ran-dom variables are collections of numbers associated to a collection of probabilities, stochastic processes are collections of time series associated to a collection of events.

When reviewing stochastic processes in the previous subsection, events ω were not defined, which meant that we had a broad view of all possible paths in a process. By fixing ω we are able to focus solely on a single path, i.e. a specific time series from all the (infinite) possible paths of the process. This enables us to make more specific assumptions about that path behavior than we were able to make on the process, which is the finally goal in some financial forecasting analysis. There is a vast literature dedicated solely to time series. Publications vary in scope, depth and approach, so the reader has many options available to study the area more closely. This work follows closely the study by Brockwell and Richard (2002), which presents the topics clearly and with good mathematical depth.

A time series is a specific path from a stochastic process, that is, given a probabilistic event ω, a time series is a single realization of the stochastic process through time.

Definition 7(Time series). Given a sequence of random variables {Xt|t ∈ T } in the proba-bility space Ω such that ω → R and ω 7→ Xt(ω), and given a certain path ω, the path of Xt is a time series.

(37)

2.2. TIME SERIES 27

2.2.1

Trend and seasonality

It is common practice to mathematically describe time series according to aspects that are very likely to occur in nature, namely trend and seasonality. These two aspects provide essential determinism to our models: after all, if a time series were completely unpredictable, there would be no point in forecasting at all.

In our specific case of interest, financial forecasting, the real degree of randomness has been a hot debate topic for quite a long time: French mathematicians Jules Regnault and Louis Bachelier both published work in the late XIXth century modeling stock returns as random walk processes (KARLIN; TAYLOR, 1998, p. 474). The Efficient Market Hypothesis, first defined in Eugene Fama’s (FAMA, 1965), further defends the concept of market randomness.

When it is present, a time series trend is readily noticeable by looking at the series chart. Trend can be found in many natural observations: population count, world GDP, consumer good sales, inflation, mortality rate etc. Usually it can be modeled by a linear, quadratic or exponential component, among other forms. Simple trends can be estimated moving averages (COWPERTWAIT; METCALFE, 2009, p. 20).

Seasonality is the cyclic component of a series. For example, Christmas sales have a 12-month seasonality. One must be careful when defining the time range of the series, for if it is too short, seasonality may not be noticeable: if we only had 23 months of data with only one Christmas sales peak, we would not be able to detect seasonality. As with trends, there are a pleiad of different naturally occurring time series with seasonality: weather seasons, climate, airplane tickets sales, etc.

2.2.2

Modeling

There are several ways to model time series. For simple and synthetic data, for example, it may be enough to model the series as a addition of trend, seasonality and noise (COWPERT-WAIT; METCALFE, 2009, p. 19): Xt= mt+ st+ zt, ☛ ✡ ✟ ✠ 2.4 where Xt is the observed value at time t, mt is the trend, st the seasonality and zt the residuals. One may improve this model according to the necessity of the series: seasonality might be multiplyied by some constant, trend might be exponentiated and so on.

Going through time series modeling is beyond the scope of this work. There are many high quality publications available that cover this subject in great detail (COWPERTWAIT; METCALFE, 2009; BROCKWELL; RICHARD, 2002). We will present the essential modeling framework needed to understand the Autoregressive Conditional Heteroskedasticity (ARCH) and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model families which will be used later on.

(38)

2.2.3

ARMA model

ARMA stands for autoregressive moving-average. As the name suggests, it is a combi-nation of autoregressive and moving-average models. In the autoregressive (AR) model, as the name suggests, future values depend on previous values. The number of autoregression, i.e. the number of previous values used to compute the current value, is defined by a constant p called process order(also called lag process); hence the abbreviation AR(p). The general form of AR models is described below.

Definition 8(Autoregressive model AR(p)). An autoregressive model of order p, AR(p), is defined as Xt= µ + p

i=1 βiXt−i+ εt, ☛ ✡ ✟ ✠ 2.5 where (β1, . . . , βp)⊤are the parameters of the model and εt, called shock or noise, is a white noise process, and µ is the mean when there is no autoregressive structure (the intercept). Expanding the summation, we obtain:

Xt= µ + β1Xt−i+ ... + βpXt−p+ εt. ☛ ✡ ✟ ✠ 2.6

When using an autoregressive model, a very common choice for lag is p = 1. The first-order autoregressive model, AR(1), is defined as follows:

Definition 9(Autoregressive model AR(1)). An autoregressive model of order 1, AR(1), is defined as: Xt = µ + β Xt−1+ εt, ☛ ✡ ✟ ✠ 2.7 where β is the model parameter and εt is a white noise process (εt∼ W N(0,σ2)).

Random walk, for example, is an AR(1) process with β = 1 and µ = 0. The auto-covariance and autocorrelation functions of an AR(1) model can be obtained according to the following expressions (DANIÉLSSON, 2011, p. 211):

Autocovariance function: γx= σ2 1 − β2 ☛ ✡ ✟ ✠ 2.8 Autocorrelation function: ρi= βi ☛ ✡ ✟ ✠ 2.9

(39)

2.2. TIME SERIES 29 Moving Average (MA) is the second component of an ARMA process. Values are calculated as the average of previous shocks. The ammount of previous values is defined by the process order or lag, q.

Definition 10 (Moving-average process MA(q)). A moving average process of order q, MA(q), is defined as Xt= µ + q

i=0 αiεt−i, ☛ ✡ ✟ ✠ 2.10 where α = (1,α1, α2, ..., αq)⊤ is the vector of weights, µ is the mean of the process and εt is a white noise process (εt ∼ W N(0,σ2)). Expanding the summation

Xt= µ + εt+ α1εt−1+ ... + αqεt−q, ☛ ✡ ✟ ✠ 2.11 and using the lag operator we can rewrite

Xt= µ + (1 + α1ω1+ ... + αqωqt= µ + α(ω)εt, ☛ ✡ ✟ ✠ 2.12 where ωiε t = εt−i, and α(ω) = 1 + α1ω1+ ... + αqωq.

We are now able to define the ARMA(p,q) process. It is simply the addition of AR(p) and MA(q):

Definition 11(Autoregressive moving-average model ARMA(p,q)). An autoregressive mov-ing average model of order p, q, ARMA(p, q), is defined as

Xt= µ + p

i=a βiXt−i+ q

i=0 αiεt−i ☛ ✡ ✟ ✠ 2.13 where β = (β1, . . . , βp)⊤ and α = (1,α1, α2, ..., αq)⊤ are linearly independent vectors and εt, called shock or noise, is a white noise process (εt∼ W N(0,σ2)). Here, µ can be seen as an intercept of the model equation. Expanding the summation:

Xt= µ + β1Xt−1+ ... + βpXt−p+ εt+ α1εt−1+ ... + αqεt−q. ☛ ✡ ✟ ✠ 2.14

Sometimes we can ignore the average component (µ = 0). This is both by notation convenience and makes sense mathematically since we will mostly be using time series to deal with volatility, which usually has insignificant µ. It is simple enough to add the component later if needed. With that said, we can conveniently represent ARMA(p,q) as any stationary process

(40)

where: (1 − β1ω1− ... − βpωp)Xt = (1 + α1ω1+ ... + αqωqt. ☛ ✡ ✟ ✠ 2.15 We can rewrite the equation in an even more concise fashion using lag operators:

β (ω)Xt = α(ω)εt, ☛ ✡ ✟ ✠ 2.16 where β (ω) = 1 − β1ω1− ... − βpωpand α(ω) = 1 + α1ω1+ ... + αqωq.

2.3

Volatility models

Volatility is informally defined as a measure of fluctuation in a given time series. As with risk in general, market volatility cannot be measured directly: it is estimated by observing how prices oscillate. There are several ways to estimate volatility, the most basic method being Historical Volatility (also known as Statistical Volatility), which is simply the standard deviation of the sequence of stock returns (prices). Historical Volatility is efficient only when the returns are normally distributed. However, the returns are in general nonnormal distributed, presenting some statistical common properties (“stylized facts”), like clustering, fat tails and nonlinear dependence. (DANIÉLSSON, 2011; CONT, 2001; ENGLE; PATTON, 2001)

Volatility models are usually described as time series; for example, a MA(q) process with a single weight α = 1 might be used; a simple but very ineffective choice (ALEXANDER, 2008). In general, the most important aspect of modeling volatility is to take into account the stylized facts; in particular, volatility clustering and the fat tails. If some chosen model fails to do so, as is the case with a simple equal-weighted MA(q) process, it will be largely ineffective to be used as a volatility model.

While the previous sections introduced basic concepts of stochastic processes and time series which are applied to several different fields of study, in this section we can be more specific. Hitherto time series were treated as having a generic nature; from now on they will be mostly about asset returns. Returns can be calculated directly as the difference of prices P0and P1divided by the latter or using a logarithm approximation (called by log-returns). For notation and computational reasons, we will use the log-returns definition:

Definition 12(Financial asset log-returns). Given an asset A, for the t-period of the time, the asset’s log-returns series (Rt) is given by

Rt= log(At) − log(At−1), ☛ ✡ ✟ ✠ 2.17 where At is the asset price at t-period.

(41)

2.3. VOLATILITY MODELS 31 The above equation represents the log-returns series definition which permits to compute the log-returns using the data. On the other hand, the log returns may be modeled by some other equation. For example, it is common practice to de-mean (i.e., subtract the mean) the return series in order to simplify mathematical definitions and algorithm implementation. Thus one way to model the log-return series (Rt) is using the equivalence

Rt= µ + σtεt, ☛ ✡ ✟ ✠ 2.18 where εt denotes the shock or residual random variable at t-period, which is usually assumed iid, and µ is the average, usually close enough to zero so that one can assume µ = 0.

2.3.1

Conditional and unconditional volatility

Volatility models provide both conditional and unconditional volatility estimates. Un-conditional volatility is simply the model’s standard deviation σ over the entire time range. This parameter can be estimated using the corresponding sample measures. Conditional volatility is the standard deviation estimate for a specific point in time given previous values. The amount of previous values may vary. In this case, it is common to use a parametric statistical modeling to estimate the standard deviation at t-period (σt). The next subsection’s goal is to give a brief summary of some common conditional volatility models.

2.3.2

EWMA

As we mentioned above, the single most important aspect of a volatility model is cap-turing stylized facts. We can improve MA(q) process by making the weights decay over time (SINHA; CHAMU, 2000). Exponentially weighted moving average (EWMA) does just that.

EWMA was introduced by J.P. Morgan’s RiskMetricsTMsoftware (MORGAN, 1996) in the late 1990s. RiskMetrics assigns an exponentially decaying weight λ to each observation in the sample, so that more recent returns have greater weight than old ones. The EWMA model is described as follows:

Definition 13 (Exponentially weighted moving average (EWMA)). Considering the de-meaned returns series {Rt} defined by Rdmt = Rt− ¯R, EWMA models conditional volatility σt2as σt,T2 = T

i=1 θi(Rdmt−i)2, ☛ ✡ ✟ ✠ 2.19 where each θi, i = 1, . . . , T (weights), are given by

Θi= (1 − λ )λ i λ (1 − λT), ☛ ✡ ✟ ✠ 2.20

(42)

where λ is the decay factor with ∑T

i=1θi= 1, ¯Ris the sample mean of the returns, and T the estimation window. After some algebra we can write the EWMA general equation as

σt2= (1 − λ )(Rdmt )2+ λ ˆσt2−1. ☛2.21✟ Simple volatility measures such as standard deviation, moving average and EWMA are non-parametric: no input besides the observation window is needed. This makes implemen-tation easy and facilitates comprehension, but limits their power to deal with the challenges imposed by real-world volatility such as non-normality and clustering (DANIÉLSSON, 2011). Next subsection introduces two known parametric models, namely, ARCH and GARCH.

2.3.3

ARCH and GARCH

The basis of most volatility models currently in use is the Autoregressive Conditional Heteroskedasticity(ARCH) model proposed by Engle (1982) and the subsequent Generalized ARCH (GARCH) model proposed by Bollerslev (1986). As the name implies, ARCH is au-toregressive, i.e., the time series values (variance) depend on previous values according to some amount of lag. This was novel to volatility models. Non-autoregressive volatility models as-sume only one-period asset volatility, which as we discussed above is not a realistic assumption because of non-normality, volatility clustering and other stylized facts (FRANCQ; ZAKOIAN, 1937, p. 10). The definition below describes the general equation belongs to the ARCH class models. For numerical simplicity, in the descriptions of the following models we will use the de-meanedreturns, which are defined by Rdmt = (Rt− ¯R), where ¯R denotes the sample mean of the returns, since the expected return E(Rdm

t ) = 0.

Definition 14(ARCH(q)). Consider the returns as a time series modeled as an addition of mean µ and volatility σ. ARCH defines volatility σt2for day t as the summation:

σt2= α0+ p

i=1 αi(Rdmt−i)2, ☛ ✡ ✟ ✠ 2.22 where α0, . . . , αp> 0 are the model parameters. The first-order ARCH process, ARCH(1) is defined as: σt2= α0+ α1(Rtdm−1)2. ☛ ✡ ✟ ✠ 2.23 ARCH(p) unconditional volatility is given by

σ2= α0 1 − ∑p

i=1αi

. ☛2.24✟

(43)

2.3. VOLATILITY MODELS 33 ARCH models require long lags to capture volatility clustering, which translated to impractical computational costs to model volatility within reasonable time (DANIÉLSSON, 2011). In this sense, Generalized ARCH (GARCH) solves this problem by including previ-ously calculated volatility in equation☛2.22 :✟

Definition 15(GARCH(p,q)). A process {Rdmt } will be a GARCH(p,q) process if and only if (FRANCQ; ZAKOIAN, 1937):

E(Rdmt |Rdmu , u < t) = 0, µ,t ∈ Z,2.25✟ and the conditional variance σ2

t = Var(Rtdm|Rdmu , u < t) is given by σt2= α0+ p

i=1 αi(Rtdm−i)2+ q

i=1 βiσt2−i, ☛ ✡ ✟ ✠ 2.26 where α = (α0, . . . , αp)⊤and β = (β1, . . . , βq)⊤are the model parameters vectors, and p,q are the lag orders in Rdm

t and σt2, respectively. Furthermore, we must have α0> 0, αi0, i = 2,..., p, and βj≥ 0, j = 1,...,q. If β = 0, the GARCH(p,q) process reduces to the ARCH(p) process as defined in Definition 14. Using the lag operator (ω) we can write the model as σt2= α0+ A(ω)(Rdmt )2+ B(ω)σt2, ☛ ✡ ✟ ✠ 2.27 with A(ω) = α1ω1+ . . . + αpωp and B(ω) = β1ω1+ . . . + βqωq, being polynomials of order p and q, respectively.

The covariance stationarity condition is ∑q

i=1αi+ ∑qi=1βj < 1. (BOLLERSLEV, 1986, p. 310) The unconditional volatility is given by

σ2= α0

1 − ∑p

i=1αi+ ∑qi=1βj

. ☛2.28✟

As a particular case we have the first-order GARCH model, GARCH(1,1), which is defined as σt2= α0+ α(Rdmt−1)2+ β σt2−1. ☛ ✡ ✟ ✠ 2.29 There are hundreds of variations on ARCH models (ANGELIDIS; BENOS; DEGIAN-NAKIS, 2004), however GARCH(1,1) is consistently found to outperform other, more complex models (HANSEN; LUNDE, 2001). The model is also the most widely used by specialists to model volatility of daily returns; perhaps excessively (FRANCQ; ZAKOIAN, 1937, p. 206). The fact that GARCH enables the amplitude of Rdm

t to depend upon previous values is of paramount importance, since it effectively models volatility clustering (FRANCQ; ZAKOIAN,

(44)

1937, p. 21). Owning to its robust performance, superior estimation, overall simplicity and flex-ibility, the GARCH family of models is by far the most popular conditional volatility model in use.

(45)

35 35 35

3

Related work

Value-at-risk is a thriving topic with novel literature constatly published. We begin this chapter by formally defining VaR and its most usual implementations. In the second half of the chapter, we will explore some of the many possible VaR methodologies and the corresponding literature. We conclude the chapter by summarizing our findings in the literature.

3.1

Value-at-Risk

Financial risk can be defined as the probability of losing part or all of an investment. When we are working with financial assets, we can access the daily stock returns over an spec-ified time frame, among others time periods like monthly stock returns, for example. In this sense, the risk can be considered as a latent variable: it can not be measured directly from the data we have, so it must instead be inferred from the data, which in our case are the stock return series.

Value-at-Risk (VaR) is arguably the most widely used risk metric in commercial appli-cations (DANIÉLSSON, 2011). The 1988 Basel Committee signed by G-10 countries imposed risk management on banks, most of which have since used some form of VaR methodology to comply with regulations (JORION, 2001). Several United States government agencies such as the Federal Reserve and the Security and Exchange Commision (SEC) use or advocate the use of VaR (KHINDANOVA; RACHEV, 2000). In the Brazilian context, Banco Central, the Central Bank of Brazil, requires banks to manage risks based on VaR. Although not mentioned directly, requirements are based on the Basel Committees recommendations (BRAZIL, 2009).

VaR has a non-constructive definition, meaning there are dozens of different ways of forecasting it. Basel regulations allow banks to develop their own methods to calculate VaR, which stimulated research and development of new VaR methodologies. The use of VaR as a risk metric by banks quickly spread during the 1990s; one of the pioneers, J.P. Morgan made available to the public its estimation software RiskMetricsT M in 1994, quickly becoming the industry benchmark. RiskMetrics was developed during the 1980s, starting as a aggregation of hundreds of risk factors and several VaR estimates calculated daily. (HOLTON, 2002)

(46)

Other banks such as Credit Swiss, Chase Manhattan and Deutsche Bank followed J.P. Morgan. By 1999, over 80 different commercial vendors were offering VaR software (CHRISTOF-FERSEN; HAHN; INOUE, 2001). Albeit used mostly by finance and actuarial-related clients, several studies were made using VaR in many different risk modelling scenarios, such as the US movie industry (BI; GILES, 2009), agriculture (ASFAHA et al., 2014) and geology (WEBBY et al., 2007).

3.1.1

Definition

VaR is a single number representing how much loss may occur with a certain probability and considering an specific time window of analysis. This non-constructive definition of VaR results in a vast number of different implementations - there is no consensus yet on which method is superior, and the large amount of different implementations makes it difficult to choose the best implementation. In other words VaR presents the worst best case scenario - however, risk measures are expected to say something about “worst worst” case scenarios, specially in the cases of financial assets, where fat tails are expected.

The commonly used Normal VaR, which assumes that the returns of a stock prices series follows a Gaussian probability process, has well-known deficiencies, among which the most serious is the lack of information provided in extreme events, which demands fat tails on the return probability distribution behavior - which a Normal PDF can not model. However, other methods of VaR estimation can be used, for example, the GARCH-VaR class (HUNG; LEE; LIU, 2008). Other measure also used to complement VaR is Tail VaR (TVaR), also known as Conditional VaR (CVaR), which we will mention later in this chapter.

In a parametric point of view, once you assume a certain distribution for the returns, VaR is simply the (1 − α)%-quantile of that distribution, where α is the confidence level. If there is no postulated distribution, you can use the empirical distribution, which is extracted from the data. Thus, three basic inputs are needed to calculate VaR: confidence level (α), time window and return distribution assumption. Large banks and financial institutions are often obliged to use α = 99% (BRAZIL, 2009, p. 5), (KOU; PENG; HEYDE, 2013, p. 7), while traders and small businesses have no such restrictions; in literature, α = 95% and α = 99% are the most common choices, as can be seen in the studies referenced in this chapter. Time windows are often chose depending on the confidence level and time frame: considering daily time frames, an α = 99% could performs best with a window of 250 periods or more, while with an α = 95% it is best to use smaller windows, of 100 periods or less (BEST, 1998). In other words,

(47)

3.1. VALUE-AT-RISK 37 the following equation:

Pr[Rdmt ≤ VaR(1−α)] = F(−VaR) = Z −VaR −∞ P(x) dx = (1 − α), ☛ ✡ ✟ ✠ 3.1

where F is the cumulative distribution function (CDF) and P is the probability density function (PDF) of the de-meaned returns. Definition 16 is an implicit definition of VaR. If we assume mean return is zero, we may also define VaR explicitly as follows in Definition 17:

Definition 17(Value-at-risk (explicit)). From the definition, we have: Pr[Rt≤ VaR(1−α)] = (1 − α). ☛ ✡ ✟ ✠ 3.2 Returns are defined as:

Rt = Pt− Pt−1 Pt−1 , ☛ ✡ ✟ ✠ 3.3 where Pt is price at day t. Dividing both sides of the inequality by the volatility (σ), we obtain: Pr[Rt σ ≤ VaR(1−α) σ ] = (1 − α). ☛ ✡ ✟ ✠ 3.4 If we denote (Rt/σ ) by FR, it follows that:

VaR(1 − α) = −σFR−1(1 − α).3.5✟

3.1.2

Choosing a probability distribution

VaR is just a quantile. To be able to calculate VaR1−α for any α ∈ (0,1), given a sufficiently large data sample, one needs only to choose a probability distribution for the sample. However, no matter how large the sample is, we can never be certain of its distribution; a choice must be made based on some measure of adequacy.

Choosing an adequate distribution for the returns is of paramount importance when calculating VaR, since a bad model can severally underestimate or overestimate risks. The model must take into account the peculiarities of the data, but without becoming biased by the use of samples that are too small.

We can either assume that the distribution of the data is based on some probability dis-tribution or on the observed values themselves. In the latter case, we can use the empirical distribution of returns to simulate several possible risk scenarios using the Historical Simula-tion approach. (KHINDANOVA; RACHEV, 2000) In the former case, we must pick one or

(48)

more models (distributions with certain parameters: e.g, Normal, t-Student, etc) and test their goodness-of-fit. There are several Goodness-of-fit tests that can be used for this purpose, such as, the Shapiro-Wilk test, Jarque-Bera for normality analysis (GHASEMI; ZAHEDIASL, 2012, p. 487), and, for more general cases, the Kolmogorov-Smirnov test can be used (GHASEMI; ZAHEDIASL, 2012).

3.1.3

Non-parametric simulations

In non-parametric simulations, the returns PDF is defined empirically by the returns themselves; that is, the frequency of observed values within a certain time horizon form the PDF. The simplest non-parametric VaR calculation method is the historical simulation. It is still widely used and very simple to implement. The Historical method uses only the empirical distribution of the returns. In this case, the returns are split up into several long subsamples being the same length. For example, considering N and n as the original sample size and the subsample common length, we can get N −n+1 subsamples. For each subsample, we can pick the (1 − α)-th quantile (the VaR1−α for each subsample). Thus, the VaR1−α estimation for the next period will be the mean of these partial VaR’s.

There are many advantages to historical method calculations: simplicity, robustness, decent performance when estimating high-tolerance VaR and non-dependence on parametric assumptions (BEDER, 1996; ABAD; BENITO; LÓPEZ, 2014). However the fact that historical simulation uses only the observed data values is also a weakness: as a corollary, possible return values that were not in the sample are not considered. In fact, the further from the mean we get, less (and less) observations are available for the simulation analysis (DANIELSSON; VRIES, 2000). Therefore, historical simulation is a poor choice when one has a small data sample or is calculating a VaR value far into the tail (e.g., VaR1%).

Another problem in the historical method comes from the fact that it attributes equal weights to every period, which makes it react very slowly to abrupt changes in asset volatil-ity, or change abruptly when an old observation exits the estimation window (GOORBERGH; VLAAR, 1999, p. 22). This problem is specially severe when using longer time windows, which is the case of most banks and large financial institutions (a vast majority of which use historical simulations).

There are several proposed improvements that seek to solve or mitigate this issue: using decreasing weights to value more recent data over old data, using volatility-adjusted weights and so on (DOWD, 2005, ch. 4); however, most (85%) banks use use equal weights in their historical simulations (MEHTA et al., 2012). Conditional volatility models have also been used together with historical simulation with good performance (BARONE-ADESI; GIANNOPOU-LOS; VOSPER, 1999).

Another approach to using historical simulations is to use a kernel density estimator (KDE) to generalize the returns frequency histogram. In simple terms, the KDE fills empty

(49)
(50)

able of dealing with skew and kurtosis. For that reason, the t-Student and skewed t-Student dis-tributions are often used to model VaR (LIN; SHEN, 2006; AZZALINI; CAPITANIO, 2003). Another way to model the VaR is based on the GARCH models (GOORBERGH; VLAAR, 1999; MARIMOUTOU; RAGGAD; TRABELSI, 2009), which impose a role of volatility to the random behavior of the returns. The next section focuses on this feature of the GARCH modeling.

3.1.5

The role of volatility

Volatility models allow us to estimate conditional volatility on a given time point, but we need more information to calculate VaR. Namely, we must define the returns series’ probability distribution. Refer to Equation (2.18), which models the returns series as the addition:

Rt = µ + σtεt. By de-meaning Rt, the equation becomes

Rdmt = σtεt. ☛ ✡ ✟ ✠ 3.6 Given a conditional volatility model, we are able to estimate conditional volatility σtfor every t. Return values, Rdm

t , are readily available using Definition 12. Thus, the residuals ˆεtare fully defined.

Definition 18(Realized residuals). Residuals ˆεt for a returns series Rdmt with conditional volatility estimate ˆσt assume realized values defined by

ˆεt= Rtdm

ˆσt

. ☛3.7✟

Recall that ˆεt are random variables. As we can see in the definition, realized residual values depend directly on returns, which means the model for residuals must take into account peculiarities from the returns series. In particular, as we have seen, returns almost always have fat tails and commonly present skewness. Using the GARCH approach, there are two usually distributions for ˆεt: Normal and t-Student (HARTZ; MITTNIK; PAOLELLA, 2006).

3.1.6

Expected Shortfall

Expected Shortfall (ES), also known as Tail VaR (TVaR) and Conditional VaR (CVaR), is both coherent and aware of the tail distribution’s shape (BHATTACHARYYA; RITOLIA, 2008). ES is defined as the expected loss once VaR is violated. In mathematical notation, the ES can be defined as

(51)

3.1. VALUE-AT-RISK 41

ES1−α(X) = E[X|X ≤ VaR1−α], ☛3.8✟

where X denotes a random variable representing some loss variable. It is important to note that here we are assuming VaR1−α is a negative, i.e., VaR1−α< 0. The ES can also be calculated by adding VaR with the mean excess loss function (KLUGMAN; PANJER; WILLMOT, 2012, p. 47):

ES1−α(X) = VaR1−α(X) + e(VaR1−α), ☛3.9✟ where e(·) is the mean excess loss function which is defined as (KLUGMAN; PANJER; WILL-MOT, 2012, p. 47):

e(d) = E(X − d|X < d).3.10✟

Thus, when d = VaR1−α, we have e(d) = E(X −VaR1−α|X < VaR1−α).

3.1.7

Limitations and criticism

Despite its undeniable popularity, VaR has received a wide array of criticism from in-dustry and academia. The most common academic complaint about VaR is that when it is a non-coherent risk measure. A risk measure is considered coherent if it satisfies four properties: monotonicity, subadditivity, positive homogeneity and translation invariance (ARTZNER et al., 1999). Of these four properties, VaR does not always satisfy subadditivity (KLUGMAN; PAN-JER; WILLMOT, 2012). However, under normality or the GARCH approach, VaR satisfies the subadditivity property.

Statistician and risk analyst Nassim Taleb, in his 1997 famous debate with VaR scholar Philippe Jorion, argues that VaR implementations at the time were a “potentially dangerous mal-pratice”. The gist of the critique is that the very definition of VaR was misleading and nearly useless, because eventual VaR failures always overshadow its successful forecasts. Taleb con-cludes his argument by stating that he would “accept VaR if volatility were easy to forecast with a low standard error” (JORION; TALEB, 2015). Altough GARCH models were introduced by Bollerslev one decade prior to the debate, only later in the 2000s research would emerge finding GARCH to be a good conditional volatility forecaster (SO; YU, 2006). Many new models were proposed in the decade following the debate (ANGELIDIS; BENOS; DEGIANNAKIS, 2004). Other studies argue that VaR methodologies fail to measure risk because they do not consider psychological aspects of financial decision-making (NWOGUGU, 2006). The most common criticism VaR receives, however, is regarding incorrect usage or interpretation. In his seminal book on Risk Forecasting, Jon Danielsson advises about the dangers of VaR mis-interpretation and misuse, highlighting that VaR is very easily manipulated, for instance by banks keen on meeting regulatory demands (DANIÉLSSON, 2011, p. 80-84). Financial

(52)

insti-tutions have also published research challenging famous VaR case studies and warning against VaR misuse (CULP; MILLER; NEVES, 1998).

It is important to note that whichever the calculation method chosen, nothing is affirmed about values that exceed VaR. Beyond the VaR threshold, no distributional assumptions can be made. Failure to acknowledge this limitation may lead to underestimation of extreme events with possibly catastrophic results.

Most criticism convey valid remarks about value-at-risk and point out major pitfalls. They do not, however, provide empirical evidence that VaR is an inferior risk measure. On the contrary, most objections are purely theoretical. In Chapter 3, we will investigate abundant empirical evidence that VaR is in fact a useful tool when used wisely.

3.2

How should we classify Value-at-Risk?

There is no canonical way to classify VaR methodologies, and as consequence, each author defines method classes on their own. In fact, there is no consensus even on the terms being used, and some notational abuse is to be expected, for example, GARCH(1,1) strictly speaking is a volatility model, but many authors will refer to it as a metonymy for “parametric VaR estimation assuming a Normal returns distribution using conditional volatility forecast calculated by a GARCH(1,1) process” (SO; YU, 2006, p. 192), (FÜSS; KAISER; ADAMS, 2007, p. 21). Abbreviated naming for VaR methodologies are clearly needed; one must be careful, though, as to be certain of what method they represent.

There are two basic VaR simulation techniques that are present in all methodologies: parametricityand volatility. Parametricity refers to whether the methodology uses a parametric or non-parametric simulation, as seen in Section 3.1.2, while volatility may be either uncondi-tional (e.g. variance) or condiuncondi-tional (e.g. ARCH-family processes).

Both concepts intermingle. To avoid confusion, we will establish four classes of VaR methodologies according to the characteristics of their simulations. In Table 3.1 we list such classes and provide some examples. In Table 3.1 we summarize our VaR classification.

Typically, VaR methodologies are classified first according to whether they use or not a parametric simulation; hence most studies will classify VaR methodologies either as parametric or non-parametric. Both are widely popular and can be found in research old and new. His-torical simulation (non-parametric) and Normal VaR (parametric) are often used in research as benchmarks for other models (DOWD, 2005, pp. 39, 40).

In the beginning, VaR assumed stationary (unconditional) volatility. As we have seen, this is not a realistic assumption due to volatility clustering. GARCH-family models forecast conditional volatility, i.e., volatility at period t given previous returns, and are the most com-monly used conditional volatility models (GOORBERGH; VLAAR, 1999, p. 15), (DOWD, 2005, p. 316).

(53)

3.3. A BRIEF SURVEY OF VALUE-AT-RISK METHODOLOGIES 43

Table 3.1:Value-at-risk methodology classification according to scedasticity and distributional assumption.

Parametric Non-parametric

Unconditional (homoscedas-tic) volatility

Parametric distribution sim-ulation, e.g. (µ,σ) normal VaR.

Empirical distribution simu-lation, e.g. historical simula-tion.

Conditional (heteroscedas-tic) volatility

Parametric Volatility-adjusted simulation, e.g. VaR with returns modeled as a Student-t distribution and GARCH(1,1)-forecasted volatility.

Filtered historical simu-lation (FHS), e.g. his-torical simulation with GARCH(1,1)-forecasted

volatility

(BARONE-ADESI; GIANNOPOULOS, 2001).

3.3

A brief survey of Value-at-Risk methodologies

There has been a vast amount of literature comparing different methodologies of estima-tion VaR being published over the last 20 years. In 1996, Hendricks published a comprehensive report evaluating 12 different VaR methodologies (HENDRICKS, 1996). The author concluded that most methods produced adequate results and had small differences among themselves. The author emphasized that virtually all approaches produced accurate VaR5% forecasts, but VaR1%’s were less reliable. It was also found that losses beyond VaR were typically 30% to 40% larger than expected VaR, although such remark has more of an empirical value than a mathematical one. The study also confirmed the well-known stylized facts of financial data; fat tails and volatility clustering.

There are so many publications comparing VaR that it is not hard to find papers which survey that kind of literature. Abad; Benito; López (2014) in their paper A comprehensive re-view of Value at Risk methodologies, present a broad overview of VaR techniques and papers that compare different methodologies. The work presents over a dozen GARCH-family mod-els, seven different probability density functions and a comparative table with advantages and disadvantages of the different types of methodologies we discussed in Table 3.1. The highlight of the paper is its listing of 24 studies focused on comparing VaR methods, presented in a ta-ble with the respective methodologies each paper benchmarks. The paper also presents some different benchmarking measures which we will discuss in the next section. The authors con-cluded that extreme value theory (EVT) and filtered historical simulation (FHS) present the best performance.

Referências

Documentos relacionados

A literatura encontra-se limitada em relação aos compósitos utilizados, sendo necessário mais estudos que comprovem esta análise, bem como estudos comparando

We have performed several different analysis to the data using the presented models of credibility.. We applied them to the analysis of the frequency and the severity of the claims,

The factor analysis reveals that the dependence structure of stock market returns di¤ers substantially between developed and emerging markets.. However, this structure has not

relation to three target returns - 0, the risk-free rate and the European stock

Constatou-se que a deposição de serapilheira encontrada neste estudo demonstrou a importância desta via de ciclagem de nutrientes para manutenção da produtividade

[...] a gestão democrática da escola pública não deve ser entendida como uma forma de desobrigar o Estado de suas responsabilidades ou para criar uma escola de qualidade inferior

The complementary tests for the Ibovespa, with simulations of various investment portfolios, proved the robustness of the strategy based on the GTT model; for example, at

Os resultados mostraram que houve um aumento no ponto de amolecimento de acordo com o teor de resíduo polimérico adicionado, indicando um aumento na resistência à