• Nenhum resultado encontrado

3. UNDERSTANDING ABMs

3.3. Agent-Based Models

3.3.6. Estimation of ABMs

3.3.6.2. Bayesian Estimation

(iii) Change the structural parameters 𝜃 of the original model until the distance between the estimates of the auxiliary model using real and artificial data is minimized, as follows:

𝜃̂ = arg min

𝜃 [ 𝛽̂(𝜃) − 𝛽̂𝑅]′ 𝑊−1[𝛽̂(𝜃) − 𝛽̂𝑅] (124) where 𝑊 is a positive definite matrix of weights; 𝛽̂(𝜃) is the estimated parameter of the auxiliary model; and 𝛽̂𝑅 is the estimated parameter of the original model.

The logic is the same of the simulated moments. If the number of the parameters of the auxiliary model is equal to the number of parameters in the original model, the original model is just-identified, and the distance between the estimated coefficients on the real and on the simulated data (if the model is correctly specified) goes in the limit to zero. If the number of parameters in the auxiliary model is bigger than the number of parameters in the original model, the original model is over-identified, and the distance between the estimated coefficients remains positive. Finally, if the number of parameters in the auxiliary model is smaller than the number of parameters in the original model, the original model is under-identified.

parameters is often used as a way to introduce uninformative priors. All in all, the prior is a distribution, which through application of the Bayes theorem yields another distribution as an output.

We can point out three differences between the Bayesian approach and the SMD methods. First, in the Bayesian approach, there is no maximization involved. Second, rather than obtaining a point estimative for the parameters, we get a distribution. Third, prior knowledge may be incorporated.

Sampling the posterior distribution 𝑝(𝜃|𝑌𝑅) involves two computationally intensive processes. The first obtain an estimate for the likelihood ℒ, given the values of 𝜃. The second involves the iteration over different values of 𝜃.

The estimation of the likelihood, that is, the probability of observing the data, given the current values of the parameters, can be done when it is not feasible to analytically derivate it. The process is done when it is repeatedly sampling from the model output.51

Once the likelihood is known, the application of the Bayes theorem allows the model to get a probability density function for the posterior distribution, at one given value of 𝜃. However, to recover the whole shape of the posterior distribution, many values need to be sampled.

There are four main classes of efficient sampling schemes to obtain samples from a function of 𝜃: the rejection sampling, the importance sampling, the Markov chain Monte Carlo, and the sequential Monte Caro methods.52

In the last fifteen years, a new set of methods have appeared to produce approximations of the posterior distributions without relying on the likelihood. These methods are labelled likelihood-free methods. The best-known class is the Approximate Bayesian Computation (ABC). In what follows, we give an overview of this method.

3.3.6.2.1. Approximate Bayesian Computation (ABC)

In standard Bayesian methods, the likelihood function provides the fit of the model with the data. However, the likelihood is often computationally impractical to

51 See Richiardi (2018c, pp. 211-214) for a detailed explanation.

52 It is beyond the scope of this work to detail how they work. For an excellent survey on this subject, see Hartig et al. (2011).

evaluate. The basic idea of the ABC is to replace the evaluation of the likelihood with a 0-1 indicator, describing whether the outcome of the model is close enough to the observed data.

To perform such a task, a few procedures need to be done. First, the model outcome and the data must be summarized. Then, a distance between the simulated and the real data is computed. The model is considered close enough to the data if the distance falls within the admitted tolerance,

Taking it to a properly formalism, the basic ABC works as follows:

(i) A candidate vector 𝜃𝑖 is drawn from a prior distribution;

(ii) A simulation is done with parameters vector 𝜃𝑖, obtaining simulated data from the model density 𝑝(𝑦|𝜃𝑖);

(iii) The candidate vector is ether retained or dismissed depending on whether the distance between the summary statistics computed on the artificial data 𝑆(𝑦(𝜃)) and the summary statistics computed on the real data 𝑆(𝑦𝑅) is within or outside the admitted tolerance ℎ ∶ 𝑑(𝑆, 𝑆𝑅) ≤ ℎ.

This procedure is repeated 𝑁 times. The retained values of the parameters define an empirical approximated posterior distribution.

As we can notice, there are three main ingredients in ABC: (i) the section of summary statistics; (ii) the definition of a distance measure; and (iii) the definition of a tolerance threshold. The most challenging choice concerns the first. The standard scheme to select to select summary statistics for ABC is the rejection sampling (i. e., candidates are drawn from the prior distribution, and only those who perform well are maintained).

However, as Richiardi (2018c) argues, this is not very efficient, mainly if the prior distribution differs significantly from the posterior.

In fact, this topic is an active area of research. In recent years, we have seen the development of techniques to provide guidance in the selection of the summary statistics (see Fearnhead and Prangle, 2012), as well as the use of ABC with more efficient sampling schemes (see Sisson et al., 2016).

In summary, the main difference between ABMs estimation and more standard methods lies in the higher computational complexity of ABMs. Often, likelihood-based

methods are impractical, unless very few parameters are involved. This challenging task of empirical validating ABMs has been so far restricted to a few and relatively simple cases. Surely, this is going to change as the field of agent-based modelling becomes more and more mature. As a matter of fact, likelihood-free methods (e. g. ABC) seem therefore promising, especially when coupled with the use of efficient Monte Carlo sampling (Richiardi, 2018c; Sisson et al., 2016).