• Nenhum resultado encontrado

• Initialize parameters µand θ, set a random partition, x, and initialize the auxiliary variables W, q, c, r,s, and the cost and penalty functions,f and h;

• For each proposed move, x→y, compute the cost differentials δ0 =f(y)−f(x) and δµ=f(y, µ)−f(x, µ) .

• Accept the move with the Metropolis probability,M(δµ, θ). If the move is accepted, update x, W, q, c, r, s, f and h;

• After each batch of Metropolis sampling steps, perform a cooling step update θ ←(1 +1)θ , µ←(1 +2)µ , 0< 1 < 2 <<1 .

Computational experiments show that the HSA successfully overcomes the difficulties undergone by the SSA, as shown in Stern (1991). As far as we know, this was the first time this kind of perturbative heuristic has been considered for SA. Pflug (1996) gives a detailed analysis for the convergence of such perturbed processes. These results are shortly reviewed is section H.1.

In the next section we are going to extend the idea of stochastic optimization to that of evolution of populations, following insights from biology. In zoology, there are many examples of heuristic merit or penalty functions, often called fitness or viability indicators, that are used as auxiliary objective functions in mate selection, see Miller (2000, 2001) and Zahavi (1975). The most famous example of such an indicator, the peacock’s tail, was given by Charles Darwin himself, who stated: “The sight of a feather in a peacock’s tail, whenever I gaze at it, makes me feel sick!” For Darwin, this case was an apparent counterexample to natural selection, since the large and beautiful feathers have no adaptive value for survival but are, quite on the contrary, a handicap to the peacock’s camouflage and flying abilities. However, the theory presented in this section give us a key to unlock this mystery and understand the tale of the peacock’s tail.

5.3 The Way of Sex: All for One

From the interpretation of the cooling constant given in the last section, it is clear that we would have a lower constant, resulting in a faster cooling schedule, if we used a richer set of single moves. Specially, if the additional moves could provide short-cuts in the configuration space, as the moves indicated by the dashed line in Figure 3a. This is one of the arguments that can be used to motivate another important class of stochastic evolution algorithms. Namely, Genetic Programming, the subject of the following sections. We will focus on a special class of problems known as functional trees. The general conclusions, however, remain valid in many other applications.

5.3.1 Functional Trees

In this section, we deal with methods of finding the correct specification of a complex function. This complex function must be composed recursively from a finite set, OP = {op1, op2, . . . opp}, of primitive functions or operators, and from a set,A={a1, a2, . . .}, of atoms. The k-th operator, opk, takes a specific number, r(k), of arguments, also known as the arity of opk. We use three representations for (the value returned by) the operator opk computed on the arguments x1, x2, . . . xr(k) :

opk(x1, . . . xr(k)) ,

opk

/ \

x1 . . . xr(k)

, opkx1 . . . xr(k) .

The first is the usual form of representing a function in mathematics; the second is the tree representation, which displays the operator and their arguments as a tree; and the third is the prefix, preorder or LISP style representation, which is a compact form of the tree representation.

As a first problem, let us consider the specification of a Boolean function ofqvariables, f(x1, . . . xq), to mach a target table, g(x1, . . . xq), see Angeline (1996) and Banzhaf el al.

(1998). The primitive set of operators and atoms for this problem are:

OP ={∼,∧,∨,→,,⊗} and A={x1, . . . xq,0,1} .

Notice that while the first operator (not) is unary, the last five (and, or, imply, nand, xor) are binary.

x y ∼x x∧y x∨y x→y xy x⊗y

0 0 1 0 0 1 1 0

0 1 1 0 1 1 0 1

1 0 0 0 1 0 0 1

1 1 0 1 1 1 0 0

The set, OP, of Boolean operators defined above is clearly redundant. Notice, for example, that

x1 →x2 =∼(x1∧ ∼x2), ∼x1 =x1x1 and x1∧x1 =∼(x1x2) .

This redundancy may, nevertheless, facilitate the search for the best configuration in the problem’s functional space.

Example 1a shows a target table, g(a, b, c). As it is usual when the target function is an experimentally observed variable, the target function is not completely specified.

Unspecified values in the target table are indicated by the don’t-care symbol ∗. The two

5.3 THE WAY OF SEX 131 solutions, f1 andf2, match the table in all specified cases. Solutionf1, however, is simpler and for that may be preferred, see section 4 for further comments.

a b c g f1 f2

0 0 0 1 1 1

0 0 1 1 1 1

0 1 0 ∗ 1 0

0 1 1 ∗ 1 0

1 0 0 0 0 0

1 0 1 1 1 1

1 1 0 0 0 0

1 1 1 1 1 1

f1

|

/ \

∼ |

| |

a c

f2

|

/ \

∧ ∧

/ \ / \

∼ ∼ | |

| | | |

a b a c

f1 = (∼A)∨C , f2 = (∼A∧ ∼B)∨(A∧C) . f1 = (∨(∼A)C) , f2 = (∨(∧(∼A) (∼B)) (A∧C)) . Example 1a: Two Boolean functional trees for the targetg(a, b, c).

As a second problem, let us consider the specification of a function for an integer numerical sequence, such as the Fibonacci sequence, presented in Koza (1983).

g(j)≡

j , if j = 0∨j = 1 ;

g(j −1) +g(j−2), if j ≥2 .

The following array, gj, 0≤j ≤20, lists the first 21 elements of the Fibonacci sequence.

g = [0,1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181,6765] . In this problem, the primitive set of operators and atoms are:

OP ={+,−,×, σ} , A={j,0,1} ,

where j in an integer number, and the first three operators are the usual arithmetic operators. The specified function is used to compute the firstn+ 1 elements of the array fj, seeking to mach the target array gj, 0 ≤ j ≤ n. The last primitive function is the recursive operator, σ(i, d), that behaves as follows: When computing the j-th element, f(j),σ(i, d) returns the already computed element fi, if i is in the range, 0≤i < j, or a default value, d, ifi is out of the range.

In the functional space of this problem, possible specifications for the Fibonacci func-tion in prefix representafunc-tion, are

(+ (σ (−j 1) 1) (σ(−j (+ 1 1) 0))) , (+ (σ (−j 1) 1) (+ 0 (σ(−j (+ 1 1) 0)))) . Example 2a: Two functional trees for the Fibonacci sequence.

Since the two expressions above are functionally equivalent, the first one may be preferable for being simpler, see section 4 for further comments.

As a third problem, we mention Polynomial Network models. These functional trees use as primitive operators linear, quadratic or cubic polynomials in one, two or three variables. For several examples and algorithmic details, see Farlow (1984), Madala and Ivakhnenko (1994) and Nikolaev and Iba (2006). Figure 4 shows a simple network used for sales forcast, a detailed report is given in Lauretto et al. (1995). Variable x5 is a magazine’s sales forecast obtained by a VARMA time series model using historic sales, econometric and calendric data. Variables x1 to x4 are qualitative variables (in the scale:

Bad, Weak, Average, God, Excellent) to assess the appeal or attractiveness of an individ-ual issues of the magazine, namely: (1) cover impact; (2) editorial content; (3) promotional items; and (4) point of sale marketing.

3

1 2

x1 x2 x3 x5 x3 x4

u

AAAAAAAAC

u

[[ [[

^

u

[[ [[

^

Figure 4: Polynomial Network.

Rings on a node: 1- Linear; 2- (incomplete) Quadratic; 3- (incomplete) Cubic.

Of course, the optimization of a Polynomial Network is far more complex than the optimization of Boolean or algebraic etworks, since not only topology has to be optimized (identification problem), but also, given a topology, the parameters of the polynomial function have to be optimaized (estimation problem). Parameter optimization of sub-trees can be based on Tikhonov regularization, ridge regression, steepest descent or Partan gradient rules. For several examples and algorithmic details, see Farlow (1984), Madala and Ivakhnenko (1994), Nikolaev and Iba (2001, 2003, 2006), and Stern (2008).