Many-objective optimization with particle swarm optimization algorithm

(1)

Universidade de Tr´

as-os-Montes e Alto Douro

Many-objective Optimization

with Particle Swarm Optimization Algorithm

A thesis submitted for the

Doctorate degree in Electrical and Computer Engineering

H´

elio Alves Freire

Supervisor: Jos´e Paulo Barroso de Moura Oliveira Co-Supervisor: Eduardo Jos´e Solteiro Pires Co-Supervisor: Maximino Esteves Correia Bessa

(2)

(3)

Universidade de Tr´

as-os-Montes e Alto Douro

Many-objective Optimization

with Particle Swarm Optimization Algorithm

A thesis submitted for the

Doctorate degree in Electrical Engineering

H´

elio Alves Freire

Supervisor: Jos´e Paulo Barroso de Moura Oliveira Co-Supervisor: Eduardo Jos´e Solteiro Pires Co-Supervisor: Maximino Esteves Correia Bessa

Evaluation Panel:

——————————————————————————————————————– ——————————————————————————————————————– ——————————————————————————————————————– ——————————————————————————————————————–

(4)

(5)

This thesis was funded by FCT - Funda¸cão para a Ciência e a Tecnologia (QREN - POPH - Tipologia 4.1 - Forma¸cão Avan¸cada comparticipada pelo Fundo Social Europeu e por fundos do MCTES).

(6)

(7)

Statement of Originality

Unless otherwise stated in the text, the work described in this thesis was carried out solely by the candidate. None of this work has already been accepted for any other degree, nor is it being concurrently submitted in candidature for any degree.

(8)

(9)

Acknowledgements

First of all, I would like to thank to my supervisor Paulo Moura Oliveira for his strong interest in my work, his constant support, encouragement, and guidance. To my co-supervisor Eduardo Pires by his disponibility, to discuss tecnical aspects related to the algorithms and support. To my co-supervisor Maximino Bessa by his encouragment. Thank the institutions of UTAD and INESC-TEC by providing the conditions for the realization of this thesis. I thanks to Tatiana for agreeing to review the thesis and her helpful comments. Thanks to the co-authors of my publications for the fruitful cooperations and for contributing to this thesis. A special thanks to my friends Manuel, Bruno, Sandra, Frank, Jorge, Deolinda and Vagaroso. Without you this thesis could have been finished earlier. I would also like to thanks my parents, brother and sister for their endless love and support.

(10)

(11)

Abstract

Many optimization problems involve several objectives subject to certain restric-tions that should be considered simultaneously. Unlike single-objective problems which seek for the global optimal solution, problems with multiple objectives give rise to a set of solutions called Pareto front. In the last two decades, evolution-ary algorithms in conjunction with Pareto dominance principles have demonstrated a large capacity to find a set of solutions at or nearby the optimal Pareto front. This method has been explored essentially in problems with two or three objec-tives. In recent years, evolutionary algorithms have been applied to solve problems with more than three criteria, called many-objective problems. Evolutionary algo-rithms and methods based on the Pareto dominance principle, developed to solve problems with two or three objectives, have proven unsuitable for many-objective problems. One of the main issues with many-objective optimization problems, using the Pareto dominance concept, is that entire population becomes practically non-dominated in a stage in which the population is still far from the optimal Pareto front. The selection of “good” solutions for recombination with the remaining popu-lation becomes virtually random, making it difficult the popupopu-lation convergence to the optimal Pareto front. The main principle developed in this doctoral thesis is to seed the multi-objective algorithm population with some near optimal solutions, called corner solutions. To find the corner solutions a Corner MOPSO algorithm

(12)

into several bi-objective problems. Three evolutionary algorithms were used from different paradigms, a genetic algorithm, NSGA-II, a particle swarm, SMPSO, and a differential evolution algorithm, GDE3, where the corner solutions are introduced. The algorithm is tested for the family DTLZ benchmark problems, more specifically the DTLZ1, DTLZ2, DTLZ3, DTLZ4, and DTLZ5 problems with the number of targets for each of problems ranging between 4 and 10.

The developed method are applied to design control structures with PID controller, considering a many-objective perspective. The aim is to tune the PID controller parameters for different systems

Key Words: Evolutionary algorithms, corner solutions, multi-objective optimiza-tion, many-objective optimizaoptimiza-tion, PID controller.

(13)

Resumo

Muitos dos problemas de optimiza¸cão envolvem diversos objectivos sujeitos a al-gumas restri¸cões e que devem ser considerados simultaneamente. Ao contrário dos problemas uni-objectivo em que se procura a solu¸cão óptima global, a resolu¸cão dos problemas com múltiplos objectivos dão origem a um conjunto de solu¸cões, chamado frente de Pareto. Nas últimas duas décadas os algoritmos evolutivos conjuntamente com o princ´ıpio da dominância de Pareto têm demonstrado uma grande capacidade para obter um conjunto de solu¸cões próximas da frente óptima de Pareto. Esta situ-a¸cão tem-se verificado principalmente para problemas com dois ou três objectivos. Nos últimos anos tem-se levado os algoritmos evolutivos a resolver problemas com mais de três objectivos, denominados many-objective. Os algoritmos evolutivos e métodos baseados no princ´ıpio da dominância de Pareto desenvolvidos para resolver problemas com dois ou três objectivos, têm-se mostrado desadequados para proble-mas many-objective. Uma das questões centrais com os problemas de optimiza¸cão many-objective usando a dominância de Pareto é que praticanemte toda a popu-la¸cão se torna não-dominada numa etapa em que a popula¸cão ainda se encontra muito longe da frente óptima de Pareto. A seleçcão das “boas” solu¸cões para recom-bina¸cão com a restante popula¸cão torna-se praticamente aleatória, tornando dif´ıcil a convergência da popula¸cão para a frente óptima de Pareto. O princ´ıpio básico desenvolvido nesta tese de doutoramento é colocar algumas solu¸cões próximas da

(14)

los algoritmos evolutivos e pelo princ´ıpio da dominância de Pareto para problemas com dois objectivos. Neste algoritmo faz-se uma transforma¸cão de um problema many-objective em diversos problemas bi-objective. Foram usados três algoritmos evolutivos, de paradigmas diferentes, um algoritmo genético, o NSGA-II, um par-ticle swarm, o SMPSO e um algoritmo differential evolution, o GDE3, onde são introduzidos as corner solutions. O algoritmo é testado para a fam´ılia de proble-mas benchmark DTLZ, mais propriamente os probleproble-mas DTLZ1, DTLZ2, DTLZ3, DTLZ4 e DTLZ5, com o número de objectivos, para cada um dos problemas, var-iando entre 4 e 10. Os métodos desenvolvidos são aplicados para projectar estruturas de controlo com controlador PID, considerando uma perspectiva many-objective. O objectivo é ajustar os parâmetros do controlador PID para diferentes sistemas.

Palavras Chave: Algoritmos evolutivos, corner solutions, optimiza¸c˜ao multi-objectivo, optimiza¸c˜ao many-objective, controlador PID

(15)

List of Tables

3.1 Bound for the number of points required to represent a Pareto front

with resolution r = 10. . . 45

4.1 Solutions coordinates in a 3D linear geometry. . . 59

4.2 Solutions P1 transformation in linear geometry. . . 60

4.3 Solutions P2 transformation in linear geometry. . . 60

4.4 Solutions P₃ transformation in linear geometry. . . 61

4.5 Solutions coordinates in a 3D convex geometry. . . 62

4.6 Solutions P1 transformation in convex geometry. . . 62

4.9 Solutions coordinates in a 3D concave geometry. . . 64

4.10 Solutions P1 transformation in concave geometry. . . 64

4.13 Solutions coordinates in a 3D linear geometry with corners out of axes. 66 4.14 Solutions P1 transformation in linear geometry with corners out of axes. 66

4.15 Solutions P2 transformation in linear geometry with corners out of axes. 67

(20)

4.20 NSGAII Average and Standard Deviation GD. . . 76

4.21 NSGAII Average and Standard Deviation Spacing. . . 78

4.22 GDE3 Average and Standard Deviation GD. . . 79

4.23 GDE3 Average and Standard Deviation Spacing. . . 80

4.24 Statistical Test GD. . . 80

4.25 Statistical Test Spacing. . . 81

5.1 Relation among Ms, vector margin, gain and phase margins. Relation among Mt and circle center (ct) and radius (rt). . . 90

5.2 Objectives maximum and minimum values for the 4 systems. . . 95

5.3 Gp1 - Two PID solutions values. . . 96

5.4 PI gains obtained using the PSO for system Gp1. . . 105

5.5 PID gains obtained using the PSO for system Gp2. . . 105

5.6 Non-dominated front obtained by Gp1. . . 112

5.7 Non-dominated front obtained by Gp2. . . 116

5.8 Comparison between MaPSO solution #7, Cohen-Coon and Murril ITAE for Gp2. . . 119

(21)

List of Figures

2.1 Example where solutions converged and are well distributed along the

Pareto front.. . . 8

2.2 Pareto dominance relation illustrative example. . . 9

2.3 Example of dominated and non-dominated spaces. . . 10

2.4 Search spaces in multi-objective optimization problems. . . 12

2.5 Ideal, Nadir, and Corner solutions examples. . . 15

2.6 Example of application the maximin method in a non-dominated pop-ulation. . . 19

2.7 Optimal Pareto front for DTLZ1 . . . 21

2.8 Optimal Pareto front for DTLZ5, DTLZ3, DTLZ4 . . . 23

2.9 Optimal Pareto front for DTLZ5 . . . 24

3.1 Full connected topology . . . 37

3.2 Ring topology . . . 37

3.3 Crowding Distance . . . 41

3.4 -dominance example . . . 47

4.1 Corner solutions: example for a minimization problem. . . 56

4.2 Search spaces in Corner MOPSO algorithm. . . 58

(22)

4.7 Solutions in a 3D convex geometry. . . 62 4.8 Solutions P1 in a convex geometry. . . 62

4.9 Solutions P2 in a convex geometry. . . 63

4.10 Solutions P3 in a convex geometry. . . 63

4.11 Solutions in a 3D concave geometry. . . 64 4.12 Solutions P1 in a concave geometry. . . 64

4.13 Solutions P2 in a concave geometry. . . 65

4.14 Solutions P3 in a concave geometry. . . 65

4.15 Solutions in a 3D linear geometry with corners out of the axes. . . 66 4.16 Solutions P₁ in a linear geometry with corners out of axes. . . 66 4.17 Solutions P2 in a linear geometry with corners out of axes. . . 67

4.18 Solutions P3 in a linear geometry with corners out of axes. . . 67

4.19 Coordinates of corner particles obtained with Corner MOPSO for DLTZ3 with 10 objectives. . . 72 4.20 Coordinates of corner particles obtained with Corner MOPSO for

DLTZ3 with 7 objectives.. . . 72 4.21 GD Average for DTLZ4, problem with 9 objectives in NSGAII

algo-rithm. . . 75 4.22 Average number of solutions in Pareto front for DTLZ4, problem with

9 objectives in NSGAII algorithm.. . . 75 4.23 GD Average for DTLZ3, problem with 7 objectives in GDE3 algorithm. 77 4.24 Average number of solutions in Pareto front for DTLZ3, problem with

7 objectives in GDE3 algorithm. . . 77 4.25 GD Average for DTLZ1, problem with 5 objectives in SMPSO

algo-rithm. . . 77 5.1 General single input single output feedback loop. . . 84 5.2 PID control with set-point weighting. . . 87 5.3 PI/PID control with a feedforward pre-filter. . . 88

(23)

5.4 Loop Nyquist plots with circles constraints. . . 89 5.5 Gp1 - System output . . . 93

5.6 Gp1 - PID controller outputs. . . 93

5.7 Gp2 - System output. . . 94

5.8 Gp2 - PID controller output. . . 94

5.9 Gp3 - PID decision space variables. . . 94

5.10 Gp3 - Objective space normalized plot. . . 94

5.12 Gp3 - Control signal. . . 95

5.13 Gp1 - Two PID selected solutions. . . 96

5.14 Gp1 - Nyquist plot for two PID solutions. . . 96

5.16 Gp4 - Control signal . . . 96

5.17 Nyquist plots for the 4 processes Gp1 − Gp4, corresponding to the

final 20 non-dominated solutions. . . 97 5.18 PI control results for system Gp1. Load disturbance step responses. . 104

5.19 PI control results for system Gp1. Set-point tracking responses. . . . 104

5.20 PI control results for system Gp1. Nyquist plots with circles constraints.106

5.21 PI control results for system Gp₁. Comparison with the results ob-tained in Hast et al. (2013) . . . 106 5.22 PI control results for system Gp1 with set-point weighting. Set-point

tracking responses. . . 107 5.23 PID control results for system Gp2. Load-step responses. . . 108

5.24 PID control results for system Gp2. Set-point tracking responses. . . 108

5.25 PID control results for system Gp2. Nyquist plots with circles

con-straints. . . 109 5.26 PID control results for system Gp2 with set-point filtering and

lead-lag pre-filtering. . . 110 5.27 Gp1 results. SPT and DR response and controller output signals for

the 20 solutions.. . . 111 5.28 Gp1 results. Nyquist plots for the 20 final solutions. . . 112

(24)

5.31 Gp2 results. SPT and DR response and Controller output signals for

the set of 20 selected solutions. . . 114 5.32 Gp2 results. Nyquist plots for the 20 solutions. . . 115

5.33 MaPSO versus MODR method for Gp2. Load disturbance rejection

(LDR) responses when a unit step is applied d = 1 is applied at t = 0s.118 5.34 MaPSO versus Cohen-Cohen and Murrill rules for Gp2. Load

distur-bance rejection (LDR) responses when a unit step is applied d = 1 is applied at t = 0s. . . 120 5.35 MaPSO versus IAE minimization for LDR by Hast et al. (2013) for

Gp2. Load disturbance rejection (LDR) responses when a unit step

(25)

Acronyms

ACO Ant Colony Optimization

ABC Artificial Bee Colony

ACO Ant Colony Optimization

BOP Bi-objective Optimization Problem

CDAS Control the Dominance Area of Solutions

CE Control Effort

DE Differential Evolution

DM Decision Maker

DRS Dominance Resistant Solutions

EA Evolutionary Algorithm

EP Evolutionary Programming

ES Evolution Strategies

FOPTD first order plus timed delay

GA Genetic Algorithms

GD Generational Distance metric

GDE3 Third Evolution Step of Generalized Differential Evolution

GM Gain Margin

GP Genetic Programming

IAE Integral of Absolute Error

ITAE Integral Time-weight Absolute Error

(26)

MOP Multi-objective Optimization problem

MOPSO Multi-objective Particle Swarm Optimization

NSGA-II Non-dominated Sorting Genetic Algorithm II

ONVG Overall Non-dominated Vector generation

ONVGR Overall Non-dominated Vector Generation Ratio

PCA Principal Component Analysis

PI Proportional and Integral controller

PID Proportional, Integral, and Derivative controller

PM Phase Margin

PSO Particle Swarm Optimization

SOP Single-objective Optimization Problem

SP Spacing metric

SPEA2 Strenght Pareto Evolutionary Algorithm

SPT Set-Point Tracking

(27)

CHAPT

ER

1

Introduction

Optimization and optimization algorithms can be applied to solve problems in sev-eral domains. Evolutionary algorithms (EA) applications have been playing an im-portant role in real-world design and optimization tasks. They can provide a solution in a single algorithm run and are insensitive to the type of objective functions, e.g. concave, convex, linear, discontinuous, multi-modal, etc. (Deb, 2001). This does not mean that the EA are always the best algorithms to solve any problems, but they should be considered in the resolution of difficult ones.

Coello et al. (2002) identified over four hundred publications with practical appli-cations of multi-objective evolutionary algorithms (MOEA). Since then many others applications have been reported in fields as diverse as biology (Shehu and De Jong, 2015) and economy (Hu et al.,2015). See (Coello,2015) for a more recent real world MOEA applications survey. Even in every day life multi-objective decisions prob-lems are present. For example, to buy a smartphone most costumers pay attention to multiple aspects, such as: price, performance, battery life, size, and some of these criteria in conflict among each other. This means that a trade-off must be achieved, to decide about the best solution. As a matter of fact, most real-world problems are multi-objective. The objectives have to be minimized or maximized simultaneously, taking into account a set of constrains (Deb,2001;Coello et al.,2007). Thus, rather than obtaining a single optimal solution for all criteria, the result becomes a set of

(28)

solutions representing different incomparable designs. The solutions found by the MOEA should have good convergence to the optimal Pareto front. Also, they should be uniformly distributed along the front, and the extension of the front should be maximum. The final aim is that the decision maker (DM) can select the preferred solution, to solve a problem, among a set of Pareto optimal solutions.

1.1 Problem Statement

In evolutionary computation field, if single-objective optimization is a clearly delin-eated field, multi-objective optimization is still a open research field, particularity when the number of objectives is high (Ishibuchi et al., 2008; Chand and Wagner, 2015). MOEA gained high relevance with the capacity to solve real-world problems with two or three objectives. A common way to compare different solutions in terms of optimality is by using the Pareto dominance relation. However, when these Pareto based comparison techniques are applied to problems with four or more objectives (called many-objective problems) (Purshouse and Fleming, 2003) they often have a negative effect on the performance (Garza-Fabre et al.,2011). Pareto-based MOEA seem to be the most affected technique by the high dimensionality. The three main difficulties concerning this, are:

1. Search ability deterioration: The population solutions have the tendency to be non-dominated at early search stages (Corne and Knowles,2007). Thus most of the solutions are considered good solutions and Pareto-based algorithms have difficulty to apply an appropriate selection pressure, without compromising the population convergence.

2. Pareto front dimensionality: The higher the problem dimensionality is, more solutions are needed to represent all the Pareto front. Indeed, the number of solutions increase exponentially with the number of objectives. Moreover, the DM will have a difficult task to select one solution between a huge number of solutions.

(29)

1.2. RESEARCH HYPOTHESIS AND METHODOLOGY

3. Pareto front visualization: For a 2 or 3 dimensions problem is easy to plot the Pareto front using 2 or 3 dimensional graph. However, when the dimensions increase, becomes more difficult to represent the population characteristics in a 2D plane. Also, it is harder for the DM to have a good perception of the best solution for a specific problem.

The global aim of this thesis is to present techniques that can be incorporated into most MOEA, originally developed for two and three objective problems, to solve many-objective problems. More specifically in attenuating the search ability deterioration. The research of this issue is the core of this thesis.

1.2 Research Hypothesis and Methodology

Considering that most real engineering applications require optimizing more than three design criteria, the aim of this thesis is to develop nature and biologically inspired computational techniques which may contribute to better solve this type of many-optimization problems. The bio-inspired techniques refered can be classified under the Computational Intelligence umbrella, namely intelligent swarm optimiza-tion and evoluoptimiza-tionary computaoptimiza-tion.

The research hypothesis stated in this thesis is that: Multi-objective algorithms, based on Pareto dominance, can be improved to solve many-objective problems with the appropriate insertion of corner informed solutions into the evolutionary search.

The following research questions are stated in order to test and validate the research hypothesis:

1. The insertion of informed solutions, particularly corner points, improves the many-objective search?

2. The evolutionary stage in which the informed solutions are inserted signifi-cantly determines the outcome of the many-objective search?

(30)

3. How to improve many-objective particle swarm optimization algorithms to find corner Pareto solutions?

4. Can the design of Proportional, Integrative, and Derivative (PID) controllers be improved by using many-objective corner based particle swarm optimiza-tion?

To provide the answers to these questions, the action-research methodology is used, based in the experimentation. In canonical form, the action-research is conducted for multiple cycles of a five-step process (Kock et al., 2008): 1) Diagnosing the problem; 2) Planning the action; 3) Taking the action; 4) Evaluating the results; and 5) Specifying lessons learned for the next cycle. This methodology is part of a continuous improvement logic, wherein each time the cycle runs will tend to the problem solution. From the several experiments executed, the results are observed and analysed. The promising results are explored in others experiments until a solution is achieved that fullfills the research hypothesis.

1.3 Key Contributions

This work offers the following contributions to the field of many-objective optimiza-tion:

• A method, called Corner MOPSO, to solve many-objective problems; • The method that is independent of the evolutionary algorithm used; • An application of the proposed method to PID control design;

1.4 Outline of the Thesis

This thesis is organized in six chapters. Following this introduction chapter, the next two chapters describe background concepts. The last three chapters present

(31)

1.4. OUTLINE OF THE THESIS

the techniques developed, as well as the application to design PID control structures, corresponding results, and conclusions. The chapter list is the following:

• Chapter 1 -Introduction – corresponds to the current chapter.

• Chapter2-Multi-Objective Optimization: Introductory and Background Con-cepts – introduces multi-objective optimization concepts, defines Pareto dom-inance relation and the main elements required to design a MOEA.

• Chapter3-Multi-objective Evolutionary Optimization– presents an overview of some evolutionary algorithms. Describes the Particle Swarm Optimization (PSO) algorithm from a single to many objective perspective, discussing some intrinsic aspects of this algorithm. Explores problem characteristics which a many-objective optimizer may encounter during a given algorithm run. It reviews different proposals to address the previous outlined problem charac-teristic.

• Chapter 4 - Many-Objective Corner Based Particle Swarm Optimization Al-gorithm – presents the main contribution of this thesis. In the first part, the proposed algorithm is described, based on the PSO algorithm. In the second part, some results and a comparative study are presented.

• Chapter 5 - Corner based Many-Objective PID Controller Design – the algo-rithm proposed in chapter 4 is applied to many-objective optimization prob-lems in the field of control systems, and correspondent results are presented and analysed.

• Finally, chapter 6 - Conclusion – draws the main conclusions and provides some suggestions for future research.

(32)

(33)

CHAPT

ER

2

Multi-Objective Optimization:

Introductory and Background

Concepts

Optimization is a process of finding the best possible solution for a problem, for which often there is a lack of information and the time to find the optimal solution is limited. Optimization problems are present in several of our everyday activity. Different algorithms have been developed to solve these optimization problems, such as linear programming using gradient information for non-linear problems. When gradients can not be exactly evaluated other techniques have to be applied. Some of these methods are inspired in the Darwin theory for natural selection and they have been applied with success in industry and services sectors (Coello, 2015). This chapter reviews useful concepts for the development of the work presented in this thesis. Thus, sections 2.1 and 2.2 present important terminology used in this thesis, namely multi-objective optimization concepts and Pareto principles defini-tions, respectively. Thereafter, in section2.3a standard multi-objective evolutionary algorithm principal steps are described. While the first three sections address gen-eral information the next two comprise specific information for this thesis. Section 2.4 describes the benchmark problems used in this work to test the proposed algo-rithms and, finally, section 2.5 describes the metrics used to compare and validate these algorithms.

(34)

2.1 Multi-objective Optimization

The optimization term refers to the process of finding a feasible solution that is the minimum (or the maximum) possible fitness value in one or more objective functions. Optimization is a task common to many engineering and scientific disciplines. For instance, in an engineering application, one could wish to get the minimum possible cost of production and/or to maximize the profit for a given product. Optimiza-tion problems that involve only one objective funcOptimiza-tion are known as single-objective optimization problems (SOP), whereas those involving two or more objectives are known as multi-criteria or multi-objective optimization problems (MOP).

In single-objective approaches the aim is finding the problem unique optimal solu-tion, unless the problem is multi-modal with more than one optimal solution. On the other hand, in multi-objective optimization, the aim is to find a well distributed set of solutions that represent the best possible compromise among the objectives. Indeed, it is expected that the objectives have some degree of conflict among them, i.e., when one objective improves at least another must be worsen. Besides the diversity of solutions representing different trade-offs among objectives it is desired that a good convergence to the global optimal front is achieved.

0 0.25 0.50 0.75 1 1 0.75 0.50 0.25 0 f1 (minimization) f2 (mi n im izat ion)

Figure 2.1 – Example where solutions converged and are well distributed along the Pareto front.

(35)

2.1. MULTI-OBJECTIVE OPTIMIZATION

Figure 2.1 presents an example of a minimization bi-objective problem where the solutions set is well distributed along the Pareto optimal front. To compare solutions, Pareto optimality principles are used. It is important to state that this set is formed by samples of possible elements, since the number of real optimal solutions set in continuous problems is infinite.

By using the notion of Pareto dominance it is possible to compare two solutions. It is said that a solution strictly dominates another one if it is better in all problem objectives. In a weak dominance, a solution must be better than another solution at least in one objective and in the other ones should be equal. If among two solutions there are objectives where one solution is better and in some of the others the second solution is better, then these two solutions are incomparable.

0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 a b c d stric lydo minan ce inco m pa ra bl e weak dominance f1 (minimization) f2 (mi n im izat ion)

Figure 2.2 – Pareto dominance relation illustrative example.

Figure 2.2 shows the possible dominance relations between four solutions. Solution a weakly dominates solution b because it is just better in objective f1 and they

have the same value in the objective f2. On the other hand, solution a strictly

dominates solution c because it is better in both objectives. Solutions a and d are incomparable because for objective f1 solution d is better than solution a but

for the second solution a is better than solution d. Therefore, the set formed by solutions a and d is called the non-dominated solution set and solutions b and c

(36)

belongs to the dominated solutions set. Figure 2.3 represents the dominated space, in shadow, and the non-dominated space, in white, where the black circles represent the best compromise proposals in the population and are members of the Pareto front. The grey solutions are dominated by the black ones. In many cases the best solutions set obtained just represents an approximation to the optimal Pareto front but they could be close enough to be considered good solutions to the problem.

0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 non-dominated space dominated space f1(minimization) f2 (mi n im izat ion)

Figure 2.3 – Example of dominated and non-dominated spaces.

2.1.1 Solution and search space

A solution is a vector (2.1) formed by n decision variables, each one with a value that when applied to a objective function gives a single score, representing the performance on that objective.

x = (x1, x2, . . . , xn) (2.1)

In this work, all decision variables are constrained real-values (2.2). However, any alphabet can be adopted for the decision variables.

(37)

2.1. MULTI-OBJECTIVE OPTIMIZATION

xi ∈ R : (xLi ≤ xi ≤ xHi ) ∧ (−∞ xLi < xHi ∞)∀i ∈ {1, 2, . . . , n} (2.2)

In (2.2) xL_i and xH_i represent minimum and maximum range values for the i decision variable, respectively. The n decision variables with the constrained boundaries result in the decision space or search space S, represented in (2.3).

S = {(x₁, x₂, . . . , x_n) ∀_x_i=1..n _{∈ R : x}L_i ≤ x_i ≤ xH

i } (2.3)

2.1.2 Multi-objective problem and objective space

A multi-objective problem (MOP) is a problem with at least two optimization objec-tives with some degree of conflict among them, i.e., for one objective to be improved at least another suffers a deterioration. The aim of optimizing a problem in a multi-objective perspective is to obtain a set of optimal solutions rather than a single optimal solution. For instance, in a feedback control system it is usual to minimize the overshoot and the rise time when a step input is applied to the reference input. But if the rise time is minimum the overshoot tends to high and if the system has no overshoot the rise time could not be fast enough. Or if in the control system the objective is to maximize the input sensitivity and minimize the disturbance sensitiv-ity. The maximum input sensitivity will be too sensitive to noise and the minimum disturbance sensitivity will not be responsive enough to control inputs. It is possi-ble to have a control system where the aim is to optimize the last four objectives simultaneously. The result is a set of solutions representing a trade-off among all the objectives.

This set of optimal solutions is called the Pareto optimal set. Usually, the problems to be addressed have some constraints wich have to be considered. These constraints can be boundary constraints, where the value of a decision variable is limited to some range, and constraint functions, which are expressed as functions on one side of an inequality equation.

(38)

Mathematically, in a minimization context, a MOP can be defined as (Deb, 2001):

minimize f(x) = (f₁(x), f₂(x), . . . , f_m(x)), m = 1, . . . , M

subject to gk(x) ≥ 0, k = 1, . . . , K

xL_i ≤ xi ≤ xHi i = 1, . . . , n

(2.4)

where the vector x is a solution with n decision variables and the vector f(x) rep-resents the objective functions to be solved, with M being the total number of ob-jective functions. When the decision variables fulfil the functions constraints (gk),

with K being the number of function constraints, and the boundary constraints (xL_i ≤ x_i ≤ xH

i ) the feasible set is obtained in the decision variable space S, where

S ⊆ Rn_.

The fitness space or objective space Z (2.5) of a problem is the mapping of all possible solutions in S to the M dimensional set of results formed by applying each solution to each of M objectives:

Z = {(f1(x), f2(x), . . . , fM(x)), ∀x∈ S} (2.5)

x₁ x₂

x3

Decision variable space

S ⊆ Rn f2

f₁ Objective function space

Z ⊆ Rm

Figure 2.4– Search spaces in multi-objective optimization problems.

In MOP two spaces are considered, the ndimensional decision space and the M -dimensional objective space, as represented in Figure2.4. Each point in the decision

(39)

2.2. PARETO TERMINOLOGY AND OTHER DEFINITIONS

space represents a solution and its values represent the quantities to be chosen in the optimization problem. In the objective space each point represents a solution and its values the respective fitness value which enables to determine the solution quality. It is important to note that there is not necessarily a strict correspondence between trends seen in the decision-space and those in the fitness space. Indeed, it is possible for entirely continuous search spaces to give rise to discontinuous objective-spaces, therefore, solutions which are closely mapped in objective space, could be widely separated in search space.

2.1.3 Definition of a conflicting multi-objective problem

For the purposes of this work, the fundamental focus is on the subset of multi-objective problems that feature conflict, that is to say, those problems where there are at least some portion of objective space where progress towards the optimal of one objective leads to movement away from the optimal of another objective:

∃a, b ∈ S; i, j ∈ {1, 2, . . . , M } ∧ i 6= j : ((fi(a) < fi(b) ∧ (fj(a) > fj(b))) (2.6)

Those problems that do not meet these criteria can be more readily solved using single objective optimization via simple reduction of objectives (since all solutions will tend to have approximately mutual utility across objectives) or through the merging of objectives.

2.2 Pareto Terminology and Other Definitions

Pareto terminology is commonly used to compare solutions in multi-objective opti-mization, where the goal is to seek a set of solutions that are unmatched with each other. The following definitions present some relevant concepts used in the work developed in this thesis, considering a minimization problem:

(40)

Definition 1 (Pareto Dominance relation). Given two vectors, y1 and y2, y1 dom-inates y2 (denoted by y1 ≺ y2_{) if and only if: ∀ i ∈ {1, . . . , n}, y}1

i ≤ yi2 and

∃ i ∈ {1, . . . , n} : y1 i < y2i.

Definition 2 (Pareto Optimality). A solution x∗ ∈ S is said to be Pareto optimal if there is no other solution x ∈ S such that f(x) < f(x∗).

Definition 3 (Pareto optimal set). The Pareto optimal set, P∗, is defined as: P∗ = {x ∈ S | @ y ∈ S : f(y) < f(x)}.

Definition 4 (Pareto front). For a Pareto optimal set, P∗, the Pareto front, P F∗, is defined as: P F∗ = {f(x) = (f1(x), . . . , fm(x)) | x ∈ P∗}.

Definition 5 (Strict Pareto dominance). Given two vectors, y1 and y2, y1 strictly dominates y2 (denoted by y1 ≺ y2_{), if and only if: ∀}

i ∈ {1, . . . , n} : yi1 < yi2.

Definition 6 (Weak Pareto Optimality). A solution x∗ ∈ S is said to be weak Pareto optimal if there does not exist another solution x ∈ S such that f(x) 6 f(x∗). Definition 7 (Weak Pareto optimal set). The weak Pareto optimal set, P−, is defined as: P−_{= {x ∈ S | @ y ∈ S : f(y) 6 f(x)}.}

Definition 8 (Pareto front approximation). A Pareto front approximation, de-noted by P Fapprox, is a subset of the objective space Z composed of mutually

non-dominated vectors, i.e., for any two vectors y1, y2 ∈ P Fapprox, y1 ⊀ y2∧ y2 ⊀ y1. Definition 9 (Optimal solution set). The optimal solution set is the set obtained by the optimizers and is denoted by S+.

Definition 10 (Reference set). The reference set is a prefedined set of solutions, normally belonging to the P∗ and is denoted by R+.

In the objective space and in the optimal Pareto set, there are some points that can have a remarkable importance. In Figure 2.5 are represented some of that points in the objective space.

Definition 11 (Ideal point). The ideal point yI = (yI₁, . . . , y_MI ) is the vector com-posed by the best objective values over the objective space Z. Analytically, the ideal objective vector is expressed by: min

(41)

2.3. STANDARD MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM

Definition 12 (Nadir point). The nadir point yN = (y₁N, . . . , y_MN) is the vector composed by the worst objective values over the objective space Z in the entire optimal Pareto set. Analytically, the nadir objective vector is expressed by: yN_m = max

x∈Sfm(x), m ∈ 1, . . . , M .

Definition 13 (Corner solutions). Corner solutions are the set of solutions of the optimal Pareto set with the worst objective value for one objective.

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Ideal point Nadir point f₁ (minimization) f2 (m in im iza ti o n ) Pareto front Corner Solutions

Solutions in optimal Pareto front Other solutions in population set

Figure 2.5– Ideal, Nadir, and Corner solutions examples.

2.3 Standard Multi-objective Evolutionary

Algo-rithm

Any EA must be independent of the objective functions, which should act as a black box for the algorithm. The black box input are n decision variables of each solution and the output is the vector of objective values f (x). The vast majority of EA are guided by the same principles. A common structure is given by Algorithm 2.1. In the first step, the population initialization, the decision variables values are usu-ally obtained from an uniform random process, with values limited by boundary constraints. If a priori to the EA execution there is knowledge available about a possible location for the optimum solutions, the population can be initialized in the

(42)

Algorithm 2.1: Basic structure of an EA 1: Generate the initial population

2: while stopping condition is not reached do 3: Evaluate the population

4: Select the best elements 5: Change population elements 6: Obtain new population 7: end while

neighbourhood of that location. After the population initialization, the algorithm starts an interactive process during a set number of iterations or using other termi-nation criteria. The third step corresponds to the population evaluation where for each member all the objective functions are evaluated. The next step selects the best elements. In a single objective optimization process is easy to sort the elements by its fitness or objective value. On the other hand, since in multi-objective optimization process exists a set of incomparable solutions, this process is not straightforward. For example, all solutions in a Pareto front can be considered optimal solutions. However, all of them have the same importance? Are the selected solutions the ones which in some way will influence the decision variable values for the next iteration?

2.3.1 Population Initialization

Randomly1 is the easiest method to initialize a population and the most common. Usually for a population of N elements, N solutions are initialized. However, initial-ize more than N elements and keep the N best solutions is an other possibility. In the method proposed by Bhattacharya (2008) 5N elements are initialized and just the N best solutions are used in the initial population.

For some problems it is possible to seed the initial population with some expert solutions. These expert solutions can be obtained by other algorithms or using available knowledge or heuristics about the problem. For example, in a control

(43)

2.3. STANDARD MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM

system design problem it is possible to estimate, by other methods, elements that are expected to be reasonable solutions.

The initialization process can play an important role in the final results obtained. It is known that for problems with large dimensions and low number of elements a random initial population can produce a non uniform distribution of the solu-tions in the search space. This can produce an inefficient exploration of the search space (Pant et al., 2007). For example in Gutierrez et al. (2011) four initialization strategies are explored to solve an Antenna array problem using the Particle Swarm Optimization algorithm: random, orthogonal array, chaotic, and opposition. The chaotic method presented better results. Other methods can be found in a survey paper (Kazimipour et al., 2014), about initialization procedures in EA.

2.3.2 Algorithm Termination Criteria

At the end of the algorithm execution it is expected that the population solutions are sufficiently close to the optimal Pareto front in order to be considered possible solutions of the problem. The “termination criteria” in the algorithm2.1is a critical aspect in MOEA and has not received much attention from researchers.

It is common to use as the termination criteria a predetermined number of objec-tive function evaluations, so different MOEA can be compared. This criterion can cause the algorithm to spend unnecessary computational resources because the best solutions can have been found before the termination criterion has been executed. Moreover, it is also possible that the last iteration solution set is not the best ever found (Laumanns et al., 2002).

Another criterion consists in the fitness evaluation of the obtained solutions and check whether there are significant improvements over a specific number of itera-tions. What may happen, besides the computational costs and time with the suc-cessive comparisons, are the solutions becoming ‘stuck’ in an local front. Whatever the criteria used, algorithms can not be running indefinitely. There is always the possibility that in the next iteration, solutions quality will improve substantially.

(44)

2.3.3 Elitism and Elite Archiving

Elitism is the name given to the process of saving the best population solutions to the next iteration. Without this process often the best solutions found during an iteration can fail to be part of the next generation. The use of elitism enables a substantial improvement of the algorithms results (Deb, 2001).

In MOEA, elitism is created with an external archive where the best solutions are kept. After the termination criterion has been reached, the archive solutions are presented as the final result of the algorithm. This archive only stores a limited number of solutions, usually the number of solutions to be presented as the final result of the algorithm. Non-dominated solutions candidate to belong to the archive are usually in greater numbers than those the archive may contain. It is prohibitive, in terms of search cost and maintenance, the existence of unbounded archives or even very large. Therefore, the archive truncation can result in performance degradation. The most common case is the front oscillation (Laumanns et al., 2002) where good solutions that have no place in the archive may lead to insertion in the next iteration of elements that would be dominated by these solutions. This front oscillation may have some drawbacks. For example, when the termination condition is reached, the archive can not be filled with the best solutions set that already have been found.

2.3.3.1 Maximin Algorithm

In this section the maximin method (Solteiro Pires et al., 2005) is presented. This method is used to select the non-dominated solutions to the external archive, when the candidate solutions are more than the free space in the archive. The idea behind the algorithm is to decrease larger areas of Pareto front without solutions.

Consider a population of ndP non-dominated solutions and an archive of nA solu-tions space, where ndP > nA, and the Figure 2.6 examplifies.

The solution set R = {a, b, c, d, e, f, g, h, i}, with ndP = 9, intends to obtain a set of nA = 5, represented by the set S, that should contain the solution which make the

(45)

2.3. STANDARD MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM a f d g c h e i b 1 3 2 3 1 f1(minimization) f2 (minimiz at ion )

Figure 2.6– Example of application the maximin method in a non-dominated population.

better solution distribution along the Pareto front. Firstly the corner solutions are chosen from the set R. In this example, the solutions a and b. These solutions are inserted into the set S and deleted from the set R. At the end of step 1 are obtained the set R = {c, d, e, f, g, h, i} and the set S = {a, b}. Thus the Euclidean distance between the solutions in R and S is calculated. For each solution in R the smaller distance to any other solution in S is kept. After then, the R solution with higher distance value to go to set S is chosen. In this example the solution c is moved from R to S. This way, the set R = {d, e, f, g, h, i} and the set S = {a, b, c} are obtained ate the end of the step 2. As |S| = 3 is lesser than 5 the process is repeated. Now, for the step 3, the distances between solutions in set R and solution c are calculated and compared with the distances calculated before. The small distance value for each R solution is kept and the solution with the higher distance value is selected to be moved from the set R to the set S, in this example, the solution d. After this, in the next step the solution e is selected and inserted into the S set. At this point, the set R = {f, g, h, i}, the set S = {a, b, c, d, e} and nA = 5. Therefore, the algorithm stops since there are no slots available in the archive.

(46)

2.4 DTLZ benchmark problem functions

Over the years several algorithms have been proposed where authors claim the supe-riority of their methods over the others. In order to clarify these comparisons some benchmark problems and metrics were proposed. This section presents some of the most popular ones.

The set of DTLZ benchmark problems, proposed byDeb et al.(2005b), are scalable in the objective number and in the decision variables number, which facilitates the investigation into many objective problems. In this thesis problems DTLZ1 to DTLZ5 are used to test the efficiency of the proposed algorithms. The problems are scalable from 4 to 10 objectives. All the problems considered the functions minimization. The M parameter is the total objectives number, m is the actual objective, n the total number of decision variables which can be calculated by n = M + k − 1, where k is a parameter defined according to the problem. The values of k adopted are the ones proposed by its authors.

The DTLZ problems allow investigate the properties of many-objective problems in a controlled manner, with known problem characteristics and knowledge of the Pareto optimal frontHuband et al. (2006).

DTLZ1 problem

This is a problem with k = 5. The optimal Pareto front is linear and continuous.                f1(x) = (1 + g(x))0.5QM_i=1−1xi, fm=2:M−1(x) = (1 + g(x))0.5QMi=1−mxi(1 − xM−m+1), fM(x) = (1 + g(x))0.5(1 − x1), g(x) = 100[k +Pn M((xi− 0.5)2− cos(20π(xi− 0.5)))]. (2.7) xi ∈ [0, 1], for i = 1, 2, 3, ..., n.

The optimal Pareto front is obtained when g(x) = 0, achieved withPM

(47)

2.4. DTLZ BENCHMARK PROBLEM FUNCTIONS

Figure 2.7 illustrates the optimal Pareto front of this problem for three objectives.

0 0.25 0.50 0 0.25 0.50 0 0.25 0.50 f1 f2 f3

Figure 2.7 – Optimal Pareto front for DTLZ1 problem for three objectives.

DTLZ2 problem

This is a problem with k = 10. The optimal Pareto front is concave and continuous.

               f1(x) = (1 + g(x))QMi=1−1cos(xiπ/2),

fm=2:M−1(x) = (1 + g(x))QMi=1−mcos(xiπ/2) sin(xM−m+1π/2),

fM(x) = (1 + g(x)) sin(x1π/2),

g(x) = Pn

M(xi− 0.5)2.

(2.8)

xi ∈ [0, 1], for i = 1, 2, 3, ..., n.

The optimal Pareto front is obtained when x∗_M = 0.5. Figure2.8 shows the optimal Pareto front for three objectives in the objective space.

DTLZ3 problem

(48)

               f1(x) = (1 + g(x))QM−1_i=1 cos(xiπ/2),

fm=2:M−1(x) = (1 + g(x))QM−mi=1 cos(xiπ/2) sin(xM−m+1π/2),

fM(x) = (1 + g(x)) sin(x1π/2),

g(x) = 100[k +Pn

M((xi− 0.5)2− cos(20π(xi− 0.5)))].

(2.9)

xi ∈ [0, 1], for i = 1, 2, 3, ..., n.

The optimal Pareto front is obtained when g(x) = 0. In this problem 3k− 1 local fronts are produced, parallel to the optimal Pareto front. Figure 2.8 shows the optimal Pareto front for three objectives.

DTLZ4 problem

This is a problem with k = 10. The parameter α = 100 is used. The optimal Pareto front is concave and continuous.

               f1(x) = (1 + g(x))QM_i=1−1cos(xαiπ/2),

fm=2:M−1(x) = (1 + g(x))QMi=1−mcos(xαiπ/2) sin(xαM−m+1π/2),

fM(x) = (1 + g(x)) sin(xα1π/2),

g(x) = Pn

M((xi− 0.5)2.

(2.10)

xi ∈ [0, 1], for i = 1, 2, 3, ..., n.

The optimal Pareto front is obtained when g(x) = 0. In this problem 3k− 1 local fronts are produced, which are parallel to the optimal Pareto front. Figure2.8shows the optimal Pareto front for three objectives.

DTLZ5 problem

This is a problem with k = 10. The optimal Pareto front is degenerated. The optimal Pareto front is meant to be an arc embedded in M -objective space.

(49)

2.5. PERFORMANCE METRICS 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 f1 f2 f3

Figure 2.8 – Optimal Pareto front for DTLZ2, DTLZ3, DTLZ4 problems for three objectives..

                           f1(x) = (1 + g(x))QM−1_i=1 cos(θiπ/2),

fm=2:M−1(x) = (1 + g(x))QMi=1−mcos(θiπ/2) sin(θM−m+1π/2),

f_M(x) = (1 + g(x)) sin(θ₁π/2), g(x) = Pn M(xi − 0.5)2, θ1 = x1, θi = _4(1+g(x))1 (1 + 2g(x)xi), i = 2, 3, ..., (M − 1). (2.11) xi ∈ [0, 1], for i = 1, 2, 3, ..., n.

The optimal Pareto front is obtained when g(x) = 0. Figure 2.9 shows the optimal Pareto front for three objectives.

2.5 Performance metrics

Measuring the results quality in multi-objective optimization is much more complex than in a single-objective problem where often it is enough to compare two solutions

(50)

0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 0 0.25 0.50 0.75 1 f1 f2 f3

Figure 2.9– Optimal Pareto front for DTLZ5.

and consider the one with best fitness value as superior. As the aim of a multi-objective algorithm is concerned with convergence and diversity, several metrics to measure the quality of the results have been proposed. First of all, so as not to bias the results it is important that when comparing results, that the number or ratio of non-dominated solutions in S+, representing the achieved optimal solution set, is similar among the different sets in comparison. Examples of metrics that measure the capacity of the S+ are the Overall Non-dominated Vector Generation (ONVG) that gives the number of solutions, |S+|, or Overall Non-dominated Vector Generation Ratio (ONVGR) that gives the ratio among the number of solutions in the optimal solution set, S+ and Pareto front P F∗2, _{|P F}|S+∗|_| (Veldhuizen and Lamont,

2000).

The convergence metrics measure the proximity of optimal solution set S+ to the P F∗. To an effective measure is necessary to have previous knowledge of the optimal Pareto front, which is just possible if the problem is known a priori. It is the case of benchmark problems. For other problems, a common strategy is to build a reference set R+ with the results of priori optimization of the problem. Examples of convergence metrics are the Generational Distance GD (Veldhuizen and Lamont, 1998) with the (2.12) formulation. Other example is the -indicator (I₊1 ) (Zitzler

2_{In continuous problems the number of solutions in P F}∗_{is infinite but normally a finite number}

(51)

2.5. PERFORMANCE METRICS

et al., 2003).

In this thesis the next two metrics are used, the first one to measure the convergence to the front and the second to measure the diversity of front solutions:

Generational Distance (GD): The concept of generational distance (Veldhuizen and Lamont,1998) is used to estimate how far elements in the non-dominated solu-tions set, provided by the algorithm, are from those in the optimal Pareto set. To use this metric it is necessary to have a previous knowledge of the global Pareto optimal front. Normally, this happens in test benchmark problems used to measure the performance of the algorithms.

GD = q P|S+| i=1 d2i |S+_| (2.12) where di = min p∈P∗kF (si) − F (p)k, si ∈ S +_{. Thus, d}

i is the smallest Euclidean distance

from s ∈ S+ to the closest solution in P F∗.

Spacing (SP): Spacing (Schott,1995) is used to measure the neighbouring solutions range variance in the known Pareto front. It is defined as:

SP = v u u t 1 b − 1 b X i=1 (d − di)2 (2.13) where d_i = min_j(PM k=1|fki− f j

k|), i, j = 1, ..., b and b is the number of non-dominated

solutions generated by the algorithm; M is the number of objectives; d is the mean of all di. According to this metric, values of SP near zero are preferable because

(52)

2.6 Concluding Remarks

In this chapter a set of background information is given that is used and explored along the thesis. The chapter starts with a distinction between multi-objective and single-objective and are introduced the concepts of dominance, as well as the definition of solution and search space, multi-objective problem and objective space. The Pareto concept and terminology is very important to this thesis. As most of evolutionary algorithms follow the same main tasks, a standard evolutionary algorithm is presented with a description of some common steps. The DTLZ family benchmark problems and the performance metrics GD and SP are described because they are used in chapter 4to test and validate the proposed method and algorithm. In the next chapter evolutionary algorithms which become popular to solve multi-objective problems are presented. Also, it is referred why these algorithms have difficulties to solve problems with more than three objectives outlining some of the algorithms.

(53)

CHAPT

ER

3

Multi-objective Evolutionary

Optimization

A large variety of MOEA have been proposed since the first implementation by Schaffer (1985) with the goal to improve the efficiency of MOP (Coello et al.,2007). However, most of these algorithms were developed to be applied to solve problems with 2 or 3 objectives and when applied to problems with 4 or more objectives they scale poorly (Hughes, 2005; Knowles and Corne, 2006; Praditwong and Yao, 2007; Teytaud, 2007; Wagner et al., 2006), therefore, fail to converge to the Pareto front because the deterioration problem becomes more prevalent with the increasing num-ber of objectives (Li et al.,2014). Problems with 4 or more objectives are known as many-objective problems (MaOP) (Farina and Amato, 2002), and constitute a sub-set of multi-objective problems. Until 2007 the subject many-objective optimization did not receive much attention from the researchers. Some of the studies until that date are (Khare et al., 2003; Purshouse and Fleming, 2003; Fleming et al., 2005; Hughes, 2005; Deb et al., 2006), but the relevance and research interest of MaOP have been demonstrated in the last years as well in real problem solving such as the following examples: nurse rostering (S¨ulflow et al.,2006), car controller optimization (Narukawa and Rodemann, 2012), and water supply (Kasprzyk et al., 2012). For more applications of many-objective optimization see Li et al. (2015).

This chapter starts by presenting a brief review of the evolutionary algorithms paradigms. Then the particle swarm optimization algorithm is discussed, as it is

(54)

the main base for the developed algorithm in this thesis. The final part of this chapter presents the main difficulties in solving many-objective problems, as well as the current proposals to deal with these difficulties. Other subset of multi-objective problems are bi-objective problems (BOP). A section is dedicated to explore some characteristics of bi-objective problems due to its relevance in a technique proposed in this thesis (see section 3.7).

3.1 Evolutionary Algorithms Paradigms

Evolutionary Algorithm (EA) (B¨ack,1996) is a generic concept to designate a set of optimization stochastic methods. These methods are inspired in the nature evolu-tion and selecevolu-tion theory. The common characteristic to many of these methods is to have a population of individuals in an environment with limited resources where the competition for these resources implies a natural selection (promoting the sur-vival individuals who can adapt better). Generally, EA use an initial population where each element represents a possible solution to solve a given problem. In every iteration, the elements or solutions are evaluated to determine their fitness. Then, solutions with higher fitness have more probability to be recombined between them in a way to produce new offspring generation until a stop criteria is reached. It is expected that some elements of the new generation are better than the elements in the previous generation.

The algorithms reviewed here are considered the corner stones of EA: Evolutionary Programming (EP) (Fogel, 1962), Evolution Strategies (ES) (Rechenberg, 1965), Genetic Algorithms (GA) (Holland, 1975) and Genetic Programming (GP) (Koza, 1992), and they will be briefly described next.

In last decades, with the importance that EA have conquer, new algorithms were proposed such as: Ant Colony Optimization (ACO) (Colorni et al., 1991), Particle Swarm Optimization (PSO) (Kennedy and Eberhart, 1995), Differential Evolution (DE) (Storn and Price, 1997), and Artificial Bee Colony (ABC) (Karaboga, 2005), among many others. The description of some of these algorithms is presented in

(55)

3.1. EVOLUTIONARY ALGORITHMS PARADIGMS

section 3.4 where the PSO algorithm will have a more detailed description.

3.1.1 Evolution Strategies

Evolution Strategy (ES) was introduced byRechenberg(1965). In the early versions the mutation mechanism is the only operator and a parent produce a single child by a Gaussian distributed mutation. In the version (1 + 1) − ES a child substitute the parent if is fitter; in version (1, 1) − ES the child substitute always the parent. Later Rechenberg proposed the multimember ES (µ + 1) − ES, where more than one parent, µ > 1, produce a single child. Parents are chosen randomly and recombined. If the child is better than any parent it is selected to the next generation and the worst parent is discarded from the population (Beyer and Schwefel,2002). Schwefel introduced in 1981 two other versions of multimembered ES, (µ + λ) − ES and (µ, λ) − ES, where more than one parent produce more than one child. In the first example, the µ fittest elements between parents and children are selected for the next generation. In the second case the µ, with λ > µ, fittest children are chosen to the next generation even if they are worst than the respective parents.

For the generation of a new individual a Gaussian operator is used in the mutation. The child vector y_i is created from the parent vector x_i by

y_i = xi+ N(0, σ) (3.1)

where σ is the mutation strength and the vector N(0, σ) is generated using a zero-mean normal distribution with standard deviation σ. Adaptive strategies can be adopted by varying the mutation strength (Deb,2001).

3.1.2 Evolutionary Programming

Evolutionary Programming (EP) was proposed by Fogel (1962) as an approach to artificial intelligence. While in the first EP systems the individuals were represented

(56)

by finite state machines, nowadays EP uses real-valued representations (Deb,2001). EP algorithms just use the mutation operator. Each parent produces just one child, similar to (µ + λ) − ES with µ = λ. The selection for the next generation is made among the merged population of parents and children where each individual compete for survival in a stochastic tournament selection.

3.1.3 Genetic Algorithms

Genetic algorithms (GA) were proposed by Holland (1975) and its popularity has been boosted by the work of Goldberg (1989). The population is composed by individuals representing chromosomes with several genes. In GA the chromosome can be represented by a binary vector where each bit corresponds to a gene with two possible states, 0 or 1. For problems with real-valued parameters, these can be encoded by binary strings. The GA uses a recombination or crossover operator, where parts of a chromosome are recombined with other parts of other chromosome. GA also uses a mutation operator, providing a given probability to a gene state to be flipped. In this form the offspring generation is created. The new population for the next generation is obtained between the parents population and the offspring population. Different kinds of individuals selection of the two populations (parents and offspring) can be applied to form the final new population.

3.1.4 Genetic Programming

Genetic Programming (GP) was proposed by Koza (1992) and the main difference relatively to GA is the data structure, with potential solutions commonly represented by trees instead of binary-coded or real-coded parameters. It uses the recombination and mutation operators. The solution recombination exchange sub-trees and the mutation performs random changes in trees. Each population element, in contrast with GA, represents a computer program with functions and variables.

(57)

3.2. PARTICLE SWARM OPTIMIZATION

3.2 Particle Swarm Optimization

Particle Swarm Optimization (PSO) is by now a popular and well established meta-heuristic inspired in collective intelligence of many animal swarms. This collective intelligence can be observed in flocks of birds as they look for food, avoid predators, seek to travel more quickly and other behaviours. The intelligence is in the group, in how they cooperate and support each other and not in each individual per se. PSO was developed byKennedy and Eberhart (1995) and simulates a swarm group behaviour, where groups of individuals work together to improve the collective and individual performance. Like other population based meta-heuristics, PSO uses a population, called swarm, of solutions, called particles. Each particle changes its behaviour interacting with the other particles.

The goal of PSO algorithm is to find the optimum of an objective function f : S ⊂ Rn→ R. A minimization problem will be considered. Let the search space be S ⊂ Rn and f : S → Y ⊆ R the objective function, where n is the search space dimension, and S represents the admissible space of the problem. The swarm is defined by A = {x1, x2, . . . , xN}, where N is the pre-set number of particles defined

by xi = (xi1, xi2, . . . , xin) ∈ S, i = 1, 2, . . . , N . Thus xi represents the particle i

position in the search space and fi = f (xi) ∈ Y the position (or fitness value) in

the objective space. The position of each particle is adjusted by the velocity values, represented by vi = (vi1, vi2, . . . , vin), i = 1, 2, . . . , N .

PSO is ruled by two main equations, the velocity equation (3.2) and the position equation (3.3).

vt+1_i = vt_i+ r₁c₁(pbestt_i − xt_i) + r₂c₂(gbestt− xt_i) (3.2)

xt+1_i = xt_i+ vt+1_i (3.3)

(58)

each other. The c1 and c2 values are positive acceleration coefficients. The velocity is

changed taken past information obtained in the algorithm. Each particle memorizes the best location reached. Therefore, the PSO algorithm has a swarm, A, to keep the particles position, has a memory set represented by Pb = {pbest1, pbest2, . . . , pbestN}

with the best positions reached by each individual particle. The particles tend to the global minimum by the best position that any particle has visited before. This information is (assuming that the topology full connected neighbourhood is adopted, see section 3.2.2.3) shared among all particles. The best position among all swarm particles is represented as gbest = min f (pbesti).

Algorithm3.1represents the basic PSO algorithm structure. In the next subsections the different steps in the construction of a PSO algorithm will be discussed.

Algorithm 3.1: Basic structure of a PSO

1 Create and initialize the swarm

2 While the termination criteria is not met

3 Evaluate the swarm

4 For each particle in swarm do

5 Determine pbest

6 Determine gbest

7 For each particle in swarm do

8 Evaluate new velocity

9 Evaluate new position

10 Return gbest solution

3.2.1 Swarm initialization

The first PSO step algorithm is the initialization of the swarm particles. The ini-tialization of the particles position is usually performed randomly. Other techniques can be applied, as referenced in2.3.1. The initialization should be done in a way the particles cover the search space uniformly (Engelbrecht, 2007). In problems with large dimensions when the initialization is made randomly a non uniform distribu-tion of the particles can occur in some cases (Gutierrez et al., 2011). Assuming

Many-objective optimization with particle swarm optimization algorithm

Universidade de Tr´

as-os-Montes e Alto Douro

Many-objective Optimization

with Particle Swarm Optimization Algorithm

H´

elio Alves Freire

Universidade de Tr´

as-os-Montes e Alto Douro

Many-objective Optimization

with Particle Swarm Optimization Algorithm

H´

elio Alves Freire

Statement of Originality

Acknowledgements

Abstract

Resumo

Contents

List of Tables

List of Figures

Acronyms

1

Introduction

1.1

Problem Statement

1.2

Research Hypothesis and Methodology

1.3

Key Contributions

1.4

Outline of the Thesis

2

Multi-Objective Optimization:

Introductory and Background

Concepts

2.1

Multi-objective Optimization

2.1.1

Solution and search space

2.1.2

Multi-objective problem and objective space

2.1.3

Definition of a conflicting multi-objective problem

2.2

Pareto Terminology and Other Definitions

2.3

Standard Multi-objective Evolutionary

Algo-rithm

2.3.1

Population Initialization

2.3.2

Algorithm Termination Criteria

2.3.3

Elitism and Elite Archiving

2.4

DTLZ benchmark problem functions

2.5

Performance metrics

2.6

Concluding Remarks

3

Multi-objective Evolutionary

Optimization

3.1

Evolutionary Algorithms Paradigms

3.1.1

Evolution Strategies

3.1.2

Evolutionary Programming

3.1.3

Genetic Algorithms

3.1.4

Genetic Programming

3.2

Particle Swarm Optimization

3.2.1

Swarm initialization