A simple and effective genetic algorithm for the two-stage capacitated facility location problem

(1)

A simple and effective genetic algorithm for the two-stage capacitated

facility location problem

q

Diogo R.M. Fernandes

a

_{, Caroline Rocha}

b

_{, Daniel Aloise}

c,⇑

_{, Glaydston M. Ribeiro}

d

_{, Enilson M. Santos}

e

_,

Allyson Silva

a

Department of Industrial Engineering, Federal University of Rio Grande do Norte, Brazil

b_{School of Sciences and Technology, Federal University of Rio Grande do Norte, Brazil} c

Department of Computer Engineering and Automation, Federal University of Rio Grande do Norte, Brazil

d

Department of Transport Engineering, COPPE-Graduate School and Research in Engineering, Brazil

e

Department of Civil Engineering, Federal University of Rio Grande do Norte, Brazil

a r t i c l e

i n f o

Article history:

Received 17 October 2013

Received in revised form 9 April 2014 Accepted 31 May 2014

Available online 5 July 2014 Keywords:

Two-stage facility location Genetic algorithm

Multi-stage transportation systems

a b s t r a c t

This paper presents a simple and effective Genetic Algorithm (GA) for the two-stage capacitated facility location problem (TSCFLP). The TSCFLP is a typical location problem which arises in freight transporta-tion. In this problem, a single product must be transported from a set of plants to meet customers demands, passing out by intermediate depots. The objective is to minimize the operation costs of the underlying two-stage transportation system thereby satisfying demand and capacity constraints of its agents. For this purpose, a GA is proposed and computational results are reported comparing the heuristic results with those obtained by two state-of-the-art Lagrangian heuristics proposed in the literature for the problem.

1. Introduction

Location Analysis is one of the most active ﬁelds in Operations Research and Industrial Engineering. It deals with the decision of optimally placing facilities in order to minimize operational costs (Nickel & Puerto, 2005; Daskin, 1995; Drezner & Hamacher,

2004). Although solved in several practical situations by intuitive

methods, optimal facility location decisions usually demand more in-depth studies. Regardless of the type of business in which the company/industry is involved, the decisions about location are strategic and belong to the core of any planning and/or management process. Furthermore, these decisions lead to long term commitments due to high costs involved in installing location facilities. The choice of the location may have inﬂuence over the relations between the company and its clients. If the client must be physically joined in the process, it is unlikely that a location is acceptable if the travel time or distance between the provider and the client is relatively large.

Many different facility location models have been proposed in order to cope with the large spectrum of real-world applications (see Klose & Drexl (2005) for a survey). Multi-stage modeling consists in finding the optimal placement for facilities on several hierarchically layered levels. They are traditionally solved sequentially providing, in some cases, infeasible or less profitable solutions (Guyonnet, Grant, & Bagajewicz, 2009). Particularly, the two-stage capacitated facility location problem (TSCFLP) is a typi-cal problem which arises in freight transportation. In this problem, a single product must be transported from a set of plants to meet customers demands. However, the transportation is not performed directly; the plants send the product to a set of depots which perform the delivery to customers. Then, there are two transporta-tion flows: between plants and depots, and between depots and customers. The TSCFLP includes capacities for plants and depots, fixed and variable costs. The fixed costs are assigned to opening of plants and depots and the variables costs are assigned to trans-portation flows. So, we need to identify which plants and depots should be opened to serve all customers, at a minimal total cost.

The TSCFLP has some variants. For instance,Tragantalerngsak,

Holt, and Ronnqvist (2000) approached a TSCFLP where depots are load with goods from a single plant and a customer is served

by just one depot.Klose (2000), Keskin and Íster (2007)approach

the TSCFLP on locating only the depots, i.e., the location of the http://dx.doi.org/10.1016/j.cie.2014.05.023

q

This manuscript was processed by Area Editor Qiuhong Zhao. ⇑Corresponding author. Tel.: +55 8499069268.

E-mail addresses: diogorobson@hotmail.com (D.R.M. Fernandes), caroline. rocha@ect.ufrn.br(C. Rocha),aloise@dca.ufrn.br(D. Aloise),glaydston@pet.coppe. ufrj.br(G.M. Ribeiro),enilson@interjato.com.br(E.M. Santos),allysonfcs@gmail.com (A. Silva).

Contents lists available atScienceDirect

Computers & Industrial Engineering

(2)

plants in the distribution system is ﬁxed. Moreover, the latter work considers that the number of depots to locate is known beforehand. The TSCFLP version considered in this paper is the

same of Litvinchev and Ozuna (2012) who proposed several

Lagrangian relaxations for the problem. Feasible solutions are con-structed from those of the Lagrangian subproblems by applying simple heuristic procedures.

Let xijand sjk be continuous variables which indicate,

respec-tively, the amount of products transported from plant i 2 I to depot (or satellite) j 2 J, and from satellite j 2 J to customer k 2 K. Let y_i be a binary variable which is equal to one if and only if plant i 2 I is chosen to be opened, otherwise it is equal to zero. Similarly,

let zj be a binary variable which is equal to one if and only if

satellite j 2 J is opened, otherwise it is equal to zero. Thus, the mathematical model of the TSCFLP can expressed as:

Minimize z ¼X i2I fiyiþ X j2J gjzjþ X i2I X j2J cijxijþ X j2J X k2K djksjk ð1Þ subject to :X j2J sjkPqk;

8

k 2 K ð2Þ X i2I xijP X k2K sjk;

8

j 2 J ð3Þ X j2J xij6biyi;

8

i 2 I ð4Þ X k2K sjk6pjzj;

8

j 2 J ð5Þ xij2 Rþ;

8

i 2 I; j 2 J ð6Þ sjk2 Rþ;

8

j 2 J; k 2 K ð7Þ yi2 0; 1f g;

8

i 2 I ð8Þ zj2 0; 1f g;

8

j 2 J; ð9Þ where:

fiis the ﬁxed cost associated to plant i 2 I; gjis the ﬁxed cost associated to satellite j 2 J;

cijis the cost of transporting one unit of the product between

plant i 2 I and satellite j 2 J;

djkis the cost of transporting one unit of the product between

satellite j 2 J and customer k 2 K;

qkis the demand of customer k 2 K;

biis the capacity of plant i 2 I; and pjis the capacity of satellite j 2 J.

The objective function(1)represents the total cost of the

trans-portation system. Constraints (2) ensure that each customer is

served. Constraints(3)are conservation constraints, i.e, the total

amount of products transported from a depot must be at most

the total transported to it from the plants. Constraints(4) and (5)

are capacity constraints assigned to plants and depots, respec-tively. Finally, constraints(6) and (7)are assigned to ﬂow variables,

and constraints(8) and (9)impose binary values for the respective

variables.

Marín and Pelegrin (1999)worked with an equivalent problem with fractional variables xij;sjk2 ½0; 1;8i 2 I; j 2 J; k 2 K, which yields a different formulation with a different polyhedral structure from model (1)–(9). Indeed, the relaxation of the same constraints in both formulations can result in different lower bounds. The

reader is referred toFisher (1981), Geoffrion and McBride (1978)

and Guignard (1997)for details about Lagrangian relaxations for mixed-integer mathematical programs. For the best Lagrangian

relaxation obtained in Marín and Pelegrin (1999), the relative

difference between the best lower and upper bounds was 1.6% in average for instances with jIj ¼ jJj ¼ jKj ¼ 60.

In this paper, we present a simple but effective Genetic Algorithm (GA) for the TSCFLP. Computational results show that

this GA is competitive with state-of-the art Lagrangian heuristics proposed in the literature, providing optimal solutions for several

instances proposed inLitvinchev and Ozuna (2012).

This paper is divided as follows. Section 2 describes the

proposed GA including chromosome, population, crossover and

mutation operators. Section3presents an extensive computational

experimentation followed by our ﬁnal remarks in Section4.

2. Genetic algorithm

Metaheuristics are general frameworks to design heuristic algorithms which are able to escape from local optima. Genetic algorithms are nature-inspired metaheuristics based on an evolutionary principle which have been successfully applied to

several NP-hard combinatorial optimization problems (see Leu,

Matheson, & Rees (1996), Onwubolu & Mutingi (2001), Watanabe, Ida, & Gen (2005)). They work with a set of solutions, called population, whose elements represent individuals, each

one represented by a chromosome. According to Reeves (2003),

the role of a GA is to recombine chromosomes with simulated genetic such as crossover and mutation. Algorithm 1 presents the pseudocode of a typical GA (Reeves, 2003). It starts with the selec-tion of an initial populaselec-tion of chromosomes (line 1) that evolves by iteratively applying crossover (lines 4–8) and mutation (lines 9–12) operators, yielding a new population (line 15) each time. Thus, each iteration of the loop of lines 2–16 is called a generation.

Algorithm 1. Genetic algorithm (Reeves, 2003)

1: Create an initial population of chromosomes 2: while Stopping condition is not satisﬁed do

3: repeat

4: if Crossover condition satisﬁed then

5: Select parent chromosomes

6: Deﬁne parameters to crossover

7: Apply the crossover operator

8: end if

9: if Mutation condition satisﬁed then

10: Choose mutation points

11: Apply the mutation operator

12: end if

13: Evaluate ﬁtness of offspring

14: until sufﬁcient offspring created

15: Select the new population

16: end while

17: return The best individual found

The stochastic component of GAs lies in the mechanism of parental selection to crossover (line 5) and the mutation probabil-ity (line 9). The population size used in a genetic algorithm is an important factor that has a direct influence on its efficiency and effectiveness. There are few chances to get an individual with good quality (close to the optimal solution) with a small population, and thus, the algorithm is not effective. On the contrary, if the popula-tion is composed by a large number of individuals, the probability of obtaining a solution with good quality increases during the evo-lutionary process of the algorithm, but computational time increases, making it less efficient (Reeves, 2003).

In the sequel, we present the main components of our genetic algorithm for the TSCFLP. Indeed many of them, e.g. chromosome codiﬁcation, have been used successfully for other location

prob-lems (see for instanceAlcaraz, Landete, & Monge (2012), Hosage

& Goodchild (1986), Jaramillo, Bhadury, & Batta (2002), Stanimirovic´, Kratica, & Dugos˘ija (2007)).

(3)

2.1. Chromosome codiﬁcation

We note from the mathematical formulation (1)–(9) that if

variables y e z were ﬁxed, it is possible to obtain the values of the transportation variables x and s by solving a minimum cost

ﬂow problem (Ahuja, Magnanti, & Orlin, 1993). Our genetic

algorithm will use this fact to codify a solution in a chromosome consisting of a binary vector of jIj þ jJj positions. If a position is equal to 1, the corresponding plant (or satellite) is opened, other-wise it is closed.

Fig. 1 shows the representation of the chromosome used. Speciﬁcally, all plants are represented before the satellites in the chromosome. Each individual in the population is evaluated by the ﬁtness of its chromosome, which is equivalent to the cost of the solution according to the objective function(1).

2.2. Initial population

A GA uses an initial population in the beginning of the evolu-tionary process which takes place through a sequence of crossing and mutation operators. In this work, two heuristics are used to generate the initial population.

The first heuristic adopts as strategy of construction a cost-ben-efit criterion to decide which plants and/or satellites to select. The fixed and transportation costs are considered together, thus enabling the integration between the two distribution levels. The cost-benefit of opening a facility (plant or satellite) is defined as the ratio of the facility’s fixed cost and its total storage capacity plus transport costs. Considering a plant i 2 I, its cost-benefit is expressed byfiþ

P

j2Jcij

bi , while the cost-beneﬁt of opening a satellite

j 2 J is calculated as P

i2IcijyiþgjþPk2Kdjk

pj (note that the cost-beneﬁt

of the satellites is given in terms of the open plants). The pseudo-code of the constructive heuristic 1, called CH1, is shown in Algo-rithm 2.

Algorithm 2. Constructive Heuristic – CH1

1: yi 0;8i 2 I 2: zj 0;8j 2 J 3: for i 2 I do 4: BCPi fiþP_j2Jcij bi 5: end for 6: whileP_i2Ibiyi6Pk2Kqkdo

7: Select an index i02 f‘jy‘¼ 0g with probability given by

BCP_i0 P ‘:_y‘¼0BCP‘ 8: yi0 1 9: end while 10: for j 2 J do 11: BCSj P i2IcijyjþgjþPk2Kdjk bi 12: end for 13: whileP_j2Jpjzj6Pk2Kqk do

14: Select an index j02 f‘jz‘¼ 0g with probability given by

BCS_j0 P ‘:z‘¼0BCS‘ 15: zj0 1 16: end while 17: return ðy; zÞ

We observe that Algorithm 2 is divided into two stages: the selection of plants and the selection of satellites. The algorithm begins with all plants and satellites closed (lines 1 and 2). In the

loop of lines 3–5, the cost-benefit of all plants are calculated, denoted BCPi, for i ¼ 1; . . . ; jIj. Next, in the loop of lines 6–9, plants are opened until the demand of customers could be satisfied. They are randomly selected at each iteration in line 7 according to a probability based on the cost-benefit criterion. Similarly, satellites are opened in the loop of lines 10–16 until the total demand of the clients could be satisfied. Finally, the chromosome represented by the pair ðy; zÞ is returned in line 17.

For the opening of plants is calculated the probability to open each plant, and one of them is randomly selected to be opened (lines 7 and 8). Between lines 10–16 this process is repeated to sat-ellites. In the loop of lines 10–12 is calculated the cost-beneﬁt of all satellites, denoted BCSj, for j ¼ 1; . . . ; jJj. Then, in line 14, one satel-lite is chosen randomly to be opened. This procedure is repeated until all demands are satisﬁed. Finally, plants and satellites opened by heuristic CH1 are returned in line 17. At the end, the solution of the algorithm is presented as a binary vector with components equal to 1 if the corresponding facility is open, and 0 otherwise, as shown inFig. 1.

The second constructive heuristic, called CH2, starts from the solution of the linear programming relaxation of model (1)–(9). Then, it iteratively sets to 0 the binary variables with values near to zero, and to one the binary variables near to one (a tolerance

¼ 106 was used). Initially, the linear relaxation of model (1)–

(9) is solved and the variables yi, with i 2 f1; . . . ; jIjg and zj, with j 2 f1; . . . ; jJjg, which are closest to 1 are ﬁxed to that value (i.e., plant i and satellite j are opened). Then, the relaxed problem is repeatedly solved and the procedure of ﬁxing a pair of variables

yi and zj performed ate each iteration until the total capacity of

the opened plants and the total capacity of the opened satellites exceeds each the total demand of the customers. Heuristic CH2 is used only once due to its computational time, while the CH1 is used to form the rest of the population. In general, the individual constructed by CH2 has a better ﬁtness then those generated by CH1, being used as a guide to the search process inside the solution space.

Solutions generated by constructive heuristics are not necessar-ily optimal, and often may be improved by local search heuristics. Two local searches developed for the TSCFLP are described below. The ﬁrst local search (LS1) performs isolated moves of opening and closing facilities (plants and satellites). These moves are made by taking a facility, whether plant or satellite, and complementing its current state. Thus, a closed plant/satellite is opened, and vice versa. A solution’s neighborhood encompasses all neighbors obtained by state complementation of the facilities. To obtain the total cost of a neighbor solution, including the transportation costs,

the minimum cost ﬂow algorithm proposed byGoldberg (1997)is

used considering the set of opened plants and satellites ( the algo-rithm is not run for an infeasible solution. If an improving neighbor solution is found, it becomes the new current solution. The neigh-borhood search moves are performed iteratively for each plant and each satellite in the current solution until it is impossible to ﬁnd a better solution in the neighborhood.

The second local search (LS2) is performed after LS1. The use of two local searches in sequence is meant to broadly explore the search space (Hansen & Mladenovic´, 2001). LS2 differs from LS1 because it works with complementary pairs of facilities. It is understood by a complementary pair that composed by an opened

(4)

facility (bit 1) a closed facility (bit 0). LS2 starts with a search for complementary pairs. Whenever a complementary pair is found, the facilities values are complemented by opening the closed facility and closing the opened one. Then, the minimum cost ﬂow algorithm is run to obtain the total cost of the new solution, which will be compared to the current solution. If the new solution is bet-ter, it will be considered as the new current solution. The local search ends when the evaluation of all complementary pairs do not result in a better solution than the current one.

2.3. Crossover and mutation operators

As stated before, the GA is a metaheuristic based on biological function selection. In computational terms, the operator responsi-ble for the parental chromosomes crossover select arbitrarily the genes of two parents with an equal probability of selection. In

Fig. 2the crossing of two parental chromosomes generates a child chromosome with some genes from parents 1 and 2. This proce-dure diversiﬁes the population because the children generated replace the parents in the next generation.

As in biology, during crossing, there is a small possibility that abnormalities occur without a logical reason. The mutation opera-tor is applied to the generated offspring so that every time a muta-tion occurs, the mutant gene bit is inverted. In other words, if the gene is an opened facility, it will close, and vice versa (Fig. 3). The mutation operator is a strategy for diversifying the population expected to generate different solutions, increasing the area of the search space explored by the algorithm.

The crossover and mutation operators may generate infeasible chromosomes to the TSCFLP. It is easy to verify the feasibility of

a chromosome by running a maximum ﬂow algorithm (Ahuja

et al., 1993). In particular, for instances in which there are arcs between every pair of plants and satellites and every pair of

satel-lites and customers, it is sufﬁcient to verify if: (i)

P i2IbiyiP

P

k2Kqk, and (ii) P

j2JpjzjPPk2Kqk. In this work, when-ever crossover and mutation operators yield an infeasible chromo-some, it is discarded from the population.

Algorithm 3. Genetic algorithm with elitism for the TSCFLP 1: Create an initial population of chromosomes

2: while Stopping condition is not satisﬁed do

3: repeat

4: if Crossover condition satisﬁed then

5: Select parent chromosomes

6: Deﬁne parameters to crossover

7: Apply the crossover operator

8: end if

9: if Mutation condition satisﬁed then

10: Choose mutation points

11: Apply the mutation operator

12: end if

13: Evaluate ﬁtness of offspring

14: until sufﬁcient offspring created

15: Select the new population

16: if the best individual found by the algorithm is

updated then

17: Perform local searches BL1 followed by BL2 in the

best individual

18: end if

19: Perform local searches BL1 followed by BL2 in the

whole population every N generations 20: end while

21: return The best individual found

2.4. Elitism

The proposed GA considers within its evolutionary process an extra intensiﬁcation process. Whenever a generated individual is the best found so far, the local search heuristics LS1 and LS2 are applied sequentially in order to improve this elite individual. Moreover, these local search heuristics are applied at intervals of N generations in each individual of the current population. Algo-rithm 3 shows how AlgoAlgo-rithm 1 is modiﬁed by applying the elitism procedure.

3. Computational experiments

Computational experiments were performed on a Pentium Intel with a 2.3 GHz clock and 24 Gigabytes of RAM memory. The genetic algorithm was implemented in C++ and compiled by gcc 4.2. Five classes of instances were artiﬁcially generated by varying seven parameters of the TSCFLP: bi;fi;cij;pj;gj;djk and qk.Table 1 shows, for each class, the range of values assumed for each param-eter, from which a speciﬁc value to an instance is assumed accord-ing to an uniform random distribution, where B ¼

P k2Kqk jIj and P ¼ P k2Kqk

jJj . All instances have twice more satellites than plants

and twice more costumers than satellites, so that jKj ¼ 2jJj ¼ 4jIj. For simplicity, we will use the number of plants, i.e., jIj, to indicate the size of the considered instances.

The instances can be found at (http://www.gerad.ca/aloise/

publications.html).

3.1. Constructive heuristics and local searches

Our ﬁrst set of experiments aims to evaluate the performance of the proposed constructive heuristics and local searches on

generat-ing the initial population of the GA.Table 2reports the heuristic

results for instances with jIj ¼ 50. The ﬁrst column refers to the class of the instance while the second shows their identiﬁers within each class. Five problem instances were generated for each class, totalizing 25 instances. The other columns refer to the rela-tive gaps between the average cost of the heuristic solutions and the best solution obtained by the exact solver CPLEX 12.0, which was used until the instance was optimally solved or until it was

halted due to lack of memory. For example, let zCH1be the cost of

the solution obtained by the constructive heuristic CH1, then the value of GapCH1 is calculated as zCH1z

z , in which z is the cost of Fig. 2. Crossover operator.

(5)

the solution obtained by CPLEX. Thus, the fourth column GapBL1 shows the relative gap between CH1 followed by the application of the ﬁrst local search and the solution obtained by CPLEX. The

ﬁfth column GapBL2refers to the CH1 heuristic followed by the

sec-ond local search, and the sixth column GapCH1BLrefers to the CH1

heuristic followed by the application of both local searches.

We note fromTable 2that CH1 is not effective when applied

alone as it did not ﬁnd the optimal solution for any instance. More-over, its solutions are far from optimality for instance #3 of class 4 (22,19%), for example. By means of local search 1, CH1 results were improved by 37% in average whereas, by means of local search 2, CH1 results were improved by 53% in average. For the case where both local searches were applied, CH1 results improved by 64% in average.

Table 3 presents the computational results obtained by con-structive heuristic 2. According to this table, we observe that:

CH2 did not ﬁnd the optimal solution for the tested instances, even though it obtains near-optimal solutions for the instances in class 5, in particular in instance #5 (0.26%);

LS1 improved the solutions obtained by CH2 in almost half of the tested instances;

LS2 improved by 52% in average the solutions obtained by CH2; and

The application of both local searches improved by 80% in aver-age the results obtained by CH2. In particular, the obtained

results are better than those obtained by CPLEX for the instances #1 and #3 of class 3.

3.2. Parameter setting

The GA proposed in this paper requires the deﬁnition of four parameters: population size, number of generations, interval of generations for application of the elitism procedure (N), and the mutation rate.

Computational tests were developed to identify the most appropriated set of parameter values. They were conducted with the following combination of values: (i) application of the elitism at every N ¼ 20 and N ¼ 50 generations; (ii) 500, 1000 and 2000 generations; and (iii) population size composed by 50, 100 and

200 individuals.Table 4reports average results of 50 distinct runs

of the GA regarding the 25 generated instances with jIj ¼ 50 plants. The tests were performed using a ﬁxed mutation rate of 5%. The ﬁrst column in the table refers to the number of generations N for elitism application. The second column indicates the number of generations of the algorithm, and the third one refers to the pop-ulation size. Each line in the fourth column presents the relative gaps (in %) of the GA average results with respect to the best solu-tion obtained by the CPLEX solver.

We note from Table 4 that the best results for all tested

instances are obtained using 2000 generations. Indeed better Table 1

Range values for the TSCFLP in the ﬁve classes of generated instances.

Parameter Class 1 Class 2 Class 3 Class 4 Class 5

bi [2B 5B] [5B 10B] [15B 25B] [5B 10B] [5B 10B] fi ½2 104_{3 10}4 _{½2 10}4_{3 10}4 _{½2 10}4_{3 10}4 _{½2 10}4_{3 10}4 _{½2 10}4_{3 10}4 cij [35 45] [35 45] [35 45] ½50 1 102 [35 45] pj [2P 5P] [5P 10P] [15P 25P] [5P 10P] [5P 10P] gj ½8 1031:2 104 ½8 1031:2 103 ½8 1031:2 103 ½8 1031:2 104 ½8 1031:2 104 djk [55 65] [55 65] ½8 1021 103 ½50 1 102 ½8 1021 103 qk [10 20] [10 20] [10 20] [10 20] [10 20] Table 2

CH1 results for instances with 50 plants.

Class Instance GapCH1(%) GapBL1(%) GapBL2(%) GapCH1BL(%)

1 1 11.61 10.07 6.23 5.65 2 10.03 8.44 4.67 3.71 3 13.08 11.94 7.44 6.66 4 14.3 6.89 5.71 5.71 5 9.14 5.21 5.21 5.21 2 1 14.79 10.89 9.1 5.26 2 12.48 10.24 4.77 3.29 3 12.83 12.83 7.73 7.73 4 15.03 5.65 5.65 5.65 5 8.04 3.98 3.98 3.98 3 1 1.00 0.63 0.78 0.25 2 1.34 0.58 0.63 0.16 3 1.31 0.81 0.00 0.00 4 1.25 0.24 0.03 0.03 5 1.27 0.4 0.01 0.01 4 1 15.65 13.95 9.27 7.95 2 18.81 15.66 8.85 3.74 3 22.19 16.53 7.6 7.6 4 19.43 11.42 6.44 6.44 5 8.97 4.07 4.07 4.07 5 1 2.76 2.07 1.61 0.88 2 2.54 2.36 1.06 0.9 3 2.54 1.54 1.47 1.47 4 3.18 1.28 1.06 1.06 5 1.67 0.81 0.59 0.59 Table 3

CH2 results for instances with 50 plants.

Class Instance GapCH2(%) GapBL1(%) GapBL2(%) GapCH2BL(%)

1 1 3.87 1.10 1.80 0.00 2 3.64 0.33 1.19 0.13 3 3.22 0.52 1.36 0.28 4 2.65 2.65 1.26 1.26 5 1.08 1.08 0.11 0.11 2 1 0.70 0.70 0.39 0.39 2 1.60 1.60 0.33 0.33 3 2.12 2.12 0.78 0.78 4 0.81 0.81 0.1 0.10 5 0.85 0.85 0.18 0.85 3 1 0.81 0.37 0.47 0.02 2 0.90 0.21 0.54 0.19 3 0.78 0.21 0.57 0.05 4 0.73 0.17 0.51 0.04 5 0.74 0.09 0.52 0.09 4 1 1.62 0.45 1.25 0.2 2 1.18 1.18 0.62 0.62 3 1.56 1.56 0.92 0.92 4 1.96 0.67 1.14 0.61 5 2.00 0.99 1.06 0.99 5 1 0.32 0.26 0.24 0.18 2 0.44 0.44 0.14 0.14 3 0.47 0.47 0.09 0.09 4 0.45 0.45 0.20 0.20 5 0.26 0.26 0.10 0.26

(6)

results can be obtained by running the GA for more generations. However, according to our experiments, these improvements appeared to be too marginal compared to the additional CPU time. There exist two configurations of parameters that yield gaps equal to 0.06%. That configuration which requires the least amount of CPU time has a population formed by 200 individuals, uses 2000 generations, and applies elitism every 50 iterations. However, a fair choice of parameters should also consider efficiency. Remark that by using a different set of parameter values composed by 50 indi-viduals, 2000 generations and N ¼ 50, the algorithm is performed in less computation time almost without compromising its effectiveness.

The last parameter observed was mutation rate. The results

pro-duced inTable 4considered a ﬁxed mutation rate of 5%. However,

in order to better investigate an appropriate value for this rate, additional computational tests were performed with mutation rates of 3% and 7%. The tests were performed with the best set of parameters deﬁned previously.

We note fromTable 5that the best performance was obtained

with a mutation rate of 3%. Thus, the ﬁnal set of values for our parameters was deﬁned as:

Total of 2000 generations;

Elitism is applied every N ¼ 50 generations; Population composed by 50 individuals; and Mutation rate of 3%.

3.3. Comparison with other heuristics from the literature

We compared our results to the ones provided by two

Lagrang-ian heuristics proposed byLitvinchev, Pérez, and Espinosa (2012)

and Litvinchev and Ozuna (2012). The ﬁrst one, denoted RB3, relaxes constraints (5) whereas the second heuristic, denoted RB4, relaxes constraints (2) and (4). After that, both Lagrangian

relaxations are solved by the subgradient method (Guignard,

1997) and feasibility is recovered through a very simple routine

(cf.Litvinchev et al. (2012) or Litvinchev & Ozuna (2012)). Our results are presented for the same instances used by those authors for the case in which the ﬁxed costs are made proportional to the number of depots/clients.

Our evaluation is applied on parameter

BF deﬁned by

Litvinchev and Ozuna (2012)as the relative gap between best fea-sible solution (zBF) and the optimal one (z), i.e.,

BF¼zBFz

zBF 100.

Table 6 presents the results where the ﬁrst column shows the name of the instance set, which is followed by the number of

plants, depots and clients, i.e., jIj; jJj and jKj, respectively. Fifth and sixth columns present the average relative gaps provided by

RB31 _{and RB4. Seventh and eighth columns shows, respectively,}

the relative gap for the best solution found over ten runs of GA and the average relative gaps.

It is easy to see that GA provides tighter gaps than the previous heuristics proposed in the literature, even when average gap values are considered. For the first six instance classes, GA always finds the optimal solution. Unfortunately, a comparison in terms of effi-ciency was not possible since the computing times of the

Lagrang-ian heuristics are not reported in Litvinchev et al. (2012) and

Litvinchev and Ozuna (2012). In the next section, we report the CPU times of our GA for instances generated in this work.

3.4. Comparison with exact solver

The objective of these experiments is to compare the efﬁciency of the GA proposed in this paper with an exact method state-of-the-art for solving mixed-integer programs (CPLEX) thus verifying the advantages of the heuristic approach versus the exact/optimal approach. Heuristics are usually better from the practitioner view-point for problems whose optimal solutions are hard to ﬁnd and/or demand huge computational effort. Moreover, in many real situa-tions, the problem data are inaccurate or contain noise which make exact methods of little use in such situations. Heuristics allow the manager to explore several different scenarios in a short period of time by testing different values for the model parameters.

Tables 7 and 8shows the average results obtained in 10 distinct runs of the GA for instances with jIj ¼ 50; jJj ¼ 100; jKj ¼ 200 and jIj ¼ 100; jJj ¼ 200; jKj ¼ 400, respectively. The first column refers to the class of problems and the second one indicates the identifi-cation of the instance (#1 to #5). The third column shows the aver-age solution value of the best individual (solution) obtained by the GA in ten runs, and the fourth column shows the average execution time (in seconds) of the algorithm. The fifth and sixth columns refer to the optimal solution values and run times, respectively, Table 4

Test of parameters for instances with jIj ¼ 50.

N Generation Population Average gap (%)

20 500 50 0.16 100 0.14 200 0.14 1000 50 0.10 100 0.09 200 0.09 2000 50 0.07 100 0.07 200 0.06 50 500 50 0.16 100 0.14 200 0.13 1000 50 0.10 100 0.10 200 0.10 2000 50 0.07 100 0.07 200 0.06 Table 5

Computational results for setting the mutation rate.

Local search Generation Population Mutation (%) Average gap (%)

50 2000 50 3.00 0.04

5.00 0.07

7.00 0.21

Table 6

Comparison with Lagrangian heuristics ofLitvinchev et al. (2012), Litvinchev and Ozuna (2012). Class jIj jJj jKj RB3 RB4 GA BF BF BFa BFb A 3 5 9 0.00 0.19 0.00 0.00 B 5 7 30 0.00 0.336 0.00 0.00 C 7 10 50 0.27 0.16 0.00 0.00 D 10 10 100 0.06 0.54 0.00 0.00 E 10 16 30 0.58 0.00 0.00 F 30 30 30 0.58 0.00 0.00 G 30 60 120 0.51 0.00 0.01 H 30 30 100 0.56 0.00 0.05 I 50 50 200 0.88 0.12 0.13

a_{Computed using the best solution value found in 10 runs of GA.} b

Computed using the average solution values found in 10 runs of GA.

1

RB3 solutions are not reported byLitvinchev et al. (2012) and Litvinchev and Ozuna (2012)for classes E; F; G; H and I.

(7)

provided by CPLEX. The symbol (⁄) in the sixth column refers to the total running time of CPLEX before it is aborted due to out of mem-ory condition. In this case the solution reported in the ﬁfth column refers to the best one obtained up to that point. Finally, the seventh column (Gap) refers to the relative gaps between the average solu-tion value obtained by the GA and the solusolu-tion value found by CPLEX.

We conclude fromTables 7 and 8that:

The relative gaps between the solutions obtained by GA and those of CPLEX are always less then 0:4% for the instances with 50 plants and less than 1:3% for those with 100 plants; In 8 instances with 50 plants, that is, in 32% of these instances,

GA obtained the optimal solution in all runs; Table 7

Average results provided by GA and CPLEX solutions for instances with jIj ¼ 50 plants.

Class Instance GA Time (s) CPLEX Time CPLEX (s) Gap (%)

1 1 722178.0 581.41 722178 5.80 0.00 2 733350.4 564.71 732194 46.06 0.16 3 733664.2 578.28 733473 246.45 0.03 4 727325.0 533.44 725147 58.65 0.30 5 719512.6 552.90 719431 133.39 0.01 2 1 492747.0 317.05 492747 6.28 0.00 2 494205.0 316.55 494203 22.31 0.00 3 496435.4 330.70 495089 142.00 0.27 4 492215.8 312.35 492107 17.30 0.02 5 489711.0 276.85 489625 11.61 0.02 3 1 2688951.0 276.45 2689629 7323.36⁄ 0.03 2 2697803.8 285.95 2698204 9461.61⁄ 0.01 3 2679038.0 271.31 2679964 10012.95⁄ 0.03 4 2692662.0 236.55 2693384 16160.17⁄ 0.03 5 2646182.0 242.34 2646182 14523.96 0.00 4 1 541803.0 303.57 541803 146.13 0.00 2 539178.0 307.04 539178 190.42 0.00 3 546738.4 318.14 544684 485.45 0.38 4 542750.0 279.54 541849 7531.17 0.17 5 537806.4 264.10 537782 129.89 0.00 5 1 2776346.8 361.04 2775499 517.72 0.03 2 2781496.0 344.12 2781496 509.90 0.00 3 2767842.2 420.73 2767634 13420.70 0.01 4 2777619.0 300.55 2777307 1004.31 0.01 5 2736077.8 318.55 2735567 520.35 0.02 ⁄

The total running time of CPLEX before it is aborted due to out of memory condition.

Table 8

Average results provided by GA and CPLEX solutions for instances with jIj ¼ 100 plants.

Class Instance GA Time (s) CPLEX Time CPLEX (s) Gap (%)

1 1 1484057.4 2784.59 1477398 10019.98 0.45 2 1477503.4 2745.01 1464441 916.43 0.89 3 1497213.0 3001.82 1494399 7357.76 0.19 4 1466182.2 2823.38 1462309 5175.84 0.26 5 1500689.2 2863.24 1492462 967.47 0.55 2 1 979526.6 1483.25 973482 9620.04 0.62 2 973034.4 1455.77 968617 3558.10 0.46 3 989362.0 1427.72 976887 2731.95 1.28 4 978502.0 1444.06 975770 28975.46 0.28 5 952067.8 1419.01 947219 121366.18 0.51 3 1 5298518.4 1355.38 5299973 89547.24⁄ 0.03 2 5278225.4 1320.82 5279599 91032.79⁄ 0.03 3 5227517.0 1311.85 5227517 63668.25⁄ 0.00 4 5316646.0 1365.90 5320811 58087.83⁄ 0.08 5 5251934.4 1383.04 5251871 69316.46⁄ 0.00 4 1 1060799.0 1269.00 1058360 32763.62 0.23 2 1053288.6 1230.08 1050553 54782.24 0.26 3 1070421.0 1283.25 1057271 99944.50 1.24 4 1054638.0 1301.31 1052324 128018.98 0.22 5 1060621.0 1334.96 1059397 3739.27 0.12 5 1 5512662.0 1551.16 5506970 153249.98 0.10 2 5487604.2 1499.94 5482791 70784.97 0.09 3 5458935.8 1477.00 5446763 133319.11⁄ 0.22 4 5523513.8 1513.77 5517165 185827.19⁄ 0.12 5 5468022.8 1472.05 5463544 172795.94⁄ 0.08 ⁄

(8)

In most of the instances of class 3, GA obtained better solutions than CPLEX. The main feature of the instances in class 3 is the large capacity of its plants (bi parameters) and satellites (pj parameters), which result in a less amount of opened facilities in the optimal solution;

Computational times spent by CPLEX are particularly small when solving instances of the classes 1 and 2. These classes have in common the fact that their transportation costs are low at both levels of the distribution system; and

Given the parameters set in Section3.2, the GA proposed in this

paper demands more time to complete as we increase the size of the instances. However, this increase occurs more slowly than that observed for CPLEX. Moreover, the computing times of the GA can be reduced if the algorithm runs with a smaller population and/or for less generations, and/or applying elitism less often.

The performance comparison between GA with CPLEX is

com-plemented by the analysis of the results presented in Tables 9

and 10. These tables report in their third and fourth columns, respectively, the average solution values obtained by GA and the average CPU times required to obtain those solutions (instead of the CPU times relative to the execution of 2000 generations as in

Tables 7 and 8). The ﬁfth column presents then, for each instance, the result obtained by CPLEX using as stopping criterion the time limit given in the fourth column. Finally, the last column refers to the relative gaps between the average solution values obtained by GA and the solution values provided by CPLEX.

Tables 9 and 10show that when the same amount of CPU time is given to both methods, GA obtains better results than CPLEX in 28 out of 50 instances, i.e., in 56% of them. Moreover, for some instances, GA outperforms CPLEX substantially; for the fourth

instance in class 3 ofTable 9GA was better than CPLEX by

approx-imately 20%, whereas the relative gap is not large when CPLEX ﬁnds better results than GA within its time limit; the maximum gap of 1:26% was obtained for the third instance of class 2 in

Table 10.

4. Conclusion

This paper presented a Genetic Algorithm (GA) for the two-stage capacitated facility location problem (TSCFLP), which is a hard problem in freight transportation. Our GA is composed by:

(i) two different constructive heuristics, one of them based on a pure greedy criterion and the other one on rounding linear relaxations of the problem formulation after iteratively ﬁx-ing some of its variables;

(ii) straightforward crossover and mutation operators for which the resulting individuals are evaluated by a minimum cost ﬂow algorithm; and

(iii) a periodical elitism procedure consisting of two local searches on two different solution neighborhoods.

Extensive computational experiments were performed in order to set the best GA parameters to ensure the quality of the solutions obtained. The proposed GA outperformed two state-of-the-art Lagrangian heuristics for the TSCFLP. Furthermore, for large-scale instances, it found good solutions in most of the tested cases. In the worst scenario, our GA presented a relative gap of less than 1.3% from the optimum solution.

Acknowledgments

We are thankful to Prof. Edith Ozuna Espinosa for providing us the test instances. DRMF is grateful to CAPES-Brazil. CR, DA, GMR and EMS were partially supported by CNPq-Brazil grants 482110/ 2011-2, 305070/2011-8, 307002/2011-0, 309746/2012–4. GMR also acknowledges to Fundação de Amparo à Pesquisa do Espírito Santo.

References

Ahuja, R. K., Magnanti, T. L., & Orlin, J. B. (1993). Network ﬂows: Theory, algorithms, and applications. Prentice-Hall.

Table 9

Comparison between GA average best results and CPLEX stopped within the computational time elapsed by GA – Instances with jIj ¼ 50 plants.

Class Instance GA Time (s) CPLEX Gap (%)

1 1 722178.00 145.75 722178.00 0.00 2 733350.40 361.27 732194.00 0.16 3 733664.20 152.79 733701.80 0.01 4 727325.00 359.95 725147.00 0.30 5 719512.60 209.45 719431.00 0.01 2 1 492747.00 73.37 492747.00 0.00 2 494205.00 166.26 494203.00 0.00 3 496435.40 128.04 495276.80 0.23 4 492215.80 73.83 492107.00 0.02 5 489711.00 52.44 489625.00 0.02 3 1 2688951.00 20.35 2738105.40 1.80 2 2697803.80 62.44 2737923.40 1.47 3 2679038.00 29.27 2953326.80 9.29 4 2692662.00 19.84 3373986.00 20.19 5 2646182.00 30.03 2793351.60 5.27 4 1 541803.00 126.38 542110.60 0.06 2 539178.00 89.51 539915.00 0.13 3 546738.40 114.47 546195.40 0.10 4 542750.00 169.42 542185.00 0.10 5 537806.40 173.50 537782.00 0.00 5 1 2776346.80 196.62 2776887.60 0.02 2 2781496.00 99.08 2789023.20 0.27 3 2767842.20 101.80 2769523.00 0.06 4 2777619.00 133.02 2782260.80 0.17 5 2736077.80 144.19 2738287.80 0.08 Table 10

Comparison between GA average best results and CPLEX stopped within the computational time elapsed by GA – Instances with jIj ¼ 100 plants.

Class Instance GA Time (s) CPLEX Gap (%)

1 1 1484057.40 1130.75 1481439.20 0.18 2 1477503.40 1066.21 1473807.00 0.25 3 1497213.00 92.96 1518522.00 1.40 4 1466182.20 600.73 1465893.40 0.02 5 1500689.20 1456.09 1495958.80 0.32 2 1 979526.60 1235.12 973556.00 0.61 2 973034.40 1002.97 970557.20 0.26 3 989362.00 1099.94 977075.00 1.26 4 978502.00 769.11 976983.60 0.16 5 952067.80 1166.97 948366.20 0.39 3 1 5298518.40 915.29 5326738.60 0.53 2 5278225.40 1181.43 5332402.00 1.02 3 5227517.00 808.74 5269097.00 0.79 4 5316646.00 1180.41 5343714.00 0.51 5 5251934.40 755.37 5508707.20 4.66 4 1 1060799.00 12.01 1316534.00 19.42 2 1053288.60 130.62 1194566.40 11.83 3 1070421.00 1098.22 1070758.00 0.03 4 1054638.00 12.09 1312458.00 19.64 5 1060621.00 12.25 1310700.00 19.08 5 1 5512662.00 564.55 5842381.60 5.64 2 5487604.20 1323.59 5488277.60 0.01 3 5458935.80 1349.22 5466255.00 0.13 4 5523513.80 1159.15 5523169.00 0.01 5 5468022.80 246.17 5891929.80 7.19

(9)

Alcaraz, J., Landete, M., & Monge, J. F. (2012). Design and analysis of hybrid metaheuristics for the reliability p-median problem. European Journal of Operational Research, 222, 54–64.http://dx.doi.org/10.1016/j.ejor.2012.04.016. Daskin, M. S. (1995). Network and discrete location: Models, algorithms and

applications. Wiley.

Drezner, Z., & Hamacher, H. W. (2004). Facility location: Applications and theory. Springer.

Fisher, M. L. (1981). The lagrangian relaxation method for solving integer programming problems. Management Science, 27(1), 1–18. http://dx.doi.org/ 10.1287/mnsc.1040.0263.

Geoffrion, A., & McBride, E. (1978). Lagrangian relaxation applied to capacitated facility location problems. AIIE Transactions, 10(1), 40–47. http://dx.doi.org/ 10.1080/05695557808975181.

Goldberg, A. V. (1997). An efﬁcient implementation of a scaling minimum-cost ﬂow algorithm. Journal of Algorithms, 22(1), 1–29. http://dx.doi.org/10.1006/ jagm.1995.0805.

Guignard, M. (1997). Lagrangian relaxation. TOP, 11(1), 151–200.http://dx.doi.org/ 10.1007/BF02579036.

Guyonnet, P., Grant, F. H., & Bagajewicz, M. L. (2009). Integrated model for reﬁnery planning, oil procuring, and product distribution. Industrial and Engineering Chemistry Research, 48(1), 463–482.http://dx.doi.org/10.1021/ie701712z. Hansen, P., & Mladenovic´, N. (2001). Variable neighborhood search: Principles and

applications. European Journal of Operational Research, 130(1), 449–467. doi: 10.1016/S0377-2217(00)00100-4.

Hosage, C. M., & Goodchild, M. F. (1986). Discrete space location–allocation solutions from genetic algorithms. Annals of Operations Research, 6, 35–46. http://dx.doi.org/10.1007/BF02027381.

Jaramillo, J. H., Bhadury, J., & Batta, R. (2002). On the use of genetic algorithms to solve location problems. Computers and Operations Research, 29, 761–779. doi: 10.1016/S0305-0548(01)00021-1.

Keskin, B. B., & Íster, H. (2007). A scatter search-based heuristic to locate capacitated transshipment points. Computers and Operations Research, 34(10), 3112–3125. http://dx.doi.org/10.1016/j.cor.2005.11.020.

Klose, A. (2000). A lagrangian relax-and-cut approach for the two-stage capacitated facility location problem. European Journal of Operational Research, 126, 185–198. doi: 10.1016/S0377-2217(99)00300-8.

Klose, A., & Drexl, A. (2005). Facility location models for distribution system design. European Journal of Operational Research, 162, 4–29.http://dx.doi.org/10.1016/ j.ejor.2003.10.031.

Leu, Y.-Y., Matheson, L. A., & Rees, L. P. (1996). Sequencing mixed-model assembly lines with genetic algorithms. Computers and Industrial Engineering, 30(4), 1027–1036.http://dx.doi.org/10.1016/0360-8352(96)00050-2.

Litvinchev, I., & Ozuna, E. L. (2012). Lagrangian bounds and a heuristic for the two-stage capacitated facility location problem. International Journal of Energy Optimization and Engineering, 1(1), 59–71. http://dx.doi.org/10.4018/ ijeoe.2012010104.

Litvinchev, I., Pérez, M. M., & Espinosa, E. L. O. (2012). Two stage facility location problem: Lagrangian based heuristics. Brazilian Symposium of Operational Research, 1(1), 1–12.

Marín, A., & Pelegrin, B. (1999). Applying lagrangian relaxation to the resolution of two-stage location problems. Annals of Operations Research, 86, 179–198.http:// dx.doi.org/10.1023/A%3A1018998500803.

Nickel, S., & Puerto, J. (2005). Location theory: A uniﬁed approach (1st. ed.). Springer. Onwubolu, G. C., & Mutingi, M. (2001). A genetic algorithm approach to cellular manufacturing systems. Computers and Industrial Engineering, 39(1-2), 125–144. http://dx.doi.org/10.1016/S0360-8352(00)00074-7.

Reeves, C. (2003). Genetic Algorithms. In F. Glover & G. A. Kochenberger (Eds.), Handbook of metaheuristics. Kluwer Academic Publishers.

Stanimirovic´, Z., Kratica, J., & Dugos˘ija, D. (2007). Genetic algorithms for solving the discrete ordered median problem. European Journal of Operational Research, 182, 983–1001.http://dx.doi.org/10.1016/j.ejor.2006.09.069.

Tragantalerngsak, S., Holt, J., & Ronnqvist, M. (2000). An exact method for the two-echelon, single-source, capacitated facility location problem. European Journal of Operational Research, 123(3), 473–489. http://dx.doi.org/10.1016/S0377-2217(99)00105-8.

Watanabe, M., Ida, K., & Gen, M. (2005). A genetic algorithm with modiﬁed crossover operator and search area adaptation for the job-shop scheduling problem. Computers and Industrial Engineering, 48(4), 743–752. doi: http:// dx.doi.org/10.1016/j.cie.2004.12.008.