An efficient weighted bi-objective scheduling algorithm for heterogeneous systems

(1)

An efﬁcient weighted bi-objective scheduling algorithm

for heterogeneous systems

q

Cristina Boeres

⇑

, Idalmis Milián Sardiña, Lúcia M.A. Drummond

⇑

Instituto de Computação, Universidade Federal Fluminense, 24210-240 Niteroi, Rio de Janeiro, Brazil

a r t i c l e

i n f o

Article history:

Available online 23 October 2010 Keywords:

Static scheduling Reliability

Heterogeneous systems Direct acyclic tasks

a b s t r a c t

This paper proposes the Makespan and Reliability Cost Driven (MRCD) heuristic, a static scheduling strategy for heterogeneous distributed systems that not only minimizes the makespan, but also maximizes the reliability of the application. The MRCD scheduling deci-sions are guided by a weighted function that considers both objectives simultaneously, instead of prioritizing one of them. This work also introduces a classification of the solu-tions produced by weighted bi-objective schedulers to aid users to tune the weighting function such that an appropriate solution can be selected in accordance with their needs. In comparison with the related work, MRCD produced schedules with makespans that were significantly better then those produced by the other strategies at expense of an insignificant deterioration in reliability.

1. Introduction

Resources in large parallel and distributed systems may not be available for a long period of time. Therefore, when exe-cuting an application on such systems it is important to tackle reliability and fault-tolerance issues. Aiming an efﬁcient and robust execution of parallel applications on distributed heterogeneous systems, where the speed of the processors and their failure rates varies, this work focus on a schedule strategy that considers not only the minimization of the running time or makespan, but also maximizes the reliability of the application execution. The target application is represented by a directed acyclic graph (DAG), whose vertices are the tasks and edges are dependencies between them. A weighted bi-objective cost function that integrates both objectives guides the decisions to schedule the tasks of a parallel application on heterogeneous systems susceptible to failures.

In the problem tackled in this work, while a processor can finish the execution of an application task quickly, a high failure rate can be assigned to it depending on its reliability. Therefore, there is no single optimum solution to be found and actually, a variety of solutions can be optimal in some sense. This is defined as a multi criteria linear programming problem and it can be seen as an extension of the classical model of linear programming where there exists more than one objective function. In the case in which these functions are conflicting, a unique optimal solution for the problem does not make sense. Thus, there is a paradigm disruption, as linear programming deals with the optimality paradigm and multi criteria linear programming does not have an optimal solution with the usual meaning. There are, however, a set of efficient solutions, called in the lit-erature as non-dominated or Pareto optimal[18].

The makespan and reliability cost driven (MRCD) algorithm proposed here is a list scheduling heuristic that takes into account the reliability and makespan. A weighted bi-objective function to guide the scheduling process is deﬁned. The speciﬁcation of proper weights to the objectives can be however, troublesome. On an attempt to provide information to

q

This work is supported by research grants from CNPq and FAPERJ.

⇑Corresponding authors. Tel.: +55 21 26295680; fax: +55 21 26295670 (L.M.A. Drummond). E-mail addresses:boeres@ic.uff.br(C. Boeres),lucia@ic.uff.br(L.M.A. Drummond).

Contents lists available atScienceDirect

Parallel Computing

(2)

aid this speciﬁcation, a metric is also generated by MRCD. In fact, the main proposal of the work carried out here, i.e. both the weighted bi-objective scheduling algorithm and the information to guide on the weights choice, is based on a detailed exper-imental analysis taking also under consideration a variety of strategies found in the literature.

MRCD was compared with other scheduling strategies from the recent literature. In order to have a fair comparison with MRCD, some of these heuristics were implemented with small changes. In the experiments, three architectural scenarios were set: the worst case scenario, where the slowest processors are the ones with smallest failure rates, the best case scenario, where the fastest processors are those with the smallest failure rate, and the homogeneous processors scenario, where the pro-cessors, although have distinct failures rates, have the same speed. The experimental analysis in all of these scenarios shows that MRCD produced schedules with very efﬁcient performance when compared with other works implemented here. Be-sides, the proposed metric allowed a classiﬁcation of the solutions generated by weighted bi-objective schedules, which showed to be very helpful to aid the choice of a solution that better achieves the users’ needs.

The remaining of this paper is organized as follows. The next section presents relevant papers published in recent years that deal with the bi-objective scheduling problem. Section3introduces the employed application and architectural models for distributed systems with faults. The proposed weighted bi-objective scheduling is presented in Section4. Section5 illus-trates the application of the proposed strategy using a Gauss Elimination graph as example. The experimental results are depicted in Section6and then, some conclusions are drawn in Section7.

2. Related work

In the literature, there is a variety of works that deal with the bi-objective scheduling problem. The main contributions and approaches with respect to the bi-objective scheduling constructive heuristics published in the last decade are summa-rized inTable 1. Note that many of them consider real-time systems, however, this environment is not the target system considered here.

The authors in[2]present a bi-criteria scheduling heuristic for DAG of operations according to two criteria: ﬁrst the min-imization of the schedule length, and second the maxmin-imization of the system reliability. It also uses the active replication of operations to improve the reliability. If the system reliability or the schedule length requirements are not met, then a param-eter of the compromise function can be changed and the algorithm is re-executed, until both requirements are met.

In[1], an objective function integrates several factors, like real-time deadlines, reliability, quantitative measures of the communication and degree of parallelism. The overall acceptance rate of the applications and the effect of reliability on the overall acceptance rate are investigated.

The work in[14]presents a dynamic scheduling heuristic for parallel real-time jobs on heterogeneous clusters. It is as-sumed that jobs are modelled by DAGs and that they arrive at the system following a Poisson process. The algorithm takes the reliability measure into account and there is an admission control so that a parallel real-time job whose deadline cannot be guaranteed is rejected.

The work proposed by Qin and Jiang[15]is a real-time system oriented scheduling algorithm which orders the tasks of a parallel application by their deadlines. The cost function is a hierarchical one, since for each task, the subset of processors that maximizes the reliability is identiﬁed, and then the processor from this subset that minimizes the earliest start time is selected. As a consequence, the results favour the reliability rather than the execution time of the application.

Dongarra et al.[7]derived the importance of the product failure rate task execution time for the case of scheduling inde-pendent tasks onto processors. An extension of the popular list scheduling heuristic HEFT[17]was proposed, a processor that minimizes the product of the earliest ﬁnishing time of a task by the reliability cost of a processor to schedule the task is chosen.

In[10], a weighted bi-objective list scheduling algorithm (BSA) is proposed. After sorting the tasks in non-increasing order of their priority, the chosen processor is the one that minimizes a weighted integrated cost function, which will be discussed later. The objectives makespan and reliability are normalized by their maximum values and combined in the function. Table 1

Multi-objective scheduling related works.

Work Year Contributions Approach

Assayad et al.[2] 2004 Static scheduling, real-time systems, heterogeneous systems

Reliability and scheduling ﬁxed goals, iterative adaptation Amin et al.[1] 2005 Static scheduling, real-time systems,

homogeneous systems

Tandem and fork-join graphs, distinct criteria cost function Qin and Jiang[14] 2005 Dynamic scheduling, real-time systems,

heterogeneous system

Minimizes reliability costs, task rejection, deadline task ordering Qin and Jiang[15] 2006 Static scheduling, real-time systems,

heterogeneous systems

Hierarchical cost function

Dongarra et al.[7] 2007 Static scheduling, heterogeneous systems Extension of list scheduling, failure rate task ﬁnishing time Hakem et al.[10] 2007 Static scheduling, heterogeneous systems Weighted bi-objective function

Jeannot et al.[11] 2008 Static scheduling, heterogeneous systems Independent tasks, (2 + d, 1)-approximation of the Pareto set Girault et al.[9] 2009 Static scheduling heterogeneous systems Two step algorithm: randomized followed by a List scheduling phase

(3)

Nonetheless,[10]presents the use of only three solutions: the one that minimizes only the makespan; the one that weights equally both objectives; and the one that maximizes the reliability.

An approach for simultaneous optimization of the makespan and the reliability for a set of independent tasks to be sched-uled on a set of heterogeneous processors is presented in[11]. The proposed heuristic finds, for fixed makespanM value, a solution with makespan at most 2M and also finds solutions with optimal reliability among the schedules that are shorter M. This solution is used to derive a (2 + d,1)-approximation of the Pareto set of the problem, for any d > 0.

A two step algorithm is presented in[9]. During the ﬁrst step, the reliability is maximized by using a randomized algo-rithm, which considers only a spatial allocation. Then in the second step, the makespan is minimized by applying a list scheduling algorithm.

It must be also pointed out that there are works such as[3,6]which employ metaheuristcs for solving this problem. They usually require at least one initial solution that can be provided by the heuristics previously presented. In any case, this class of strategies is out of the scope of this paper and therefore, they are not presented in the table.

2.1. Contributions of this work

The proposed cost function of the Makespan and Reliability Cost Driven (MRCD) algorithm integrates both objectives simul-taneously. The way in which the strategy was designed allows the maximization of the reliability without compromising the minimization of the makespan. The work in[10]is similar to MRCD in the sense that it also considers a weighted cost func-tions that aggregates both objective, but actually the function itself is distinct as well as the normalization applied to the objectives, as clarified in the remaining of the paper. Moreover, our work proposed a classification of the solutions produced by MRCD when considering various values of the weight applied to the cost function. This classification guides the choice of the best solution in accordance with the user needs. This contribution was also validated when comparing MRCD with some recent papers from the literature. Note that some of the strategies analyzed here produce a unique solution[7,15]. From the experimental analysis it can be seen that usually these results are not satisfactory concerning both objectives simulta-neously. Actually, the experimental results show that MRCD outperforms those works.

3. Scheduling model for distributed systems with faults

In this work, the application model is based on the class of parallel applications that can be represented by directed acyclic graphs (DAGs), denoted by G = (V, E,

e

,

x

), where: the set of vertices V represents tasks; E, the precedence relation among them;

e

(

v

) is the amount of work associated with task

v

2 V; and

x

(u,

v

) is the weight associated with the edge (u,

v

) 2 E, rep-resenting the amount of data units transmitted from task u to

v

. The set of immediate predecessors of a task

v

2 G is deﬁned as pred(

v

) = {u 2 Vj(u,

v

) 2 E} while the set of immediate successors of

v

is given by succ(

v

) = {u 2 Vj(

v

, u) 2 E}.

The architectural model speciﬁes the main features of the target architecture. Given the set P = {p0, p1, . . . , pm1} of available

heterogeneous processors, the computational slowdown index csi(pj) is associated to each pj2 P, which is inversely

propor-tional to the computapropor-tional power of pj, as identiﬁed by Nascimento et al.[12]. The communication delay index cdi(pi, pj)

estimates the latency cost associated with each communication link (pi, pj). Therefore, the execution time of a task

v

on a

processor pjis given by et(

v

, pj) =

e

(

v

) csi(pj) and the communication time to send the amount

x

(u,

v

) of data between

the adjacent tasks u and

v

allocated to distinct processors puand pv is ct(u,

v

) =

x

(u,

v

) cdi(pu, pv). Note that cdi(pu, pv) = 0

if pu= pv, and also, communication and computation can be overlapped.

Upon these deﬁnitions, a schedule algorithm may need to calculate the earliest time that a task

v

can start its execution on processor pj, which is EST(

v

, pj) = maxu2pred(v){EST(u, pu) + et(u, pu) + ct(u,

v

)} that depends on the schedule of v’s predecessor

tasks and the availability of the processor. The earliest ﬁnish time of task

v

on processor pjis then EFT(

v

, pj) = EST(

v

, pj) + et(

v

, pj).

A schedule Sch of a DAG G speciﬁes, for each

v

2 V, the processor and the time in which

v

must be completed. The schedule length or makespan of Sch is deﬁned asMðSchÞ ¼ max8v2VfEFTð

v

;pvÞg.

The reliability of a system is based on the probability in which resources of the system execute tasks without any failure [15]. In this work, only processor failures are considered, which are statistically independent and follow a Poisson Law with a failure rate of FR(pj), "pj2 P[9]. This rate represents the number of processor failures per unit of time that can occur. It is

assumed here that the communication links between processors are reliable, based on the assumption that communication protocols are able to tolerate their failures [9].The task reliability cost associated with the execution of

v

in pjis then

RC(

v

, pj) = FR(pj) et(

v

, pj), which should be minimized. Given the set of tasks allocated to pj, Vpj V, the processor reliability

cost is then RCpðpjÞ ¼

P

8v2V_pjRCð

v

;pjÞ. As a consequence, the system reliability cost associated with the execution of G on P in

accordance with a schedule Sch is RCsðG; P; SchÞ ¼P8pj2PRCpðpjÞ. Finally, as in[15], the total reliability of the scheduled G in P

is RT¼ eRCsðG;P;SchÞ, which is the probability that the application G can run successfully on the set P under the schedule Sch

during its execution. As a matter of fact, RTshould be maximized and also, by minimizing the task reliability cost RC(

v

, pj),

the total reliability RTbecomes close to one.

4. A Weighted bi-objective scheduling strategy

The MRCD scheduling algorithm follows the traditional list scheduling framework: ﬁrstly, during the ordering phase, a pri-ority is associated with the tasks. Then, during the scheduling phase, each task with the highest pripri-ority is allocated to the

(4)

processor that minimizes a cost function. This work defines a weighted bi-objective function to guide the scheduling process, such that not only the makespan of the application is minimized but also the total reliability of the application scheduled on a set P of processors is maximized. However, this specification of proper weights to the objectives can be difficult and in or-der to provide further information to help the choice of weights, a metric is also generated by MRCD, which is the average difference between the two objectives.

In the bottom level blevel(

v

) is the priority assigned to the tasks in V, which is the rank of tasks deﬁned in[17]. Studies have shown that in the list scheduling framework, the use of blevel(

v

) leads to efﬁcient schedules on a heterogeneous pro-cessor system[17]. During the ordering phase, a list of tasks Vordis constructed in accordance with the non-decreasing order

of blevel(

v

), deﬁned as

ble

v

elð

v

Þ ¼

etð

v

Þ if succð

v

Þ ¼ ;;

max

u2succðvÞ etð

v

Þ þ ctð

v

;uÞ þ ble

v

elðuÞ

n o

otherwise; 8

< :

where etð

v

Þ is the average execution time of

v

considering all the processors and ctð

v

;uÞ is the average communication time, considering all the links.

During the scheduling phase, the algorithm seeks for the processor that optimizes the cost function. Let

v

be the next task to be scheduled with the highest blevel(

v

). MRCD also implements the tasks insertion procedure from[17], which tries to in-sert a task

v

in an earliest idle time slot between two already scheduled tasks on a processor. The length of an idle time slot should be at least long enough to compute

v

on that processor considering its precedence constraints. The processor pv is the one that minimizes the cost function (1 w)EFT(

v

, pj) + w RC(

v

, pj), where pj2 P and 0 6 w 6 1 is a constant value. When

w = 0, MRCD becomes similar to HEFT[17], in which the objective is only the minimization of the makespan. When w = 1, MRCD only considers the reliability.

4.1. The weighted bi-objective cost function

A number of scheduling bi-objective cost functions where the objectives have different priorities have been proposed to schedule an application on heterogeneous systems which are failure prone[1,6,7,10,13]. In the cost function proposed in this work, w reveals a priority given to the objectives, which are simultaneously considered in the cost function, differently from [15], where ﬁrstly all the processors that maximizes the reliability when scheduling a task

v

are chosen and then, from this subset of processors, the one that minimizes the start time of

v

is selected.

A closer look to f(

v

, pj) can take one to note that the values of the objectives are not comparable and aggregating them is

not straightforward. So, in this work, both objectives EFT(

v

, pj) and RC(

v

, pj) were normalized for each task

v

in V over the set

of processors P. The normalization procedure transforms all the objectives so that they share the same minimum and max-imum values 0 and 100, respectively. In this work, the normalized value associated with the earliest ﬁnishing time is

EFTnð

v

;pjÞ ¼ 100

EFTð

v

;pjÞ min8p2PEFTð

v

;pÞ max8p2PEFTð

v

;pÞ min8p2PEFTð

v

;pÞ

;

(5)

and the one associated with the reliability cost is

RCnð

v

;pjÞ ¼ 100

RCð

v

;pjÞ min8p2PRCð

v

;pÞ max8p2PRCð

v

;pÞ min8p2PRCð

v

;pÞ

:

Note that each one of the objectives RCnð

v

i;pviÞ and EFTnð

v

i;pviÞ were normalized over the same scale [0, 100], and since both

objectives are minimized in the cost function, the values of RCnð

v

i;pviÞ and EFTnð

v

i;pviÞ chosen are close to zero.

4.1.1. The MRCD algorithm

As seen in the list scheduling framework[16,17], two main phases characterize the proposed algorithm: the ordering and scheduling phases. The main steps of the Makespan and Reliability Cost Driven (MRCD) algorithm are shown in Algorithm 1, for a given DAG G = (V, E,

,

x

) and set of processors P.

Algorithm 1: MRCD (G, P, w)

1 Vord= h

v

0,

v

1, . . . ,

v

n1i/blevel(

v

i) 6 blevel(

v

i + 1), i = 0, . . . , n 2;

2 fori = 0, . . . , n 1

3 F = 1;

4 "pj2 P

5 Calculate EST(

v

i, pj) using taskInsertion(

v

i, pj);

6 EFT(

v

i, pj) = EST(

v

i, pj) + et(

v

i, pj);

7 RC(

v

i, pj) = FR(pj) et(

v

i, pj);

8 "pj2 P

9 EFTnð

v

i;pjÞ ¼ 100

EFTðvi;pjÞmin8p2PEFTðvi;pÞ

max8p2PEFTðvi;pÞmin8p2PEFTðvi;pÞ;

10 RCnð

v

i;pjÞ ¼ 100 RCðvi;pjÞmin8p2PRCðvi;pÞ max8p2PRCðvi;pÞmin8p2PRCðvi;pÞ; 11 f(

v

i, pj) = (1 w) EFTn(

v

i, pj) + w RCn(

v

i, pj); 12 if(f(

v

i, pj) < F) 13 F ¼ f ð

v

i;pjÞ; pvi¼ pj; 14 ifðMðSchÞ < EFTð

v

i;pviÞÞMðSchÞ ¼ EFTð

v

i;pviÞ; 15 RCs¼ RCsþ RCð

v

i;pviÞ; 16 Sch ¼ Sch [ h

v

i;pvi;ESTð

v

i;pviÞi; 17 RT ¼ eRCs;DðSchÞ ¼ P vi 2VRCnðvi;pviÞEFTnðvi;pviÞ n ;

The ordering phase of MRCD generates the list Vordof tasks in non-decreasing order of blevel() (line 1). The scheduling

phase is executed for each

v

i2 Vordfrom lines 2 to 16, where in lines 4 to 7, the earliest ﬁnishing time EFT(

v

i, pj) and the

min-imal task reliability cost RC(

v

i, pj) are calculated, for each pj2 P. In lines 8 to 10, both the minimal and maximal values for

EFT(

v

i, pj) and RC(

v

i, pj) over all pj2 P are collected and then applied to the normalization procedure. The normalized values

of EFTn(

v

i, pj) and RCn(

v

i, pj) are subjected to the cost function f(

v

i, pj) where the processor that minimizes such function is

identiﬁed (lines 12 and 13). MRCD generates the ﬁnal schedule Sch, its makespanMðSchÞ and the total reliability cost RT,

as seen in lines 14–17. Furthermore, with the normalized values EFTnð

v

i;pviÞ and RCnð

v

i;pviÞ, for each

v

iin its best processor

p_v

i, their difference portrays the imbalance degree between both objectives. The average difference or imbalance degree

be-tween the objectives D(Sch) of the resulting schedule Sch is given by

DðSchÞ ¼ P vi2VRCn

v

i;pvi EFTn

v

i;p_v_i n ; ð1Þ

as seen in line 17, where p_v_iis the processor allocated to

v

i.

In order to calculate D(Sch), the partial difference of each task RCnð

v

i;pviÞ EFTnð

v

i;pviÞ is considered, given that task

v

iis

scheduled on processor p_v

i. If RCnð

v

i;pviÞ > EFTnð

v

i;pviÞ, the chosen processor pviwas such that more priority was given to

the minimization of the ﬁnishing time of

v

than to the minimization of the reliability cost of

v

i. Obviously, if

RCnð

v

i;pviÞ < EFTnð

v

i;pviÞ, the opposite is held. Finally, if RCnð

v

i;pviÞ ¼ EFTnð

v

i;pviÞ, the chosen processor pviwas such that

both objectives were equally minimized. Calculating the average of the partial differences, MRCD provides an information about which objective was prioritized for a given weight w.

4.2. A classiﬁcation of the weights applied to the cost function

The approximate methods to generate Pareto solutions can be divided in three main approaches: weighted, constrained and non-dominated solutions estimation methods[4]. In the weighted method, the multi criteria linear programming prob-lem is transformed into a unique objective probprob-lem, by using a weighted summation of the original probprob-lem objectives. The

(6)

constrained method consists in reducing the multi objective problem to a unique objective one, considering the remaining objectives as constraints. The non-dominated solutions estimation method is similar to the weighted method, but it also cal-culates an upper bound to an error, i.e. it is possible to obtain a set of non-dominated solutions within an error below a pre-deﬁned value. Usually, the smaller the error is, the larger the number of iterations of the algorithm to achieve a solution. This method can be applied only to bi-objective problems. In all these methods, there is no knowledge of the solution to be achieved.

The problem tackled in this paper, the weighted method was chosen because there is no hint to estimate the error in the case of the non-dominated solutions estimation method and in the case of the constraint method, it is also difﬁcult to prop-erly establish a restriction. This work also provides the users the option to assess the trade-offs between the objectives with additional information about this set of solutions.

Let S be the set of all feasible solutions for the bi-objective problem tackled. Let Sk¼ hSchk;MðSchkÞ; RTðSchkÞ; DðSchkÞi be

one solution of the bi-objective problem. A solution Sk2 S dominates another solution Sqif the following conditions are

satisﬁed:

(i) Sqis not better than Skin both objectives, i.e.MðSchkÞ 6MðSchqÞ and RT(Schk) P RT(Schq).

(ii) Skis absolutely better than Sqin at least one of the objectives, i.e.MðSchkÞ <MðSchqÞ or RT(Schk) > RT(Schq).

The solution Skis dominant and Sqis dominated by Sk. If Skdoes not dominate Sqand vice versa, the solutions are

incom-parable[8]. A variety of feasible solutions can be produced by MRCD for the same input DAG G and target system if different w values are given to the algorithm. Let W be the set of values assigned to w of the cost function in MRCD. In accordance with the dominance condition (i) and (ii) described above, the set of all solution in S that are dominants and incomparable is de-noted as S0_{(non-dominated). Note that the solutions in S}0_{are not dominated by any other solution in S. The concept of}

dom-inance can be sometimes too weak for applications, in cases where one objective can be improved signiﬁcantly at a cost of a small deterioration of the other. This concept does not necessarily tell which solutions to chose, but rather which solutions to avoid[8].

A classiﬁcation is then proposed, having the knowledge of the solutions in S0_{. The solutions S}

e2 S0that offers an

equilib-rium between the objectives are those with the smallest value of jD(Sche)j. The set of solutions SRT is compounded of those

Sk2 S0with the smallest D(Schk), and consequently are the ones with the highest total reliability RT(Schk). On the other hand,

the set of solutions denoted by SMcontains the Sk2 S0with the highest D(Schk), i.e. contain the solutions with the minimum

makespan_MðSchkÞ.

5. Applying the concepts

In order to illustrate the application of the weights on the cost function and the classiﬁcation of the solutions which can be generated by MRCD for different w values, a Gauss Elimination application with 9 tasks was scheduled on a set of three processors with the characteristics shown inFig. 1. The computation and communication weights associated with the tasks and messages are shown besides the vertices and edges, respectively.

The results produced by MRCD can be seen inTable 2, where: w is the weight applied to the cost function f ð

v

;pjÞ;M is the

makespan of the produced schedule; RTis the total reliability; and D, the associated average difference. The columns in bold

show S0_{, the non-dominated solutions. Among them, three solutions are selected in accordance with the classiﬁcation}

pro-posed in Section4.2: SM;Seand SRTproduced by MRCD for w = 0.3, w = 0.4 and w = 0.9, respectively. Note that, only one

solu-tion from each subset SM;Seand SRT was selected and highlighted in bold.

Fig. 2shows the schedules produced by MRCD for the Gauss Elimination DAG with nine tasks onto the three processors, for the weights (a) w = 0.3 (SMsolution); (b) w = 0.4 (Sesolution); and (c) w = 0.9 (SRT solution), respectively. Observing

Fig. 2(a), MRCD schedules the tasks of the critical path of the input DAG on the fastest processor, since the weight w = 0.3 prioritizes the choice of a faster processor rather than a more reliable one. On the other hand, the solution SRT inFig. 2(c)

shows that MRCD produced a schedule with an expected pattern, that is, it allocates all the tasks on the most reliable pro-cessor, since the weight w is practically equal to 1.0. Finally, in the Sesolution showed inFig. 2(b), the critical path of the

respective DAG is allocated to p1, which is the processor with the average speed and failure rate. In this solution, the average

difference between the two objectives was the closest one to zero, corresponding to a schedule in which the minimization of the makespan and maximization of the reliability were equally prioritized.

6. Experimental results

The proposed scheduling algorithm MRCD was compared with other scheduling strategies from the literature, which were divided in two classes: the ﬁrst one produces a unique solution, denoted as class 1, and encompasses the heuristics HEFT[17], RCDMod (based on[15]) and RHEFT (based on[7]); and class 2, containing BSAMod based on[10]that can also produce several solutions. BSAMod employs a cost function that although seems similar to MRCD, it produces different re-sults, added to the fact that a distinct normalization procedure is used. In order to have a fair comparison with MRCD, some of these heuristics were implemented with small changes, as will be describe next.

(7)

6.1. Scheduling strategies under comparison

In all the heuristics analyzed here, including MRCD (Algorithm 1), the common list scheduling steps (based on[17]) implemented were:

1. The ordering of tasks by their blevel().

2. Each task

v

iwith the highest blevel() is scheduled on the processor pjthat minimizes its cost function f(

v

i, pj) also using the

taskInsertion procedure.

Note that, if the algorithm is guided by a weighted cost function, a normalization procedure is also implemented. In class 1, since HEFT minimizes only the makespan of the resulting schedule, i.e.

f

v

i;pj

HEFT¼ min_p2PfEFTð

v

i;pÞg ð2Þ

this work derived RTbased on the ﬁnal schedule SchHEFTproduced by HEFT.

The work[15]also gives a solution for the bi-objective scheduling problem, but uses a hierarchical cost function. Differ-ently from the original proposal, RCDMod was implemented with no time constraints (deadlines) and the employed cost function used the tasks completion times instead of their start times. Therefore, during the scheduling phase, RCDMod ﬁrstly identiﬁes the subset P0_{of processors from P that minimizes the reliability cost of}

_v

i, that is

P0¼ fp 2 Pjminp2PfRCð

v

i;pÞg;

and then, the processor in P0_{that minimizes the cost function}

f

v

i;pj

RCDMod¼ min_p2P0 fEFTð

v

i;pÞg; ð3Þ

is selected, in accordance with the taskInsertion procedure.

RHEFT also implements the list scheduling approach of HEFT and is based on the product et(

v

, j) FR(j) (the product of the execution time and failure rate) used for scheduling each task on a reliable processor. Dongarra et al.[7]proposed an adap-tation of HEFT into Reliable-HEFT (RHEFT), emphasizing the importance of this product by scheduling a task

v

ion the

pro-cessor pjthat minimizes the cost function:

f

v

i;pj

RHEFT¼ min_p2PfEFTð

v

i;pÞ FR pð Þg ð4Þ

Table 2

MRCD solutions for Gauss Elimination DAG with 9 tasks.

w 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 M 49.5 49.5 49.5 79.5 84.9 84.9 84.9 109.5 167.8 RT 0.9489 0.9489 0.9489 0.9819 0.9915 0.9915 0.9915 0.9946 0.9983 D 69.8 69.8 69.8 21.2 41.8 41.8 41.8 70.2 100 2 V 6 V 3 p 1 V 1 p 6 V 1 V 4 V 1 p p2 p3 s 0 V 1 V 6 V 7 V 8 V 2 p 3 p 1 p 0 V 5 V 7 V 8 V V7 8 V 0 V 3 V V3 4 V 5 V 2 V 3 V 2 V 5 V 4 V 2 p (c) w = 0.9 (a) w = 0.3 (b) w = 0.4

(8)

In class 2, BSAMod also tackles the bi-objective problem by using the function: fð

v

i;pÞBSA¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi w EFTn

v

i;pj 2 þ 1 wð Þ RCn

v

i;pj 2 q ; ð5Þ

as proposed in[10], but sorts tasks by blevel() during the ordering phase, and then executes the taskInsertion procedure guided by f(

v

i, pj)BSA. The normalized values for ﬁnishing time and reliability costs are:

EFTn

v

i;pj ¼ EFT

v

i;pj max p2P fEFTð

v

i;pÞg ; and RCn

v

i;pj ¼ RC

v

i;pj max p2P fRCð

v

i;pÞg ; respectively.

All of heuristics were executed over synthetic applications represented by three classes of DAGs: one that represents the Gaussian Elimination, denoted by Gn; the diamond DAG Din, which represents, for example, matrix multiplication; and

ran-dom generated DAGs Rn, graphs with irregular topologies. In all the cases, n denotes the number of tasks of the DAG. In all

graphs, a unit cost was associated with the communications between tasks. The same does not hold for the computing tasks weights. In Gn, each

v

2 V has a distinct computing weight, which is a function of the number of lines of the input matrix for

the Gaussian Elimination process, as deﬁned in[5], while in both Rnand Din, the weight

(

v

) = 50, "v 2 V.

The simulated distributed system was composed of m processors divided into three groups, denoted as P0, P1and P2, each

one with m/3 processors. The processor failure rate, FR(pi), was uniformly generated in [105, 3.3 105], "pi2 P0;

[3.4 105_{, 6.6 10}5_{], for p}

i2 P1; [6.7 105, 104], for pi2 P2, as proposed in[15].

6.2. Comparing MRCD with other heuristics

Three architectural scenarios were set, concerning the computational slowdown index. In the worst case scenario (WCS), the adopted values were csi(pi) = 73, for pi2 P0; csi(pi) = 53, for pi2 P1; csi(pi) = 33, for pi2 P2, which were obtained from[12]

in order to model real machines available in our institution. WCS is considered the worst case scenario because the slowest processors are the ones with smallest failure rates. On the other hand, in the best case scenario (BCS), the fastest processors are those with the smallest failure rate, where csi(pi) = 33 for pi2 P0, csi(pi) = 53 for pi2 P1and csi(pi) = 73 for pi2 P2. A third

scenario, denoted as homogeneous processors scenario (HPS), was set with csi(pi) = 53 for all pi2 P0[ P1[ P2. In all of the

sce-narios, the failure rate is per second and the makespan value in seconds.

Tables 3, 5 and 7present the pair makespan and the system reliability (M; RT) for each heuristic andTables 4, 6 and 8, the

average difference D and weight w for each DAG, considering the three following comparisons. The ﬁrst comparison C1 gives the results of HEFT and MRCD, C2 shows the results of RCDMod and MRCD, and C3 compares RHEFT and MRCD. Also, the pair in the columns (%) represents the percentage improvement of the makespan produced by MRCD over RCDMod and RHEFT, respectively, and the percentage improvement on the reliability produced by the two heuristics over MRCD. In each case, the MRCD solutions selected were:

SMfrom S0in C1 for a fair comparison with HEFT.

SRT2 S

0_{in the case of C2, since RCDMod gives higher priority to the processors that maximizes reliability.}

The dominant solution or in case of MRCD solutions do not dominate the RHEFT solution, the MRCD solution closest to the RHEFT one is shown. A solution Siis the closest one to another solution Sjif it presents the smallest Euclidean distance

from it considering normalized values.

Note that the values in bold are the dominant solutions for each comparison C1, C2 and C3. 6.2.1. Heuristics in class 1 in the worst case scenario

The ﬁrst set of experiments is carried out in the worst case scenario and compares the results of MRCD with the heuristics in class 1.

It can be observed that with the increase of the number of tasks n, the makespan also grows and the reliability decreases. As the number of tasks executed on a ﬁxed number of processors grows, the processors loads also increase and consequently, their reliability costs.Table 3also shows that the solutions in C1 have worse reliability than the results in C2 and C3, but the makespan are amazingly smaller. It happened because in the employed environment, the fastest processors were also the most susceptible to failures, and the chosen values for w gave priority to the minimization of the makespan. As a matter of fact, MRCD presented better solutions than HEFT in all graphs. By choosing MRCD solutions with the greatest D values (the SMsolution), the algorithm showed to be capable of producing solutions with makespans as good as (or better than)

(9)

In C2, RCDMod presented slightly better reliability than the results SRT for each DAG, as seen inTable 3. Recall that such

heuristic employs a hierarchical cost function that prioritizes the reliability only. Although the MRCD solutions also consider reliability, the makespan was not neglected since the cost function integrates both objectives. However, it is important to note that the makespan of the RCDMod are almost the double of those presented by MRCD.

Finally in C3, MRCD solutions dominated RHEFT solutions in many graphs. RHEFT employs a cost function that also inte-grates both objectives but in a distinct manner than from MRCD. In C3, MRCD presented makespans which are almost half of those produced by RHEFT, while the reliabilities were very close.

6.2.2. Heuristics in class 1 in the homogeneous processors scenario

The results produced by the heuristics in class 1 for the homogeneous processors scenario (HPS) are shown inTables 5 and 6, considering the same input DAGs, as in the previous analysis. Similar conclusions derived for WCS can also be obtained for Table 3

Results from class 1 heuristics for Gauss, random and diamond DAGs on m = 24 processors in the worst case scenario.

DAG C1 C2 C3

HEFT MRCD RCDMod MRCD (%) RHEFT MRCD (%)

G152 190.0, 190.0, 2406.0, 1283.3, 87.4, 694.9, 449.4, 54.6, 0.9207 0.9329 0.9576 0.9554 0.22 0.9457 0.9428 0.30 G252 318.7, 318.7, 5201.9, 2715.6, 91.5, 1309.6, 856.3, 52.9, 0.8351 0.8587 0.9106 0.9060 0.49 0.8813 0.8784 0.32 G377 480.4, 480.4, 9603.8, 4981.5, 92.7, 2204.6, 1494.8, 47.4, 0.7184 0.7514 0.8412 0.8335 0.92 0.7869 0.7856 0.16 G527 675.1, 675.1, 15976.7, 8211.0, 94.5, 3560.9, 2273.0, 56.6, 0.5853 0.6122 0.7500 0.7385 1.56 0.6679 0.6657 0.33 G702 902.8, 902.8, 24685.6, 12687.4, 94.5, 5258.9, 3463.3, 51.8, 0.4406 0.4553 0.6412 0.6260 2.42 0.5286 0.5318 0.59 R80 330.0, 330.0, 5840.0, 3066.0, 90.4, 1022.0, 949.0, 7.69, 0.8264 0.8415 0.9002 0.8952 0.55 0.8527 0.8647 1.39 R98 330.0, 330.0, 7154.0, 3650.0, 96.0, 1460.0, 1314.0, 11.1, 0.7905 0.8076 0.8791 0.8730 0.70 0.8306 0.8403 1.16 R152 396.0, 396.0, 11096.0, 5694.0, 94.8, 1679.0, 1533.0, 9.52, 0.6959 0.6990 0.8189 0.8101 1.08 0.7334 0.7533 2.64 R256 528.0, 528.0, 18688.0, 10512.0, 77.7, 1606.01, 1314.0, 22.2, 0.5421 0.5421 0.7143 0.7027 1.64 0.5665 0.5830 2.82 R364 759.0, 759.0, 26572.0, 13505.0, 96.7, 2993.0, 1825.0, 64.0, 0.4159 0.4180 0.6198 0.6038 2.64 0.4568 0.4674 2.26 Di81 580.9, 580.9, 5913.0, 3066.0, 92.8, 3285.0, 3066.0, 7.17, 0.8300 0.8321 0.8990 0.8939 0.57 0.8924 0.8939 0.16 Di100 660.0, 660.0, 7300.0, 3723.0, 96.0, 4599.0, 3723.0, 23.5, 0.7949 0.7966 0.8768 0.8706 0.71 0.8721 0.8706 0.17 Di144 825.0, 825.0, 10512.0, 5329.0, 97.2, 6935.0, 5329.0 30.1, 0.7153 0.7190 0.8276 0.8190 1.04 0.8217 0.8190 0.32 Di256 1155.0, 1155.0, 18688.0, 9417.0, 98.4, 12994.0, 9417.0, 37.9, 0.5407 0.5494 0.7143 0.7012 1.87 0.7062 0.7012 0.71 Di361 1452.0, 1452.0, 26353.0, 13286.0, 98.3, 19126.0, 13286.0, 43.9, 0.4158 0.4209 0.6222 0.6062 2.64 0.6133 0.6062 1.17 Table 4

D and w values produced by MRCD for Gauss, random and diamond DAGs on m = 24 processors for the worst case scenario for the comparison C1, C2 and C3.

n Gn Rn Din 152 252 377 527 702 80 98 152 256 364 81 100 144 256 361 C1 D 30.3 31.0 32.7 35.6 41.0 27.7 29.9 42.0 38.2 44.1 43.6 43.9 44.7 46.6 49.5 w 0.5 0.5 0.5 0.5 0.4 0.6 0.6 0.4 0.4 0.3 0.6 0.6 0.4 0.3 0.2 C2 D 58.7 58.1 57.1 57.1 57.3 58.7 57.9 56.9 60.0 54.5 59.7 60.5 61.0 62.5 62.7 w 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 C3 D 13.2 7.6 6.3 4.4 4.8 10.6 18.1 5.95 12.8 15.3 59.7 60.5 61.0 62.5 62.7 w 0.7 0.7 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.9 0.9 0.9 0.9 0.9

(10)

the homogeneous scenario. Added to that, it is interesting to note that in the majority of the DAGs, the D values in C1 shown inTable 6are closer to the equilibrium value, since, in relation to the speed of the processors, choosing a processor that min-imizes the reliability cost does not harm the minimization the makespan. However, the topology of the DAGs also affects the minimization of the makespan, but not the minimization of the reliability cost, due to the cost function of MRCD. This is the reason for SMnot being the same as Sesolutions.

In the case of the results in C2, the D values are slightly higher than those correspondents in the previous scenario (WCS). The chosen solution SRT prioritizes the minimization of the reliability costs. In this case, although the processors have the

same speed, adjacent tasks might have been allocated to distinct processors, increasing the ﬁnal makespan. Still, comparing with RCDMod, the makespans are amazingly smaller, as seen inTable 5.

In C3, practically all of the D values are relatively higher then those produced in the WCS, although still negative. Each MRCD solution is the closest one to that provided by RHEFT. Although in this case RHEFT does not neglect the minimization of the makespan, since it considers the ﬁnishing time in its cost function, even in this scenario, MRCD provides solutions with much better makespans, but with slightly higher reliability, as seen in the last column ofTable 5.

6.2.3. Heuristics in class 1 in the best case scenario

In the best case scenario (BCS), the objectives can be achieved by choosing a processor that both minimizes the makespan and reliability costs, for each task of the application. InTables 7 and 8, one can see that the D values are closer to the equi-librium solution, mainly when compared with the worst case scenario results. The way that MRCD aggregates the objectives does not compromise the minimization of the makespan. In the BCS, the makespans are much smaller than those produced by both RCDMod and RHEFT. Also, a higher number of dominant solutions are met by MRCD over RHEFT than those met in the other scenarios. Again, there is no MRCD dominated solutions.

In summary, from all the experiments carried out for the heuristics in class 1, it can be concluded that the solutions pro-vided by MRCD is either dominant or incomparable, but never dominated. Considering the previous tables for each scenario,

Table 5

Results from heuristics in class 1 for Gauss, random and diamond DAGs on m = 24 processors in the homogeneous processors scenario.

DAG C1 C2 C3

G152 305.2, 305.2, 1746.8, 666.7, 162.0, 504.5, 343.4, 46.9, 0.9402 0.9569, 0.9690 0.9652 0.39 0.9595 0.9591 0.04 G252 511.9, 511.9, 3776.7, 1409.8, 167.8, 983.6, 722.9, 36.0, 0.8594 0.8982 0.9342 0.9261 0.87 0.9086 0.9134 0.52 G377 771.6, 771.6, 6972.6, 2546.1, 173.8, 1657.8, 1263.5, 31.4, 0.7346 0.7958 0.8820 0.8675 1.66 0.8246 0.8447 2.38 G527 1084.3, 1084.3, 11599.5, 4249.5, 172.9, 2558.8, 1958.8, 30.5, 0.5803 0.6584 0.8115 0.7897 2.76 0.7143 0.7517 4.97 G702 1450.0, 1450.0, 17922.4, 6434.2, 178.5, 3713.1, 2988.1, 24.2, 0.4139 0.4988 0.7242 0.6935 4.43 0.5867 0.5872 8.73 R80 530.0, 530.0, 4240.0, 1643.0, 158.0, 848.0, 795.0, 6.66, 0.8486 0.8869 0.9265 0.9178 0.94 0.8743 0.9033 3.21 R98 530.0, 530.0, 5194.0, 2014.0, 157.8, 1113.0, 901.0, 23.5, 0.8240 0.8543 0.9107 0.9005 1.13 0.8576 0.8816, 2.7 R152 636.0, 636.0, 8056.0, 3127.0, 157.6, 1325.0, 1060.0, 25.0, 0.7052 0.7411 0.8650 0.8500 1.76 0.7667 0.8118 5.54 R256 583.0, 583.0, 13568.0, 5724.0, 137.0, 1484.0, 901.0, 64.7, 0.5065 0.5089 0.7833 0.7631 2.64 0.5834 0.5656 3.14 R377 848.0, 848.0, 19292.0, 7155.0, 169.6, 2438.0, 2014.0, 21.0, 0.3832 0.3840 0.7066 0.6757 4.56 0.4810 0.5582 13.8 Di81 900.9, 900.9, 4293.0, 1536.9, 179.3, 2385.0, 1536.9, 55.1, 0.8888 0.9006 0.9256 0.9155 1.09 0.9207 0.9155 0.56 Di100 1007.1, 1007.1, 5300.0, 1908.0, 177.7 3339.0, 1908.0, 75.0, 0.8444 0.8749 0.9090 0.8969 1.34 0.9054 0.8969 0.94 Di144 1219.1, 1219.1, 7632.0, 2703.0, 182.3, 5035.0, 2703.0 86.2, 0.7525 0.8111 0.8716 0.8550 1.93 0.8671 0.8550 1.40 Di256 1643.1, 1643.1, 13568.0, 4717.0, 187.6, 9434.0, 4717.0 100, 0.5752 0.6486 0.7833 0.7566 3.52 0.7768 0.7566 2.67 Di361 1961.1, 1961.2, 19133.0, 6625.0, 188.8, 13886.0, 6625.0, 109.6, 0.4223 0.5106 0.7086 0.6747 5.01 0.7012 0.6747 3.92

(11)

it is clear that MRCD highly improves execution performance in all cases at a small deterioration of the reliability in most of the cases.

6.2.4. Varying the number of processors for the worst case scenario

The following set of experiments was performed with a varying number of processors on the DAGs: G1034, R546and Di529

in the worst case scenario, as seen inTable 9. With an increasing number of processors, the makespan diminishes while reli-ability increases, since the scheduling strategies can allocate tasks on more appropriate processors concerning failure rates and computational slowdown indexes. In relation with the comparisons C1, C2 and C3, the same behaviour previously de-scribed in the worst case scenario is also detected.

Table 6

D and w values produced by MRCD for Gauss, random and diamond DAGs on m = 24 processors for the homogeneous processors scenario for the comparison C1, C2 and C3. n Gn Rn Din 152 252 377 527 702 80 98 152 256 364 81 100 144 256 361 C1 D 6.18 10.6 18.8 24.2 29.2 1.39 3.61 25.6 54.4 55.5 11.6 13.1 17.1 25.2 31.1 w 0.7 0.6 0.5 0.5 0.5 0.6 0.6 0.5 0.4 0.4 0.6 0.6 0.5 0.5 0.5 C2 D 38.5 43.3 43.6 46.2 46.6 50.0 50.8 51.1 58.1 51.4 35.7 39.0 42.6 45.1 46.1 w 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 C3 D 0.37 8.63 9.03 8.82 11.8 15.8 17.7 17.3 12.8 21.1 35.7 39.0 42.6 45.1 46.1 w 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.5 0.6 0.9 0.9 0.9 0.9 0.9 Table 7

Results from heuristics in class 1 for Gauss, random and diamond DAGs on m = 24 processors in the best case scenario.

DAG C1 C2 C3

G152 190.0, 190.0, 1087.6, 190.0, 472.4, 314.1, 190.0, 65.3, 0.9516 0.9733 0.9806 0.9733 0.74 0.9747 0.9733 0.18 G252 318.7, 318.7, 2351.5, 368.2, 538.6 627.0, 368.2, 91.9, 0.8820 0.9341 0.9585 0.9423 1.72 0.9405 0.9423 0.05 G377 480.4, 480.4, 4341.4, 652.7, 565.1, 1030.9, 652.7, 82.2, 0.8036 0.8474 0.9248 0.8953 3.28 0.8803 0.8953 1.03 G527 675.1, 675.1, 7222.3, 1062.6, 579.6, 1652.6, 1062.6, 81.7, 0.6403 0.7190 0.8780 0.8316 5.58 0.7922 0.8316 3.11 G702 902.8, 902.8, 11159.2, 1620.9, 588.4, 2508.0, 1620.9, 54.7, 0.4949 0.5267 0.8180 0.7523 8.73 0.6937 0.7523 7.78 R80 330.0, 330.0, 2640.0, 462.0, 471,4 528.0, 462.0, 14.2, 0.8669 0.9293 0.9535 0.9375 1.70 0.9141 0.9375 2.50 R98 330.0, 330.0, 3234.0, 495.0, 553.3 726.0, 495.0, 46.6, 0.8434 0.8841 0.9434 0.9219 2.33 0.9006 0.9219 2.31 R152 396.0, 396.0, 5016.0, 726.0, 590.9, 924.0, 726.0, 27.2, 0.7296 0.7528 0.9136 0.8798 3.83 0.8287 0.8798 5.8 R256 528.0, 528.0, 8448.0, 1551.0, 444.6, 1122.0, 759.0, 47.8, 0.5453 0.5453 0.8589 0.8123 5.73 0.6653 0.6848 2.84 R364 759.0, 759.0, 12012.0, 1947.0, 516.9, 1914.0, 1023.0, 87.1, 0.4257 0.4298 0.8055 0.7412 8.67 0.5751 0.5759 0.13 Di81 580.8, 580.8, 2673.0, 594.0, 350.0 1485.0, 594.0, 150.0, 0.9298 0.9364 0.9530 0.9390 1.48 0.9498 0.9390 1.15 Di100 660.0, 660.0, 3300.0, 693.0, 376.1, 2079.0, 693.0, 200.0, 0.9120 0.9232 0.9423 0.9242 1.95 0.9400 0.9242 0.94, Di144 825.0, 825.0, 4752.0, 891.0, 433.3, 3135.0, 891.0, 251.8, 0.8704 0.8883 0.9180 0.8905 3.08 0.9150 0.8190 2.75 Di256 1155.0, 1155.0, 8448.0, 1353.0, 524.3, 5874.0, 1353.0, 334.1, 0.7404 0.7528 0.8589 0.8085 6.23 0.8545 0.8085 5.68 Di361 1452.0, 1452.0, 11913.0, 1782.0, 568.5, 8679.0, 1782.0, 387.0, 0.6287 0.6454 0.8069 0.7384 9.28 0.8017 0.7384 8.58

(12)

6.3. Analysis of the heuristics from class 2

The heuristic proposed in[10]belongs to the same class of MRCD. In order to be fair, BSAMod follows the HEFT frame-work, but implements different cost functions and normalization procedures as seen in Eq.(5). The comparison of MRCD with BSAMod is an attempt to draw the advantages and pitfalls of the MRCD cost function and normalized values. Next, re-sults produced by MRCD using only the normalization of BSA (but the MRCD cost function) are shown and compared with the original implementation of MRCD.

6.3.1. The normalization method of BSA

As already said, the normalization of the objectives tackled in this work is necessary due to their distinct range values. On the study of the normalization method, let MRCDBSA be MRCD version with the BSA normalized values EFTBSAn ð

v

;pjÞ ¼

EFTðv;pjÞ

maxp2PfEFTðv;pÞgand RC

BSA n ð

v

;pjÞ ¼

RCðv;pjÞ

maxp2PfRCðv;pÞgdeﬁned in[10].Table 10shows the advantages of using the normalization

proce-dure that considers both the maximum and minimum values of the ﬁnishing times and reliability costs (MRCD), instead of only the maximum values as in BSA. The solutions were produced for all of the DAGs Gn, Rnand Dinanalyzed so far. The

values for w are also in the set W. The line labeled as ‘‘MRCD dominant’’ shows the number of solutions that were Table 8

D and w values produced by MRCD for Gauss, Random and Diamond DAGs on m = 24 processors for the best case scenario for the comparison C1, C2 and C3.

n Gn Rn Din 152 252 377 527 702 80 98 152 256 364 81 100 144 256 361 C1 D 1.91 4.0 9.75 14.6 21.5 16.2 4.17 20.6 37.2 26.6 4.56 2.37 0.35 10.8 11.7 w 0.9 0.6 0.6 0.5 0.4 0.7 0.5 0.4 0.1 0.3 0.6 0.8 0.8 0.3 0.3 C2 D 1.91 3.70 7.59 11.0 13.1 18.2 20.7 20.1 35.4 28.2 1.19 1.84 7.27 13.0 16.8 w 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 C3 D 1.91 3.79 7.59 11.0 13.1 18.2 20.7 20.1 31.5 26.0 1.19 1.84 7.27 13.0 16.8 w 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.6 0.6 0.9 0.9 0.9 0.9 0.9 Table 9

Comparison in Class 1 for G1034, R546and Di529with a varying number m of processors.

DAG m C1 C2 C3

HEFT MRCD D w RCDMod MRCD D w RHEFT MRCD D w

G1034 24 1504.1, 1499.8, 41.3 44389.8, 22758.4, 55.8 8743.9, 5309.0, 2.1 0.8625 0.8662 0.4 0.9232 0.9192 0.9 0.8897 0.8901 0.7 45 1335.8, 1335.8, 35.2 44389.8, 23805.3, 51.5 9291.4, 4623.8, 11.0 0.8466 0.8803 0.4 0.9355 0.9298 0.9 0.9173 0.9181 0.8 66 1335.8, 1335.8, 31.5 22229.9, 9640.3, 49.8 5673.5, 4940.6, 29.3 0.8516 0.8939 0.4 0.9523 0.9469 0.9 0.9305 0.9392 0.8 87 1335.8, 1335.8, 28.4 22229.9, 7797.8, 54.9 4885.1, 4137.6, 25.5 0.8461 0.9004 0.4 0.9523 0.9484 0.9 0.9340 0.9387 0.8 108 1335.8, 1335.8, 28.4 44389.8, 11617.2, 54.4 6556.8, 4108.4, 15.6 0.8450 0.9025 0.4 0.9523 0.9473 0.9 0.9344 0.9334 0.8 R546 24 1122.0, 1122.0, 36.4 39858.0, 20513.0, 54.8 4453.0, 2628.0, 16.5 0.8770 0.8773 0.4 0.9307 0.9271 0.9 0.8886 0.8919 0.6 45 629.0, 627.0, 57.7 39858.0, 20805.0, 49.8 2920.0, 2263.0, 1.44 0.8724 0.8735 0.2 0.9419 0.9365 0.9 0.8970 0.9097 0.6 66 627.0, 627.0, 46.5 19929.0, 8541.0, 51.3 2847.0, 1533.0, 5.8 0.8756 0.8819 0.4 0.9571 0.9521 0.9 0.9095 0.9142 0.6 87 627.0, 627.0, 49.3 19929.0, 6935.0, 56.0 2409.0, 1347.0, 2.91 0.8703 0.8832 0.3 0.9571 0.9535 0.9 0.9139 0.9183 0.6 108 627.0, 627.0, 39.7 39858.0, 10804.0, 47.2 2628.0, 1274.0, 7.3 0.8682 0.8908 0.4 0.9571 0.9527 0.9 0.9130 0.9181 0.6 Di529 24 1881.0, 1881.0, 51.2 38617.0, 19418.0, 63.2 29492.0, 19418.0, 63.2 0.8779 0.8791 0.2 0.9328 0.9292 0.9 0.9311 0.9292 0.9 45 1571.1, 1571.1, 66.7 38617.0, 20367.0, 52.1 31828.0, 20367.0, 52.1 0.8655 0.8702 0.3 0.9437 0.9385 0.9 0.9413 0.9385 0.9 66 1505.1, 1505.2, 65.3 19418.0, 8833.0, 54.6 16133.0, 8833.0, 54.6 0.8676 0.8747 0.3 0.9584 0.9539 0.9 0.9577 0.9539 0.9 87 1485.1, 1485.2, 69.7 19418.0, 7081.0, 59.6 13067.0, 7081.0, 59.6 0.8603 0.8711 0.5 0.9584 0.9551 0.9 0.9571 0.9551 0.9 108 1485.1, 1485.2, 66.1 38617.0, 10293.0, 57.8 22630.0, 10293.0, 57.8 0.8634 0.8754 0.5 0.9584 0.9541 0.9 0.9568 0.9541 0.9

(13)

dominant over MRCDBSAfor all Gns, Rns and Dins, for each one of the scenarios. The subsequent line shows the number of

dominated solutions produced by MRCDBSA. It is important to emphasize that MRCD results are never dominated and

therefore, the normalization procedure must consider both minimal and maximal values of the respective objective. Actually, it was observed that, since the normalization procedure of BSA only divides by the maximum value of the given objective, it leaded to poor choices most of the times.

6.3.2. Comparison between MRCD and BSAMod

The next sets of experiments were performed considering BSAMod of class 2. The set S was generated by executing both MRCD and BSAMod with W = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1} for 24 processors. Firstly, the number of dominant and dominated solutions for both heuristics were accounted, inTable 11. This table shows clearly the advantage of using the MRCD cost function and normalization procedure on the speciﬁcation of a proper schedule, since MRCD dominates more than it is dominated. Actually, the case in which MRCD dominates less frequently is for the diamond DAG, due to its topology, a greater number of tasks (there is a high degree of adjacent tasks) is allocated to the same processor, increasing therefore, Table 10

Dominance relation between MRCD over MRCDBSAin the three scenarios for Gn, Rnand Dinon 24 processors.

Solutions WCS BCS HPS Total

Gn Rn Din Gn Rn Din Gn Rn Din

MRCD dominant 12 11 8 15 14 2 18 11 9 100

MRCDBSAdominated 11 15 8 18 14 2 21 13 9 111

Table 11

Dominance relation between MRCD and BSAMod in the three scenarios for Gn, Rnand Dinon 24 processors.

DAGs Dominants Dominated

BSAMod MRCD BSAMod MRCD WCS Gn 0 14 15 0 Rn 1 8 10 1 Din 1 4 4 1 BCS Gn 2 5 5 2 Rn 3 4 4 3 Din 0 3 3 0 HPS Gn 0 15 19 0 Rn 2 10 11 2 Din 2 9 7 2 Total 11 72 78 11 Table 12

Percentage improvements onM produced by MRCD over BSAMod and percentage improvements on RTproduced by BSAMod over MRCD for G702.

w 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Average

M(%) 0.0 18.5 49.0 75.0 106.0 142.0 113.0 10.0 17.0 13.0 0.0 49.4

RT(%) 0.0 2.8 3.9 5.6 7.8 8.2 5.2 0.8 0.3 0.2 0.0 3.01

D – 49.1 45.7 42.8 41.0 38.7 27.8 4.80 36.3 57.3 – –

Table 13

Percentual Average improvements onM produced by MRCD over BSAMod and percentage improvements on RTproduced by BSAMod over MRCD for Gn, Rnand Dinon 24 processors on the three scenarios.

n Gn Rn Din 152 252 377 527 702 80 98 152 256 364 81 100 144 256 361 WCS M (%) 31.5 38.6 44.4 48.0 49.6 9.77 8.01 3.66 10.0 4.26 35.6 75.8 53.0 138.6 96.3 RT(%) 0.28 0.60 1.06 1.91 3.24 0.15 0.0 0.11 4.65 0.92 0.99 5.92 2.23 4.94 8.07 BCS M (%) 0.07 1.2 1.99 2.72 4.58 2.76 2.06 0.82 5.52 1.08 1.83 1.73 3.06 15.1 13.2 RT(%) 0.07 0.18 0.32 1.1 2.65 0.05 0.2 0.19 0.81 1.78 0.09 0.21 0.48 2.32 4.79 HPS M (%) 1.41 3.8 38.1 9.19 40.7 2.13 1.24 1.32 12.4 2.80 13.1 17.5 24.8 68.5 51.4 RT(%) 0.1 0.36 0.82 0.58 1.94 0.20 0.11 0.22 0.62 1.66 0.33 0.52 1.32 4.77 9.66

(14)

the reliability cost of that processor. Although, in these cases, MRCD does not dominate, it is hardly dominated by BSAMod, and some of the cases the solutions are incomparable.

Table 12shows the percentage difference of both makespan and reliability for each w value for an example DAG G702on

24 processors in the Worst Case Scenario. The lineM presents the percentage improvement on the makespan produced by MRCD over BSAMod and the line RT, the percentage improvements on the reliability of BSAMod over MRCD. Note that an

outstanding improvement on the makespan is produced by MRCD while the reliability remains almost the same as the re-sults of BSAMod. Bear in mind that the solution in this table was not classiﬁed under the dominance concept.

Fig. 3. The average makespan and the reliability produced by both MRCD and BSAMod on 24 processors on the three scenarios.

Fig. 4. The average difference D produced by MRCD on 24 processors on the three scenarios.

Table 14

Percentual average improvements onM produced by MRCD over BSAMod and percentage improvements on RTproduced by BSAMod over MRCD for a varying number of processors.

m G1034 R546 Di529

24 87 213 24 87 213 24 87 213

AverageM(%) 47.2 41.1 41.1 3.34 5.55 8.25 105 71.5 68.3

(15)

An analogous comparison between MRCD and BSAMod carried on 24 processors was held, considering the DAGs shown in Table 13for all the three scenarios. In all the scenarios for both gauss and diamond DAGs, MRCD provided solutions with better makespan and slightly smaller reliability. Actually, in the case of WCS, the improvements on the makespan are supe-rior in the other cases, with a small detesupe-rioration on the reliability. Actually, MRCD looses to BSAMod only for some of the random DAGs in the BCS and HPS. In summary, one can conclude the beneﬁts of applying the implemented list scheduling algorithm together with the cost function from MRCD.

The corresponding graphics for makespan and reliability with the variation of w are shown inFig. 3.Fig. 4shows the aver-age difference D on the three scenarios produced by MRCD. As expected, the higher w is, the larger the values for makespan are, since it is not prioritized, also the reliability increases, since in this case this objective is prioritized. On the other hand, the average difference is closer to zero when w is equal to 0.7, and not when w = 0.5, as could seem more plausible in a naive analysis of the weighted cost function.

A set of similar experiments was carried out considering a varying number of processors, for the DAGs G1034, R546and

Di529in the Worst Case Scenario only. The main conclusion is that the improvements on the makespan provided by MRCD

were signiﬁcant at a cost of a small deterioration of the reliability for any number of processors, as can be seen inTable 14. 7. Concluding remarks

This work proposes a weighted cost function and a classification of the solutions provided by MRCD. An experimental analysis shows that MRCD can find efficient solutions when compared with the other heuristics under evaluation. The com-parison with HEFT shows the importance of using a bi-objective function that considers both minimization of the makespan and maximization of the reliability. From the evaluation of RCDMod and MRCD, one can conclude the importance of using an integrated bi-objective function, while the relevance of the weights and the cost function provided by MRCD are shown on the comparisons with RHEFT. Finally, when considering BSA, which is also a weighted bi-objective scheduling heuristics, MRCD provides outstanding improvements on the makespan while keeping similar reliability results. As future work, MRCD will be used in conjunction with a fault-tolerance approach that is primary-backup based, with the objective to recover real MPI applications from faults and execute them efficiently on distributed heterogeneous systems.

Appendix A. Glossary Heuristics:

MRCD – makespan and reliability cost driven HEFT – heterogeneous earliest ﬁnish time RCDMod – reliability cost-driven modiﬁed

RHEFT – reliable heterogeneous earliest ﬁnish time BSAMod – bi-criteria scheduling algorithm modiﬁed

Application model: G – task graph V – set of vertices E – set of arcs et – execution time ct – communication time Architectural model: P – set of processors

csi – computational slowdown index cdi – computational delay index FR – failure rate

Scheduling model:

M – makespan

EST – earliest start time EFT – earlieste ﬁnish time RC – reliability cost RT– total reliability

(16)

Scheduling strategy:

D – average difference between the objectives

S – set of feasible solutions for the bi-objective problem Se– equilibrium solution

SRT– solution with the highest total reliability

SM– solution with the minimum makespan

F – weighted function w – weight

References

[1] A. Amin, Maximizing reliability while scheduling real-time task-graphs on a cluster of computers, in: Proceedings of the 10th IEEE Symposium on Computers and Communications, IEEE Computer Society, Washington, DC, USA, 2005, pp. 1001–1006.

[2] I. Assayad, A. Girault, H. Kalla, A bi-criteria scheduling heuristic for distributed embedded systems under reliability and real-time constraints, in: Proceedings of the 2004 International Conference on Dependable Systems and Networks, IEEE Computer Society, Washington, DC, USA, 2004, p. 347. [3] L.-C. Canon, E. Jeannot, Scheduling strategies for the bicriteria optimization of the robustness and makespan, in: Proceedings of the 24th IEEE

International Parallel and Distributed Processing Symposium (IPDPS’08), 2008, pp. 1–8. [4] J.L. Cohon, Multiobjective Programming and Planning, Academic Press, 1978.

[5] M. Cosnard, D. Trystram, Parallel Algorithms and Architectures, International Thomson Computer Press, 1995.

[6] A. Dogan, F. Ozguner, Biobjective scheduling algorithms for execution time-reliability trade-off in heterogeneous computing systems, Journal of Computing 48 (3) (2005) 300–314.

[7] J. Dongarra, E. Jeannot, E. Saule, Z. Shi, Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems, in: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’07), ACM Press, 2007.

[8] T. Gal, T. Hanne, T. Stewart (Eds.), Multicriteria Decision Making: Advances in MCDM Models, Algorithms, Theory, and Applications, Kluwer Academic, 1999.

[9] A. Girault, I. Saule, D. Trystram, Reliability versus performance for critical applications, Journal of Parallel and Distributed Computing 69 (3) (2009) 326–336.

[10] M. Hakem, F. Butelle, Reliability and scheduling on systems subject to failures, in: Proceedings of the 2007 International Conference on Parallel Processing (ICPP’07), 2007, pp. 38.

[11] E. Jeannot, E. Saule, D. Trystram, Bi-objective approximation scheme for makespan and reliability optimization on uniform parallel machines, in: Proceedings of the 14th International Euro-Par Conference on Parallel Processing, Springer-Verlag, Berlin, Heidelberg, 2008, pp. 877–886. [12] A.P. Nascimento, A.C. Sena, C. Boeres, V.E.F. Rebello, Distributed and dynamic self-scheduling of parallel MPI grid applications, Concurrency and

Computation: Practice and Experience 19 (14) (2007) 1955–1974.

[13] L.N. Nassif, J.M. Nogueira, F.V. Andrade, Distributed resource selection in grid using decision theory, in: Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid (CCGRID’07), 2007, pp. 327–334.

[14] X. Qin, H. Jiang, A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters, Journal of Parallel Distributed Computing 65 (8) (2005) 885–900.

[15] X. Qin, H. Jiang, A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems, Parallel Computing 32 (5) (2006) 331–356.

[16] O. Sinnen, Task Scheduling for Parallel Systems, Wiley, 2007.

[17] H. Topcuouglu, S. Hariri, M. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems 13 (3) (2002) 260–274.