Publicações do PESC A Graph-Theoretic Model of Shared-Memory Legality

(1)

A Graph-Theoretic Model of Shared-Memory Legality

Ayru L. Oliveira Filho†

Valmir C. Barbosa‡

†_{Centro de Pesquisas de Energia El´etrica—CEPEL}

Caixa Postal 68007

21944-970 Rio de Janeiro - RJ, Brazil [email protected]

‡_{Programa de Engenharia de Sistemas e Computa¸c˜ao, COPPE/UFRJ}

Caixa Postal 68511

21945-970 Rio de Janeiro - RJ, Brazil [email protected]

Abstract

The concept of legality is a crucial one for the definition of many shared-memory consistency conditions. As originally defined, a sequence of operations on a given object is legal if it is in the set of valid sequences specified for that object. Being thus defined on totally ordered sets of operations, the notion of legality is not fully realistic, because parallel executions on shared-memory multiprocessing systems are best represented by partially ordered sets of operations. It is conceivable, therefore, that the eventual derivation of practical systems based on consistency models that rely on some form of legality will tend to be excessively heavy on the required synchronization. In this paper, we introduce an alternative definition of legality that is based on partially ordered sets of operations. Our treatment starts with a system model that makes very few assumptions on machine architecture, and proceeds to employ a graph-theoretic formalism to handle the normally troublesome issue of write multiplicity and to characterize legality. Using this same formalism, we argue that the novel definition of legality is consonant with the one on totally ordered sets of operations, so it carries the same intuitive appeal intended by the original definition.

(2)

1 Introduction

The design of a scalable, efficient multiprocessing system, that is, a system comprising several pro-cessors that share a common address space, is dependent upon a variety of important factors. Of these factors, one that seems to be especially critical is what is known as the system’s consistency model, which, in essence, is a set of constraints (consistency conditions) that regulate the “inter-twining” of operations on shared objects. Ultimately, the choice of a consistency model poses a tradeoff between how general the program can be at the user level and how time-consuming guaran-teeing correctness may be at lower levels. The sequential consistency model [1], for example, allows ample generality at the user level, however at the expense of efficiency [2]. Even so, various other consistency models have appeared in the wake of sequential consistency, and many of them retain the notion of sequential consistency, albeit in a considerably restricted manner (e.g., [3, 4, 5, 6, 7]). In all of these consistency models, and after a fashion in some others as well [8, 9], one of the key concepts is that of the legality of a sequence of operations on shared objects. Starting from the specification of what the valid sequences of operations on each shared object are, a given sequence is said to be legal if the subsequence of operations corresponding to each shared object is one of those valid sequences. When the shared objects are memory locations, then the operations at hand are the writing of a value into a location and the reading of its content. In this case, a sequence of operations is said to be legal if every read operation returns the value that was last written into the corresponding location according to the sequence [2]. Given this notion of legality for read/write operations, an “execution,” broadly understood as a partially ordered set of reads and writes, is said to be sequentially consistent if a legal sequence of those operations exists in which every read returns the same value as it does in the execution, and furthermore the order in which operations appear in the program is maintained in the sequence [2].

Because legality is defined on totally ordered sets of operations, the actual derivation of algo-rithms and protocols to implement the consistency models that rely on some form of legality is not based on executions, which are partially ordered sets of operations, but rather on sequences that are related to those executions by the definition of legality. A question that comes naturally, then, is whether such approaches tend to “exaggerate” the amount of synchronization that they eventually require, and, if so, by how much. We believe that providing characterizations of legality on the partially ordered sets of operations that constitute executions is a fundamental first step towards answering these questions. This, to our knowledge, has not yet been attempted satisfactorily, and is the main theme of this paper.

We start in Section 2 with the description of a system model that is general enough to represent a great variety of architectures. The model is built at two levels, of which the bottommost aims at hiding the details of specific architectures. At this level, the model comprises a partially ordered set of events, these being the start and end of read and write operations (the end of a write is to be understood merely as a signal that the processor may proceed, and bears no relation to the actual writing of a value into a memory location, however that may be carried out). Such events are related to one another to yield the partial order in a manner that depends entirely on architectural details and is as such not pursued further.

The model’s second level comprises a partially ordered set of read and write operations. At this level, it is worth noting that multiple writes of a same value into the same memory location

(3)

are allowed. As one may readily recognize, this amounts to additional complexity in the treatment of legality on partially ordered sets. When legality is examined on a sequence, all that matters is the most recent value written into a memory location, regardless of whether multiple writes are allowed or not. If, on the other hand, operations are not totally ordered, then the multiplicity of writes must be handled explicitly.

Following this introduction of the system model in Section 2, the remainder of the paper proceeds as follows. Section 3 contains a treatment of legality on partially ordered sets when multiple writes are disallowed. This sets the stage for the remaining sections, which incorporate multiple writes into the execution (Section 4), then regard such an execution as what is known as an AND-OR graph (Section 5), and finally come to a graph-theoretic characterization of legality on executions (Section 6). This characterization requires the existence of at least one legal sequence of operations that, in a manner to be made precise later, extends the execution, and is in this sense in consonance with the usual definition of legality on sequences. Concluding remarks follow in Section 7.

2 System Model

In this section, we present the basic assumptions and definitions related to the system model. We consider a multiprocessing system in which shared memory is the only means of communication among processors, and assume that each processor runs one single process during the execution of a parallel program. The shared-memory abstraction is supported by a memory system with which processes interact by requesting that operations on shared objects be carried out. The objects we consider in this paper are memory locations, and the operations that processes may request are reads and writes. We assume, for the sake of simplicity, that the initial values of all memory locations result from write operations requested by the processes.

A read of the value v from location x is denoted by r(x)v. This operation comprises two events, namely a start event denoted by s[r(x)v] and an end event denoted by e[r(x)v]. Likewise, w(x)v denotes the write of the value v to location x, with s[w(x)v] and e[w(x)v] denoting the operation’s start and end events, respectively. Start events correspond to the requests sent by processes to the memory system, while the end events correspond to the respective responses sent by the memory system back to the processes. Note that while the occurrence of e[r(x)v] at a process indicates completion of the operation r(x)v, meaning that the value v has been read, nothing of the sort can be said of e[w(x)v]. What this event indicates is merely that the command to perform the write has been passed on to the memory system.

Because we want to stay clear of architectural details and be able to make statements that hold across a wide variety of shared-memory architectures, we assume that all events occurring during a computation are partially ordered by a relation that we denote by ≺. This relation reflects the internals of the memory system, and can be thought of as being akin to the usual “happened-before” relation of distributed computing [10, 11]. To judge by most current implementations of shared memory, what determines ≺ is the precedence that the message passing used to implement the memory system imposes on the start and end events of operations, as well as the temporal precedence of events occurring at the same process. Despite this purposeful generality of ≺, all events occurring at a same process are totally ordered by it, although it is conceivable that this total order does not alternate start events with end events. In other words, the model supports

(4)

computations with more than one outstanding request by the same process.

On top of ≺, several relations of interest at the level of operations can be defined. The following two are crucial for the subsequent developments in the paper.

May influence (−→): This relation is akin to the one introduced in [12], and indicates, for ami read, the writes that may have been responsible for the value returned by the read. Obviously, such writes and the read must involve the same value, so −→ is intended to allow the handling ofmi multiple writes of the same value to the same memory location. Thus

w(x)v−→ r(x)v if and only if s[w(x)v] ≺ e[r(x)v].mi

Execution order (−→): This relation is meant to capture the partial order that exists amongxo operations issued by the same process. If op and op′ _{are two such operations, then there are two}

cases to be considered. The first case is that of two reads, say op = r(x)v and op′_{= r(y)u, and the}

second case is the case of a read and a write, say op = r(x)v and op′ = w(y)u. In either case, op−→ opxo ′ if and only if e[op] ≺ e[op′].

Note that cases in which op is a write are deliberately left aside, because the completion of a write is, as we remarked earlier, unrelated to the occurrence of the write’s end event. What is related to the completion of a w(x)v operation, instead, is the end event of a r(x)v operation such that w(x)v−→ r(x)v. More specifically, the end event of a r(x)v issued by a certain processmi indicates completion, as far as that process is concerned, of at least one of the w(x)v operations that precede r(x)v in −→.mi

Orders −→ andmi −→ are the ones that interest us now, but others may be of interest whenxo addressing specific consistency models on top of our generic system model. One example is the following.

Program order (−→): Likepo −→, this relation is intended to capture the ordering of operationsxo issued by the same process. Unlike−→, however,xo −→ is a total order given the process. Specifically,po for op and op′ _{we have op} _{−→ op}po ′ _{if and only if op precedes op}′ _{in the code of the process where}

they appear.

In comparison with other system models (e.g., [13]), the two-level model we have introduced makes virtually no assumptions on architectural details. It has therefore the potential for greater generality, so the properties that we show to hold for it will tend to be applicable to a wider variety of architectures.

We are then in position to define what will be meant throughout the paper as an execution. An execution is a set of operations (reads and writes) partially ordered by the relation [−→ ∪xo −→]mi +_,

that is, by the transitive closure of all pairs of operations related by −→ orxo −→. In the sequel,mi we denote such a set of operations by Ω, and let −→ denote the partial order [σ −→ ∪xo −→]mi +_{. The}

execution is then the partially ordered set (Ω,−→). Any total order of the operations in Ω is aσ serialization of those operations. A serialization −→ is a linear extension (or linearization) ofρ −→σ if every pair of operations in−→ also appears inσ −→.ρ

(5)

In sections to come it will be useful to regard an execution as an acyclic directed graph, call it G, of vertex set Ω and edge set given by the pairs of operations in −→ and in the transitivemi reduction of −→. Consider, for example, Figure 1. In the first graph, no pair of operations arexo related by−→, since no read returns the value written by the only write operation in the execution.mi In the second graph, although r(b)−→ w(b)5, the execution shows no relation between r(b)5 andpo w(b)5 by−→. Supposing that s[w(b)5] ≺ e[r(b)] and that r(b) returns the value 5, w(b)5xo −→ r(b)5.mi

P1 r(x) w(y)1 r(z) • r(x)0 r(z)0• • w(y)1 ... . ... . . . . . . . . . . . . . . . . . . . . . . xo −→ xo −→ P2 r(a) r(b) w(b)5 • r(a)0 • w(b)5 • r(b)5 ... . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . xo −→ xo −→ −→mi

Figure 1: Executions as directed graphs.

3 Legality under Single Writes

As we indicated in Section 1, the original definition of legality is given for totally ordered sets of operations [2]. That definition assumes that every object has a sequential specification of the set of allowed operations and of the set of valid sequences of operations. Based on the sequential specification of each object, legality is defined as given next, and is henceforth referred to as sequential legality.1

Definition 1 (Sequential legality) A sequence of operations −→ is legal if and only if, for everyρ object x, the subsequence of −→ comprising operations on x only is in the sequential specificationρ of x.

In this paper, objects are shared-memory locations and operations are reads and writes. Opera-tion sequences in the objects’ sequential specificaOpera-tions are all those in which read operaOpera-tions do not return “overwritten” values. In order to be more specific, let−→ be a sequence of operations. A readρ operation r(x)v is said to be legal if there exists a write operation w(x)v such that w(x)v −→ r(x)vρ and there exists no write operation w(x)v′ _{such that v}′ _{6= v and w(x)v}_{−→ w(x)v}ρ ′ _{−→ r(x)v. Based}ρ

on this notion of the legality of a read operation, we can specialize the definition of sequential le-gality as follows.

1_{For the sake of notational simplicity, in the remainder of the paper we use a total order, instead of a totally}

ordered set, to denote a sequence of operations. The corresponding set of operations is then left to be inferred from the context.

(6)

Definition 2 (Sequential legality, revised) A sequence of operations −→ is legal if and only ifρ all read operations in −→ are legal.ρ

Definition 2 and the notion of legal reads on which it is based are often used in the definition of memory consistency conditions over sets of read/write operations (partially or totally ordered) [5, 7]. The central issue addressed in this paper is the establishment of legality over a partially ordered set of operations. By doing so, we hope to be opening up the possibility that algorithms and protocols aimed at implementing legality-based consistency models require just the right amount of synchronization that the models dictate.

Before we delve more deeply into the details of a definition of legality over partially ordered sets of operations, we point out that such a definition must necessarily be somehow tied to the usual definition on sequences, so that a common semantics of avoiding value “overwriting” can be shared by the two. The way we approach this is by requiring the existence of a legal sequence of operations that extends, in a sense to be made precise further in this section and more generally in Section 6, the partially ordered set we define to be legal. We also note that requiring this semantic equivalence is what makes our and previous related approaches (e.g., [14, 6]) totally distinct.

Our first definition of legality over a partially ordered set is based on the simplifying assumption that, for every read in an execution, there exists exactly one write that appears along with it in a −→ pair. Note that this does not preclude the existence of multiple writes of a same valuemi to the same memory location, but rather implies that the sets of reads that those various writes “may influence” are all pairwise disjoint. For this reason, such writes can be suitably tagged and the execution treated thereafter as if every value were uniquely identified, that is, as if no value were written to the same memory location more than once. As a consequence, the semantics associated with the −→ relation is one of “certainly influences” as opposed to “may influence.”mi This assumption has been used in order to simplify some formal definitions of consistency models [15, 14, 16, 6]. In Section 6, this assumption is relaxed and a second definition for legality is introduced.

The first step towards defining legality over an execution is to extend the execution’s partial order through the addition of pairs of operations aimed at reflecting the need for a legal linear extension (the existence of such an extension is, in the case of single writes, the way the aforemen-tioned semantic equivalence will turn out to be secured). The following definition formalizes this notion as the relation −→, called a legality constraint.lc

Definition 3 (Legality constraint for single writes) For v 6= v′_{, let r(x)v and w(x)v}′ _{be such}

that r(x)v does not precede w(x)v′ _in _{−→. Then r(x)v}σ _{−→ w(x)v}lc ′ _{if and only if there exists r(x)v}′

such that w(x)v−→ r(x)vσ ′_.

The intuition behind this legality constraint is the following. If r(x)v, w(x)v′_{, and r(x)v}′ _are

all as in Definition 3, then, in any linear extension of −→, w(x)v must precede w(x)vσ ′_{. Thus, in}

order not to violate sequential legality, r(x)v must precede w(x)v′_.

Letting −→= [σˆ −→ ∪xo −→ ∪mi −→]lc +_{, we can define the legality of an execution based on relation} ˆ

σ

−→.

(7)

• w(x)0 • r(x)0 • w(x)1 • r(x)1 • r(y)0 • w(y)0 • r(y)0 • w(y)1 • r(y)1 • r(x)0 ... ... . ... . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . ... ... . . . . . . . . . . . . . . . . . ... . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . ... . . . . . . ... . ... . . ... . ... . ... ... ... . . . ... . . . . . . . . . .... . . . . . ... . . . ... ... ... ... ... ... ... . ...

Figure 2: An execution that is not legal.

We note before proceeding that this definition is closely related to the condition for a “se-quentializable history” such as defined in [14]. However, this similarity is not without important differences, as for example the fact that, in [14], the execution is subject to a write order constraint (or an object order constraint). Also, the “exclusive edges,” although defined very similarly to our legality constraint, do not contemplate the case in which a write precedes a read of another value, as a consequence of the write order constraint. We thus believe that our approach is more general, in addition to being amenable to a graceful extension to the case of multiple writes, as we see in sections to come. Such a case is not addressed in [14].

Figure 2 depicts an execution that is not legal. In the figure, graph G is shown with its edge set expanded by the addition of edges to represent the legality constraint (dashed edges). Note, for example, that w(x)0 −→ r(x)1, so the r(x)0 that does not precede w(x)1 inσ −→ must do soσ in the legality constraint. The execution represented by G is in this case not legal because there exists a directed cycle in−→ (this cycle can be seen clearly in the expanded graph). This is also,ˆσ incidentally, an execution that the criterion of [14] classifies as legal, which further stresses the difference between that criterion and ours.

The following theorem relates executions that are legal by Definition 4 to sequential legality.

Theorem 1 (Ω,−→) is legal if and only ifσ −→ has a legal linear extension.σ

Proof. If −→ has no legal linear extension, then there must exist locations x and y, and values vσ x,

v′

x, vy, and vy′, such that w(x)vx −→ r(x)vσ x′ σ

−→ r(y)vy and w(y)vy −→ r(y)vσ y′ σ

−→ r(x)vx. In

this case, −→ is such that r(x)vlc x lc −→ w(x)v′ x and r(y)vy lc −→ w(y)v′ y, and therefore ˆ σ −→ contains a directed cycle. By Definition 4, (Ω,−→) is not legal.σ

Conversely, if−→ has a legal linear extension, call itσ −→, then, by Definition 2, fixing a locationρ x yields an alternating pattern in −→: First a write to x appears, then the reads of that value,ρ then the write of another value to x, then its reads, and so on. By Definition 3, such a−→ containsρ

lc

−→, so−→ is also a linear extension ofρ −→, which must then be acyclic. By Definition 4, (Ω,ˆσ −→)σ is legal. 2

4 Modeling Multiple Writes

In this section, the simplifying assumption that all values are uniquely identified is dropped. As will become apparent shortly, our approach will be to consider all writes that “may influence” a

(8)

read concomitantly. The reader should note that this is in contrast with other models that also allow multiple writes. The approach of [5], for example, is to tackle one such write-read pair at a time.

In order to generalize the legality constraint of Definition 3, a little elaboration is needed first. Consider an execution (Ω,−→), and let a relationσ −→ be called a single-write reduction fromσ′ −→ ifσ it is defined like −→ but replacingσ −→ with a subset ofmi −→ that includes exactly one pair for eachmi read. In other words, the −→ pairs that participate in the definition ofmi −→ are pairs according toσ′ which exactly one write “may influence” each read. Note that, following our discussion in Section 3, such a subset of−→ can be regarded as characterizing single writes of a same value to the samemi memory location.

Definition 5 (Legality constraint for multiple writes) For v 6= v′_{, let r(x)v and w(x)v}′ _be

such that r(x)v does not precede w(x)v′ _in _{−→. Then r(x)v}σ _{−→ w(x)v}lc ′ _{if and only if there exists a}

single-write reduction from −→, call itσ −→, for which r(x)vσ′ −→ w(x)vlc′ ′_{, where} _{−→ is the legality}lc′

constraint that results from the application of Definition 3 to −→ in place ofσ′ −→.σ

According to Definition 5, the legality constraint −→ for multiple writes is the union of thelc legality constraints that result from applying Definition 3 to all single-write reductions from −→.σ Admittedly, such a definition of−→ is somewhat too abstract if we contemplate eventual practicallc uses of it. However, as we demonstrate next, it is possible to display exactly which read-write pairs result from Definition 5.

Proposition 1 For v 6= v′_{, let r(x)v and w(x)v}′ _{be such that r(x)v does not precede w(x)v}′ _in _−→.σ

Then r(x)v−→ w(x)vlc ′ _{if and only if there exist w(x)v and r(x)v}′ _{such that (a) w(x)v} _{−→ r(x)v,}mi

(b) w(x)v−→ r(x)vσ ′_{, and (c) w(x)v}′ mi_{−→ r(x)v}′_.

Proof. If r(x)v −→ w(x)vlc ′_{, then by Definition 5 there exists}_{−→, a single-write reduction from}σ′ _−→,σ

such that r(x)v −→ w(x)vlc′ ′_{, where} _{−→ results from applying Definition 3 to}lc′ _{−→ in place of}σ′ _−→.σ

By Definition 3, it then follows that there exists r(x)v′ _{such that w(x)v} σ′

−→ r(x)v′_{, where w(x)v}

is the single write that precedes r(x)v by−→ inmi −→, thus implying (a). Likewise, w(x)vσ′ ′ _precedes

r(x)v′ _by _{−→ in}mi _{−→, thence (c). Item (b) follows directly from the fact that every pair in}σ′ _{−→ is}σ′

also a pair in−→.σ

Conversely, assume that there exist w(x)v and r(x)v′ _{such that (a) through (c) hold. If we}

let −→ be the single-write reduction fromσ′ −→ in which w(x)v precedes r(x)v byσ −→ and w(x)vmi ′

precedes r(x)v′ _by _{−→, then by Definition 3 r(x)v}mi _{−→ w(x)v}lc′ ′_{, where} _{−→ results from applying}lc′

that definition to−→. By Definition 5, it follows that r(x)vσ′ −→ w(x)vlc ′_{. 2}

By Proposition 1, a more intuitive rendering of the notion in Definition 5 is the following. If r(x)v is as in Definition 5, and if at least one of the write operations that “may influence” r(x)v precedes in−→ a read operation also involving location x but a different value vσ ′_{, then r(x)v must}

also precede the writes that “may influence” that r(x)v′_{, as we intend to tackle all multiple writes}

of a same value to the same memory location concomitantly. This is illustrated in Figure 3, where a fragment of an execution is shown along with dashed edges to represent the pairs in −→. Solidlc

(9)

edges represent pairs in −→. The operations on x in the figure relate to Proposition 1 in such aσ way that v = 0 and v′ _{= 1, 2.}

• w(x)0 • • w(x)0 • w(x)2 • w(x)2 • r(x)2 • r(x)0 • w(x)1 • w(x)1 • w(x)1 • r(x)1 . ... . . . . . . . . . . . . . . . . . . . . . . . ... . ... . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... ... . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . ... ... . ... . . . ... . ... . . ... ... . . .... . . . ... ... . ... . . . ... ... ... . ... . ... . ... . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... . ... ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . ... . . . . . . . . . . . . . . . . . . ... . ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 3: Legality-constraint pairs.

5 Executions and AND-OR Graphs

In this section, we extend the definition of graph G to yield another directed graph, to be denoted by ˆG, and examine the structure of this resulting graph closely. As we recall from Section 2, G has a vertex for each operation in Ω and a directed edge for each pair in −→ or in the transitivemi reduction of −→. If to this edge set we add the pairs inxo −→ as well, then the resulting graph islc

ˆ

G. Note that graph ˆGhas already been informally used in our depiction of legal executions under single writes (cf. Definition 4) in Figure 2, as well as in Figure 3.

In G and in ˆG, a node may have one immediate ancestor connected to it by an−→ edge, andxo must have one or more immediate ancestors connected to it by −→ edges (if it is a read). In ˆmi G, it may also have immediate ancestors that connect to it by −→ edges (if it is a write). This islc depicted in Figure 4. If we recall that these graphs embody the dependencies that constitute an execution, then clearly only one (any one) of the immediate ancestors (writes) connecting to a read by −→ edges is “indispensable,” in the sense that the read would “behave” likewise if none butmi that one ancestor actually performed a write. What this means is that both G and ˆG carry an AND-OR semantics, and should as such be viewed as what is known as AND-OR graphs.

In order to reason about the properties of an execution as it relates to an AND-OR graph, we need additional definitions and results. Let H = (N, E) be an AND-OR graph. For each node ni∈ N , let Di and Aibe, respectively, the set of descendants and ancestors of niin H. Let Oi ⊆ Di

be the set of immediate descendants of ni and Ii ⊆ Ai be the set of immediate ancestors of ni.

Also, for ti ≥ 0, let Pi1, . . . , P ti

i be the subsets of Ii representing the AND groups of immediate

ancestors of ni. In the case of G and ˆG, we have the following. If ni is a read, then ti is the

number of writes that “may influence” that read, and each set in P1 i, . . . , P

ti

i has cardinality 1 or

2, connecting forward to ni by at most one xo

−→ edge and exactly one −→ edge. If nmi i is a write,

then in G ti≤ 1 and Pi1, if it exists, connects forward to ni by one xo

−→ edge.

The AND groups of immediate ancestors of ni in ˆGwhen ni is a write are determined in great

closeness to Definition 5 of Section 4, as follows. Let R ⊂ Ω be a set of reads. This set is an AND group of immediate ancestors of ni in ˆG if and only if there exists a single-write reduction from

(10)

• • w(x)v • w(x)v • r(x)v xo −→ mi −→ mi −→ • • • • w(x)v′ xo −→ lc −→ lc −→ . ... ... . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . ... ... ... . ... . ... . ... . ... . .... . .. .. .. . . .. ... .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . ._{. .} . . . .... .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .._{. .} . . . .. .. ... .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . . .. .. .. Figure 4: Fragments of ˆG. σ

−→, call it −→, with the following property. When Definition 3 is applied toσ′ −→ in place ofσ′ −→,σ the reads that precede ni in the resulting legality constraint are exactly those in R. Naturally,

the number ti of such AND groups depends on how many distinct R sets can be produced, via

Definition 3, from single-write reductions from−→.σ

Now let H′ _{= (N}′_{, E}′_{) be a subgraph of H. H}′_{is a b-subgraph [17] of H if every node n}

i ∈ N′has

at most ti immediate ancestors in H′, of which at least one is in each of Pi1, . . . , Pt

i

i . Analogously,

H′ is a c-subgraph of H if every node ni∈ N′ has as its immediate ancestors in H′ all the elements

of exactly one of P1 i, . . . , P

ti

i .

We will also be using a graph structure known as a knot. A knot is a subset K ⊆ N with the property that, for each node ni ∈ K, Ai = K. A subset K ⊆ N is said to be a b-knot [17] in H if

K is a knot in some b-subgraph of H. As we will see in the next section, the existence of structures akin to b-knots in ˆG is strongly related to the legality of the execution that ˆGstands for. First, though, we introduce two supporting lemmas.

Lemma 1 If H contains no b-knots, then every strongly connected component of H has at least one node ni such that at least one of Pi1, . . . , P

ti

i does not intersect the component.

Proof. The lemma is trivial for single-node components. If this is not the case, then let C be a strongly connected component of H, and let L be its node set. Suppose that every ni ∈ L is such

that all of P1 i, . . . , P

ti

i intersect L. We show that H contains a b-knot, and for such we first build

a b-subgraph H′ _{of H. This b-subgraph has node set L, and in it each n}

i ∈ L has at least one

immediate ancestor in each of P1

i ∩ L, . . . , P ti

i ∩ L. Note that, by assumption, this construction

is always possible. Also, because C is strongly connected, H′ _{has no sources. Now consider the}

sequence of sets A1 i, A

2 i, A

3

i, . . . ,for some ni ∈ L, such that A1i is the set of ancestors of node ni in

H′ and, for k > 1, Ak

i is the set of ancestors in H′ of all nodes in Ak−1i . The absence of sources

in H′ _{ensures that all sets in this sequence are nonempty. In addition, because L is finite, the}

sequence has a fixed point, which is by definition a knot in H′_{, therefore a b-knot in H. 2}

Lemma 2 H contains a b-knot if and only if every spanning c-subgraph of H contains a directed cycle.

(11)

Proof. Let K be a b-knot in H. By definition, every spanning c-subgraph of H includes K as part of its node set. Let H′ _{be one such c-subgraph, and consider a traversal of H}′ _{that starts anywhere}

in K and proceeds as follows. When at node ni, the traversal moves on to another node that is

an immediate ancestor of ni in both H′ and the b-subgraph of H where K is a knot (note that

such an ancestor must always exist, as a consequence of the very definitions of b-subgraphs and c-subgraphs). This traversal is confined to K and, because K is finite, must eventually return to a node already encountered, thereby characterizing a directed cycle in H′_.

If H contains no b-knots, then we display an acyclic spanning c-subgraph of H. In order to construct such a c-subgraph, we first split H’s nodes into maximal strongly connected components C1, . . . , Cm. If all of C1, . . . , Cm are singletons, then H is acyclic by the maximality of the

compo-nents, and so is every one of its c-subgraphs. Otherwise, by Lemma 1, and for 1 ≤ k ≤ m, let Fk

be the nonempty set of nodes of Ck such that, if ni∈ Fk, then at least one of Pi1, . . . , P ti

i does not

intersect Ck. If we regard each of C1, . . . , Cm as a “supernode” and let the only edges coming into

“supernode” Ck be those coming from all the nodes in one of the sets Pi1, . . . , P ti

i that does not

intersect Ck for ni∈ Fk, then what we have is an acyclic c-subgraph on “supernodes” (acyclic, as

before, by the maximality of the strongly connected components). Next we shrink “supernode” Ck

by removing Fk from it, and recursively repeat the entire process from the splitting into maximal

strongly connected components. The recursion ends when no such components can any longer be found that are not singletons, at which time an acyclic spanning c-subgraph of H has been found. 2

6 Legality under Multiple Writes

The characterization of b-knots provided by Lemma 2 holds for AND-OR graphs in general. In the case of ˆG, however, the c-subgraphs of interest are those that are consistent with respect to Definition 5. For such a consistent c-subgraph of ˆG, there exists a single-write reduction from−→σ that yields both the immediate ancestors of reads in the c-subgraph and the immediate ancestors of writes (via an application of Definition 3 to that single-write reduction).

We say that ˆG contains a quasi-b-knot if, for every single-write reduction −→ fromσ′ −→, thereσ exists a strongly connected component C in ˆG such that, for every node ni in C, the subset of

Ii that connects forward to ni by σ′

−→ intersects C. The proof of the following lemma is entirely analogous to that of Lemma 2.

Lemma 3 ˆGcontains a quasi-b-knot if and only if every spanning c-subgraph of ˆGthat is consistent contains a directed cycle.

We are now in position to define the legality of an execution when multiple writes of a same value to the same memory location are allowed.

Definition 6 (Legality under multiple writes) (Ω,−→) is legal if and only if ˆσ G does not contain a quasi-b-knot.

This definition can be thought of as the counterpart of Definition 4 in the presence of multiple writes. Likewise, Theorem 2, given next, is the counterpart of Theorem 1. The meaning this

(12)

theorem carries is that the legality of an execution as given in Definition 6 is in consonance with the usual notion of sequential legality.

Theorem 2 (Ω,−→) is legal if and only if there exists a single-write reduction fromσ −→ that hasσ a legal linear extension.

Proof. The proof is based on the observation that to every spanning, consistent c-subgraph of ˆ

G there corresponds a relation that is a single-write reduction from −→, and conversely. Such aσ relation is the transitive closure of all pairs in −→ or in the set ofxo −→ pairs that appear in themi c-subgraph. The c-subgraph, therefore, is subject to the results of Section 3.

If (Ω,−→) is legal, then by Definition 6 ˆσ G has no quasi-b-knots. By Lemma 3, at least one spanning c-subgraph of ˆG that is consistent is acyclic. The corresponding single-write reduction from −→ has, by Theorem 1 under Definition 4, a legal linear extension.σ

Conversely, if (Ω,−→) is not legal, then it follows from Definition 6 that ˆσ Ghas a quasi-b-knot, and from Lemma 3 that all of ˆG’s spanning c-subgraphs that are consistent contain directed cycles. Once again by Definition 4 and Theorem 1, no single-write reduction from−→ has a legal linearσ extension. 2

Definition 6 can now be used to specify shared-memory consistency conditions that depend on legality directly on executions. For example, (Ω,−→) is sequentially consistent if a legal executionσ (Ω′_,_{−→) exists such that: (i) Ω ⊆ Ω}σ′ ′_{; (ii) Ω}′_{\ Ω comprises read operations exclusively;}2

(iii) every read in Ω ∩ Ω′ _{returns the same value in both executions; and (iv)} _{−→ is a total order consistent}σ′

with−→ when restricted to any single process.po

Conditions (i) through (iv) amount to extending the set of operations Ω by the inclusion of “artificial” reads meant to ensure that the resulting −→ is a total order, given a process, that isσ′ consistent with−→. Clearly, in such a total order, write operations alternate with groups of reads,po of which the one that immediately follows a w(x)v is a r(x)v. This r(x)v may or may not be one of the “artificially” included reads, depending on whether an appropriate read already exists in Ω.

7 Concluding Remarks

We have in this paper addressed the issue of legality in multiprocessing systems. Starting with the usual definition of legality on sequences of operations, we argued for the possible need for a definition given directly on partially ordered sets of operations, and proceeded all the way to providing such a definition by employing graph-theoretic properties of the so-called AND-OR type of directed graphs. We finalized by showing that this new definition of legality is consonant with the traditional definition, in the sense that a partially ordered set is legal if and only if another partial order closely related to it via the AND-OR graph-theoretic formalism admits a linear extension that is legal by the original definition.

Our treatment has been based on a system model that makes very few assumptions on machine architecture, and in addition tackles the issue of multiple writes of a same value to the same memory location directly, without the recourse, used previously by other authors, of breaking up

(13)

the multiple writes and treating them separately. The work we have reported on has been extended by addressing, based on this system model, the issue of how to express the most relevant memory consistency models that have been proposed to date [18]. In the wake of this investigation, it is now possible to come to the question of how the design of algorithms and protocols to support those consistency models can be conducted so to benefit from the new notion of legality we have described.

Acknowlegments: The authors acknowledge partial support from CNPq and CAPES, from the PRONEX initiative of Brazil’s MCT under contract 41.96.0857.00, and from a FAPERJ BBP grant.

References

[1] L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. on Computers, C-28(9): 690–691, September 1979.

[2] M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. on Programming Languages and Systems, 12(3): 463–492, July 1990.

[3] K. Gharachorloo, D. E. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proc. of the 17th Annual Int. Symp. on Computer Architecture (ISCA’90), pages 15–26, May 1990.

[4] S. V. Adve and M. D. Hill. A unified formalization of four shared-memory models. IEEE Trans. on Parallel and Distributed Systems, 4(6): 613–624, June 1993.

[5] M. Ahamad, R. A. Bazzi, R. John, P. Kohli, and G. Neiger. The power of processor consistency (extended abstract). In Proc. of the 5th ACM Annual Symp. on Parallel Algorithms and Architectures (SPAA’93), pages 251–260, June 1993.

[6] M. Raynal and A. Schiper. A suite of formal definitions for consistency criteria in shared memories. In Proc. of the ISCA Int. Conf. on Parallel and Distributed Computing Systems (ICPDS’96), pages 125–130, September 1996.

[7] H. Attiya, S. Chaudhuri, R. Friedman, and J. Welch. Shared memory consistency conditions for non-sequential execution: Definitions and programming strategies. SIAM J. on Computing, 27(1): 65–89, February 1998.

[8] G. R. Gao and V. Sarkar. Location consistency: Stepping beyond the memory coherence barrier. In Proc. of the 1995 Int. Conf. on Parallel Processing (ICPP’95), volume II, pages 73–76, August 1995.

[9] M. Frigo. The weakest reasonable memory. Master’s thesis, Department of Electrical Engi-neering and Computer Science, MIT, 1998.

[10] L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7): 558–565, July 1978.

(14)

[11] V. C. Barbosa. An Introduction to Distributed Algorithms. The MIT Press, Cambridge, MA, 1996.

[12] L. Lamport. On interprocess communication—Part I: Basic formalism. Distributed Computing, 1: 77–85, 1986.

[13] K. Gharachorloo. Memory Consistency Models for Shared-Memory Multiprocessors. PhD thesis, Stanford University, 1995.

[14] M. Mizuno, M. Raynal, and J. Z. Zhou. Sequential consistency in distributed systems: Theory and implementation. In F. Mattern K.P. Birman and A. Schiper, editors, Lecture Notes in Computer Science, pages 224–241. Springer Verlag, 1995.

[15] J. Misra. Axioms for memory access in asynchronous hardware systems. ACM Trans. on Programming Languages and Systems, 8(1): 142–153, January 1986.

[16] M. Raynal and A. Schiper. From causal consistency to sequential consistency in shared memory systems. In Proc. of the 15th Int. Conf. on Foundations of Software Technology and Theo-retical Computer Science, volume 1026 of Lecture Notes in Computer Science, pages 180–194, December 1995.

[17] V. C. Barbosa and M. R. F. Benevides. A graph-theoretic characterization of AND-OR dead-locks. Technical Report ES-472/98, COPPE/UFRJ, Rio de Janeiro, Brazil, July 1998.

[18] A. L. Oliveira Filho. The Legality of Shared-Memory Computations. PhD thesis, Federal University of Rio de Janeiro, 2000. In Portuguese.