Node-wide Dot-based Clocks Without Fill - Multi-Value Distributed Key-Value Stores

Figure19: Average network traffic used in very-fast anti-entropy rounds.

Table 4: The average, the95^th and the99^th latencies for clientUpdaterequests.

The best result per line is in bold.

DottedDB MerkleDB

HH HL LH LL HHH HHL HLH HLL LHH LHL LLH LLL Avg. 19.8 16.4 4.8 5.1 7.0 9.3 6.7 8.1 6.4 8.9 7.0 8.0

95^th 64 39 8 7 8 11 8 10 8 11 8 10

99^th 460 394 66 79 137 173 128 155 123 163 143 149

4.5 node-wide dot-based clocks without fill

There is an alternative approach to the NDC framework that never fills the object context with the node clock, in exchange for a slightly delayed garbage collection. We call this variantNDC Without Fill(NDC-NF) and the main idea is that if all current versions of the object are in all replica nodes, then the old object versions represented in the context have been removed everywhere. Thus, the context has served its purpose of elim-inating older versions when exchanging and merging objects, and can be removed entirely.

All data structures are the same as NDC, except for NSK that is no longer needed. Objects will be stripped when the dots of their current versions are removed from the DKM.

(a) HH: High number of objects missing per anti-entropy (10%);

Highreplication loss (100%).

(b) HL: High number of objects missing per anti-entropy (10%);

Lowreplication loss (10%).

(d) LL:Lownumber of objects miss-ing per anti-entropy (1%); Low replication loss (10%).

Figure20: CDFs for the latency of clientUpdaterequests.

4.5.1 Algorithms

Algorithm6shows the client API for NDC-NF. The put and delete oper-ations are the same, but the get operation must explicitly add the dots of the object versions to the causal context returned to the client. With-out this, a stripped object with no context would send an empty context to the client, that if used in a subsequent update to the same key, would not dominate the versions previously read by the client.

Algorithm 7 defines the auxiliary operations for NDC-NF. Notably, the fill operation is removed, as it is no longer used in this version of NDC. Also, the mergeoperation remains the same.

The strip operation now only strips the causal context of an object if the version dots are known by all replica nodes of that key. When this

4.5 node-wide dot-based clocks without fill 77

Algorithm6: GET Operation at Nodei with NDC-NF.

1 procedureGET(k:K,quorum[=1] :N):

2 O:=;

3 parallel forj2replica_nodes(k)do

4 O:=O[rpc(j,fetch,hki)

5 await(size(O)>quorum)

6 foro2Odo

7 (vers,cc) :=merge((vers,cc),o)

// add current dots to causal context for the client 8 return(ran(vers),cc[dom(vers))

10 procedurePUT(. . .):

// the same

11 procedureDELETE(. . .):

// the same

happens, it means that the causal context has reached all replica nodes, and therefore removed all possible older versions present in the causal context.

The fetch operations now returns the object directly from storage, without the need to fill any causal context.

update The update operation now checks if all versions from the received object are already known by the node clock. If there is nothing new, then this object serves no purpose and it is ignored, returning the local object.

This new test in the update operation is not necessary in NDC but it is in NDC-NF, because nodes now are vulnerable to older objects that for some reason reach a node after the newer object is in all nodes; this should be extremely rare, but can happen with network reordering/de-lays.

The problem with older objects being replicated after removing the context of the current object is due to the assumption that every older object is now deleted. This is not relevant with NDC, since the context is always restored whenever the object is read from storage.

Versions of the incoming object that are already known by the local node are discarded before merging to the local object to avoid the reap-pearence of obsolete versions.

Algorithm7: Auxiliary Operations at Node iwith NDC-NF.

1 functionfetch(k:K):

// return the object as-is, without filling 2 returnSTi[k]

4 procedurestore(k:K,o:Object):

5 for(j,c)2dom(vers)do

6 NCi[j] :=NCi[j][{c}

7 DKM_i[(j,c)] :=k

8 STi[k] :=o

10 functionmerge(. . .):

// the same 11

12 functionupdate(k:K,(vers,cc) :Object):

13 ifdom(vers)✓NC_ithen returnfetch(k)

14 vers:={(d,v)2vers|d62NCi}

15 (vers,cc) :=merge((vers,cc),fetch(k))

16 for(j,c)2dom(vers)do cc[j] :=max(cc[j],c)

17 store(k,(vers,cc))

18 return(vers,cc)

20 functionstrip((vers,cc) :Object):

// test if all current dots are in all replica nodes 21 for(j,c)2dom(vers)do

22 ifmin({WMi[p][j]|p2peers(j)})< cthen

23 return(vers,cc)

24 return(vers,;)

26 function⇠⇠⇠fill(. . .⇠):

// not needed anymore

store Thestoreoperation is now simplified to only update the node clock and the dot-key map and finally save the new object to storage.

The process of stripping was delegated to another function that is called after anti-entropy rounds and theNSK is no longer needed.

4.5.1.1 Background Processes

Algorithm 8 shows the new definitions for NDC-NF background pro-cesses. Thestrip_causality process is removed, since the equivalent strip-ping process can be made by the gc_dkm procedure. When a dot is

4.5 node-wide dot-based clocks without fill 79

Algorithm8: Background Tasks at Nodei with NDC-NF.

1 process₍₍₍₍strip_causality():⁽⁽⁽⁽

// not needed anymore 2 processanti_entropy():

// the same

3 proceduresync_clock(. . .):

// the same 4

5 proceduresync_repair(p:I,NC:I,!P(N),O:K⇥Object):

// the same 6

7 procedureupdate_watermark(p:I,NC:I,!P(N)):

// the same 8

9 proceduregc_dkm():

// remove entries known by all peers 10 for((j,c),k)2DKMido

11 ifmin({WM_i[p][j]|p2peers(j)})>cthen

12 DKMi:={(j,c)}-DKMi

13 (vers,cc) :=strip(STi[k])

14 if{val2ran(vers)|val6=null}=;^cc=;then

15 STi:={k}-STi

16 else ifcc=;then

17 ST_i[k] := (vers,;)

known by all replica nodes, it is the perfect time to check if the corre-sponding object can be stripped, because if that dot corresponds to the only version of the object, then that object is in all replica nodes and the context will be removed. This also the reason that the NSK data structure is no longer needed.

4.5.2 NDC versus NDC-NF

There are two disadvantages with the NDC-NF approach:

1. Object stripping is slightly slower than NDC, because the con-text is only removed when all versions are known to be in all replica nodes; in NDC, entries are removed independently when included in the node clock, which happens before that information is known by other peers;

2. The stripping depends on having information aboutallpeer nodes, which makes the strip operation less fault-tolerant than the NDC version.

The main advantages of NDC-NF are the following:

1. The node clock can be safely garbage collected, because older en-tries from retired nodes are not used to fill objects;

2. The is no fill operation every time an object is read from storage, requiring less computation;

3. Objects in the local storage are all completely independent from the node clock, because they never require it for filling the causal context, which means that the node clock is not a single point of failure like in NDC and can be lost without affecting the context of the local objects;

4. The non-stripped keys are implicitly tracked by the dot-key map, and the stripping is only attempted when the chance of success is very high; thus, there is no need for an extra background process that constantly attempts to strip non-stripped keys.

No documento Multi-Value Distributed Key-Value Stores (páginas 99-104)