Results with a single policy - Performance evaluation

5.6 Performance evaluation

5.6.2 Results with a single policy

In a first step, only homogeneous scenarios are considered: requests may have different resiliency schemes but use only one I/O policy, either RAIT, PARALLEL or VERTICAL.

0 1000 2000 3000 4000 5000 6000 7000 8000

0.4 1

RAIT block size (MB)

Avg.weightedresponsetime

VERTICAL RAITPARALLEL

Figure 5.5: Impact of RAIT block sizeBon average weighted response time.

The first experiment aims at analyzing the impact of the block sizeBfor the RAIT policy (contrarily to PARALLEL and VERTICAL, RAIT requires a block size to be tuned, since it impacts how data is written onto tape). The performance of RAIT is computed in terms of the average weighted response time. Request arrival rate is set to 180 requests per hour, and file sizes range from 1 GB to 1 TB, while Bvaries between 1 MB and 8 GB.

Results presented on Figure 5.5 show thatB has a significant impact on the average weighted re-

sponse time of RAIT. For the smallest values, RAIT performs worse than VERTICAL whereas it almost ties PARALLEL for larger values. WithB = 1MB, RAIT is about 80 times slower than with B = 192MB, while the performance is constant between 192 MB and 8 GB. Altogether, this experiment outlines the importance ofB value for RAIT policy, which clearly benefits from large enough blocks in order to offer competitive performance. In all the following experiments, B will be chosen according to these results.

The next experiment intends to compare the performance of all I/O policies for various arrival rates.

The objective is twofold : assessing the average performance of each policy, and in particular, determin- ing the maximum arrival rate which can be handled by each policy. File sizes are chosen among three subsets:smallfile sizes range from 10 MB to 1 GB,mediumfile sizes from 1 GB to 100 GB, andlarge file sizes from 1 TB to 100 TB. Arrival rates are chosen with respect of these files sizes between a few requests per hour to hundreds of requests per hour.

0 5000 10000 15000 20000

900 800 700 600 500 400 300 200 100 0

Avg.weightedresponsetime

λ(Req / hour) VERTICAL

RAITPARALLEL

Figure 5.6: Impact of arrival rates on the average weighted response time for small files (B = 16MB for RAIT).

For small files, as depicted in Figure 5.6, the best average weighted response time is offered by VERTICAL, which can sustain higher arrival rates than the other policies. RAIT is able to serve requests with reasonable response time for rates lower than 420 requests per hour. PARALLEL performs better than RAIT since it can keep up with rates lower than 600 requests per hour. The best policy in that case is VERTICAL, which can tolerate rates up to 800 requests per hour. This is due to the fact that with small files, extra latencies paid by data parallel policies (RAIT and PARALLEL) are not negligible.

Also, recall that more files can be written concurrently throughout the entire system with VERTICAL.

Contrarily to its contenders, VERTICAL can pipeline a large number of files before having to write EC tapes, thereby increasing average performance.

For medium sizes, results presented on Figure5.7show that, as expected, the system benefits more from data parallelism, latencies being now negligible. PARALLEL dominates other policies in this case, being able to handle up to 150 requests per hour while RAIT gets overloaded with arrival rates higher than 95 requests per hour. The extra tape drives used by RAIT as well as extra latencies when reading data have a significant impact on the average weighted response time. In that case, VERTICAL

4 2 0 14 12 10 8 6

180 160

140 120

100 80

60 40

λ(Req / hour)

Avg.weightedresponsetime

VERTICAL RAITPARALLEL

Figure 5.7: Impact of arrival rates on the average weighted response time for medium-size files (B = 256MB for RAIT).

is the worst policy since it can only sustain up to 80 requests per hour.

Finally for large files, results presented on Figure5.8outline a behavior similar to that observed for medium sized files. PARALLEL represents the best policy in terms of average weighted response time.

Using this policy, the system is able to cope with 1.3 requests per hour while the second competitor, RAIT, can only serve up to 1 request per hour without being overloaded. Unsurprisingly, VERTICAL suffers from the lack of parallelism with these huge files, and can only bear 0.1 request per hour.

As a conclusion, these experiments show that PARALLEL offers the best results overall. RAIT outperforms VERTICAL whenever files are large enough. The fact that PARALLEL dominates the other solutions means not only that data parallelism is crucial, but also (and less expectedly) that balancing parity across tapes (as in RAIT) has a negligible impact compared to that of enforcing data sequentiality.

The next experiment aims at measuring how the difference of compressibility between data and EC blocks affects the overall performance of each I/O policy. Indeed, neither PARALLEL nor VERTICAL can balance the EC blocks across tapes, possibly leading to more tape loads/unloads for tapes dedicated to EC blocks. The underlying objective of this experiment is therefore to assess the impact of this pitfall on the average weighted response time. In order to do so, the arrival rate is set to 65 requests per hour, RAIT block size isB = 192M B, and file sizes range between 1 GB and 1 TB. The compression rate of data is set to be 3x. The EC compression rate varies between 1x (no compression) and 3x (same compression rate as data).

As shown on Figure 5.9, I/O policies do not display the same sensitivity to EC compression rate.

Both PARALLEL and RAIT are more affected by compression than VERTICAL. As a matter of a fact, while the later almost maintains its performance regardless of EC compression rate, the average weighted response time displayed by RAIT and PARALLEL increases when EC blocks are less compressed.

The lower sensitivity of VERTICAL to EC compression comes from the fact that several requests are served before writing EC blocks, because these are aggregated. This is not the case for PARALLEL

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Avg.weightedresponsetime

λ(Req / hour) VERTICAL RAITPARALLEL

Figure 5.8: Impact of arrival rates on the average weighted response time for large files (B = 8GB for RAIT).

3 3

2.5 2

1.5 1

Avg.weightedresponsetime

EC compression rate

VERTICAL RAITPARALLEL

Figure 5.9: Impact of EC blocks compression rates on performance.

50000

40000

30000

20000

10000

35 30

25 20

15 10

Avg(Responsetime/size)

Load balancing period (days)

VERTICAL RAITPARALLEL

Figure 5.10: Impact of LOAD-BALANCERperiod on performance.

and RAIT.

Moreover, with both PARALLEL and VERTICAL, data tapes are always entirely filled, and EC blocks are written onto dedicated tapes. When EC blocks are far less compressed than data, tapes dedicated to EC are filled faster and require more frequent EJECT/LOAD operations. This also explain the higher sensitivity of PARALLEL to EC compression. On the contrary, with RAIT, the difference of compression between data and EC blocks degrades the balance of tape occupancy, leading to extra EJECT/LOAD operations on every tape drive (with RAIT, when a tape is filled, all loaded tapes are ejected).

Altogether, this experiment shows that both RAIT and PARALLEL display higher sensitivity to compression than VERTICAL. Whenever data is highly compressible, VERTICAL ties PARALLEL.

The following experiment focuses on the evaluation of the impact of the load balancing period MINLBINTERVALon the average weighted response time. Indeed, this setting might significantly influ- ence the behavior of the system under intensive workloads. Therefore, MINLBINTERVAL needs to be precisely tuned in order to fully exploit the archival architecture. In this experiment, file sizes are chosen between 1 GB and 1 TB, RAIT block size isB = 192MB, while the mean arrival rate is set toλ= 65 requests per hour. Results depicted on Figure5.10show that the load balancing period indeed has a significant impact on performance: all three policies reach a minimum average weighted response time for a particular value of MINLBINTERVAL. Interestingly, the best value of MINLBINTERVALis not the same for every policy: although both RAIT and PARALLEL perform better when LOAD-BALANCER

is called at most every 6 days, VERTICAL benefit from a significantly higher value: 24 days. This is due to the fact that VERTICAL often has more pending requests than the other policies, and requires more time to serve each request. Remember that whenever an I/O process is destroyed by LOAD- BALANCER, loaded tapes are ejected. Therefore, calling the load balancing process too frequently may cause (in the worst case) tapes to be unloaded after each request has been served, leading to higher response times. All in all, this experiment shows that the load balancing period MINLBINTERVALneed to be precisely tuned in order to fully benefit from the parallel storage system.

No documento scale platforms (páginas 135-140)