Conclusion - Un algorithme de fouille de données générique et parallèle pour architecture multi

In this section, we have proposedParaMiner, which is an algorithm adapted to pattern mining problems as defined in the previous chapter. Although ParaMiner is generic it is made efficient by generalizing state of the art algorithmic optimizations such as the database reduction technique and indexing techniques. In addition ParaMineris a parallel algorithm hence it can benefit from modern parallel architectures. As a consequence any one with basic programming skills and a dataset to mine can benefit from state of the art pattern mining techniques andfrom parallelism.

In addition, pattern mining is a challenging problem for the parallelism community. Indeed naive parallelizations of pattern mining typically fail to scale even on machines with few cores, and even with a ad-hoc algorithm it is sometime hard to reach the maximal speed on a given execution platform due to the complex interactions between the algorithm behavior and hardware components. In the past years various researchers from both communities have worked together to build efficient ad-hoc parallel algorithms (see Section 5). Thanks to the Melindaexecuting engine used inParaMiner, we can quickly benefit from these works by implementing strategies into Melinda. In addition, ParaMinertogether with Melinda is an interesting framework for quick experimentations regarding an unknown pattern mining problem. It also makes the experiments comparable with other execution strategies or other pattern mining problems. Hence it is and valuable tool for a better understanding of parallel pattern mining algorithms.

Within the next section, we validate the efficiency of ParaMinerby comparing it with other ad-hoc algorithms.

Chapter 4 Experiments

Contents

4.1 Experimental settings . . . . 52 4.2 Experimental evaluation of dataset reduction . . . . 53 4.2.1 Reduction factors for thefimproblem . . . . 53 4.2.2 Reduction factors forcrgandgri. . . . 54 4.2.3 Performance impact of dataset reduction . . . . 56 4.3 Experimental evaluation of parallelism inParaMiner . . . . 57 4.3.1 Parallel performance evaluation ofParaMineronLaptop 58 4.3.2 Parallel performance evaluation ofParaMineronServer 60 4.3.3 Melindastrategies to improve the cache locality . . . . 66 4.4 Comparative experiments of ParaMinerwith ad-hoc algorithms . 69 4.4.1 ParaMinervsfimalgorithms . . . . 70 4.4.2 ParaMinervsgrialgorithms . . . . 71 4.5 Conclusion on the experimental evaluation ofParaMiner . . . . . 72

In this chapter we report on thorough experiments that we have conducted to evaluate ParaMiner’s performances in terms of execution times and also in terms of scalabil- ity on large computation platforms. We give the details of our experimental settings in Section 4.1.

In Section 4.2, we experimentally demonstrate the efficiency of dataset reduction by mea- suring the gain offered by this optimization presented in the previous chapter. The experiments show that dataset reduction makes drastically faster the computation of closed patterns and that it is a key optimization to tackle large datasets.

In Section 4.3, we demonstrate the benefit of parallelism on several types of computation platforms. With the raise of multi-core processors, parallelism is now available in most standard computers. We thus evaluate the gain offered by parallelism on a 4-core laptop computer. These experiments show that ParaMiner can fully exploit this new form of computational power.

We also demonstrate thatParaMinercan benefit from larger computation platforms by conducting experiments on a 32-core computation server with 64GiB of memory. Thanks to this additional computational power, ParaMiner successfully tackle the problem of gradual itemset mining on real world datasets.

Those larger platforms have complex architectures and exploiting them efficiently have been an active research topic for decades. Designing efficient parallel pattern mining algorithms is a particularly challenging problem due to irregular and memory intensive compu- tations. Experiments reveal important parallelism issues also identified in [TP09, GBP⁺05]

or [NTMU10] that prevent ParaMinerfrom reaching the theoretical performance of the computing platforms. From those experiments, we propose solutions and demonstrate their feasibility with Melindastrategies.

In Section 4.4, we demonstrate experimentally that ParaMiner is generic andefficient, thanks to the above optimizations. For doing so, we compare the performances of ParaMiner with several state of the art specific algorithms designed for particular pattern mining problems. Although these algorithms are the fastest available, ParaMiner is competitive in all cases. For the problem of gradual itemset mining, it outperforms the state of the art algorithm by several orders of magnitude.

4.1 Experimental settings

We have implementedParaMinerinC++. TheSelectandClooperators for the different problems proposed in this thesis are also implemented in C++ although it is technically possible to interface ParaMiner with other common programming languages such as Java orPython. TheMelindalibrary is implemented in C. Unless otherwise mentioned, theMelindastrategy in use is the default strategy where the tuples are pushed into and pulled from the tuple-space in the first-in, first-outorder.

All the algorithms including algorithms from different authors are compiled with the gcc compiler with the same settings and with compiler optimizations fully enabled (-O3 flag in gcc). They are executed on computers running the GNU/Linuxoperating system.

We execute the algorithms on two distinct computing platforms namelyLaptopandServer.

Their hardware configuration are presented in Table 4.1. Laptopis a high-end laptop computer with four cores. Its fairly standard configuration is similar to what most data-miners use as their main computer. Experiments conducted on this platforms thus represent what anyone can get by running ParaMiner on his/her own computer.

Laptop Server

# cores 4 32

Memory (GiB) 8 64

Processor type Intel Core i7 X900 4 x Intel Core i7 X7560

Processor frequency (GHz) 2 2.27

Cache size (MiB) 8 4 ×24

Memory bus bandwidth (GiB/s) 7.9 9.7

Table 4.1: Specifications of the computation platform

Server is a computation server with 32 cores and 64 GiB of memory. It is four times the

size ofLaptopin terms of number of cores and eight times the amount of memory available.

However it is important to note that other important characteristics are not dimensioned in these proportions. For example the bus memory bandwidthis roughly the same on both platforms: 7.9 GiB/s on Laptopvs 9.7 GiB/s onServer. This will imply important issues discussed later in this chapter.

No documento Un algorithme de fouille de données générique et parallèle pour architecture multi-coeurs (páginas 58-62)