An approach for profiling distributed applications through network traffic analysis

(1)

Pós-Graduação em Ciência da Computação

“

An Approach for Profiling Distributed

Applications Through Network Traffic Analysis

”

Por

THIAGO PEREIRA DE BRITO VIEIRA

Dissertação de Mestrado

Universidade Federal de Pernambuco posgraduacao@cin.ufpe.br www.cin.ufpe.br/~posgraduacao

(2)

UNIVERSIDADE FEDERAL DE PERNAMBUCO

CENTRO DE INFORMÁTICA

PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO

THIAGO PEREIRA DE BRITO VIEIRA

“AN APPROACH FOR PROFILING DISTRIBUTED

APPLICATIONS THROUGH NETWORK TRAFFIC

ANALYSIS"

ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DA UNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITO PARCIAL PARA OBTENÇÃO DO GRAU DE MESTRE EM CIÊNCIA DA COMPUTAÇÃO.

ORIENTADOR: Vinicius Cardoso Garcia

CÓ-ORIENTADOR: Stenio Flavio de Lacerda Fernandes

(3)

Catalogação na fonte

Bibliotecária Jane Souto Maior, CRB4-571

Vieira, Thiago Pereira de Brito

An approach for profiling distributed applications through network traffic analysis. / Thiago Pereira de Brito Vieira. - Recife: O Autor, 2013.

xv, 71 folhas: fig., tab.

Orientador: Vinicius Cardoso Garcia.

Dissertação (mestrado) - Universidade Federal de Pernambuco. CIn, Ciência da Computação, 2013.

Inclui bibliografia.

1. Ciência da computação. 2. Sistemas distribuídos. I. Garcia, Vinicius Cardoso (orientador). II. Título.

(4)

Dissertação de Mestrado apresentada por Thiago Pereira de Brito Vieira à Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco, sob o título “An Approach for Profiling Distributed Applications

Through Network Traffic Analysis” orientada pelo Prof. Vinicius Cardoso Garcia e

aprovada pela Banca Examinadora formada pelos professores:

______________________________________________ Prof. José Augusto Suruagy Monteiro

Centro de Informática / UFPE

______________________________________________ Prof. Denio Mariz Timoteo de Souza

Instituto Federal da Paraíba

_______________________________________________ Prof. Vinicius Cardoso Garcia

Centro de Informática / UFPE

Visto e permitida a impressão. Recife, 5 de março de 2013

___________________________________________________

Profa. Edna Natividade da Silva Barros

Coordenadora da Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco.

(5)

Eu dedico esta dissertação aos meus pais, por me ensinarem a sempre estudar e trabalhar para evoluir como pessoa e proﬁssional.

(6)

Agradecimentos

Primeiramente eu gostaria de agradecer a Deus pela vida, saúde e todas oportunidades criadas em minha vida.

Agradeço aos meus pais, João e Ana, por todo o amor, carinho e incentivos para que Eu possa sempre buscar crescimento pessoal e proﬁssional, além de sempre me apoiarem nas minhas decisões e se mostrarem sempre preocupados e empenhados em me ajudar a alcançar meus objetivos.

Agradeço à Alynne, minha futura esposa, por todo o amor e paciência durante todo nosso relacionamento, principalmente nestes dois intensos anos de mestrado, em que foram essenciais suas palavras de apoio nos momentos difíceis e sua descontração para me dar mais engergia e vontade de seguir com cada vez mais dedicação.

Agradeço à Agência Nacional de Telecomunicações - Anatel por permitir e pro-porcionar mais um aprendizado na minha vida. Gostaria de agradecer especialmente a Rodrigo Barbosa, Túlio Barbosa e Jane Teixeira por compreenderem e me apoiarem nesde desafio de cursar um mestrado. Agradeço a Marcio Formiga, pelo apoio antes e durante o mestrado, e pela compreensão do esforço necessário para vencer mais este desafio. Agradeço a Wesley Paesano, Marcelo de Oliveira, Regis Novais e Danilo Balby pelo apoio e suporte para que eu pudesse me dedicar ao mestrado durante estes dois anos. Também agradeço aos amigos da Anatel, que de forma direta ou inditera me ajudaram a enfrentar mais este desafio, dentre eles agradeço em especial a Ricardo de Holanda, Rodrigo Curi, Esdras Hoche, Francisco Paulo, Cláudio Moonen, Otávio Barbosa, Hélio Silva, Bruno Preto, Luide Liude e Alexandre Augusto.

Agradeço a todos aqueles que me orientaram e forneceram algum ensinamento durante este mestrado, em especial a Vinicius Garcia pelo acolhimento, apoio, orientações, cobranças e todos os importantes ensinamentos durante estes meses. Agradeço a Stenio Fernandes por todos os ensinamentos e orientações em momentos importantes da minha pesquisa. Agradeço a Rodrigo Assad pelo trabalho realizado em conjunto ao usto.re e pelas orientações, que me nortearam no desenvolvimento da minha pesquisa. Agradeço a Marcelo D’Amorim pelo acolhimento inicial e pelo trabalho que desempenhamos juntos, que foi de grande valor para a minha inserção na pesquisa cientíﬁca e para o meu desenvolvimento como pesquisador.

Agradeço a José Augusto Suruagy e Denio Mariz por aceitarem fazer parte da banca da minha defesa de dissertação e pelas valiosas críticas e contribuições para o meu trabalho.

(7)

con-tribuiram para que estes dias dedicados ao mestrado fossem bastante agradáveis. Gostaria de agradecer a Paulo Fernando, Lenin Abadie, Marco Machado, Dhiego Abrantes, Rodolfo Arruda, Francisco Soares, Sabrina Souto, Adriano Tito, Hélio Rodrigues, Jamil-son Batista, Bruno Felipe e demais pessoas que tive o prazer de conhecer durante este período do mestrado.

Também agradeço a todos os meus velhos amigos de João Pessoa, Geisel, UFPB e CEFET-PB, que tanto me deram apoio e incentivos para desenvolver este trabalho.

Finalmente, a todos aqueles que colaboraram direta ou indiretamente na realização deste trabalho.

(8)

Wherever you go, go with all your heart. —CONFUCIUS

(9)

Resumo

Sistemas distribuídos têm sido utilizados na construção de modernos serviços da Internet e infraestrutura de computação em núvem, com o intuito de obter serviços com alto desempenho, escalabilidade e confiabilidade. Os acordos de níves de serviço adotados pela computação na núvem requerem um reduzido tempo para identificar, diagnosticar e solucionar problemas em sua infraestrutura, de modo a evitar que problemas gerem impactos negativos na qualidade dos serviços prestados aos seus clientes. Então, a detecção de causas de erros, diagnóstico e reprodução de erros provenientes de sistemas distribuídos são desafios que motivam esforços para o desenvolvimento de mecanismos menos intrusivos e mais eficientes, para o monitoramento e depuração de aplicações distribuídas em tempo de execução.

A análise de tráfego de rede é uma opção para a medição de sistemas distribuídos, embora haja limitações na capacidade de processar grande quantidade de tráfego de rede em curto tempo, e na escalabilidade para processar tráfego de rede sob variação de demanda de recursos.

O objetivo desta dissertação é analisar o problema da capacidade de processamento para mensurar sistemas distribuídos através da análise de tráfego de rede, com o intuito de avaliar o desempenho de sistemas distribuídos de um data center, usando hardware não especializado e serviços de computação em núvem, de uma forma minimamente intrusiva.

Nós propusemos uma nova abordagem baseada em MapReduce para profundamente inspecionar tráfego de rede de aplicações distribuídas, com o objetivo de avaliar o desempenho de sistemas distribuídos em tempo de execução, usando hardware não especializado. Nesta dissertação nós avaliamos a eﬁcácia do MapReduce para um algoritimo de avaliação profunda de pacotes, sua capacidade de processamento, o ganho no tempo de conclusão de tarefas, a escalabilidade na capacidade de processamento, e o comportamento seguido pelas fases do MapReduce, quando aplicado à inspeção profunda de pacotes, para extrair indicadores de aplicações distribuídas.

Palavras-chave: Medição de Aplicações Distribuídas, Depuração, MapReduce, Análise de Tráfego de Rede, Análise em Nível de Pacotes, Análise Profunda de Pacotes

(10)

Abstract

Distributed systems has been adopted for building modern Internet services and cloud computing infrastructures, in order to obtain services with high performance, scalability, and reliability. Cloud computing SLAs require low time to identify, diagnose and solve problems in a cloud computing production infrastructure, in order to avoid negative impacts into the quality of service provided for its clients. Thus, the detection of error causes, diagnose and reproduction of errors are challenges that motivate efforts to the development of less intrusive mechanisms for monitoring and debugging distributed applications at runtime.

Network traffic analysis is one option to the distributed systems measurement, al-though there are limitations on capacity to process large amounts of network traffic in short time, and on scalability to process network traffic where there is variation of resource demand.

The goal of this dissertation is to analyse the processing capacity problem for mea-suring distributed systems through network trafﬁc analysis, in order to evaluate the performance of distributed systems at a data center, using commodity hardware and cloud computing services, in a minimally intrusive way.

We propose a new approach based on MapReduce, for deep inspection of distributed application trafﬁc, in order to evaluate the performance of distributed systems at run-time, using commodity hardware. In this dissertation we evaluated the effectiveness of MapReduce for a deep packet inspection algorithm, its processing capacity, completion time speedup, processing capacity scalability, and the behavior followed by MapReduce phases, when applied to deep packet inspection for extracting indicators of distributed applications.

Keywords: Distributed Application Measurement, Proﬁling, MapReduce, Network Trafﬁc Analysis, Packet Level Analysis, Deep Packet Inspection

(11)

List of Figures

2.1 Differences between packet level analysis and deep packet inspection . 8

2.2 MapReduce input dataset splitting into blocks and into records . . . 10

3.1 Architecture of the the SnifferServer to capture and store network trafﬁc 21 3.2 Architecture for network trafﬁc analysis using MapReduce . . . 23

3.3 JXTA Socket trace analysis . . . 31

3.4 Completion time scalability of MapReduce for DPI . . . 32

(a) Scalability to process 16 GB . . . 32

(b) Scalability to process 34 GB . . . 32

4.1 DPI Completion Time and Speed-up of MapReduce for 90Gb of a JXTA-application network trafﬁc . . . 43

4.2 DPI Processing Capacity for 90Gb . . . 44

4.3 MapReduce Phases Behaviour for DPI of 90Gb . . . 45

(a) Phases Time for DPI . . . 45

(b) Phases Distribution for DPI . . . 45

4.4 Completion time comparison of MapReduce for packet level analysis, evaluating the approach with and without splitting into packets . . . 47

4.5 CountUp completion time and speed-up of 90Gb . . . 48

(a) P3 evaluation . . . 48

(b) CountUpDriver evaluation . . . 48

4.6 CountUp processing capacity for 90Gb. . . 49

(a) P3 processing capacity . . . 49

(b) CountUpDriver processing capacity . . . 49

4.7 MapReduce Phases time of CountUp for 90Gb . . . 50

(a) MapReduce Phases Times of P3 . . . 50

(b) MapReduce Phases Times for CountUpDriver. . . 50

4.8 MapReduce Phases Distribution for CountUp of 90Gb . . . 51

(a) Phases Distribution for P3 . . . 51

(b) Phases Distribution for CountUpDriver . . . 51

4.9 MapReduce Phases Distribution for CountUp of 90Gb . . . 52

(a) DPI Completion Time and Speed-up of MapReduce for 30Gb of a JXTA-application network trafﬁc . . . 52

(14)

List of Tables

3.1 Metrics to evaluate MapReduce effectiveness and completion time

scala-bility for DPI of a JXTA-based network trafﬁc . . . 28

3.2 Factors and levels to evaluate the deﬁned metrics . . . 29

3.3 Hypotheses to evaluate the deﬁned metrics . . . 29

3.4 Hypothesis notation . . . 29

3.5 Completion time to process 16 GB split into 35 ﬁles . . . 33

3.6 Completion time to process 34 GB split into 79 ﬁles . . . 33

4.1 Metrics for evaluating MapReduce for DPI and packet level analysis . . 40

4.2 Factors and Levels . . . 40

(15)

List of Acronyms

DPI Deep Packet Inspection

EC2 Elastic Compute Cloud

GQM Goal Question Metric

HDFS Hadoop Distributed File System

IP Internet Protocol

I/O Input/Output

JVM Java Virtual Machine

MBFS Message Based Per Flow State

MBPS Message Based Per Protocol State

PBFS Packet Based Per Flow State

PBNS Packet Based No State

PCAP Packet Capture

PDU Protocol Data Unit

POSIX Portable Operating System Interface

RTT Roud-Trip Time

SLA Service Level Agreement

TCP Transmission Control Protocol

(16)

1

Introduction

Though nobody can go back and make a new beginning, anyone can start over and make a new ending.

—CHICO XAVIER

1.1 Motivation

Distributed systems has been adopted for building high performance systems, due to the possibility of obtaining high fault tolerance, scalability, availability and efﬁcient use of

resources (Cox et al.,2002;Antoniu et al.,2007). Modern Internet services and cloud

computing infrastructures are commonly implemented as distributed systems, to provide

services with high performance and reliability (Mi et al.,2012). Cloud computing SLAs

require low time to identify, diagnose and solve problems in its production infrastructure, in order to avoid negative impacts and problems into the quality of service provided for its clients. Thus, monitoring and performance analysis of distributed systems at production environment, became more necessary with the growth of cloud computing and the use of

distributed systems to provide services and infrastructure as a service (Fox et al.,2009;

Yu et al.,2011).

On distributed systems developing, maintaining and administration, the detection of error causes, diagnosis and reproduction of errors are challenges that motivate efforts to the development of less intrusive and more effective mechanisms for monitoring

and debugging distributed applications at runtime (Armbrust et al.,2010). Distributed

measurement systems (Massie et al.,2004) and log analysers (Oliner et al.,2012) provide

relevant information regarding some aspects of a distributed system, but this information

(17)

1.1. MOTIVATION

such as network trafﬁc analysis, which can provide valuable information of a distributed application and its environment, and also increase the number of information sources to make them more effective for evaluating complex distributed systems. Simulators

(Paul,2010), emulators or testbeds (Loiseau et al.,2009;Gupta et al.,2011) are also used

to evaluate distributed systems, but these approaches present lacks of to reproduce the production behavior of a distributed system, and its relation within a complex environment,

such as the cloud computing environment (Loiseau et al.,2009;Gupta et al.,2011).

Monitoring and diagnosing production failures of distributed systems require low intrusion, high accuracy and fast results. It is complex to achieve these requirements, because distributed systems are usually composed of asynchronous communication, unpredictability of network message issues, high number of resources to be monitored

in short time, and black box components (Yuan et al.,2011;Nagaraj et al.,2012). To

measure distributed systems with less intrusion and less dependency on developers, approaches with low dependency on source code or instrumentation are necessary, such

as log analysis or network trafﬁc analysis (Aguilera et al.,2003).

It is possible to measure, evaluate and diagnose distributed applications through the evaluation of information from communication protocols, ﬂows, throughput and load

distribution (Mi et al.,2012;Nagaraj et al.,2012;Sambasivan et al.,2011;Aguilera et al.,

2003;Yu et al.,2011). This information can be collected through network trafﬁc analysis, but to retrieve this kind of information from distributed application trafﬁc it is necessary to recognize application protocols and perform DPI to retrieve details of the application behaviors, sessions, and states.

Network traffic analysis is one option to evaluate distributed systems’ performance (Yu et al.,2011), although there are limitations on processing capacity to deal with large amounts of network traffic in short time, on scalability to process network traffic over variation of resource demands, and on complexity to obtain information of a distributed

application behavior from network trafﬁc (Loiseau et al.,2009;Callado et al.,2009). To

evaluate application’s information from network trafﬁc it is necessary to use DPI and extract information from application protocols, which requires an additional effort in comparison with traditional approaches of DPI, which usually do not evaluate content of application protocols and application states.

In the production environment of a cloud computing provider, DPI can be used to evaluate and diagnose distributed applications, through the analysis of application trafﬁc inside a data center. However, this kind of DPI presents differences and requires more effort than common DPI approaches. DPI is usually used to inspect all network trafﬁc that

(18)

1.1. MOTIVATION

arrives at a data center, but this approach would not provide reasonable performance for inspecting application protocols and their states, due to the massive volumes of network trafﬁc to be online evaluated, and the computational cost to perform this kind of evaluation

in short time (Callado et al.,2009).

Packet level analysis can also be used to evaluate packet ﬂows and load distribution of

network trafﬁc inside a data center (Kandula et al.,2009), providing valuable information

about the behavior of a distributed system and about the dimension, capacity and usage of network resources. However, with packet level analysis it is not possible to evaluate application messages, protocols, and their states.

Although much work has been done to improve DPI performance (Fernandes et al.,

2009;Antonello et al.,2012), the evaluation of application states through traffic analysis decreases the processing capacity of DPI to evaluate large amounts of network traffic. With the growth of link speeds, Internet traffic exchange and use of distributed systems

to provide Internet services (Sigelman et al.,2010), the development of approaches are

needed to be able to deal with the analysis of the growing amount of network traffic, to permit the efficient evaluation of distributed systems through network traffic analysis.

MapReduce (Dean and Ghemawat,2008), which was proposed for distributed

pro-cessing of large datasets, can be an option to deal with large amounts of network trafﬁc. MapReduce is a programming model and an associated implementation for processing and generating large datasets. It becomes an important programming model and distribu-tion platform to process large amounts of data, with diverse use cases in academia and

industry (Zaharia et al.,2008;Guo et al.,2012). MapReduce is a restricted programming

model to easily and automatically parallelize the execution of user functions and to

provide transparent fault-tolerance (Dean and Ghemawat,2008). Based on functional

combinators from functional languages, it provides a simple programming paradigm for parallel processing that is increasingly being used for data-intensive applications in cloud computing environments.

MapReduce can be used for network packet level analysis (Lee et al.,2011), which

evaluates each packet individually to obtain information of network and transport layers.

Lee et al.(2011) proposed an approach to perform network packet level analysis through MapReduce, using network traces split into packets to process each one individually and to extract indicators from IP, TCP, and UDP. However, for proﬁling an application through network trafﬁc analysis it is necessary to perform a deep packet inspection, in order to evaluate the content of the application layer, and to evaluate application protocols and reassemble application messages.

(19)

1.2. PROBLEM STATEMENT

Because the approach proposed byLee et al.(2011) is not able to evaluate more than

one packet per MapReduce iteration and analyse application messages, it is necessary a new MapReduce approach to perform DPI algorithms for proﬁling applications through network trafﬁc analysis.

The kind of workload submitted for processing by MapReduce impacts on the

be-haviour and performance of MapReduce (Tan et al.,2012;Groot,2012), requiring speciﬁc

conﬁguration to obtain an optimal performance. Information about the occupation of MapReduce phases, about the processing characteristics (if the job is I/O or CPU bound), and about the mean time duration of Map and Reduce tasks, can be used to optimize parameter conﬁgurations of the MapReduce, in order to improve resource allocation and task scheduling.

Although studies has been done to understand, analyse and improve workload

man-agement decisions in MapReduce (Lu et al.,2012;Groot,2012), there is no evaluation to

characterize the MapReduce behaviour or to identify its optimal conﬁguration to achieve the best performance for packet level analysis and DPI.

1.2 Problem Statement

MapReduce can express several kinds of problems, but not all. MapReduce does not

efﬁ-ciently express incremental, dependent or recursive data (Bhatotia et al.,2011;Lin,2012),

because its approach adopts batch processing and functions executed independently, without shared state or data. Although MapReduce is restrictive, it provides a good ﬁt for many problems of processing large datasets. MapReduce expressiveness limitations may be reduced by decomposition of problems into multiple MapReduce iterations, or

by combining MapReduce with others programming models for sub-problems (Lämmel,

2007;Lin,2012), although the decomposition into interactions increases the completion

time of MapReduce jobs (Lämmel,2007).

DPI algorithms require the evaluation of one or more packets to retrieve informa-tion from applicainforma-tion layer messages; this represents a data dependency to mount an application message from network packets, and it is a restriction to use MapReduce for

DPI. BecauseLee et al.(2011)’s approach for MapReduce performs packet level analysis

processes each packet individually, it can not be used to evaluate more than one packet per MapReduce Map function and efﬁciently reassemble an application message from network traces. Thus it is necessary a new approach to use MapReduce to perform DPI, evaluating the effectiveness of MapReduce to express DPI algorithms.

(20)

1.3. CONTRIBUTIONS

In elastic environments, like cloud computing providers, where users can request or discard resources dynamically, it is important to know how to make provisioning and resource allocation in an optimal way. To run MapReduce jobs efﬁciently, the allocated resources need to be matched to the workload characteristics, and the allocated resources

should be sufﬁcient to meet a requested processing capacity or deadline (Lee,2012).

The main performance evaluations of MapReduce are about text processing (Zaharia

et al.,2008;Chen et al.,2011;Jiang et al.,2010;Wang et al.,2009), where the input data are split into blocks and into records, to be processed by parallel and independent Map functions. Although studies has been done in order to understand, analyse and

improve workload decisions in MapReduce (Lu et al.,2012;Groot,2012), there is no

evaluation to characterize the MapReduce behavior or to identify its optimal configuration to achieve the best performance for packet level analysis and DPI. Thus, it is necessary the characterization of MapReduce jobs for packet level analysis and DPI, in order to permit its optimal configuration to achieve the best performance, and to obtain information that can be used to predict or simulate the completion time of a job with given resources, in order to determine whether the job will be finished by the deadline with the allocated

resources (Lee,2012).

The goal of this dissertation is to analyse the processing capacity problem for mea-suring distributed systems through network trafﬁc analysis, proposing a solution able to perform deep inspection in distributed applications trafﬁc, in order to evaluate distributed systems at a data center, using commodity hardware and cloud computing services, in a minimally intrusive way. Thus we developed an approach based on MapReduce to evalu-ate the behavior of distributed systems through DPI, and we evaluevalu-ated the effectiveness of MapReduce to a DPI algorithm and its completion time scalability through node addition into the cluster, to measure a JXTA-based application, using virtual machines of a cloud computing provider. Also we evaluated the MapReduce performance for packet level analysis and DPI, characterizing the behavior followed by MapReduce phases, processing capacity scalability and speed-up. In this evaluation we evaluated the impact caused by the variation of input size, block size and cluster size.

1.3 Contributions

We analyse the processing capacity problem of distributed system measurements through network trafﬁc analysis. The results of the work presented in this dissertation provide the following contributions:

(21)

1.4. DISSERTATION ORGANIZATION

1. We proposedan approach to implement DPI algorithms through MapReduce,

using whole blocks as input for Map functions. Wasshown the effectiveness of

MapReduce for a DPI algorithm to extract indicators from a distributed

appli-cation trafﬁc, also it was shown the MapReduce completion time scalability,

through node addition into the cluster, for DPI on virtual machines of a cloud computing provider;

2. We characterized the behavior followed by MapReduce phases for packet

level analysis and DPI, showing that this kind of job is intense in Map phase and highlighting points for improvement;

3. We described the processing capacity scalability of MapReduce for packet

level analysis and DPI, evaluating the impact caused by variations in input, cluster and block size;

4. Weshowed the speed-up obtained with MapReduce for DPI, with variations of

input, cluster and block size.

1.4 Dissertation Organization

The remainder of this dissertation is organized as follows.

In Chapter2, we provide the background information on network trafﬁc analysis and

MapReduce, we also investigate previous work that are related to the measurement of distributed applications at runtime and with the use of MapReduce for network trafﬁc analysis.

In Chapter 3, we look at the problem of distributed application monitoring and

restriction to use MapReduce for profiling application traffic. There are limitations on capacity to process large amounts of network packet in short time and on scalability to be able to process network traffic where there are variations of throughput and resource demand. To address this problem, we present an approach for profiling application traffic using MapReduce. Experiments show the effectiveness of our approach for profiling application through DPI and MapReduce, and shows the achieved completion time scalability in a cloud computing provider.

In Chapter 4, we performed a performance evaluation of MapReduce for network

traffic analysis. Due to the lack of evaluation of MapReduce for traffic analysis and the peculiarity of this kind of data, this chapter deeply evaluates the performance of MapReduce for packet level analysis and DPI of distributed application traffic, evaluating

(22)

1.4. DISSERTATION ORGANIZATION

the MapReduce scalability, speed-up and behavior followed by MapReduce phases. The experiments evidence the predominant phases in this kind of MapReduce job, and show the impact caused by the input size, block size and number of nodes, into the job completion time and scalability achieved through the use of MapReduce.

In Chapter5we conclude the work done, summarize our contributions and present

(23)

2

Background and Related Work

No one knows it all. No one is ignorant of everything. We all know something. We are all ignorant of something.

—PAULO FREIRE

In this chapter, we provide background information on network trafﬁc analysis, JXTA and MapReduce, we also investigate previous studies that are related to the measurement of distributed applications and to the use of MapReduce for network trafﬁc analysis.

2.1 Background

2.1.1 Network Trafﬁc Analysis

Network trafﬁc measurement can be divided into active or passive measurement, and a measurement can be performed at packet or ﬂow levels. In packet level analysis, the measurements are performed on each packet transmitted across the measurement point. The common packet inspection only analyses the content up to the transport layer, including the source address, destination address, source port, destination port and the protocol type, but packet inspection can also analyse the packet payload, performing a deep packet inspection.

Risso et al.(2008) presented a taxonomy of the methods that can be used for network

trafﬁc analysis. According toRisso et al.(2008), the Packet Based No State (PBNS)

oper-ates by checking the value of some ﬁelds present in each packet, such as the TCP or UDP ports, thus this method is very simple computationally. The Packet Based Per Flow State (PBFS) requires a session table to manage session identiﬁcation (source/destination ad-dress, transport-layer protocol, source/destination port) and the corresponding application

(24)

2.1. BACKGROUND

layer protocol, in order to be able to scan the payload looking for a speciﬁc rule, which usually is an application-layer signature, which increases the processing complexity of this method. The Message Based Per Flow State (MBFS) operates on messages instead of packets. This method requires a TCP/IP reassembler to handle IP fragments and TCP segments. In such case, memory requirements increase because of the additional state information that must be kept for each session and because of buffers required by the TCP/IP reassembler. The Message Based Per Protocol State (MBPS) interprets exactly what each application sends and receives. A MBPS processor understands not only the semantic of the message, but also the different phases of a messages exchange because it has a full understanding of the protocol state machine. Memory requirements become even larger, because this method needs to take into account not only the state of the transport session, but also the state of each application layer session. Also processing power is the highest because the protocol conformance analysis requires processing the entire application data, while previous methods are limited to the ﬁrst packets within each session.

The Figure2.1illustrates the difference between packet level analysis and DPI from

PCAP ﬁles, and shows that packet level analysis evaluates each packet individually, while DPI requires an evaluation of more than one packet to reassemble some packets and obtain an application message.

Figure 2.1 Differences between packet level analysis and deep packet inspection

DPI refers for examining both packet header and complete payload to look for predeﬁned patterns or rules. A pattern or rule can be a particular TCP connection, deﬁned by source and destination IP addresses and port numbers, it can also be a signature string

(25)

2.1. BACKGROUND

(2012) argues that many critical network services rely on the inspection of packet payload,

instead of only looking at the information of packet headers. Although DPI systems are essentially more accurate to identify application protocols and application messages, they are also resource-intensive and may not scale well with the growing link speeds. MBFS, MBPS and DPI evaluate the content of the application layer, thus it is necessary to recognize the content of the message evaluated, but encrypted messages can make these kind of evaluation infeasible.

2.1.2 JXTA

JXTA is a language and speciﬁcation for peer to peer networking, it attempts to formulate peer to peer standard protocols, in order to provide an infrastructure for building peer to peer applications, through basic functionalities for peer resource discovery, commu-nication and organization. JXTA introduces an overlay on top of the existing physical

network, with its own addressing and routing (Duigou, 2003; Halepovic and Deters,

2003).

According to JXTA speciﬁcation (Duigou,2003), JXTA peers communicate through

messages transmitted by pipes, which are an abstraction of virtual channels composed of input and output channels, for peer to peer communication. Pipes are not bound to the physical location, it has its own unique ID. Each peer can carry its pipe with itself even when its physical network location changes. Pipes are asynchronous, unidirectional and unreliable, but bi-directional and reliable services are provided on top of them. JXTA uses source-based routing, each message carries its routing information as a sequence of peers, and peers along the path may update this information. The JXTA socket adds reliability and bi-directionality to JXTA communications through one layer of abstraction

on top of the pipes (Antoniu et al., 2005), and it provides an interface similar to the

POSIX sockets speciﬁcation. JXTA messages are XML-documents composed of well deﬁned and ordered message elements.

Halepovic and Deters(2005) proposed a performance model, describing important metrics to evaluate the JXTA throughput, scalability, services and the JXTA behavior

over different versions. Halepovic et al. (2005) analysed the JXTA performance in

order to show the increasing cost or latency with higher workload and with concurrent requests, and suggests more evaluations about JXTA scalability with large peer groups in

direct communication.Halepovic(2004) cites that network trafﬁc analysis is a feasible

approach to performance evaluation of JXTA-based applications, but do not adopt it due to the lack on JXTA trafﬁc characterization. Although there are performance models and

(26)

2.1. BACKGROUND

evaluations of JXTA, there are no evaluations of it for the current versions and there are not mechanisms to evaluate JXTA applications at runtime. Because JXTA is still used for

building peer to peer systems, such as the U-Store (Fonseca et al.,2012), which motivates

our research, is necessary a solution to measure JXTA-based applications at runtime and provide information about their behavior and performance.

2.1.3 MapReduce

MapReduce (Dean and Ghemawat,2008) is a programming model and a framework for

processing large datasets trough distributed computing, providing fault tolerance and high scalability to big data processing. The MapReduce model was designed for unstructured data processed by clusters of commodity hardware. Its functional style of Map and Reduce functions automatically parallelizes and executes large jobs in a cluster. Also, MapReduce handles failures, application deployment, task duplications, and aggregation of results, thereby allowing programmers to focus on the core logic of applications.

An application executed through MapReduce is called job. The input data of a job, which is stored into a distributed ﬁle system, it is split into even-sized blocks and

replicated for fault tolerance. Figure2.2shows the dataset input splitting adopted by

MapReduce.

(27)

2.1. BACKGROUND

Initially the input dataset is split into blocks and stored into the distributed file system adopted. During the job execution of a dataset, each split is assigned to be processed by a Mapper, thus the number of splits of the input determines the number of Map tasks of a MapReduce job. Each Mapper reads its split from the distributed file system and divides it into records, to be processed by the user-defined Map function. Each Map function generates intermediate data from the evaluated block, which will be fetched, ordered by keys and processed by the Reducers to generate the output of a MapReduce job.

A MapReduce job is divided into Map and Reduce tasks, which are composed of user-deﬁned functions of Map and Reduce. The execution of these tasks can be grouped into phases, representing the Map and Reduce phases, but Reduce tasks still can be divided into other phases, which are the Shufﬂe and Sort phases. A job is submitted by an user to the master node, which selects worker nodes with idle slots and assigns Map or Reduce tasks.

The execution of a Map task can be divided into two phases. In the first, the Map phase reads the task’s split from the distributed file system, parses it into records, and applies the user-defined Map function to each record. In the second, after the user-defined Map function has been applied to each input record, the commit phase registers the final output with the TaskTracker, which then informs the JobTracker that the task has finished executing. The output of the Map phase is consumed by the Reduce phase.

The execution of a Reduce tasks can be divided into three phases. The ﬁrst phase, called Shufﬂe phase, fetches the Reduce task’s input data, where for each Reduce task is assigned a partition of the key produced by the Map phase. The second phase, called Sort phase, groups records with the same key. The third phase, called Reduce phase, applies

the user-deﬁned Reduce function to each key and its values (Kavulya et al.,2010).

A Reduce task cannot fetch the output of a Map task until the Map has finished and committed its output to disk. Only after receiving its partition from all Map outputs, the Reduce task starts the Sort phase, while this does not happens, the Reduce task executes the Shuffle phase. After the Sort phase, the Reduce task enters the Reduce phase, in which it executes the user-defined Reduce function for each key and its values. Finally the output of the Reduce function is written to a temporary location on the distributed file

system (Condie et al.,2010).

MapReduce worker nodes are configurable to concurrently execute up to a defined number of Map and Reduce tasks, which are defined according to the number of Map and Reduce slots. Each worker node of a MapReduce cluster is configured with a fixed number of Map slots, and another fixed number of Reduce slots, which means the number of Map

(28)

2.1. BACKGROUND

or Reduce tasks that can be executed concurrently per node. During job executions, if all possible slots are occupied, pending tasks must wait until some slots are freed up. If the number of tasks in the job is bigger than the number of slots available, then Maps or Reduces are ﬁrst scheduled to execute on all available slots, and these tasks compose the ﬁrst wave of tasks, that is followed by subsequent waves. If an input is broken into 200 blocks and there are 20 Map slots in a cluster, the number of map tasks are 200 and the

map tasks are executed through 10 waves of executions (Lee et al.,2012). The number of

waves, and the sizes of waves, would aid the conﬁguration of tasks for improved cluster

utilization (Kavulya et al.,2010).

The Shuffle phase of the first Reduce wave may be significantly different from the Shuffle phase that belongs to the next Reduce waves. This happens because the Shuffle phase of the first Reduce wave overlaps with the entire Map phase, and hence its depends

on the number of Map waves and their durations (Verma et al.,2012b).

Each Map task is independent of the others Map tasks, meaning that all Mappers can be performed in parallel on multiple machines. The number of concurrent Map tasks in a MapReduce system is limited by the number of slots and the number of blocks in which the input data was divided. Reduce tasks can also be performed in parallel during the Reduce phase, and the number of reduce tasks in a job is speciﬁed by the application and by the number of Reduce slots per node.

MapReduce tries to achieve data locality for its job executions, which means the Map task and the input data block it will process should be located as close to each other as possible, in order for the Map task can read the input data block incurring as little network trafﬁc as possible.

Hadoop1_{is an open source implementation of MapReduce, which relies on HDFS}

for distributed data storage and replication. HDFS is an implementation of Google File

System (Ghemawat et al.,2003), which was designed to store large ﬁles, and was adopted

by MapReduce system as distributed ﬁle system to store its ﬁles and intermediate data. The input data type and workload characteristics cause impact into the MapReduce performance, due to each application has a different bottleneck resource, and requires

speciﬁc conﬁguration to achieve optimal resource utilization (Kambatla et al.,2009).

Hadoop has a set of parameters for its configuration, the default values of these parameters are based on typical configuration of machines in clusters and requirements of a typical application, that usually processes text-like inputs, although the MapReduce optimal resource utilization is dependent on the resource consumption profile of its application.

(29)

2.2. RELATED WORK

Because the input data type and workload characteristics of MapReduce jobs impacts into MapReduce performance, it is necessary to evaluate the MapReduce behavior and performance for different purposes. Although much work has been done in order to

understand and analyse MapReduce for different input data types and workloads (Lu

et al.,2012;Groot,2012), there is no evaluation to characterize the MapReduce behavior and identify its optimal conﬁguration for an application to packet level analysis and DPI.

2.2 Related Work

2.2.1 Distributed Debugging

Modern Internet services are often implemented as complex, large-scale distributed systems. Information about the behavior of complex distributed systems is necessary to evaluate and improve their performance, but for understanding distributed system behavior it is required to observe related activities across many different components and

machines (Sigelman et al.,2010).

The evaluation of distributed applications is a challenge, due to the cost of monitoring distributed systems and the lack of performance measurement of large scale distributed applications at runtime. To reproduce the behavior of a complex distributed system, in a test environment, it is necessary to reproduce each relevant conﬁguration parameter

of the system (Gupta et al.,2011), which is a difﬁcult effort, and is more evident and

complex in cases where faults only occurs when the system is over a high load (Loiseau

et al.,2009).

Gupta et al.(2011) presented a methodology and framework for large scale tests, able to obtain resource conﬁgurations and scale near a large scale system, through the use of

emulated scalable network, multiplexed virtual machines and resource dilatation. Gupta

et al.(2011) shows its accuracy, scalability and the realism on network tests. However it can not obtain the same accuracy of an evaluation of a real system at runtime, neither can diagnose a problem occurred in production environment, in short time.

According to Sambasivan et al. (2011), debugging tools are needed to help the

identification and understanding of root causes of the diverse performance problems that can arise in distributed systems. A request-flow can be seen as path and timing of a request in a distributed system, representing the the flow of individual requests within and across the components of a distributed system. There are many cases for which request-flow traces comparison is useful; it can help to diagnose performance changes

(30)

2.2. RELATED WORK

resulting from modiﬁcations made during software development or from upgrades of a deployed system. It can also help to diagnose behaviour changes resulted from component degradations, resource leakage, or workload changes.

Sigelman et al.(2010) reported Dapper, a large production distributed system tracing framework of Google, that states three concrete design goals: low overhead, application-level transparency and scalability. These goals were achieved by restricting Dapper’s core tracing instrumentation to an ubiquitous threading, control ﬂow, and RPC library code of Google. Dapper provides valuable insights about the evaluation of distributed systems through ﬂows and procedure calls, but its implementation is dependent of the instrumentation into the component responsible for message communication of the distributed system, what can not be available in a black box system.

Some techniques has been developed for performance evaluation of distributed

sys-tems. Mi et al.(2012) proposed an approach, based on end-to-end request trace logs, to

identify primary causes of performance problems in cloud computing systems.Nagaraj

et al.(2012) compared logs of distributed systems to diagnose performance problems, using machine learning techniques to analyse logs and to explore information of states

and event times.Sambasivan et al.(2011) used request ﬂows to ﬁnd performance

mod-ifications in distributed systems, comparing request flows across periods and ranking them based on their impact in system’s performance. Although these approaches evaluate requests, flows and events of distributed systems, traffic analysis was not used as an approach to provide de desired information.

Aguilera et al.(2003) proposed an approach to isolate performance bottlenecks in distributed systems, based in message-level traces activity and algorithms for inferring the dominant paths of a distributed system. Although network trafﬁc was considered as source to extract the desired information, a distributed approach was not adopt for data processing.

Yu et al.(2011) presented SNAP, a scalable network-application profiler to evaluate the interactions between applications and the network. SNAP passively collects TCP statistics and socket logs, and correlates them with network resources to indicate prob-lem locations. However, SNAP did not adopted application traffic evaluation, neither distributed computing to perform network traffic processing.

2.2.2 MapReduce for Network Trafﬁc Analysis

Lee et al.(2010) proposed a network ﬂow analysis method using MapReduce, where the network trafﬁc was captured, converted to text and used as input to Map tasks. As a result,

(31)

2.3. CHAPTER SUMMARY

it was shown improvements in fault tolerance and computation time, when compared with

ﬂow-tools2_{. The conversion time from binary network traces to text represents a relevant}

additional time, that can be avoided adopting binary data as input data for MapReduce jobs.

Lee et al.(2011) presented a Hadoop-based packet trace processing tool to process large amounts of binary network trafﬁc. A new input type to Hadoop was developed, the PcapInputFormat, which encapsulate the complexity of processing a captured binary

PCAP traces and extracting the packets through the Libpcap (Jacobson et al., 1994)

library. Lee et al.(2011) compared their approach with CoralReef3_{, which is a network}

traffic analysis tool that also relies on Libpcap, the results of the evaluation showed speed-up on completion time, for a case that process packet traces with more than 100GB. This approach implemented a packet level evaluation, to extract indicators from IP, TCP and UDP, evaluating the job completion time achieved with different input size and two cluster configurations. It was implemented their own component to save network traces into blocks, and the developed PcapInputFormat rely on a timestamp-based heuristic for finding the first packet from each block, using sliding-window. These implementations to iterate over packets of a network trace, can present a limitation on accuracy, if compared

with the accuracy obtained by Tcpdump4_{and LibPCAP for the same functionalities.}

The approach proposed by Lee et al.(2011) is not able to evaluate more than one

packet per MapReduce iteration, because each block is divided into packets that are evaluated individually by the user-deﬁned Map function. Therefore, a new MapReduce approach is necessary to perform DPI algorithms, which requires to reassemble more than one packet to mount an application message, in order to evaluate message contents, application states and application protocols.

2.3 Chapter Summary

In this chapter, we presented the background information of network trafﬁc analysis, JXTA and MapReduce, we also investigated previous studies that are related to the measurement of distributed applications and related to the use of MapReduce for network trafﬁc analysis.

According to the background and related work evaluated, the detection of error causes, diagnose and reproduction of errors of distributed systems are challenges that motivate

2_{www.splintered.net/sw/ﬂow-tools/}

3_{http://www.caida.org/tools/measurement/coralreef} 4_{http://www.tcpdump.org/}

(32)

2.3. CHAPTER SUMMARY

efforts to develop less intrusive mechanisms for monitoring and debugging distributed applications at runtime. Network traffic analysis is one option to distributed systems measurement, although there are limitations on capacity to process large amounts of network traffic in short time, and on scalability to process network traffic where there is variation of resource demand.

Although MapReduce can be used for packet level analysis, it is necessary an approach to use MapReduce for DPI, in order to evaluate distributed systems at a data center through network traffic analysis, using commodity hardware and cloud computing services, in a minimally intrusive way. Due to the lack of evaluation of MapReduce for traffic analysis and the peculiarity of this kind of data, it is necessary to evaluate the performance of MapReduce for packet-level analysis and DPI, characterizing the behavior followed by MapReduce phases, its processing capacity scalability and speed-up, over variations of the most important configuration parameters of MapReduce.

(33)

3

Proﬁling Distributed Applications

Through Deep Packet Inspection

Life is really simple, but we insist on making it complicated. —CONFUCIUS

In this chapter, we first look at the problems in the distributed application monitoring, processing capacity of network traffic, and in the restriction to use MapReduce for profiling application network traffic of distributed applications.

Network traffic analysis can be used to extract performance indicators from commu-nication protocols, flows, throughput and load distribution of a distributed system. In this context, network traffic analysis can enrich diagnoses and provide a mechanism for measuring distributed systems in a passive way, with low overhead and low dependency on developers.

However, there are limitations on the capacity to process large amounts of network traffic in short time, and on processing capacity scalability to be able to process network traffic over variations of throughput and resource demands. To address this problem, we present an approach for profiling application network traffic using MapReduce. Exper-iments show the effectiveness of our approach for profiling a JXTA-based distributed application through DPI, and its completion time scalability through node addition, in a cloud computing environment.

In Section3.1we begin this chapter by motivating the need for an approach using

MapReduce for DPI, then we describe, in Section3.2, the architecture proposed and the

DPI algorithm to extract indicators from network trafﬁc of a JXTA-based distributed

(34)

3.1. MOTIVATION

setup used to evaluate our proposed approach. The obtained results are presented in

Section3.4and discussed in Section3.5. Finally, Section3.6concludes and summarizes

this chapter.

3.1 Motivation

Modern Internet services and cloud computing infrastructure are commonly implemented as distributed systems, to provide services with high performance, scalability and reliabil-ity. Cloud computing SLAs require a short time to identify, diagnose and solve problems in its infrastructure, in order to avoid negative impacts and problems in the provided quality of service.

Monitoring and performance analysis of distributed systems became more necessary with the growth of cloud computing and the use of distributed systems to provide services

and infrastructure (Fox et al.,2009). In distributed systems development, maintenance and

administration, the detection of error causes, and the diagnosing and reproduction of errors are challenges that motivates efforts to develop less intrusive mechanisms for debugging

and monitoring distributed applications at runtime (Armbrust et al.,2010). Distributed

measurement systems (Massie et al.,2004) and log analyzers (Oliner et al.,2012) provide

relevant information of some aspects of a distributed system. However this information can be complemented by correlating information from network trafﬁc analysis, making them more effective and increasing the information source to ubiquitously evaluate a distributed system.

Low overhead, and transparency and scalability are commons requirements for an efﬁcient solution to the measurement of distributed systems. Many approaches have been proposed in this direction, using instrumentation or logging, which cause overhead and a dependency on developers. It is possible to diagnose and evaluate distributed applica-tions’ performance with the evaluation of information from communication protocols,

ﬂows, throughput and load distribution (Sambasivan et al.,2011;Mi et al.,2012). This

information can be collected through network trafﬁc analysis, enriching a diagnosis, and also providing an approach for the measurement of distributed systems in a passive way, with low overhead and low dependency on developers.

Network trafﬁc analysis is one option to evaluate distributed systems performance (Yu et al.,2011), although there are limitations on the capacity to process large number

of network packets in a short time (Loiseau et al.,2009;Callado et al.,2009) and on

scalability to process network trafﬁc over variations of throughput and resource demands. To obtain information of the behaviour of distributed systems, from network trafﬁc, it

(35)

3.1. MOTIVATION

is necessary to use DPI and evaluate information from application states, which requires an additional effort in comparison with traditional approaches of DPI, which usually do not evaluate application states.

Although much work has been done in order to improve the DPI performance (

Fernan-des et al.,2009;Antonello et al.,2012), the evaluation of application states still decreases the processing capacity of DPI to evaluate large amounts of network trafﬁc. With the growth of links’ speed, Internet trafﬁc exchange and the use of distributed systems to

provide Internet services (Sigelman et al.,2010), the development of new approaches

are needed to be able to deal with the analysis of the growing amount of network traffic, and to permit the efficient evaluation of distributed systems through the network traffic analysis.

MapReduce (Dean and Ghemawat,2008) becomes an important programming model

and distribution platform to process large amount of data, with diverse use cases in

academia and industry (Zaharia et al.,2008;Guo et al.,2012). MapReduce can be used

for packet level analysis: Lee et al.(2011) proposed an approach which evaluates each

packet individually to obtain information of network and transport layers. An approach

to process large amount of network trafﬁc using MapReduce was proposed byLee et al.

(2011), which splits network traces into packets to process each one individually and

extract indicators from IP, TCP, and UDP.

However, for proﬁling distributed applications through network trafﬁc analysis, it is necessary to analyse the content of more than one packet, up to the application layer, to evaluate application messages and its protocols. Due to TCP and message segmentation, the desired application message may be split into several packets. Therefore, it is necessary to evaluate more than one packet per MapReduce iteration to perform a deep packet inspection, in order to be able to reassemble more than one packet and mount application messages, to retrieve information from the application sessions, states and from its protocols.

DPI refers for examining both packet header and complete payload to look for predeﬁned patterns or rules, which can be a signature string or an application message.

According to the taxonomy presented by Risso et al. (2008), deep packet inspection

can be classified as message based per flow state (MBFS), which analyses application messages and its flows, and also can be classified as message based per protocol state (MBPS), which analyses application messages and its application protocol states, what makes necessary to evaluate distributed applications through network traffic analysis, to extract application indicators.

(36)

3.1. MOTIVATION

MapReduce is a restricted programming model to parallelize user functions

auto-matically and to provide transparent fault-tolerance (Dean and Ghemawat,2008), based

on functional combinators from functional languages. MapReduce does not efﬁciently

express incremental, dependent or recursive data (Bhatotia et al.,2011;Lin,2012),

be-cause its approach adopts batch processing and functions executed independently, without shared states.

Although restrictive, MapReduce provides a good ﬁt for many problems of processing large datasets. Also, its expressiveness limitations may be reduced by problem decom-position into multiple MapReduce iterations, or by combining MapReduce with others

programming models for subproblems (Lämmel,2007;Lin,2012), but this approach can

be not optimal in some cases. DPI algorithms require the evaluation of one or more pack-ets to retrieve information from application messages; this represents a data dependence to mount an application message and is a restriction on the use of MapReduce for DPI.

Because theLee et al. (2011) approach processes each packet individually, it can

not be efﬁciently used to evaluate more than one packet and reassemble an application message from a network trace, which makes it necessary a new approach for using MapReduce to perform DPI and to evaluate application messages.

To be able to process large amounts of network trafﬁc using commodity hardware, in order to evaluate the behaviour of distributed systems at runtime, and also because there is no evaluation of MapReduce effectiveness and processing capacity for DPI, an approach was developed based on MapReduce, to deeply inspect distributed applications trafﬁc, in order to evaluate the behaviour of distributed systems, using Hadoop, an open source implementation of MapReduce.

In this Chapter is evaluate the effectiveness of MapReduce to a DPI algorithm and its completion time scalability through node addition, to measure a JXTA-based

applica-tion, using virtual machines of Amazon EC21_{, a cloud computing provider. The main}

contributions of this chapter are:

1. To provide an approach to implement DPI algorithms using MapReduce; 2. To show the effectiveness of MapReduce for DPI;

3. To show the completion time scalability of MapReduce for DPI, using virtual machines of cloud computing providers.

(37)

3.2. ARCHITECTURE

3.2 Architecture

In this section we present the architecture of the proposed approach to capture and process network trafﬁc of distributed applications.

To monitor distributed applications through network traffic analysis, specifics points of a data center must be monitored to capture the desired application network traffic. Also, an approach is needed to process a large amount of network traffic in an acceptable

time. According to (Sigelman et al.,2010), fresh information enables a faster reaction to

production problems, thereby the information must be obtained as soon as possible, al-though a trace analysis system operating on hours-old data is still valuable for monitoring

distributed applications in a data center (Sigelman et al.,2010).

In this direction, we propose a pipelined process to capture network traffic, store locally, transfer to a distributed file system, and evaluate the network trace to extract application indicators. We use MapReduce, implemented by Apache Hadoop, to pro-cess application network traffic, extract application indicators, and provide an efficient and scalable solution for DPI and profiling application network traffic in a production environment, using commodity hardware.

The architecture for network trafﬁc capturing and processing is composed of four

main components: the SnifferServer (Shown in Figure3.1), that captures, splits and stores

network packets into the HDFS for batch processing through Hadoop; the Manager, that orchestrates the collected data, the job executions and stores the results generated; the AppParser, that converts network packets into application messages; and the AppAnalyzer, that implements Map and Reduce functions to extract the desired indicators.

(38)

3.2. ARCHITECTURE

Figure3.1shows the architecture of the SnifferServer and its placement into

moni-toring points of a datacenter. SnifferServer captures network traffic from specific points and stores it into the HDFS, for batch processing through Hadoop. Sniffer executes user-defined monitoring plans guided by specification of places, time, traffic filters and the amount of data to be captured. According to an user-defined monitoring plan, Sniffer starts the capture of the desired network traffic through Tcpdump, which saves network traffic in binary files, known as PCAP files. The collected traffic is split into files with predefined size, saved at the local SnifferServer file system, and transferred to HDFS only when each file is totally saved into the local file system of the SnifferServer. The SnifferServer must be connected to the network where the monitoring target nodes are connected, and must be able to establish communication with the others nodes that compose the HDFS cluster.

During the execution of a monitoring plan, initially the network traffic must be captured, split into even-sized files and stored into HDFS. Through the Tcpdump, a widely used LibPCAP network traffic capture tool, the packets are captured and split into PCAP files with 64MB of size, which is the default block size of the HDFS, although this block size may be configured to different values.

HDFS is optimized to store large files, but internally each file is split into blocks with a predefined size. Files that are greater than the HDFS block size must be split into blocks with size equal to or smaller than the adopted block size, and must be spread among machines in the cluster.

Because the LibPCAP, used by Tcpdump, stores the network packets in binary PCAP files and due to the complexity of providing to HDFS an algorithm for splitting PCAP files into packets, PCAP files splitting can be avoided through the adoption of files less than the HDFS block size, but also can be provided to Hadoop an algorithm to split PCAP files into packets, in order to be able to store PCAP files into the HDFS.

We adopted the approach that saves the network trace into PCAP files with the adopted HDFS block size, using the split functionality provided by Tcpdump, because of the PCAP file split into packets demands additional computing time and because of the trace splitting into packets increases the complexity of the system. Thus, the network traffic is captured by Tcpdump, split into even-sized PCAP files and stored into the local file system of the SnifferServer, and periodically transferred to HDFS, which is responsible for replicating the files into the cluster.

In the MapReduce framework, the input data is split into blocks, which are split into small pieces, called records, to be used as input for each Map function. We adopt

(39)

3.2. ARCHITECTURE

the use of entire blocks, with size deﬁned by the HDFS block size, as input for each Map function, instead of using the block divided into records. With this approach, it is possible to evaluate more than one packet per MapReduce task and to be able to mount an application message from network trafﬁc. Also it is possible to obtain more processing time for the Map function than the approach where each Map function receives only one packet as input.

Differently from the approach presented by Lee et al.(2011), which only permits

evaluation of a packet individually per Map function, with our approach it is possible to evaluate many packets from a PCAP ﬁle per Map function and to reassemble application messages from network trafﬁc, which had the content of its messages divided into many packets to be transferred over TCP.

Figure3.2shows the architecture to process distributed application trafﬁc through

Map and Reduce functions, implemented by AppAnalyzer, which is deployed at Hadoop nodes, and managed by Manager, and has the generated results stored into a distributed database.

Figure 3.2 Architecture for network trafﬁc analysis using MapReduce

The communication between components was characterized as blocking and non-blocking; blocking communication was adopted in cases that require high consistency, and non blocking communication was adopted in cases where it is possible to use eventual consistency to obtain better response time and scalability.

(40)

3.2. ARCHITECTURE

AppAnalyzer is composed of Mappers and Reducers for specific application protocols and indicators. AppAnalyzer extends AppParser, which provides protocol parsers to transform network traffic into programmable objects, providing a high level abstraction to handle application messages from network traffic.

Manager provides functionalities for users to create monitoring plans with specifica-tion of places, time and amount of data to be captured. The amount of data to be processed and the number of Hadoop nodes available for processing are important factors to obtain an optimal completion time of MapReduce jobs and to generate fresh information for faster reaction to production problems of the monitored distributed system. Thus, after network traffic is captured and the PCAP files are stored into HDFS, Manager permits the selection of the number of files to be processed, and then schedules a MapReduce job for this processing. After each MapReduce job execution, Manager is also responsible for storing the generated results into a distributed database.

We adopted a distributed database with eventual consistency and high availability,

based on Amazon’s Dynamo (DeCandia et al., 2007), and implemented by Apache

Cassandra2_{, to store the indicator results generated by the AppAnalyzer. With the eventual}

consistency, we expect gains with fast writes and reads operations, in order to reduce the blocking time of these operations.

AppAnalyzer provides Map and Reduce functions to be used for evaluating specific protocols and desired indicators. Each Map function receives as input a path of a PCAP file stored into HDFS; this path is defined by the data locality control of the Hadoop, which tries to delegate each task to nodes that have a local replica of the data or that are near a replica. Then, the file is opened and each network packet is processed, to remount messages and flows, and to extract the desired indicators.

During the data processing, the indicators are extracted from application messages and saved in a SortedMapWritable object, which is ordered by its timestamp. Sort-edMapWritable is a sorted collection of values which will be used by Reduce functions to summarize each evaluated indicator. In our approach, each evaluated indicator is extracted and saved into an individual result ﬁle of Hadoop, which is stored into HDFS. MapReduce usually splits blocks in records to be used as input for Map functions, but we adopt whole ﬁles as input for Map tasks, to be able to perform DPI and reassemble application messages that had their content divided into some TCP packets, due TCP segmentation or due an implementation decision of the evaluated application. If an application message is less than the maximum segment size (MSS), one TCP packet

An approach for profiling distributed applications through network traffic analysis

“

An Approach for Profiling Distributed

Applications Through Network Traffic Analysis

”

UNIVERSIDADE FEDERAL DE PERNAMBUCO

CENTRO DE INFORMÁTICA

PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO

THIAGO PEREIRA DE BRITO VIEIRA

“AN APPROACH FOR PROFILING DISTRIBUTED

APPLICATIONS THROUGH NETWORK TRAFFIC

ANALYSIS"

Agradecimentos

Resumo

Abstract

Contents

List of Figures

List of Tables

List of Acronyms

1

Introduction

1.1 Motivation

1.2 Problem Statement

1.3 Contributions

1.4 Dissertation Organization

2

Background and Related Work

2.1 Background

2.1.1 Network Trafﬁc Analysis

2.1.2 JXTA

2.1.3 MapReduce

2.2 Related Work

2.2.1 Distributed Debugging

2.2.2 MapReduce for Network Trafﬁc Analysis

2.3 Chapter Summary

3

Proﬁling Distributed Applications

Through Deep Packet Inspection

3.1 Motivation

3.2 Architecture