Monitoring the coupling evolution of microservice-based architectures : a method for detecting indications of architectural erosion = Monitoramento da evolução do acoplamento de arquiteturas baseadas em microsserviços: um método para detectar indícios de

(1)

INSTITUTO DE COMPUTAÇÃO

Daniel Rodrigo de Freitas Apolinário

Monitoring the coupling evolution of

microservice-based architectures: a method for

detecting indications of architectural erosion

Monitoramento da evolução do acoplamento de

arquiteturas baseadas em microsserviços: um método

para detectar indícios de erosão arquitetural

CAMPINAS

2020

(2)

Monitoring the coupling evolution of microservice-based

architectures: a method for detecting indications of architectural

erosion

Monitoramento da evolução do acoplamento de arquiteturas

baseadas em microsserviços: um método para detectar indícios

de erosão arquitetural

Dissertação apresentada ao Instituto de Computação da Universidade Estadual de Campinas como parte dos requisitos para a obtenção do título de Mestre em Ciência da Computação.

Dissertation presented to the Institute of Computing of the University of Campinas in partial fulfillment of the requirements for the degree of Master in Computer Science.

Supervisor/Orientador: Prof. Dr. Breno Bernard Nicolau de França

Este exemplar corresponde à versão final da Dissertação defendida por Daniel Rodrigo de Freitas Apolinário e orientada pelo Prof. Dr. Breno Bernard Nicolau de França.

CAMPINAS

2020

(3)

Biblioteca do Instituto de Matemática, Estatística e Computação Científica Ana Regina Machado - CRB 8/5467

Apolinário, Daniel Rodrigo de Freitas,

Ap43m ApoMonitoring the coupling evolution of microservice-based architectures : a method for detecting indications of architectural erosion / Daniel Rodrigo de Freitas Apolinário. – Campinas, SP : [s.n.], 2020.

ApoOrientador: Breno Bernard Nicolau de França.

ApoDissertação (mestrado) – Universidade Estadual de Campinas, Instituto de Computação.

Apo1. Arquitetura de software. 2. Engenharia de software. 3. Software -Manutenção. 4. Métricas de acoplamento. 5. Evolução de software. I. França, Breno Bernard Nicolau de, 1983-. II. Universidade Estadual de Campinas. Instituto de Computação. III. Título.

Informações para Biblioteca Digital

Título em outro idioma: Monitoramento da evolução do acoplamento de arquiteturas

baseadas em microsserviços : um método para detectar indícios de erosão arquitetural

Palavras-chave em inglês: Software architecture Software engineering Software - Maintenance Coupling metrics Software evolution

Área de concentração: Ciência da Computação Titulação: Mestre em Ciência da Computação Banca examinadora:

Breno Bernard Nicolau de França [Orientador] Paulo Sérgio Medeiros dos Santos

Luiz Eduardo Buzato

Data de defesa: 30-11-2020

Programa de Pós-Graduação: Ciência da Computação

Identificação e informações acadêmicas do(a) aluno(a) - ORCID do autor: https://orcid.org/0000-0002-7636-536X - Currículo Lattes do autor: http://lattes.cnpq.br/5738191777522114

(4)

INSTITUTO DE COMPUTAÇÃO

Daniel Rodrigo de Freitas Apolinário

Monitoring the coupling evolution of microservice-based

architectures: a method for detecting indications of architectural

erosion

Monitoramento da evolução do acoplamento de arquiteturas

baseadas em microsserviços: um método para detectar indícios

de erosão arquitetural

Banca Examinadora:

• Prof. Dr. Breno Bernard Nicolau de França IC/UNICAMP

• Prof. Dr. Paulo Sérgio Medeiros dos Santos EIA/UNIRIO

• Prof. Dr. Luiz Eduardo Buzato IC/UNICAMP

A ata da defesa, assinada pelos membros da Comissão Examinadora, consta no SIGA/Sistema de Fluxo de Dissertação/Tese e na Secretaria do Programa da Unidade.

(5)

During this Master’s degree, the world was plagued by the covid-19 pandemic. The practice of social distancing for long months made me more sure that we depend a lot on each other. Much more than I imagined. Reflecting on this makes me even more grateful to all those who will be mentioned here. My greatest gratitude is to my God who has always supported me in everything. Without Him, I am nothing. Everything my wife Priscila does for me cannot be described here, so my gratitude is immense to this woman that I love with all my strength. It was not easy to do all this with 2 children at home, but the joy they provide was an essential fuel. Thank you Alice and Davi for their patience with my often absents. I will always be grateful to my parents because without their love and effort I would not be able to do anything alone. To my advisor, for the dedication to help me with everything (and look, I needed a lot of help) and for the huge learning. My gratitude is also great to Embrapa who gave me this great opportunity to the Masters. To all the coworkers who supported me and always encouraged me. In particular, I am grateful to Stanley, who was my Academic Advisor and has done a lot for me since the moment I became interested in Embrapa’s graduate program. My private wish was to write a lot of names here that come to mind, but I will stop here because, in the end, I will always forget someone important. Therefore receive my gratitude if you prayed, cheered me up, instructed me, guided me, listened to me.

"But may all who seek you rejoice and be glad in you; may those who long for your saving help always say, "The LORD is great!"

(6)

Entregar software com mais rapidez e frequência tem sido fundamental em um mundo cada vez mais digital e com consumidores mais exigentes. De forma crescente, a indústria tem adotado a arquitetura de microsserviços devido ao pressuposto de que esse estilo arqui-tetural atende às demandas de desenvolvimento de software atuais, tais como resiliência, flexibilidade e velocidade. No entanto, desenvolver aplicações baseadas em microsserviços também traz algumas desvantagens, como por exemplo o aumento da complexidade ope-racional do software. Estudos recentes também apontam a falta de métodos para prevenir problemas relacionados à manutenibilidade de soluções baseadas neste estilo arquitetural. Problemas arquiteturais comumente causam perda de manutenibilidade do software. Des-considerar bons princípios de design durante a evolução do software pode levá-lo ao que chamamos de erosão arquitetural, o que pode tornar inviável sua manutenção. Monitorar a qualidade interna do software é fundamental para evitar a degradação da arquitetura. No entanto, existem poucas iniciativas para monitorar a evolução de software baseado em arquiteturas de microsserviços. Este trabalho tem o objetivo de desenvolver um método para monitorar a evolução de arquiteturas baseadas em microsserviços e identificar ten-dências de aumento no acoplamento entre serviços, permitindo assim que arquitetos de software possam tomar decisões de manutenção com antecedência. Para isso, definimos um conjunto de métricas de acoplamento baseado em métricas de sistemas orientados a serviços encontradas na literatura. O método SYMBIOTE captura as dependências entre os microsserviços de uma aplicação em tempo de execução (ambiente de homologação ou produção). A partir dessas informações, ele constrói um grafo de dependência, no qual os nós representam serviços e as arestas representam dependências diretas. As métricas de acoplamento são calculadas a partir deste grafo e monitoradas ao longo do tempo para identificar tendências de aumento significativas, as quais podem ser sinais de degradação arquitetural. Apresentamos os resultados de um experimento realizado a partir de dados gerados artificialmente que revelou o comportamento das métricas em diferentes cenários e contribuiu para o desenvolvimento de um método de análise das métricas cujo intuito é identificar indícios de degradação arquitetural. No experimento, observamos que três de quatro métricas mostraram uma correlação significativa com mudanças feitas intenci-onalmente na arquitetura. O SYMBIOTE usa quatro métricas em conjunto para indicar a existência de um problema arquitetural. Avaliamos o método SYMBIOTE na aplicação Spinnaker, que é um caso real disponível no repositório GitHub. A avaliação forneceu evi-dências sobre a viabilidade de executar o método em sistemas baseados em microsserviços construídos com tecnologias comumente usadas hoje (como Kubernetes e Docker) sem a necessidade de instrumentar o código. Os resultados obtidos na avaliação do Spinnaker mostram a relação entre mudanças arquiteturais e a tendência de aumento dos valores das métricas de acoplamento na maioria dos intervalos de releases analisados. Portanto, a primeira versão de SYMBIOTE mostrou potencial para detectar sinais de degradação arquitetural durante a evolução de arquiteturas baseadas em microsserviços.

(7)

Delivering software faster and with higher frequency has been imperative in a progres-sively digital world with increasingly demanding consumers. The industry has been in-creasingly adopting the microservice architecture due to claims that this architectural style satisfies ongoing software development demands, such as resilience, flexibility, and velocity. However, developing applications based on microservices also brings some draw-backs, such as the increased software operational complexity. Recent studies have also pointed out the lack of methods to prevent problems related to the maintainability of the solutions based on this architectural style. Architectural issues are common causes of poor software maintainability. Disregarding established design principles during the software evolution may lead to the so-called architectural erosion, which can end up in a condition of unfeasible maintenance. Monitoring the software internal quality is crucial in preventing architectural degradation. However, there are few initiatives to monitoring the evolution of software microservice-based architectures. This work aims to develop a method to monitor the evolution of microservices-based architectures and identify trends in increasing coupling between services, allowing software architects to take maintenance decisions as soon as possible. For that, we defined a coupling metrics suite based on service-oriented metrics from literature. The SYMBIOTE method captures dependencies between microservices of an application in run time (staging or production environment). From this information, it builds a dependency graph, in which nodes represent services and edges represent direct dependencies. Coupling metrics are calculated from this graph and monitored over time to identify significant upward trends that may be signs of archi-tectural degradation. We present the results of an experiment with artificially-generated data that revealed the behavior of the metrics in different scenarios, and collaborated for the development of a analysis method of the metrics for the identification of indications of architectural degradation. In the experiment, we observed that three out of four metrics showed a significant correlation with intentional changes in architecture. SYMBIOTE uses four metrics in a combined way to indicate the occurrence of an architectural problem. We evaluated SYMBIOTE method in a real case application called Spinnaker available at GitHub repository. The evaluation provided evidence on the feasibility of executing the method in microservice-based systems built with commonly used technologies today (such as Kubernetes, Docker) without the need for code instrumentation. The results obtained in the Spinnaker evaluation show the relationship between architectural changes and upward trend in the values of the coupling metrics in most of the analyzed release intervals. Therefore, the first version of SYMBIOTE has shown potential to detect signs of architectural degradation during the evolution of microservice-based architectures.

(8)

4.1 Generated Barabasi-Albert Graph of small size. . . 29

4.2 Generated Barabasi-Albert Graph of medium size. . . 30

4.3 Evolution of one small size graph in the improvement scenario . . . 33

4.4 Evolution of one small size graph in the degradation scenario . . . 34

4.5 Mean trends for the SID metric . . . 36

4.6 Mean trends for the SDD metric . . . 37

4.7 Mean trends for the ADCS metric . . . 38

4.8 Mean trends for the SCF metric . . . 39

4.9 Trend Analysis of each experimental unit . . . 40

5.1 Overview of the SYMBIOTE method. . . 44

5.2 Scenario for collecting service metrics. . . 46

5.3 Example of a Generated Dependency Graph. . . 47

6.1 Spinnaker components . . . 53

6.2 Spinnaker deployment environment overview . . . 54

6.3 Graphical representation of a dependency graph in Kiali . . . 57

6.4 Dependencies of release 1.22 emphasizing the difference (dashed blue edges) to the architecture specification . . . 58

6.5 Coupling Metrics Evolution for Spinnaker’s 17-releases interval . . . 60

6.6 SID Metric Evolution for Spinnaker . . . 61

6.7 Trend analysis of the SID metric for all releases . . . 61

6.8 Differences between releases 1.13 e 1.16 . . . 62

6.9 SDD Metric Evolution for Spinnaker in release intervals . . . 63

6.10 Trend analysis of the SDD metric for all releases . . . 63

6.11 ADCS Metric Evolution for Spinnaker . . . 64

6.12 Trend analysis of the ADCS metric for all releases . . . 64

6.13 SCF Metric Evolution for Spinnaker . . . 65

(9)

4.1 Experimental Design . . . 28

4.3 Contingency Tables for Medium Graphs Scenario . . . 40

4.4 Experiment Results per Metric . . . 41

5.1 ADS and AIS coupling metrics per microservices . . . 48

5.2 Coupling metrics for the sample application . . . 48

6.1 Spinnaker releases used . . . 56

6.2 Spinnaker coupling metrics . . . 59

6.3 Spinnaker coupling metrics analysis for release intervals . . . 60

(10)

1 Introduction 12

1.1 Motivation . . . 12

1.2 Problem . . . 12

1.3 Research Goal and Questions . . . 13

1.4 Research Method . . . 14 1.5 Contribution . . . 15 1.6 Organization . . . 15 2 Theoretical Foundations 16 2.1 Continuous Delivery/Deployment . . . 16 2.2 Microservices . . . 16

2.3 Continuous Software Evolution . . . 18

2.4 Architectural Degradation . . . 19

2.5 Coupling Metrics . . . 20

3 Related Work 23 4 Establishing a Metric Suite 25 4.1 Metric Suite . . . 25

4.2 Goal and Hypotheses . . . 26

4.3.1 Graph Structure . . . 28

4.3.2 Application Size . . . 29

4.3.3 Microservices-related Design Patterns . . . 30

4.3.4 Architecture Smells . . . 30

4.3.5 Evolution Scenarios . . . 32

4.3.6 Number of Replications . . . 34

4.4 Experimental Procedure . . . 35

4.5 Experimental Results . . . 35

4.5.1 General Behavior of Metrics . . . 35

4.5.2 Trend Analysis and Hypothesis Testing . . . 38

4.6 Threats to Validity . . . 41

5 The SYMBIOTE Method 43 5.1 Approach for Collecting Metrics . . . 43

5.2 Step 1: Distributed Tracing . . . 44

5.3 Step 2: Application Tests Execution . . . 47

5.4 Step 3: Dependency Graphs generation . . . 47

(11)

6 Real Case Evaluation 51

6.1 Selecting a real case . . . 51

6.2 Spinnaker . . . 52 6.2.1 Architecture . . . 52 6.2.2 Environment . . . 53 6.2.3 Configuration . . . 54 6.2.4 Integration Tests . . . 55 6.2.5 Releases . . . 56 6.3 Collected Metrics . . . 56 6.4 Results . . . 59 6.4.1 Overview . . . 59 6.4.2 SID Evolution . . . 60 6.4.3 SDD Evolution . . . 62 6.4.4 ADCS Evolution . . . 63 6.4.5 SCF Evolution . . . 64 6.5 Discussion . . . 65 6.6 Limitations . . . 67 7 Conclusion 69 7.1 Main Results and Contribution . . . 69

7.2 Method Limitations and Applicability . . . 70

7.3 Future Work . . . 70

Bibliography 72

A Orca Service source code: dependency to Kayenta 79

(12)

Chapter 1 Introduction

1.1 Motivation

In the age of digital transformation, the software has been a facilitator for companies needing to adapt quickly to constant changes in the market [40]. In this context, Contin-uous Delivery/Deployment (CD)1 _{practices have been adopted to allow quality software}

to be delivered more often since its goal is to keep software always in releasable state [39]. This practice follows the principles of the Agile Manifesto [1], enabling early and continuous delivery of valuable software to customers.

Microservices architectural style is claimed to fit CD well and to be an important factor for adopting this practice [72]. Microservices have as their main characteristic the breaking down of the software into “small” and independently deployable services, which communicate through lightweight mechanisms [49]. Scaling agile processes and CD are among the primary motivations for the use of microservices [80]. This architecture is relatively new and, therefore, has gained attention both in industry and academia.

Successful cases on the adoption of microservices in large companies like Neftflix[23] has been driving its use in software development by the industry. In academia, research on microservices is still in its early stages [27] [35]. Microservices architecture is believed to satisfy well-known design principles established in Software Engineering, such as low coupling and high cohesion. However, real conditions inherent to the software development process may influence the fulfillment of the premises of any architectural style. Therefore, we believe it is relevant to study approaches for monitoring the evolution of microservices architectures to keep them maintainable.

1.2 Problem

Companies have established the use of the microservices architectural style in software development pursuing to enjoy benefits such as strong module boundaries, independent deployment and, technology diversity [33].

Despite the expected benefits of adopting microservices along with CD on making

1_{We recognize these terms have a different meaning. However, in this work, we refer to both as CD}

(13)

development and deployment of new changes more efficient, the microservice architec-ture is composed typically of a higher number of components when compared to tradi-tional monolithic or bold services (components) architectures. It poses new challenges for systems evolution because of their dynamism and complexity [71]. Bogner et al. [16] mentions that some features of microservices such as technological heterogeneity and de-centralized control may harm the maintainability of an application when not handled properly. Throughout the evolution, the system becomes more complex, hampering the understanding of its codebase and operational environment, what can lead developers to introduce modifications that damage the architectural integrity [26]. Over time, the accumulation of architectural issues can cause architectural degradation2 _[26].

Ideally, microservice-based applications (MSAs) should prevent architectural degrada-tion based on its characteristics, such as the physical boundaries between services that inhibit development teams from making convenient changes that cause architecture viola-tions. Other features of MSAs such as flexibility, scalability, fewer dependency problems, and separation of concerns, also promote maintainability [16]. Those characteristics ex-plain the claim that MSAs are more resilient to architectural degradation [24]. However, as the number of microservices increases, the interactions between independent services tend to increase significantly, causing difficulties in monitoring the system as a whole [27], and its evolution as well.

Due to the complexity involving distributed systems, maintaining an overview of these applications is challenging [30]. This way, monitoring the whole running (production) en-vironment has become critical for developers and operations teams. Existing monitoring technologies such as tracing, logs, operational metrics, and alerts [40] are useful for devel-opment teams, but there is a shortage of specific methods and tools to monitor architecture problems that can impact the maintainability of MSAs. Maintainability problems lead to increased costs and maintenance time as well as shortening the life of the software.

Several studies focused on object-oriented (OO) systems have shown the relationship between coupling and maintainability [50] [15] [28] [8] [9]. The work of Perepletchikov and Ryan [62] indicates that the same relationship also occurs for MSAs. Therefore, it is reasonable to assume that increasing coupling between services may be symptoms of architectural degradation.

1.3 Research Goal and Questions

This research aims to monitor the coupling evolution between the services of an MSA to capture evidence of architectural degradation. Specifically, we seek to identify microser-vices coupling issues throughout the software evolution, answering the research question: “Is the continuous monitoring of coupling metrics from microservice-based applications able to indicate architectural degradation? ”. To answer this, we understand it is crucial to distinguish regular increasing coupling from harmful degradation. Therefore, our focus is

2_{In this text, we use architectural degradation as a broader term to refer to different phenomena}

describing the loss of internal quality w.r.t. software architecture and design. Some examples include architectural erosion, debt, decay, and software aging.

(14)

to investigate variations in the coupling over time, as occasional increases in coupling can happen even when adopting certain MSA design patterns.

To achieve this main goal, we concentrated on the objectives:

• Define a coupling metrics suite for (micro)service-oriented architectures. The met-rics are selected from the literature.

• Establish an automated strategy for collecting the selected metrics from MSAs based on its concrete (deployed) architecture.

• Evaluate the behavior of existing selected metrics over time, considering the struc-tural characteristics of MSAs.

• Create a method to continuously analyze the evolution of coupling metrics between services. This includes developing an analysis method for coupling metrics evolution whose output indicates whether there are signs of architectural degradation

As a result of this, we developed the SYMBIOTE method to monitor the architectural evolution of MSAs. It uses service coupling metrics to warn developers and architects when successive changes can negatively affect the system maintainability. The method presented in this work should support software engineers on decision-making targeting improvements to solve architectural problems and prevent the architectural degradation of MSAs.

1.4 Research Method

In order to achieve the research goal, we have defined a set of activities to be performed: Literature review: first, we performed an ad-hoc literature review on software main-tenance metrics for service and microservice-based architectures. As result, we identified two recent studies on compiling metrics for services and microservices [64] [18]. The latter is a comprehensive literature review on maintainability metrics for service and microservice-based architectures.

Selecting maintainability metrics: Bogner et al. [18] present a literature review on maintainability metrics for service-oriented and microservice-based systems, but the identified ones lack empirical validation. From this review, we selected a subset of coupling metrics (see Section 2.5) to be used in the proposed method.

Defining a strategy for metrics collection: after understanding which software information (service dependencies) we needed to calculate the selected coupling metrics, we need to decide on using the static or dynamic analysis to gather the dependencies between services. Next, we defined the collection strategy and searched for existing tools to support it. Finally, we performed a proof-of-concept to verify the feasibility of the selected strategy.

Develop the method for monitoring the coupling evolution of MSA: we defined the steps for the method to obtain the desired result. Then, we designed an experiment (details from Section 4.2) that simulates the MSAs evolution (represented as

(15)

dependency graphs). The main goal of this experiment was to define a metrics analysis procedure, which should be incorporated into the method.

Evaluate the method in a real case: we select a real case of an application based on microservices. Later, the method is applied and we evaluated the result obtained. The evaluation aims to verify the method’s effectiveness as well as to identify improvement opportunities.

1.5 Contribution

The microservices architectural style has been adopted as a relatively new practice in the software industry, so studies in this area are still incipient and lack empirical validation. Therefore, more generally, this work contributes to generate knowledge from existing works, such as metrics for service-oriented and microservice-based systems, and as a starting point for the study of architectural degradation in the field of microservices.

More specifically, the main contributions of this work are: empirically validating cou-pling metrics proposed for (micro)service-based systems; proposing two new coucou-pling met-rics for (micro)service-based architectures; an approach to automatically collect coupling metrics for microservice-based applications; development of the SYMBIOTE method to support software engineers in monitoring coupling metrics and capturing indications of architectural degradation; evaluate this method in a real application.

1.6 Organization

The remainder of this work is organized as follows. Chapter 2 describes the theoretical foundation related to the main Software Engineering topics involved in this research. In Chapter 3, we present the related work. Chapter 4 describes the metrics suite defined and reports an experiment whose results contributed to the development of the proposed method. In Chapter 5, we detailed the SYMBIOTE method proposed in this work. A real case evaluation is presented in the Chapter 6. Lastly, Chapter 7 presents the summary work, conclusions, and future work.

(16)

Chapter 2 Theoretical Foundations

2.1 Continuous Delivery/Deployment

The Continuous Integration (CI) practices aim to make the integration of the various works in the construction of a software [34] frequent. Although it is not mandatory, the automation of these practices is a powerful approach that has been a hallmark of teams that adopt CI [29] [48]. Subsequently, continuous deployment practices emerged in order to allow release software whenever we want [56]. Continuous Delivery (CD) practices go a step further towards automating changes made directly to production environments. CSE encompasses all Continuous* [32] practices, but the idea behind this concept goes a bit beyond that. CSE is an approach to treat all the software engineering activities as a continuous flow, following the Lean Thinking concept of flow [81]. In the CSE, software releases have a higher frequency, making the software evolution also continuous. In this context, software architecture acquires an evolutionary nature.

Some innovation factors such as cloud and container technologies as well as DevOps contributed to the emergence of the microservice architecture and also boosted CSE [32]. Continuous practices are possible due to the toolset of configuration management, build and test automation, and deployment tools [59]. Another key aspect for CSE is the microservice architecture itself [59].

Therefore, the microservice architecture is strongly related to the best known CI and CD practices, in such a way that it influences and is also affected by the characteristics of these processes.

2.2 Microservices

As software requirements gradually become more demanding, the software architecture must adapt to meet the new user needs. For that, new architectural styles have emerged over the years as a response to new quality and time-to-market demands, such as Service-Oriented Architecture (SOA) [31]. Service-oriented architecture evolves the concept of a component-based architecture by adding mainly the concept of distributed services. Similarly, the microservices architectural style arises as a specialization of SOA to meet new requirements of modern applications [18]. According to Lewis and Fowler [49], “the

(17)

microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.”

The main characteristics of microservices are [57]:

• Bounded context: Newman [57] states that microservices are "small and focused on doing one thing well". Microservices are built based on the Single Responsibility Principle and are separated by business capabilities.

• Autonomous: Microservices are implemented separately and any changes need to be isolated from other services. For this, a good API interface is essential.

• Technology Heterogeneity: Since the services are independent, we can use technolo-gies that are best suited for each type of task performed by a microservice.

• Resilience: The physical boundary between microservices helps to isolate failures that can occur in applications. However, the distributed services architecture has more sources of failures such as network, machines, etc.

• Scaling: In a monolithic architecture, it is necessary to scale the entire application, however, with microservices we can scale only the most demanded services and better balance the use and cost of hardware.

• Ease of deployment: Microservices tend to make the task of deploying an application easier because they can be deployed in isolation. Thus, deploys more often become possible and it facilitates dealing with deployment problems.

• Organizational Alignment: Development teams can work with smaller codebases (equivalent to one microservice), thus reducing problems related to very large code-bases of monolithic systems. Therefore, team organizations can also benefit by reducing the size of teams that can focus on certain services.

• Composability: The microservice architecture makes it possible to recompose our services according to new demands, as the small pieces can be changed more easily and combined for reuse and be used in different ways.

• Optimizing for Replaceability: Microservices make it much easier to replace or mod-ernize services, as the risks and costs are lower due to the size of the services being small.

Microservices architecture has been a trend not only for modern new applications but also in the evolution (re-engineering) of monolithic systems. Jamshidi et al [40] an-alyze several factors that have contributed to the emergence and subsequent evolution of this architectural style. Among the main factors, the authors mention the growing demand for scalability and new practices for software development, such as CI and CD, containerization, and cloud technologies. Therefore, a good part of the concepts appropri-ated by microservice architecture is not entirely new in Software Engineering. However,

(18)

the current conditions of technologies and methods contributed to the emergence of this architectural style.

In the grey literature, there are a multiplicity of materials about the advantages and disadvantages of microservice architecture. Soldani et al. [73] synthesized and classified the main pains and gains of the microservice architectural style as summarized in the following.

Pains:

• Architecture: The size and complexity of services are difficult to dimension at the design time.

• Storage: Data consistency is eventual and the management of distributed transac-tions is complex.

• Tests: The partitioning of services makes it challenging to measure the performance of the application as a whole.

• Management and Monitoring: The deployment environment consists of heteroge-neous and distributed microservices.

Gains:

• Architecture: Bounded context is one of the main architectural gains since the mi-croservice must be self-contained, that is, capable of being analyzed, understood, and updated independently of other services.

• Development : The microservices style promotes low coupling and allows technolog-ical heterogeneity among services.

• Deployment : Being able to deploy each microservice independently allows even new forms of development teams organization.

• Operation: Microservices promote horizontal scaling capabilities.

Apart from the cost-benefit analysis of adopting microservices, the concerns on soft-ware architecture are present both in the advantages and disadvantages. The microservice architecture is one of the most influencing ones in current digital enterprise architectures [84] and this work intends to contribute to scientific research in this area, which is still in its early stages.

2.3 Continuous Software Evolution

Software evolution and its concerns with maintenance dates from the 1960s [20], and has matured since the definition of the Laws of Software Evolution [46]. Different from Lehman’s perspective, the observation of software evolution as a phenomenon of successive transformations of software systems, nowadays, has changed in the sense that software modifications are more frequent in the CSE perspective [32], justifying the term Continu-ous Software Evolution. Therefore, in the scope of this work the term software evolution

(19)

will be used in the perspective of changes that occur during the life cycle of software, since most of the existing differences between software evolution and maintenance are in the nature of changes [37].

In the context of continuous software development practices, software maintainability should be taken as a priority since the software architecture and design are key factors for CD [32] [70]. Thus, it is an important attribute of the internal quality of software that can be defined as the ease with which a software system or component can be modified [4]. Unfortunately, software internal quality can be overlooked as it is not trivial to keep track of how it evolves. However, neglecting this attribute can directly impact the time and cost of managing software development.

As discussed in the previous sections, MSAs are often involved in an environment of continuous releases. Large software companies can perform several deployments into production environments in a single day. Therefore, from the point of view of changes, the evolution of microservices architectures occurs continually.

In principles, microservices architecture promotes maintainability. For instance, as a microservice has a single responsibility this makes it easier to understand and, therefore, to maintain it. However, some working conditions in software development such as schedule pressure, costs, and even inadequate training of practitioners can lead software engineers to neglect aspects of internal quality, thus compromising the maintainability of systems.

2.4 Architectural Degradation

The idea of a software architecture degrading starts with the discussions on internal software quality decay from the 1990s with the metaphor of software aging [61]. Besides, the Lehman’s Law [47] states that “the quality of E-type systems will appear to be declining unless they are rigorously maintained and adapted to operational environment changes”. Detecting this decay in quality has been a challenge. Studies on software maintainability have presented ways to identify quality problems.

The decay of architectural quality is one of the most significant for system main-tainability. The architectural decay of a software system is the phenomenon “1) when concrete (as-built) architecture of software system deviates [drifts] from its conceptual (as-planned) architecture where it no longer satisfies the key quality attributes that led to its construction or 2) when architecture of software system allows no more changes to it due to changes introduced in the system over time and renders it unmaintainable [ero-sion] ” [69]. Other authors refer to the latter phenomenon with different terms, such as architectural erosion [26]. In this dissertation, we are interested mostly in the second part of this definition, since our intention is to identify indications that good design principles are not being followed.

Lindvall et al. [51] argues that architectural degradation occurs when constant changes in software can negatively impact its structural complexity. Thereby, there is a risk that the continuous evolution of these systems may cause the so-called architectural erosion, demanding evolution monitoring as an essential task for architects, which enables timely corrections. This way, architects should observe relevant maintenance aspects to avoid the

(20)

architectural decay of a system. Particularly in microservices application development, the systematic understanding of service maintainability and its metrics is important to quantify its degrees [19]. Systems built on this architecture have greater flexibility since their teams follow an evolutionary design, which makes it more challenging to manage the architectural constraints and dependencies between services [10].

2.5 Coupling Metrics

Several maintainability metrics have been proposed in the technical literature [68]. Mostly, they include object-oriented metrics for properties such as coupling [22], cohesion [21], complexity, and size. One of the most known metrics is the maintainability index [60], which is a composition of metrics such as lines of code, Halstead Volume and McCabe’s Cyclomatic Complexity. Although we have a very large set of metrics proposed, there is extensive discussion in the literature about the lack of adequate empirical validation and also about the applicability of metrics in the real world.

Perepletchikov et al. [63] show that existing metrics for software using procedural and OO paradigms are not well-suited for service-oriented systems. Thereby, specific metrics and models have been introduced recently to measure the maintainability of service-oriented systems and microservices [64] [18] [19].

In general, software product metrics can be obtained through static analysis (analysis of software artifacts such as documents and source code) or dynamic analysis (verifica-tion of software behavior in the execu(verifica-tion environment). These two approaches are not mutually exclusive and can complement each other to achieve better results. While the static analysis of the source code allows us to obtain structural information of the sys-tem, the dynamic analysis allows extracting aspects of the behavior of the application. The selection of a suitable approach depends on the metrics to measure. That is, for a given metric it may or not be feasible, easier or not, and have greater or lesser precision according to the approach. Several studies also have used a hybrid approach.

Coupling is the manner and degree of interdependence between software modules [5]. In general, the low coupling is a well-known guideline for software design. Coupling is essential in software because, otherwise, there would be no system but several modules alone. However, good design seeks to reduce coupling as much as possible, for reasons already known decades ago [43]:

• There is less chance that a failure of one module/component will impact another; • Changes made to a module/component should impact the least number of other

modules/components.

• The less knowledge of other modules/components is required to evolve a module/-component the easier it will be to maintain it.

Over the years, a huge number of coupling metrics have been proposed in the lit-erature. Several researches have studied the relationship between coupling and system maintainability. Li et al. [50] conclude that object-oriented metrics are good predictors

(21)

of maintenance effort. Binkley and Schach [15] show that coupling is a good predictor of maintainability and failures. Dubey et al. [28] show that CK metrics (classic set of metrics proposed by Chidamber and Kemerer, including coupling metrics) have a strong impact on maintainability in object-oriented systems. Al Dallal and Jehad [8] showed through an empirical assessment that maintainability is affected by changes in size, co-hesion, and coupling metrics. The results of the study carried out by Almugrin et al. [9] on open-source systems indicate that indirect coupling metrics are highly correlated with aspects of maintainability and testability. These studies analyzed object-oriented sys-tems, while Perepletchikov and Ryan [62] empirically highlight the relationship between coupling metrics and maintainability aspects such as analyzability and changeability for service-oriented systems. Therefore, we understand as a reasonable assumption that the temporal analysis of these metrics can provide us with evidence of architectural degrada-tion, so that a significant variation in these values may indicate signs that changes have introduced architectural problems in the application.

The coupling can be understood in the form of dimensions such as direct or indirect, logical, dynamic, coupling of classes or methods (OO), etc. In this work, we will focus on the direct dependencies between the services of an application. That is, we will analyze the coupling at the service abstraction level.

Recently, specific metrics and models have been introduced to measure the maintain-ability of service-oriented systems and microservices [64] [18]. Our work focus on coupling metrics as they are directly related to system architecture and represent an important as-pect when managing maintainability [15] [74]. The recent literature review from Bogner et al. [18] identified several maintainability metrics for service and microservice-based ap-plications. We present a subset of the coupling metrics belonging to the maintainability metrics for service-based systems cataloged in [18].

The following metrics should be collected per individual service:

• Absolute Importance of the Service (AIS): number of consumers invoking at least one operation from a service S1. The higher the AIS, the more important the service S1 is within the system. Average AIS can be useful for identifying and quantifying the most critical services.

• Absolute Dependence of the Service (ADS): number of services on which the S1 service depends. In other words, ADS is the number of services that S1 calls for its operation to be complete. The higher the ADS, the more this service depends on other services, i.e., it is more vulnerable to the side effects of failures in the services invoked.

The following metrics work for the entire application:

• Service Coupling Factor (SCF): this is a measure of the density of a graph’s connectivity. SCF = SC/(N2 − N ), where SC is the sum of all calls between services, and N is the total number of services. The denominator of this equation represents the combination of all nodes together, that is, the maximum number of possible edges for a graph with N nodes.

(22)

• Average Number of Directly Connected Services (ADCS): the average of ADS metric of all services. The choice of ADS as a basic metric intends to measure the density of the edges in the graph. We achieve this by adding the ADS values for all services and dividing by the total number of services. Instead, if using the AIS metric, the final result would be the same since for the same graph the sum of the AIS values has to be equal to the sum of the ADS values.

Moreover, metrics presenting the following criteria were not considered for the analysis of coupling evolution in our work:

1. Derived, repeated, or similar metrics tend to present similar variability and trends to their primary metrics. For instance, Relative Coupling of Service (RCS) and Relative Importance of Service (RIS) metrics are not considered as they will present a similar behavior to ADS and AIS metrics. The System’s Service Coupling (SSC) metric measures edge density in the dependency graph (similar to the SCF metric). The SCF metric is repeated in two different research literature sources.

2. Metrics whose definitions are inconsistent were discarded. For instance, the Absolute Criticality of the Service (ACS) metric is defined as the product of two other metrics (ADS and AIS). In this case, if a service presents ADS equals zero (no outgoing edges), then ACS will be zero (no coupling) also regardless of the AIS value. We consider it offers a misleading regarding the coupling meaning. Thus, we decided not to use this metric to avoid misinterpretation in the analysis.

3. Metrics of cyclical dependency like Services Interdependence in the System (SIY) are not used because the mere existence of a single dependency of this type already constitutes an architectural smell. Therefore, it makes no sense to analyze the variability or trend of a metric for which we know that the only acceptable value would be zero.

4. Metrics based on internal elements of a service (classes, packages, operations, inter-faces), as well as those that use weight to differentiate different types of connections between these elements will not be considered. Our interest in this initial work is to capture dependencies only at the logical service level and treat them with the same weight. This is the case with the metrics Weighted Intra-Service Coupling be-tween Elements (WISCE), Weighted Extra-Service Incoming Coupling of an Element (WESICE), Weighted Extra-Service Outgoing Coupling of an Element (WESOCE), Extra-Service Incoming Coupling of Service Interface (ESICSI), Element to Extra Service Interface Outgoing Coupling (EESIOC), Service Interface to Intra Element Coupling (SIIEC), System Partitioning Factor (SPARF), and System Purity Factor (SPURF) (metrics originally defined in [64]).

(23)

Chapter 3 Related Work

Different works have already investigated issues related to our research. Regarding, the collection of architecture-related data, Sampaio [71] proposes a model to support the evo-lution of systems based on microservices. This model aggregates structural, deployment, and execution information of microservice-based architectures but no further details are provided regarding the extraction of such information. Another limitation of his proposal is that a straightforward demo application is used as an example to assess the model feasibility. Similarly, Mayer et al. [54] elaborate an approach to extract architectural in-formation (static, infrastructure, and runtime) from the microservices application. In this last work, both static and dynamic strategies are proposed to capture architectural infor-mation. The authors create a small test scenario to test the viability of their proposal. The SYMBIOTE, in addition to extracting data from applications based on microser-vices, also aims to find and point out possible architectural degradation throughout its evolution.

Kitajima et al. [44] propose a method to extract information from calls between services and following infer their chain of relationships. The authors did not use code instrumentation when working with a dynamic approach to gathering information from the running system. The evaluation of the method used a small test application (not a real system). The purpose of this method is to extract the entire service chain involved in calls, which is useful for root cause analysis for execution failures. Our work is similar to the latter only in the approach of the dynamic extraction of metrics.

The most related work is the GMAT [52], which proposes a tool to obtain the depen-dency graph of a microservice architecture. This tool implements the static approach to gather information about dependencies. Its main concern is to offer a graphic visualization of the dependencies between the services to support the monitoring of the evolution of the software. This tool is not evaluated in a real case. Our work differs mainly in two aspects: (1) our goal is to detect indications of architectural deterioration whereas GMAT only generates the dependencies graphs; (2) as GMAT relies on static analysis, it is limited to several technological constraints such as the use of Java language, Spring Boot Actuator, Spring Feign, and springfox-swagger2. The capability to accurately identify calls between services using static declarations is also limited.

Other works regard the architecture conformance with patterns or standards. ArchCI [65] is an integrated CI tool to monitor architectural drift (deviation between concrete

(24)

and planned) for each deployment in the application integration pipeline. The idea is to check architectural compliance through a Dependency Constraint Language. Although our work has some similarities with the ArchCI (like the historical analysis through version control and pipeline executions to monitor the software evolution), our aim is not to verify compliance with a planned architecture. In a different approach, Ntentos et al.[58] propose an assessment of architecture compliance based on microservices with well-established coupling patterns. They propose metrics that represent adherence to patterns. The objective of this work was to evaluate the feasibility of building a method to measure compliance with standards. In our work, a method is already proposed and validated to measure architecture deterioration not based on patterns but on the principle of low coupling.

Some works focus on coupling evolution. Sousa et al. [74] work is an exploratory study that observed the behavior of coupling through the evolution of open-source, object-oriented software. This work is similar to ours concerning the time series analysis of historic version and the use of coupling metrics. However, our work is concerned with detecting architectural problems while the latter is interested in establishing coupling evolution properties.

Jenkins and Kirk [41] used a software component instability metric to analyze the evolution of software architecture using complex network theory. The instability metric is based on the coupling. This work is similar to ours in some aspects, but its focus is on showing similarities between the behavior of software graphs with complex networks as well as predicting the maintainability of the software, which differs from our work whose intention is to evaluate the current architecture for detect potential issues.

Magnavita et al. [53] propose a tool called EVOWAVE for visualizing any software historical data. The authors carried out an exploratory study using a time series of logical coupling from one real software. This latter work aims to provide a different type of visualization and not to analyze the time series since is the objective of SYMBIOTE.

Our work uses coupling metrics to evaluate the structure of the architecture over the continuous software evolution. Therefore, it differs from these mainly: it proposes a dynamic analysis approach to collect the dependencies between services; analyze trends in the evolution data series; inform architects when the trends indicate signs of architectural degradation; it is also validated in a real project.

(25)

Chapter 4 Establishing a Metric Suite

In order to understand how microservices coupling evolves, we need to establish a metric suite as an observation perspective. In this chapter, we present how we selected, defined, and assessed a metric suite to compose the proposed method.

4.1 Metric Suite

Analyzing metrics individually per microservice could lead to misunderstandings about the evolution of the complete system. Furthermore, if one considers only a single metric, the analysis may be biased as well. This way, we initially selected the SCF and ADCS metrics (described in Section 2.5) that were originally designed to be calculated for the entire application.

Then, we aggregated the values of individual microservice metrics (AIS and ADS) for the entire application. For this, we calculate the Gini coefficient of these metrics.

The statistician Corrado Gini proposed the Gini coefficient in 1912 and it is currently widely used to measure the distribution of wealth in the field of Economics. This index has the advantage of working with a [0;1] interval regardless of the statistical distribution of the data.

For the coupling evolution analysis, such value reveals how unequal the values of the coupling metrics (ADS and AIS) are among microservices in the same application. Thus, it allows us to observe if few microservices concentrate most of the dependencies. For exam-ple, the Single Responsibility Principle is an important design principle for microservices, so that each service has one single responsibility. A Gini coefficient with a higher value may indicate a possible violation of the Single Responsibility Principle, as there should be a small number of services concentrating incoming or outgoing calls (logical coupling).

There are several applications of the Gini coefficient in the literature targeting software evolution analysis [36] [77] [7] [38]. However, none of these works apply it in the context of microservices.

We calculated the Gini coefficient G using the formula [3] [82]. G = Pn i=1(2i − n − 1)xi nPn i=1xi (4.1)

(26)

where the values xi, xi+1, ...xn are ordered, n is the number of values to be computed,

and i represents the rank of the value x. G assumes values between 0 and 1, in which the value 0 indicates perfect equality, whereas values closer to 1 indicate more inequality among the observations. That is, for each release and each metric (AIS and ADS), we have a set M = M1 ∪ M2∪ M3∪ ...Mn of metric values, where Mi represents the metric

value for the ith _{microservice. The Gini coefficient G is calculated over M . In the end, we}

have a Gi (Gini coefficient of all AIS values) and a Gd (Gini coefficient of all ADS values)

for each release of the entire application.

Therefore, the four metrics selected for this research are SCF and ADCS (both defined in Section 2.5), as well as the two derived metrics:

• Gini coefficient for AIS : calculated using the individual AIS measures for each microservice in a given release. That is, this coefficient indicates how the importance (incoming dependencies) of services is distributed among themselves. Values close to zero mean an even distribution of importance among the microservices. Otherwise, values close to one mean the importance are very concentrated in a few services. To simplify, we call this metric Service Importance Distribution (SID).

• Gini coefficient for ADS : calculated using the individual ADS measures for each microservice in a given release. That is, this coefficient indicates how balanced are (outgoing) dependencies among services. When close to zero, it represents evenly distributed dependencies among the microservices. Otherwise, values close to one mean that few services concentrate many dependencies. To simplify, we call this metric Service Dependency Distribution (SDD).

4.2 Goal and Hypotheses

After selecting the metrics, we conducted an experimental analysis to test the behavior of the selected metrics through the evolution of microservice-based applications (MSA). For this experiment, we need a sample with several microservice-based architecture scenarios. Due to the lack of available real and stable cases of microservices applications, we worked with synthetic data (artificially-generated dependency graphs) representing microservice architectures. As we are evaluating the coupling level among microservices in applica-tions, their internal architecture design is irrelevant for this analysis. The generated data are directed graphs, in which nodes represent the microservices and edges represent the dependency (incoming and outgoing) between them. The experiment results also support the analysis method development to capture indications of architectural degradation in this kind of application. The simulated environment enabled us to test using different scenarios and gave us more data to reach a more reliable analysis method.

This study aims to analyze four coupling metrics, for the purpose of characterizing with respect to their behavior over time, in the context of artificially-generated dependency graphs representing MSA releases. For each metric, we test the following hypotheses:

H0 : There is not a significant difference in trends during the evolution of the metric1,

(27)

considering the scenarios of introducing and removing an architecture smell throughout releases of an MSA.

H1 : There is a significant difference in trends during the evolution of the metric, considering the scenarios of introducing and removing an architecture smell throughout releases of an MSA.

4.3 Experimental Design

In the absence of a reference model for a microservices architecture, we adopted some strategies to make the artificial data closer to reality. These strategies concern both the initial generation of the graph corresponding to the first release and its evolution in subsequent releases:

• To use the Barabasi-Albert (BA) model for generating graphs based on scale-free networks [14].

• To work with three different graph sizes: small, medium, and large. The size of a microservice is an object of frequent discussion in industry and academia. In the real world, the size of a microservice can vary widely and, in this case, this factor in the design of the experiment might not make sense. However, ideally, microservices should be similar in their small size [57], and this is the view we are adopting in this experiment.

• To include structures in the dependency graphs representing known microservices design patterns such as API Composition, Message Service Broker, Externalized Configuration, API Gateway, Service Registry, and Distributed Tracing.

• Introduce two architectural problems that are symptoms of known architecture smells related to coupling. The problems are 1) concentration of incoming depen-dencies around a single service; 2) concentration of outgoing dependepen-dencies around a single service.

• To define two possible architectural evolution scenarios for each application: archi-tectural improvement or archiarchi-tectural degradation. In the archiarchi-tectural improve-ment scenario, the graph representing the first release contains an architectural problem, which shall be removed through the next releases. In the architectural degradation scenario, an architectural problem shall be introduced through the re-leases.

We adopted a full factorial design to generate our experimental units (graphs repre-senting the MSA). We considered two factors that can cause variations in the coupling metrics: size and evolution scenario. Size has three levels: Small, Medium, and Large. The evolution scenario has two levels: Improve or degradation. The Table 4.1 shows the scenarios resulting from the combinations between the two factors (Application Size and Evolution Scenario) and their levels.

(28)

Table 4.1: Experimental Design

Scenario Application Size Evolution Scenario

1 Small Improvement 2 Small Degradation 3 Medium Improvement 4 Medium Degradation 5 Large Improvement 6 Large Degradation

4.3.1 Graph Structure

The dependency graphs generated by simulation need to be as realistic as possible. As previously mentioned, we adopted the Barabasi-Albert model [14], which is an algorithm for generating random scale-free networks (SFN), whose degree distribution follows a power-law. Several works have found characteristics of the scale-free networks [14] in software objects, mainly related to the oriented-object systems. Wheeldon et al [79] and Potanin et al. [66] verified power-law distribution related to coupling in real Java programs, i.e., the vast majority of classes have few dependencies whilst few classes have many dependencies. Wen et al. [78] observed that dependencies between Java packages also follow scale-free properties. Many other studies [75] [42], observed that software objects have characteristics of complex networks such as scale-free and power-law. We understand that, semantically, coupling metrics for (micro)services have the meaning of OO coupling metrics. Therefore, we create all dependency graphs using the preferential attachment process following the power-law, in which the probability of one new node connects with the pre-existing node is

p(k) = k−γ, (4.2)

where k is the number of connections of a node and γ is the degree distribution component. All graphs are directed for differentiating the incoming and outgoing edges. The creation and evolution of the dependencies graph were developed in the Java language. The source code is available at GitHub2. We used the graph algorithms contained in the JGraphT library. To generate the first release of any dependency graph, we utilized the BarabasiAlbertGraphGenerator class that implements preferential attachment growth. In Figures 4.1 and 4.2, we show examples of graphs generated by the algorithm.

In the examples, nodes whose names are prefixed with “BA” are nodes representing business services (core) of an application. The other nodes correspond to services that implement some kind of microservices design pattern. For instance, we have: GTW rep-resents the API Gateway of the application; MSB is the Message Service Broker ; REG is the Server Registry; CFG is the Externalized Configuration; TRC is Distributed Tracing; CPS (not present in the examples) is the API Composition. Also, nodes representing design patterns have an associated probability of being included during the graph

gen-2_{https://github.com/daniel-apolinario/microservices-graph}

(29)

eration, and this explains the lack of CPS in the two example graphs and “MSB” being present in only one of them.

Figure 4.1: Generated Barabasi-Albert Graph of small size.

4.3.2 Application Size

A microservice-based system may vary in size. Examples of microservice architectures from Netflix and Amazon contains hundreds of services. On the other hand, we can find several applications built with a small number of services on GitHub repositories. Therefore, dependencies graphs of varied sizes they can behave differently. There is not any attempt to classify applications in terms of the number of services in the literature. Thus, we defined the sizes (amount of services) as follows:

• Small : from 5 to t10 services; • Medium: from 11 to 25 services; • Large: more than 25 services.

Due to computational limitations, we decided to limit it to 60 in this experiment, since we intend to evaluate the first results before scaling up the number of services.

All graphs are generated within this range of sizes. These sizes correspond to the initial graph representing the first release. In the end, they will have grown proportionally at a growth rate (see Table 4.2) calculated at the beginning of evolution.

(30)

Figure 4.2: Generated Barabasi-Albert Graph of medium size.

4.3.3 Microservices-related Design Patterns

Aiming at generating dependency graphs similar to real microservice-based applications, we apply six usual design patterns3 _{found in MSAs and that can also be expressed in}

a dependency graph. The selected patterns are API Composition, Message Service Bro-ker, Externalized Configuration, API Gateway, Service Registry, and Distributed Tracing. These patterns are commonly adopted together, using for example the Netflix microser-vices components4_{. We know that some design patterns can increase coupling and also}

concentrate incoming or outgoing edges on a few nodes.

Design patterns are applied to the graphs immediately after the initial release graph is generated. To diversify the creation of graphs, each design pattern has a probability of being introduced in the generated graph. All the probabilities used in our graph generation model are listed as parameters in Table 4.2. Some parameters represent the proportions of nodes that a pattern will affect. For instance, in the Externalized Configuration Ratio pattern, we have the percentage of application nodes that will depend on the CFG node.

4.3.4 Architecture Smells

For the improvement or degradation scenarios, we have chosen two coupling-related ar-chitecture problems:

1. Problem 1 : the concentration of incoming dependencies around a single microser-vice.

2. Problem 2 : the concentration of outgoing dependencies around a single microservice.

3_{https://microservices.io/}

(31)

Table 4.2: Experimental Design

Parameter Value Description

Minimum growth rate 30% Minimum percentage of nodes to be included in the evolution.

Maximum growth rate 80% Maximum percentage of nodes to be included in the evolution.

Service Registry Prob-ability

100% Probability of a microservice application to implement the Service Registry pattern. API Gateway

Proba-bility

80% Probability of a microservice application to implement the API Gateway pattern. API Gateway Ratio 20% The ratio of nodes to be composed. API Composition

Probability

50% Probability of a microservice application has at least one composition service. Distributed Tracing

Probability

50% Probability of a microservice application has one service for Distributed Tracing. Message Service

Bro-ker Probability

50% Probability of a microservice application to implement the Message Service Broker

pattern. Message Service

Bro-ker Ratio

50% Probability of one service using the message service broker.

Externalized Configu-ration Probability

50% Probability of a microservice application to implement externalized configuration pattern. Externalized

Configu-ration Ratio

40% Ratio of nodes using the externalized configuration.

Connected nodes range for architectural problem

30-80% One node has a architectural problem (incoming or outgoing edges concentration) if it is connected to a percentage of all nodes of

(32)

These problems reflect symptoms of known architecture smells with evidence of their existence in the field, such as God Component [13] or Megaservice [76] [17], Hub-like Dependency [13], Bottleneck Service [17], Nanoservice [17], and The Knot [17]. Many architectural problems do not impact on coupling between microservices and we are unable to express them in a dependency graph. However, in the work of Bogner et al. [17] which summarizes architecture smells found in the literature, the MegaService and NanoService problems have the most evidence in the literature, so the architectural problems we use in this work include symptoms related to the higher evidence architecture smells.

The number of edges to characterize a microservice with a high concentration of in-coming or outgoing dependencies is defined by a percentage of the total number of services in the system, which is a parameter in this experiment (see Table 4.2).

4.3.5 Evolution Scenarios

We focus on implementing a simple evolution of a dependency graph through releases. We calculate the growth rate for each experimental unit (MSA), which must be a percentage of nodes randomly chosen from a pre-configured range (see Table 4.2). After the number of new nodes is calculated, we distribute randomly the creation of the nodes into the number of releases. This distribution is balanced to avoid that all the nodes be included in a single release. Thus, representing small increments as in continuous software development.

We established 21 releases (including the initial release 0) for the whole evolution of one application. The number of releases can be configured in the tool responsible for evolving the graphs. This number was initially configured according to the researchers’ experience. We considered the generation of one release by a sprint (small iteration) and an time interval smaller than six months of development. We know that both sprint and development time can vary widely between teams, that is why we did not have a reference value in the literature. We also needed to consider the minimum interval for which we are able to calculate significant statistical trends.

The default changes during the evolution are limited to the inclusion of nodes. We only consider the scenario of increased services, although we know that there may be re-engineering interventions that may occasionally decrease the number of services. A scenario of continuous decrease in services was not considered, because, by Lehman’s Law of Continuous Growth, software tends to grow only to meet the demands of users. We create new edges to connect new nodes to the graph and edges can be removed when removing the high concentration of outgoing edges in a single node (architecture improvement scenario).

Below, we list general assumptions about the evolution scenarios. The architectural smells are explained in Section 4.3.4.

Architecture improvement scenario:

1. Problem 1 is introduced in the graph representing the first release. In the next releases, the newly added nodes receive half of the input edges of the node with a high concentration of incoming edges until the concentration is lower than the minimum configured.

(33)

2. Problem 2 is introduced in the graph representing the first release. In the next releases, depending on the type of dependency, a node can be removed (simulating the merging of services) or only edges can be removed.

Architectural degradation scenario:

1. One node is selected to be the holder of problem 1. New nodes are introduced with an outgoing edge for this one.

2. One node is selected to be the holder of problem 2. New nodes are introduced with an incoming edge from this one.

In both scenarios, after the problem is introduced or removed, new nodes can be added randomly until the growth rate is reached, simulating Lehman’s law of evolution on Continuing Growth. In Figure 4.3, we present an illustration of an architectural improvement scenario. The I1 service is a node that concentrates four input edges which are considered a high concentration since there are few services in this application. In release 7, the only change between the 21 releases occurs when a new MI1 service is created. Services I2 and I3 are no longer dependent on I1 but are now dependent on MI1.

Figure 4.3: Evolution of one small size graph in the improvement scenario

In Figure 4.4 we see an example of an architectural degradation scenario. The D3 service is chosen to concentrate outgoing edges. In releases 8, 16 and 20 nodes MD1,

(34)

MD2 and MD3 are included respectively. The D3 service then becomes dependent on these three new services, thus increasing its concentration of outgoing edges.

Figure 4.4: Evolution of one small size graph in the degradation scenario

4.3.6 Number of Replications

We need multiple trials as we have stochastic components to generate graphs so that we can quantify variation in the results.

Considering the confidence-interval half-length

δ(n, α) = tn−1,1−α 2

r S2_(n)

n (4.3)

To determine the minimum number of replications we adopted the procedure in [45]: 1. We choose n = 10 as initial number of replications, γ = 0.05 as the error relative,

and α = 0.05 as the confidence level.

2. We compute X(n) and δ(n, α) for each of the metrics in each of the releases for all of the experiment scenarios.

3. We increment n until δ(n, α)/|X| ≤ γ 0 for all the means computed, where γ 0 = γ/(1 + γ).