Refactoring Monoliths to Microservices

(1)

F

ACULDADE DE

E

NGENHARIA DA

U

NIVERSIDADE DO

P

ORTO

Refactoring Monoliths to Microservices

João Paiva da Costa Pinto

Mestrado em Engenharia de Software Supervisor: Professor Filipe Correia Second Supervisor: Professor André Restivo

(2)

(3)

Refactoring Monoliths to Microservices

João Paiva da Costa Pinto

Mestrado em Engenharia de Software

(4)

(5)

Abstract

The introduction of cloud computing has forever changed the way software is built. Due to in-nate characteristics of the cloud paradigm such as, for example, high-availability, resilience, or elasticity, traditional ways of building software (e.g., monoliths) have been replaced by more so-phisticated cloud-native architectures that can fully leverage the power of cloud computing, such as microservices. Additionally, the adoption of the *aaS delivery models, more specifically SaaS, has propelled the industry to adopt this model in favour of approaches considered more traditional such as the deployment of software on-premises or managing different deployment strategies for different clients. These new models, paired up with the possibility of having just one platform that can serve multiple clients (multi-tenancy), has pushed the mass adoption of these new archi-tectures. This has motivated companies and businesses to spend time and money transforming existing monolithic software into software that can fit the paradigm of cloud computing.

Due to the inherent differences and complexities associated with both architectural paradigms, this process often takes a long time and can be very error-prone and awkward to perform, many times resulting in a trial and error approach until the end result has been reached.

In order to address the difficulties associated with this complex problem, this thesis proposes that the systematisation of existing knowledge about common refactorings that are used to migrate monoliths to microservices, in a catalogue of refactorings can mitigate many of the difficulties that engineers are faced with when applying this process. The catalogue has been built by analysing and performing a literature review on the current state of the art regarding the refactoring of mono-liths to microservices. It contains information on how to apply the refactorings, when to apply them, examples that detail possible implementations, and additional knowledge that can aid in that process. Furthermore, this thesis also documents the development of a tool that is capable of automating the application of one of those refactorings, with the aim of facilitating the refac-toring process and making it more efficient. The catalogue and the tool act as a base framework that can be expanded in the future to include other refactorings, refine existing ones and remove refactorings that are no longer relevant.

These two main contributions have been validated with the aid of a survey and a case study which have both yielded positive results. The majority of participants stated that the refactor-ings present in the catalogue represent useful and common activities that occur when performing the transformation of monoliths to microservices, and that the catalogue of refactorings and the refactoring tool would be a valuable help during that process.

Keywords: refactoring, monoliths, microservices

(6)

(7)

Resumo

A introdução da computação na cloud mudou para sempre a maneira como o software é desen-volvido. Devido a características inatas do paradigma da cloud tais como alta disponibilidade, resiliência, elasticidade, entre outras, formas tradicionais de desenvolver software (ex., monólitos) têm vindo a ser substituídas por arquiteturas mais sofisticadas e nativas à cloud que são capazes de tirar maior partido do poder da computação na cloud, tais como os microserviços. Ademais, a adoção de modelos de entrega *aaS (como um serviço), mais concretamente SaaS (software como um serviço), propulsionou a indústria a adotar este modelo em detrimento de abordagens consid-eradas mais tradicionais, tais como a implantação de software nas próprias instalações ou a gestão de diferentes estratégias de implantação para diversos clientes. Estes fatores, aliados à possibili-dade de existir apenas uma plataforma que seja capaz de servir múltiplos clientes (multi-tenancy), empurrou a adoção em massa destas novas arquiteturas. Isto motivou empresas e negócios a gastar tempo e dinheiro na transformação de software préviamente desenvolvido, não adequado à cloud, em software que possa viver nativamente dentro do paradigma da computação na cloud.

Devido às diferenças inerentes e complexidades associadas a ambos os paradigmas arquite-turais, este processo pode muitas vezes levar bastante tempo e ser suscetível a inúmeros erros, muitas vezes acabando numa abordagem de tentativa e erro até se chegar ao resultado esperado.

De forma a abordar as dificuldades associadas a este problema complexo, esta tese propõe a sistematização de conhecimento existente acerca de refactorings comuns que são usados para mi-grar monólitos para microserviços, num catálogo de refactorings com o intuito de mitigar muitas das dificuldades com as quais os engenheiros são confrontandos aquando da aplicação deste pro-cesso. Este catálogo foi construído através da análise da literatura e da revisão do estado da arte referente ao processo de transformação de monólitos para microserviços. O catálogo contém in-formação sobre como aplicar os refactorings, quando os aplicar, exemplos que detalham possíveis implementações, e informação adicional que pode ajudar neste processo. Além disso, esta disser-tação também documenta o desenvolvimento de uma ferramenta capaz de automatizar a aplicação de um desses refactorings, com o objetivo de facilitar o processo de refactoring e de o tornar mais eficiente. O catálogo e a ferramenta actuam como a base de uma framework que pode ser expandida no futuro para incluir outros refactorings, refinar refactorings existentes, e remover refactoringsque não sejam mais relevantes.

Estas duas principais contribuições do trabalho que foi desenvolvido foram validadas com a ajuda de um inquérito e de um caso de estudo, sendo que ambos retornaram resultados positivos. A maior parte dos participantes afirmou que os refactorings presentes no catálogo representam atividades consideradas úteis e comuns que acontecem durante o processo de transformação de monólitos para microserviços, e que o catálogo de refactorings e a ferramenta seriam uma ajuda valiosa neste processo.

Palavras-chave: refactoring, monólitos, microserviços

(8)

(9)

Acknowledgements

I want to express my sincere thanks to everyone who has helped and supported me during the time that took to write this thesis.

I would like to thank Professor Filipe Correia, for taking the time to advise and help me write this thesis, and for always showing the availability to clear any doubts that I might have. I thoroughly enjoyed our discussions, and I firmly believe that the feedback generated from them contributed drastically to the quality of the work that is presented in this document.

To all my colleagues from MESW, who have put up with me for the past two years, that have taught me so many things and that have made my time in FEUP unforgettable for all the best reasons, an enormous thank you.

Lastly, I want to express my most sincere thanks to my parents, who have always supported and believed in me, and without whom I would not be the person that I am today.

João Paiva da Costa Pinto

(10)

(11)

“Nothing has such power to broaden the mind as the ability to investigate systematically and truly all that comes under thy observation in life.”

Marcus Aurelius

(12)

(13)

2.3 Monoliths . . . 7 2.4 Microservices . . . 7 2.5 Microservices vs SOA . . . 10 2.6 Microservices vs Monoliths . . . 11 2.6.1 Advantages of Monoliths . . . 11 2.6.2 Disadvantages of Monoliths . . . 11 2.6.3 Advantages of Microservices . . . 12 2.6.4 Disadvantages of Microservices . . . 13 2.6.5 Choosing an approach . . . 13 2.7 Software Refactorings . . . 14

2.8 Refactoring Challenges and Benefits . . . 14

2.9 Refactoring Monoliths to Microservices . . . 15

3 Existing Approaches for Refactoring Monoliths to Microservices 17 3.1 Decomposition Strategies and Approaches . . . 17

3.2 Migration Patterns . . . 19

3.2.1 Enable Continuous Integration . . . 19

3.2.2 Enable Continuous Deployment . . . 19

3.2.3 Decompose the Monolith . . . 20

3.2.4 Change Code Dependency to Service Call . . . 20

3.2.5 Introduce Service Discovery . . . 20

3.2.6 Introduce Load Balancing . . . 20

3.2.7 Introduce Circuit Breaker . . . 21

3.2.8 Introduce Configuration Server . . . 21

3.2.9 Introduce Edge Server . . . 21

3.2.10 Containerize the Services . . . 21

3.2.11 Deploy into a Cluster and Orchestrate Containers . . . 22

3.2.12 Monitor the System and Provide Feedback . . . 22

3.3 Splitting the Database . . . 22

3.3.1 Breaking Foreign Key Relationships . . . 24

(14)

3.3.2 Shared Static Data . . . 24

3.3.3 Shared Data . . . 25

3.3.4 Transactional Boundaries . . . 25

3.3.5 Monitoring and Reporting . . . 25

3.4 Automating Architectural Refactorings . . . 25

4 Problem Statement 27 4.1 Refactoring Monoliths to Microservices . . . 27

4.2 Systematic Approach to Refactoring . . . 28

4.3 Hypothesis . . . 28 4.4 Expected Contributions . . . 29 4.5 Methodology . . . 29 5 Catalogue of Refactorings 31 5.1 Assumptions . . . 32 5.2 Extract Microservice . . . 33 5.2.1 Motivation . . . 33 5.2.2 Mechanics . . . 33 5.2.3 Example . . . 34

5.3 Break Foreign Key . . . 35

5.3.1 Motivation . . . 36

5.3.2 Mechanics . . . 36

5.3.3 Example . . . 36

5.4 Local Method Call to Synchronous Remote Call . . . 40

5.4.2 Mechanics . . . 40

5.4.3 Example with HTTP . . . 41

5.5 Local Method Call to Asynchronous Remote Call . . . 43

5.5.2 Mechanics . . . 44

5.5.3 Example with RabbitMQ . . . 44

5.6 Configure Circuit Breaker . . . 49

5.6.2 Mechanics . . . 50

5.6.3 Example with Polly . . . 50

5.7 Configure Service Discovery . . . 50

5.7.2 Mechanics . . . 51

5.7.3 Example with Eureka and Steeltoe . . . 51

5.8 Configure Health Check . . . 56

5.8.2 Mechanics . . . 56

5.8.3 Simple Example . . . 56

5.8.4 Custom Validation Example . . . 57

5.9 Containerize Service . . . 59

5.9.2 Mechanics . . . 59

5.9.3 Example with Docker . . . 59

(15)

CONTENTS xi

5.10.1 Motivation . . . 60

5.10.2 Mechanics . . . 61

5.10.3 Example with Docker Compose . . . 61

5.11 Centralize Logging . . . 63

5.11.1 Motivation . . . 63

5.11.2 Mechanics . . . 64

5.11.3 Example with the Elastic stack . . . 64

5.12 Centralize Configuration . . . 71

5.12.1 Motivation . . . 71

5.12.2 Mechanics . . . 71

5.12.3 Example with Spring Cloud Config . . . 71

6 Refactoring Monoliths to Microservices Tool 75 6.1 Purpose . . . 75

6.2 Dependencies and Tooling . . . 76

6.3 Base Framework . . . 76

6.4 Implementing the Refactoring . . . 78

6.5 Testing and Code Quality . . . 79

6.6 User Interface . . . 81

7 Results and Validation 83 7.1 Catalogue of Refactorings Survey . . . 83

7.1.1 Validation Threats . . . 83

7.1.2 Respondents . . . 84

7.1.3 Survey Results . . . 86

7.2 Refactoring Tool Case Study . . . 88

7.2.1 Validation Threats . . . 89

7.2.2 Case Study Results . . . 89

7.3 Hypothesis and Research Questions . . . 90

8 Conclusion and Future Work 93 8.1 Contributions . . . 93

8.2 Future Work . . . 93

8.2.1 Catalogue of Refactorings . . . 94

8.2.2 Refactoring Tool . . . 94

References 95

A Catalogue of Refactorings Survey 99

B Catalogue of Refactorings Survey Responses 103

C Refactoring Tool Case Study 109

(16)

(17)

List of Figures

2.1 An example microservices architecture using an API Gateway [47]. . . 9 2.2 Comparison between scaling a monolith versus scaling a microservice [8]. . . 12

5.1 The Items table has a foreign key to the Orders table, translating into a one-to-many relationship between the Orders and the Items table. . . 36 5.2 The default Eureka interface with no registered services. . . 52 5.3 General Info and Instance Info sections of the default Eureka interface with no

registered services. . . 52 5.4 The Rooms and Pricing microservices register themselves on Eureka. . . 54 5.5 The Kibana interface showing logs related to the Orders and Items microservices. 70

6.1 Result after running the existing test suite. . . 80 6.2 Code quality report generated with Visual Studio’s default code analysis tool. . . 80 6.3 The command line interface used to interact with the refactoring tool. . . 81

7.1 "How long have you been developing software." survey question results. . . 84 7.2 "I am familiar with the following concepts" survey statement results. . . 84 7.3 "I am experienced in migrating monoliths to microservices." survey statement

re-sults. . . 85 7.4 "Migrating monoliths to microservices is a complex process." survey statement

results. . . 85 7.5 "I am experienced in migrating monoliths to microservices." survey statement

re-sults. . . 86 7.6 "A catalogue with these refactorings would help me refactor a monolith into

mi-croservices" survey statement results. . . 87 7.7 Command line tool average vote by experience type queries’ result. . . 87 7.8 IDE plugin tool average vote by experience type queries’ result. . . 88

(18)

(19)

List of Tables

7.1 Manual refactoring application results comparison. . . 89 7.2 Refactoring tool results comparison with manual results. . . 89

(20)

(21)

Listings

5.1 The ImagesController class provides CRUD endpoints to manage images. . . 34

5.2 The VideosController class provides CRUD endpoints to manage videos. . . 34

5.3 The .csproj file of the monoliths which contains the project’s dependencies. . . . 35

5.4 The Order and Item classes that are used by the ORM to map to tables. . . 37

5.5 The GetOrders endpoint used to retrieve orders and associated items. . . 37

5.6 Generated SQL code by Entity Framework when querying the orders. . . 38

5.7 Example JSON of the invocation of the orders endpoint. . . 38

5.8 The GetOrders endpoint on the Orders microservice now calls the Items microser-vice. . . 39

5.9 There is a GetOrderItems endpoint on the Items microservice that retrieves the items of an order. . . 39

5.10 The RoomsService makes direct use of the PricingService by making a local call to one of its methods in a monolith architecture. . . 41

5.11 The RoomsService uses an HTTP client to make a request to the PricingService microservice. . . 42

5.12 The PricingService microservice exposes an endpoint that provides its previous functionality, acting as a remote proxy over HTTP. . . 43

5.13 The ItemsController class implementation. . . 44

5.14 The ItemsService class implementation. . . 45

5.15 The MediaGenerationPublisher class implementation. . . 46

5.16 The AddItem method now uses the MediaGenerationPublisher class instead of the MediaGenerationServiceclass. . . 47

5.17 The MediaGenerationListener class implementation. . . 47

5.18 The MediaGenerationListener class must start listening for incoming messages on application startup. . . 48

5.19 Configuring a Circuit Breaker with Polly and .NET Core on the ConfigureServices method of the Startup class. . . 50

5.20 The Rooms microservice app.settings configuration so that it can register itself on the service registry. . . 53

5.21 Startup configuration in order to use Steeltoe. . . 53

5.22 Previous implementation of the invocation of the pricing microservice. . . 54

5.23 Necessary configuration of the discovery client on the RoomsService. . . 55

5.24 Difference between the Pricing Microservice URL before and after introducing service discovery. . . 55

5.25 Simple health check endpoint implementation. . . 57

5.26 Using the .NET Core framework to configure simple and custom validations for health checks. . . 57

(22)

5.27 Custom health check validation example that queries the database to make sure that it is actively accepting requests and that the connection is up. . . 57 5.28 Example Dockerfile for a .NET core app targeting version 2.2. . . 60 5.29 Logging to a file with Serilog. . . 60 5.30 Example Docker Compose YAML configuration file for the e-shop application. . 62 5.31 Running Docker compose. . . 63 5.32 The OrdersController class. . . 64 5.33 Retrieving an order by its ID success log. . . 66 5.34 Retrieving an order by its ID failure log. . . 66 5.35 The OrdersController class after the split to microservices. . . 66 5.36 URL is retrieved through a configuration file. . . 68 5.37 Serilog configuration in the Orders microservice. . . 68 5.38 Serilog configuration in the Items microservice. . . 69 5.39 The configuration file stored on the remote repository. . . 72 5.40 The necessary values on the app.settings.json file necessary to reach the

configu-raiton server. . . 72 5.41 Necessary modifications needed on the Program.cs class in order to use the

con-figuration server. . . 72 5.42 Command necessary to run the docker image for the Spring Cloud Configuration

Server. . . 73 6.1 The RefactoringStrategyFactory is responsible for deciding which concrete

strat-egy to use at runtime. . . 76 6.2 The IRefactoringStrategy is implemented by the concrete refactorings. . . 77 6.3 Loading the specified solution into memory so that it can later be analysed. . . . 77 6.4 The FindTypeDependenciesForType method is responsible for gathering the

(23)

Acronyms

*aaS As A Service

ACID Atomicity, Consistency, Isolation, Durability API Application Programming Interface

BPM Business Process Management

CLI Command Line Interface

CRUD Create, Read, Update, Delete

CSV Comma-Separated Values

DDD Domain-Driven Design

DI Dependency Injection

ESB Enterprise Service Bus

HATEOAS Hypermedia As The Engine Of Application State HTTP HyperText Transfer Protocol

IDE Integrated Development Environment

IT Information Technology

OS Operative System

REST Representational State Transfer

RQ Research Question

SOA Service-Oriented Architecture SOAP Simple Access Object Protocol SPOF Single Point of Failure

URL Uniform Resource Locator

WSDL Web Service Definition Language

XML eXtensible Markup Language

YAML YAML Ain’t Markup Language

(24)

(25)

Chapter 1

Introduction

This chapter provides an introduction to the work that was conducted during the writing of this thesis by outlying a brief context, motivation and problem definition and by detailing the general thesis structure.

1.1 Context

In recent years, the software world has been the subject of numerous changes. Perhaps one of the most significant changes in the past two decades has been the introduction of the *aaS delivery model and cloud computing [24,15]. Cloud computing has introduced an entirely new software paradigm which has forced engineers to come up with intelligent solutions to both adapt and make the most out of it. From an architectural standpoint, this has translated to the idealisation of new architectures, often referred to as cloud-native architectures due to their synergy with the cloud. The most common cloud-native architecture to date is the microservices architecture [45].

1.2 Motivation

One of the most important features in a cloud setting is the ability to adapt to change [14], by either scaling up or down a given functionality (commonly referred to as elasticity) [41] and being adapt-able and tolerant to failure (resilience) [19]. More conventional on-premises architectures present before the adoption of the cloud, tend to be monolithic in nature, which makes them a not so good candidate for the cloud [4], especially when there is a necessity of scaling a particular behaviour of the entire monolith, having an efficient continuous deployment strategy to launch updates for specific features or using small but highly specialised teams focused on working on specific func-tionalities, among others [31, 38]. Microservices attempt to fix these issues, by proposing an opinionated distributed system architecture based off the decomposition of features in individual,

(26)

self-contained services, that can be independent in nature (e.g., technology, pipeline wise) and that together provide the same functionality as a monolithic application [45].

The power of these new architectures has become evident with the demonstration of their suc-cessful adoption in large scale scenarios by big corporations such as Amazon [43] and Netflix [46]. This success has cemented the microservices architecture as one of the top solutions when looking for highly reliable and responsive large scale architectures [45]. With this in mind, it is understandable that other businesses would want to change their monolithic solutions to microser-vices. Depending on the monolith and how entangled or modular it was built, this transformation might not be straightforward to perform and is often a complex problem [20]. Engineers many times tackle these problems in a trial and error approach, which can be a very long process and prone to errors.

1.3 Thesis Structure

This thesis is organised in the following chapters:

• Chapter1 Introduction- This chapter is dedicated to the introduction of the themes that are discussed in this thesis by providing a brief context, motivation and problem definition, and by outlying the general structure of the thesis.

• Chapter2 Background- This chapter presents information that is considered a prerequisite in order to understand the contents of this thesis and also clarifies some key terms and concepts used throughout the document and their respective scopes.

• Chapter3 Existing Approaches for Refactoring Monoliths to Microservices- This chap-ter describes the conducted lichap-terature review and state of the art analysis regarding the topic refactoring monoliths to microservices.

• Chapter 4 Problem Statement - This chapter is dedicated to the exposition of a more formal definition of the problem by detailing it, enumerating a hypothesis and respective research questions.

• Chapter5 Catalogue of Refactorings- This chapter contains a catalogue of refactorings that can be used when migrating monoliths to microservices, which is one of the main contributions of this thesis.

• Chapter6 Refactoring Monoliths to Microservices Tool- This chapter describes the im-plementation and usage of a tool that automatically applies one of the refactorings present in the catalogue of refactorings.

• Chapter7 Results and Validation - This chapter analyses the gathered results from the survey and case study that were performed in order to validate the major contributions of this thesis and its hypothesis.

(27)

1.3 Thesis Structure 3

• Chapter8 Conclusion and Future Work- The final chapter addresses the main contribu-tions of this thesis and their respective future work.

(28)

(29)

Chapter 2

Background

A background chapter is useful to present contextual or prerequisite information that is important or essential to understand the main body of a thesis. This chapter will be used to present historical developments that have set the stage for the conducted research and subsequent chapters, clarify ambiguities over critical terms and their respective scope, and bring together several subjects, explaining which aspects of each are being included or disregarded.

2.1 Service-Oriented Architecture

Service-Oriented Architecture (SOA) is an architectural style that is based on services. Services are the building blocks of this architecture and are characterized by representing a business ac-tivity that has a designated outcome (e.g., Orders service, Items service). A service is usually self-contained, but can also be composed of other services. A service acts as a “black box” to consumers of the service which only have knowledge of a contract provided by it [36].

An architectural style is often described as the combination of unique features in which a given architecture is performed or expressed. In that sense, the SOA architectural style possesses the following distinctive features [44]:

• It is based on the design of services (which mirror real-world business activities) comprising the enterprise (or inter-enterprise) business processes [44].

• Unique requirements are placed on the infrastructure - "it is recommended that implemen-tations use open standards to realize interoperability and location transparency. Implemen-tations are environment-specific – they are constrained or enabled by context and must be described within that context"[44].

• It requires strong governance of service representation and implementation. • A good service is usually determined by a Litmus Test.

(30)

• SOA services can be implemented in a number of technologies due to the nature of their business requiring different functionalities, which means that there is usually a need for complex system integration. Service orchestration in this scenario is usually delegated to an enterprise service bus (ESB)1that is responsible for mediating communication between all relevant parties and enhancing or adapting the payload between them as needed. In other words, an ESB solves the problem of complex system integration (usually stateless, short-lived transactions). For long-running (usually) stateful transactions or processes, business process management (BPM) [18] is often preferred, solving the problem of modelling and orchestrating business processes, integrating people and systems.

• A service provides a contract, which is an agreement between a consumer of that service and itself.

SOA is a broad term that defines only that business operations should be encapsulated inside of services, saying nothing regarding how these are implemented. A typical implementation of SOA is to use web services, very much due to the necessities of distributed systems. Although web services are usually thought to use web service definition language (WSDL) and object access protocol (SOAP) due to how they have been commonly implemented historically, the reality is that a web service is also a loose term that only defines that a web service is a service that operates over a network, usually serving clients in a distributed architecture, that provides a set of contained functionality to a client (e.g., a business capability, domain operations).

2.2 The Emergence of the Cloud

The appearance of the cloud has shifted the way software is built forever and has brought with it numerous changes. The big jump in cloud adoption is often attributed to companies such as Amazon which began by providing on-demand information technology (IT) and business services over the internet [35]. This way of providing businesses is what later became known with the general designation of as a service (*aaS). Due to its success, the cloud got a massive adoption and has become the de facto solution for providing applications worldwide for many companies.

This newly found paradigm meant that an application living in the cloud could be served to the entire world making use of regions and multi-tenancy to provide the same functionality of an application installed on-premises. However, this new paradigm led to an increasing demand in application availability, which in turn meant that applications had to become more resilient to failure and adaptable to incoming traffic. Cloud providers (e.g., Amazon Web Services, Azure, Google Cloud Platform) started providing various ways of taking advantage of these new concepts with new types of offerings, such as infrastructure as a service (IaaS), network as a service (NaaS) and perhaps more relevant in this context, software as a service (SaaS) [24]. Architectures consid-ered more traditional, such as monoliths, were not prepared to meet these demands, which led to newer architectures being created. These architectures are commonly referred to as cloud-native

(31)

2.3 Monoliths 7

architectures (e.g., microservices) and can leverage the full potential of advantages offered by the cloud.

One of the main goals of cloud computing is to allow users to take benefit from all of the technologies it provides, without the need of having in-depth knowledge about each one of them. The cloud aims to reduce costs, and help users focus on their core business instead of being setback by IT obstacles [16].

2.3 Monoliths

A monolith is a tightly coupled unit with multiple components designed to work together, running in a single process or executable [8]. A monolithic architecture is usually used when starting a new project because it is fast to develop and easy to maintain.

A monolith is built as a single unit, meaning that the entire code base lives together and there-fore any change regarding its magnitude, implies the rebuild and redeploy of the whole application. The constraints mentioned force the monolith to use the same technology stack and programming language during the entire lifecycle of the system.

The nature of monolithic architecture enables the scaling of monoliths to be reasonably straight-forward. Replicate the environment and redeploy the monolithic unit across multiple instances, while the handling and management of requests are done with a load balancer to stabilise the us-age of each unit and the overall system performance [32]. This approach is known as horizontal scaling. This method of making an application scale is relatively easy to develop and apply but not very robust because when a system component is under severe pressure, all the others components in the unit will be affected which can lead to a waste of resources. When one of those components fail, the entire system will most likely collapse, which makes fault isolation extremely challenging to achieve in a monolithic architecture.

Monoliths are widely used in traditional web applications, especially layered architectures, that tend to always start with a monolithic approach. This method of developing software is encouraged by IDEs and the vast amount of web frameworks available, that favour quick produc-tivity, and fast prototyping [38].

The software community has realised that monoliths often solve the early problems of develop-ment, but when complexity starts to grow, and the rise for a continuous upgrade and development is a must, the monolith presents a set of drawbacks that often lead companies and developers to look for microservices as a solution [31].

2.4 Microservices

In the past decade, the software engineering community has observed a tendency towards the adoption of cloud computing [29]. The changing infrastructural circumstances, caused by this new paradigm, require architectural styles that need to leverage opportunities provided by cloud infrastructures as well as tackle the different challenges of building cloud-oriented applications.

(32)

An architectural style that has drawn a substantial amount of attention in the industry and that attempts to solve these entropies is the microservices architecture.

The microservices architecture [22] is a subset of Service-Oriented-Architecture (SOA) [28], essentially comprised of a set of microservices that work together to provide the functionality of an entire application. However, unlike the typical contract first SOA based services that are usually the norm in this type of environments, a microservice is an independent component highly focused either on one feature (business capability) or that represents a well-defined context of the entire application domain [8]. The microservices architecture is essentially defined by the following characteristics [32]:

• Technology Heterogeneity - since each microservice is independent of one another, there are no limits to the technologies or approach (e.g., number of layers, database type) that can be used when developing them. This allows one to pick the best tool for each job rather than having to select a one-size fit all approach, that often ends up being the lowest common denominator.

• Resilience - having microservices as separate units prevents failures from cascading to an entire application and facilitates problem isolation. If a microservice fails, the rest of the microservices are not affected and can keep working as normal. Thus, systems can be built to handle the total failure of services and degrade functionality accordingly.

• Scaling - dealing with separate services enables scaling each one of them separately as the need arises, instead of having to scale everything as a piece. Not every service will have the same amount of usage, and we can deal with this problem more efficiently.

• Ease of Deployment - in comparison with more traditional architectures such as the mono-lith, deploying a microservice is a reasonably straightforward task, because they can be deployed independently without needing the entire application to be deployed. This allows the code to be deployed faster and, if a problem occurs, it can be quickly isolated to an individual service, making it easy to rollback changes.

• Organizational Alignment - microservices allow for better alignment of an architecture to an organization, helping to minimize the number of people working on one codebase to hit the perfect spot of team size and productivity.

• Composability - microservices allow functionality to be consumed in different ways for different purposes. This can be especially important when thinking about the different ways consumers may want to use an application (e.g., different requirements for mobile and web applications).

• Optimizing for Replaceability - having self-contained services with smaller scopes makes it easier to replace them in the future with different implementations if the need arises. This helps prevent codebases from becoming deteriorated and unmaintainable since it is easier to manage their replacement with a better implementation or even delete them altogether.

(33)

2.4 Microservices 9

Because microservices are independent of other microservices, they often need to communi-cate with each other to perform actions that span across different domains of the application. How this communication is achieved is commonly decided on a case-by-case basis, but is usually done by messaging patterns, remote procedure calls and less commonly by REST/SOA interfaces [27].

Figure 2.1: An example microservices architecture using an API Gateway [47].

External communication is often times delegated to an API gateway [27] that acts as an entry point into the application (Fig. 2.1) and usually also deals with some other roles such as, for example, the orchestration of the underlying microservices, acting as a reverse proxy or a load balancer or dealing with authentication and authorization.

Given their architectural complexity, microservices do not come without their difficulties and issues [9], more specifically:

• Distribution - distributed systems are harder to program, since remote calls are slow and are always at risk of failure even if they can be mitigated with an architecture prepared for failure (e.g., the circuit breaker pattern).

• Eventual Consistency - maintaining strong consistency is often difficult for a distributed system, which means that usually eventual consistency is preferred since it is not possible to enforce consistency, availability and partition tolerance at the same time (CAP theorem). This often means using complex strategies such as shared transactions, event queues or other event propagation techniques when dealing with sharing state across multiple microservices.

(34)

• Operational Complexity - managing clusters of microservices is inherently more compli-cated than a single monolithic application due to their numbers and the orchestration that is usually necessary for the individual services to work together.

• Performance - having multiple calls and orchestration between microservices can cause latency that is hard to deal with, and that can create entropy in the whole system.

• Logging and Monitoring - it can become cumbersome and difficult to understand why a problem is happening due to the difficulty in replicating the problem conditions. Creating an efficient logging system for a distributed system composed of many different services is challenging, even if there are options that facilitate this process (e.g., the ELK stack).

Microservices are a powerful architectural option when presented with large scale solutions that live in a cloud environment. Thus, it is not advised to start a project with this architecture, but to migrate existing architectures (e.g., monoliths) as the need to scale arises.

2.5 Microservices vs SOA

Both microservices and SOA are architectures that are based on services. Although they share some similarities, they also have big differences [36], mainly:

• Service Granularity - In a microservices architecture, services tend to be highly focused on performing one functionality extremely well. In contrast, in a SOA, services can range in size from small applications to very large enterprise services. It is not unusual to have a service in a SOA that represents a significant product or a part of a subsystem of a more large offering.

• Component Sharing - Service sharing or reuse is one of the core principles of a SOA that enables services to be used under different contexts, often resulting in one operation making use of several services. In a microservices architecture, sharing is often reduced to a minimum since most microservices live in isolation and are commonly segregated through what is usually referred to as a bounded context. In essence, a bounded context represents an independent domain concept that can be split from the core domain, usually having complete ownership of its data as a single unit. Because SOA commonly relies on multiple services to answer a business request, systems built on SOA are usually slower than their equivalent in a microservices architecture.

• Middleware vs API layer - Microservices usually define what is commonly known as an API layer or API Gateway often acting as a facade of the entire system or an edge server. Contrary to this, SOA tends to use a messaging middleware capable of abstracting different communication protocols used by different services as well as enhancing messages and payloads across the entire system.

(35)

2.6 Microservices vs Monoliths 11

• Heterogeneous interoperability - SOA promotes the usage of several heterogeneous pro-tocols by allowing them all to work together with its messaging middleware component, usually an enterprise service bus (ESB). Microservices attempt to simplify this by reducing the number of choices for integration. SOA should be considered if there is a necessity of integrating several systems using different protocols in a heterogeneous environment; otherwise, microservices tend to be a better candidate.

2.6 Microservices vs Monoliths

Regardless of the many advantages that microservices may have when compared to a more tradi-tional monolithic architecture, microservices are not a silver bullet that can solve every architec-tural problem. Microservices are especially useful in large scale enterprise applications based on cloud environments.

The pros and cons that this architectural paradigm poses and the type of complexity that it brings to a project should always be considered with care before a decision is made. There are many situations where monoliths are better suited, and in order to understand where to use each, it is necessary to understand the advantages and disadvantages of one over the other [31].

2.6.1 Advantages of Monoliths

The overall development process of a monolith is relatively simple. Not having to deal with distributed systems makes monoliths simple to develop and test. Integration tests are also easier to write because the system is comprised of one behemoth component, which usually means that the only integration testing that is performed is against a database. Deploying, configuring continuous integration and continuous deployment is also straightforward because most monoliths are built as a simple package that can be easily deployed. Monoliths are also simple to scale horizontally by running multiple replicas behind a load balancer (Fig.2.2).

2.6.2 Disadvantages of Monoliths

The disadvantages of monoliths start to become more evident as the monolith begins to grow, sometimes to an unmanageable size. The application can become too large and complex to be fully understood, thus making it hard to apply changes fast and correctly. The ever-growing size of the application can have negative impacts on the start-up time of the application. It is usually necessary to redeploy the entire application on each update unless you have the application separated in different packages.

Although scaling horizontally is simple and monoliths can be easily replicated, this is an in-effective approach in a high scale scenario given that different modules of the monolith may have conflicting resource requirements, and we can be assigning too much processing power to a mono-lith which only has a service that is appropriately taking advantage of it. With the current require-ments of high availability, one of the most sought out capabilities of a system is its capacity to be

(36)

elastic and respond to different situations appropriately by, for example, being able to scale a very requested service if it is being highly requested and then elegantly going back to a more normal state when it stops being requested as much.

Another problem that is common with monolithic applications is their reliability. A bug or is-sue in any module can potentially bring down the entire process. Furthermore, since all instances of the application are identical, that bug will impact the availability of the entire application since it can happen on any of the replicas. Monoliths also make it harder to apply changes since modifi-cations to frameworks or languages will affect an entire application, and it is extremely expensive in both time and cost to apply them.

Figure 2.2: Comparison between scaling a monolith versus scaling a microservice [8].

2.6.3 Advantages of Microservices

Microservices tackle the problem of complexity by decomposing an application into a set of man-ageable services which are much faster to develop, easier to understand and maintain. Microser-vices enable each service to be developed independently by a team that is focused on that service. Because every service is isolated, the barrier of adopting new technologies is practically non-existent since developers are free to choose whatever technologies are more appropriate for a given microservice and not be bounded to the choices made, for example, at the start of a project.

The microservices architecture enables each microservice to be deployed independently. This has the added benefit of making and also enabling each service to be scaled independently (Fig. 2.2).

(37)

2.6 Microservices vs Monoliths 13

2.6.4 Disadvantages of Microservices

The microservices architecture adds complexity to a project just by the fact that a microservices application is a distributed system. This often means having to choose and implement an inter-process communication mechanism based on either messaging or RPC, writing code to handle partial failure (e.g., implementing the circuit breaker pattern) and taking into account other falla-cies of distributed computing (e.g., CAP theorem).

Microservices usually do not share databases between them. Transactions that update various business entities in a microservices based application have to also update databases belonging to different services. Using distributed transactions is usually the last choice since it often means resorting to an eventual consistency approach, which is far more challenging and complex.

Testing microservices is also much more complex than a monolith. Testing a service usually means launching that service and any services that it depends upon (or at least configure stubs for those services).

It is more challenging to implement changes that span multiple services. In a monolithic ap-plication, one could simply change the corresponding modules, integrate the changes, and deploy them. In a microservice architecture, careful planning and coordination of the rollout of changes is an absolute necessity.

Deploying a microservices-based application is also more complicated. A monolithic appli-cation is often deployed and replicated on a set of identical servers, which uses a load balancer up front to decide which one to use. On the other hand, a microservice application usually con-sists of a large number of services with the possibility of each service having multiple runtime instances. This means that each instance needs to be configured, deployed, scaled and monitored. Tools such as Docker, Docker Compose or Kubernetes attempt to attenuate this issue by providing a systematic way of dealing with these configurations as well as providing tooling that facilitates this process.

Additionally, due to the sheer number of services and their dynamism, it is often crucial to implement a service discovery mechanism. Manual approaches to operations at this level are impossible to scale due to the level of complexity. A successful deployment of a microservices application thus requires a high level of automation.

2.6.5 Choosing an approach

The choice between picking either a monolith or a microservices architecture usually resides in the complexity of the application that will be created. Monoliths are usually appropriate for small applications that are not a part of an enterprise solution, while microservices usually shine in those high scale scenarios[31]. Nevertheless, it is rarely advised to start an application from scratch with a microservices architecture but to begin with a monolith [10] and then, as necessity arises (e.g., scaling), start migrating to a microservices architecture.

(38)

2.7 Software Refactorings

Software code bases tend to decay in quality over time if not properly maintained. Adding features over features can many times increase the complexity of a software system and make it difficult to adapt to future changes [11].

Refactoring is a designation given to a disciplined technique comprised of one or more steps, in some cases even other refactorings, that alter the internal structure of an existing body of code while maintaining its external behaviour with the aim of making a given code base more maintain-able and adaptmaintain-able to change [11]. The purpose of applying refactorings is tied with maintaining the health of a given software solution by preventing it from becoming an unmanageable mess. Refactorings aim at fixing code smells, which is the designation given to pieces of code that are considered bad practices, thus making the software more modular and maintainable.

Refactorings can act at different levels of a software system. Lower level refactorings can often deal with simpler things such as renaming a variable or extracting a method, while higher level refactorings can deal with changes to the architecture of a system (architectural refactorings).

2.8 Refactoring Challenges and Benefits

Refactoring can be a simple process such as renaming all occurrences of a method name or a more complex one such as performing architectural refactorings that change parts of the architecture of a software system. Refactorings are usually applied when teams detect that a given software is suffering from poor readability or maintainability; difficulty adapting existing code to different scenarios and anticipated features; difficulty in testing code without a refactoring being applied; code duplication, slow performance and old legacy code that must be worked on. Refactorings are many times applied in the context of bug fixes or feature additions and are more often applied by the need to implement immediate visible changes or features in the short term rather than potentially uncertain benefits of long-term maintainability [21].

Given the intricacies of refactoring, which is a process that can become very complex depend-ing on the situation and the refactordepend-ing itself, there are generally some challenges associated with the refactoring process, more specifically:

• Inherent challenges such as working on large code bases, a large number of dependencies between components, need for coordination between other developers or teams and ensuring program correctness after the refactoring process has been applied.

• Lack of tool support for refactoring change integration, code review tools targeting refac-toring edits, and sophisticated refacrefac-toring engines in which a user can easily define new refactoring types.

• Difficulties related to merging and integration after refactoring often discourages people from doing refactoring. Most version control systems are not sensitive to refactorings that

(39)

2.9 Refactoring Monoliths to Microservices 15

perform renames of files, and that change their location in the file system, which makes it hard for developers to understand code change history after the refactoring has been applied.

• If the refactoring is performed in the scope of another task, it typically increases the number of lines/files involved in a code check-in which burdens code reviewers and increases the possibility of the code change colliding with other code changes.

• If not done with caution, applying a refactoring can introduce regression bugs, code churns, merge conflicts, can take time away from other tasks, increase the difficulty of doing code reviews after refactoring, and risk over-engineering solutions (from misunderstanding subtle corner cases in the original code and not accounting for them in the refactored code).

The general major benefits attributed to refactoring can be summarized as improved maintain-ability, readmaintain-ability, performance, testmaintain-ability, extensibility and modularity; fewer bugs, improved performance, code size reduction, duplicate code reduction and reduced time to market [21,11].

2.9 Refactoring Monoliths to Microservices

With the growing adoption of the *aaS business and delivery model, together with the emergence of cloud computing and an ever-growing need to scale applications, new architectures have been developed to address these ever demanding requirements. Applications need to scale efficiently and effortlessly, their downtime needs to be imperceptible to the end user, and they need to be managed in an efficient manner.

The microservices architectural style has been adopted and used with success by large com-panies to address most of these demanding requirements. Microservices have been proven to be useful in large scale scenarios, popularised by the success of platforms such as Netflix and Ama-zon, which have pushed the adoption of the microservices architecture immensely.

Microservices are not, however, a silver bullet that can fix every architectural software prob-lem. Sometimes, they can even cause more harm than good when employed poorly without taking into account the scale of a given solution as a result of over-engineering a project that could have been done in a much more straightforward fashion. Nevertheless, microservices excel in large scale SaaS and enterprise solutions, which is one of the reasons for their growing adoption.

Jumping on the microservices bandwagon is simple if a system is being built from scratch. There are no existing dependencies and no need to migrate legacy or existing codebases. However, one of the most common realities for enterprise systems and companies is the need to scale existing systems, which, in these situations, are commonly built as monoliths. This transformation is a complicated process that is directly related to the intricacies of how the existing system was built and is often a tedious, harsh and time-consuming process, mostly always done without prior knowledge of the existence of such refactorings.

(40)

(41)

Chapter 3

Existing Approaches for Refactoring

Monoliths to Microservices

Analysing the state of the art is important because it provides insights regarding the current trends and the current solutions to given problems. By analysing what others have come up with to deal with similar issues, it is easier to adopt a strategy or to implement a solution for a problem with comparable characteristics. Performing this analysis also provides a gap analysis of current challenges related to the migration of monoliths to microservices as well as an overview of all the topics that are worth investigating for the scope of this thesis.

In this chapter, the conducted analysis intends to give a current notion, as of the writing of this document, of the state of the art regarding the refactoring of monoliths to microservices.

3.1 Decomposition Strategies and Approaches

Currently, the refactoring transformation from a monolith to a microservices architecture is a pro-cess that is performed manually most of the time. This usually happens because the solution that is being refactored is typically a legacy application that is too entangled and poorly built and needs a very custom and specific approach. Other times it is because the people refactoring are not aware that there are alternatives to the manual process.

Some alternatives have been developed over time that follow different strategies and that have different outcomes. Some of them focus on ways of extracting services from a monolith [6]; others focus on ways of checking what the best way of separating the monolith is and on identifying their break points [26]. From the performed research, it was not possible to ascertain the existence of alternatives focusing solely on automating the process of applying said refactorings.

From the reviewed literature [13], most of the existing formal decomposition approaches can be grouped into the following classifications:

(42)

• Approaches that use static code analysis tools, thus requiring the source code of an applica-tion. From that analysis, decompositions are generated though possible intermediate stages may be required.

• Approaches that use available metadata and require, for example, more abstract input data, such as diagrams, interfaces or version control history.

• Approaches focused on workload data that try to find suitable service decompositions by measuring usage and operational data such as performance or communication metrics, and use this data to determine an appropriate service decomposition and granularity.

• Dynamic microservice composition approaches attempts to solve the decomposition prob-lem by using a microservices runtime environment in which the resulting set of services permanently change at the end of each iteration through the re-calculation of appropriate compositions which can be based on, for example, the workload.

The aforementioned strategies are used to identify breaking points of the monolith in a more formal manner, allowing tools to suggest breaking points that would lead to service interfaces and possible microservices candidates [13]. Besides these more formal approaches found in the literature, the most common decomposition approaches used in the industry are decomposition based on business capabilities [39] and bounded contexts (from Domain Driven Design [7]).

The former uses a strategy to transform a monolith into a set of microservices by decomposing them by business capability. In essence, a business capability can be defined as something that is performed in the context of a business that generates value and usually captures what an organi-sation’s business is [39]. Contrary to how an organisation conducts its business, which is prone to change over time, a business capability tends to stay the same. For example, Netflix started as a movie rental company having a physical store and physical media that could be rented out, and now they are one of the biggest streaming platforms in the world. While their business capability (renting movies) was kept pretty much identical, the same cannot be said about how the business was conducted, which has become drastically different over time.

A business capability can be identified "by analysing the organisation’s purpose, structure, and business processes [39]". After the business capabilities have been identified, they are often promoted to services either by a one-to-one mapping or by having a service deal with a group of business capabilities. Due to the relative stability of business capabilities, creating services from them has the advantage that the services and their respective architecture should also be stable.

Another strategy that can help create services from existing monoliths has to do with the concept of subdomains and bounded contexts, which are two incredibly useful concepts that come from DDD [7]. Instead of using a traditional approach in which there is a single domain model for the entire system, DDD uses separate domain models with a smaller scope for each identified subdomain which is useful because it means that an entire organisation does not have to agree with the definition of only one model. Doing so prevents the use of overly complex domain models in situations where a much simpler representation would suffice as well as provide flexibility in

(43)

3.2 Migration Patterns 19

the sense that different subdomains can refer to different concepts or terms by the same name and this eliminates that issue. Subdomains are identified similarly as the business capabilities mentioned before: through the analysis of the business and identification of the different core areas of expertise. The scope of a domain model is called a bounded context. A bounded context is an ideal candidate to become a microservice due to its limited scope and focused representation of the domain. Additionally, the microservice’s concept of autonomous teams working on different services is "completely aligned with the DDD’s concept of each domain model being owned and developed by a single team"[7].

3.2 Migration Patterns

Patterns are generally defined as solutions to common, reoccurring problems [34]. They are help-ful for developers/architects or anyone that is involved in some way in the development of software because they can be used as a proven solution to a given problem and facilitate the development [42]. Migration patterns have the same fundamental principle, but they are geared more towards problems related to migrating systems from one to another, such as is the case when refactoring monoliths to microservices.

Transforming a monolith to a cloud-native architecture such as microservices encompasses a series of steps that are often, and understandably, repeated every time there is a need to mi-grate such a system. The following sections describe some common migration strategies found in the literature that derive from real-life requirements and scenarios and that are common to most microservices solutions.

3.2.1 Enable Continuous Integration

The need of having automatic build and testing pipelines that result with delivery artefacts in-creases as the complexity of a system grows. In a monolith, not having such tasks automated is somewhat manageable, but in a microservices architecture, the lack of continuous integration makes it very hard and complex to manage builds and tests, and have confidence in the quality of the code that is being produced. Enabling continuous integration not only makes this more manageable but also enables adding continuous deployment in a future step [5,30].

3.2.2 Enable Continuous Deployment

One of the biggest certainties in software development is the delivery of new features or bug fixes, in what can be generally referred to as updates. Automating the delivery process is essential to the software lifecycle and gives us more confidence that the changes that we want to push will be correctly shipped to where we want them to go (e.g., a live version of the application, a staging version, regression environment). Manually performing the release of updates is error-prone and should be automated. This is even more evident in the case of microservices, where

(44)

each individual microservice should have a separate continuous deployment pipeline in order to facilitate the release/update process [5,12].

3.2.3 Decompose the Monolith

As previously mentioned, when talking about decomposition strategies, decomposing the monolith is one of the most significant steps when performing the migration. There are several strategies and techniques that can be taken into account, but the end result will always be splitting the monolith into a set of different services with the least possible entropy [13,26].

3.2.4 Change Code Dependency to Service Call

During the migration process and splitting of the monolith, there will be situations where a code call must be changed to a service call. This will usually happen the logic that supports a method call now lives in a different microservice. In order to preserve functionality, the method call must be changed to a remote procedure call (RPC). There are usually two schools of thought when implementing a RPC. Either a synchronous style is used with protocols such as HTTP and gRPC or an asynchronous style with messaging with technologies, for example, such as RabbitMQ or Apache Kafka [5,39].

3.2.5 Introduce Service Discovery

In a cloud environment, there is usually a high degree of dynamism regarding the exposed services. Services are not static. They can be replicated or shut down due to the elasticity that this type of architecture provides. In order to make this manageable to the entry point of our application (e.g., API gateway, edge server, load balancers), there needs to be a way of tracking the running service instances. Service discovery allows services to register themselves as they start on a static registry that is then consumed by the interested parties. Removing a service from the registry can be triggered by the service not responding to a periodical heartbeat check or by the service terminating itself [5,39].

3.2.6 Introduce Load Balancing

The use of load balancing is often used to scale monolithic applications by replicating the monolith and then using a load balancer to decide which o the monolith should be used. In a distributed system such as microservices, this necessity is even more prominent due to the possibility of having replication on most services, which is a direct consequence of this type of elastic cloud-native architecture. Because of that, the need to manage all replicas and to decide which one to use is even bigger in this scenario, which makes it almost mandatory to use a load balancer [5,39].

(45)

3.2 Migration Patterns 21

3.2.7 Introduce Circuit Breaker

The circuit breaker is a distributed system architectural pattern that prevents failures from cas-cading to the entire system by containing the failure on the system that originated it and making clients elegantly not be kept waiting for the failing service to respond. It takes inspiration from a device with the same name that produces the same effect on an electrical circuit. In a distributed system, there are many things that are prone to fail or to induce some level of failure, such as giving a timeout. If this is not managed it can easily escalate into a catastrophic system failure, hence the need to contain such a failure and gracefully recover from it [5,39].

3.2.8 Introduce Configuration Server

In a microservices architecture, it is recommended that microservices get their property files from a centralised server to make it easier to configure and change those properties without having to find the server and change it manually. Additionally, with this flexibility services should also be able to dynamically respond to changes on those configurations, allowing changes to take place without having to manually reboot a given service [5].

3.2.9 Introduce Edge Server

An edge server is the entry point of a distributed system. It acts as a facade across all of the existing microservices and is also usually used to route traffic, act as a proxy or reverse-proxy. In the context of microservices, this is usually called an API gateway. The importance of having an edge server becomes evident in a microservices architecture due to the dynamism of the services (e.g., replicas, load balancing). Clients should not have to know all of those services. They should only be aware of an entry point into the system from where they can operate and use it [5,40].

3.2.10 Containerize the Services

Containerising services is extremely useful and helpful due to the intrinsic heterogeneous nature of microservices. For example, each microservice can be built in a different framework or language, or have completely different dependencies. From a configuration and management standpoint, it is easily understandable that this can become a nightmare pretty quickly. Containerising the services allows them to be deployed in the same environment, with the guarantees that the container will behave the same way everywhere it is deployed, successfully eliminating common issues such as applications behaving differently on different environments, and successfully eliminating the management nightmare that would be having to deal with the services configurations manually. Additionally, by using a containerization framework (e.g., Docker) we reduce the learning curve and adaptability to a certain project because the configuration language is one that is more likely to be known by a new developer [5,30,12].

(46)

3.2.11 Deploy into a Cluster and Orchestrate Containers

Deploying into a cluster is advantageous because current cluster and orchestration frameworks provide means to interconnect all the containers and auto-scale them automatically, handling things like failures or high demand with replicas and automatic load balancing. Tools such as Kubernetes allow this orchestration to take place with just a configuration file. The benefit of using such a tool becomes evident as the microservices ecosystem begins to grow and managing every single service independently becomes impossible [5,17].

3.2.12 Monitor the System and Provide Feedback

In a decentralised architecture such as microservices, it is important to be able to check the func-tioning of the entire system in a simple way. This usually means using a centralised solution for logging and monitoring. Stacks such as the ELK stack or the Cloudwatch infrastructure provided by AWS are specifically designed for this. Instead of each service writing its logs into a file, they instead write it to these tools where it is then possible to check and monitor the entire system from only one place. This is particularly useful when debugging issues or trying to understand why a particular service is not behaving as expected [5].

3.3 Splitting the Database

One of the biggest challenges when performing the change from monoliths to microservices is to decide what should be done regarding the database. Commonly, monolithic systems will use a shared database strategy across the entire application, which means that there is a strong coupling between the entire application and the database. A typical architecture in monolithic applications is to have a repository layer which exposes methods that make it possible to perform controlled actions over the database (e.g., obtaining a list of orders, updating a client’s information) and then use an ORM such as Hibernate1 or Entity Framework2, to map OO entities to their respective representations in the database, usually a table.

When transforming a monolithic system to microservices, a decision must be made: either the database is shared by all the microservices, or it must be split and transformed into different databases that are each used by each microservice.

Opting in for the first approach has some advantages:

• The entirety of the database can be accessed with SQL queries, which means that there is no problem accessing information that is very further apart and would live in another microservice if the option to use a database per microservice was instead adopted.

• All performed transactions against the database are guaranteed to be ACID, which makes dealing with transactions a lot easier.

1_{http://hibernate.org/orm/what-is-an-orm/} 2_{https://docs.microsoft.com/en-us/ef/}

(47)

3.3 Splitting the Database 23

• A single database is simpler to operate.

However, it might not be the best approach in a microservices architecture due to the following disadvantages [39]:

• A developer working on a service will need to coordinate schema changes with developers of other services that have access to the same tables. This coupling and additional coordination will slow down development.

• Because all services access the same database, they can potentially interfere with one an-other. For example, if a long-running transaction holds a lock on a table, the microservice that uses that table can potentially be blocked.

• A single database might not satisfy the data storage and access requirements of all services. The other approach is to split the database into specific microservice databases. The most common approaches are to use an independent table, independent schema or independent database server per microservice. Using a database per microservice goes head to head with the philosophy of developing microservices [39]. Everything is contained within a microservice, with every mi-croservice having the full ownership and only knowing about their database. The most common advantages of this situation are:

• Enforcing loose coupling between different microservices.

• Each microservice can pick the most appropriate persistence solution for their specific busi-ness or domain needs (e.g., using an Elastic Search to provide advanced full-text search functionality or using Neoj4 for a graph intensive use case).

Having distributed databases, although advantageous, also possess issues that stem from com-plexity introduced by this approach, more specifically:

• Implementing business transactions that span different microservices becomes more chal-lenging because it cannot be done resorting to a single ACID transaction. Instead, different mechanisms have to be taken into account such as distributed transactions, API orchestra-tion or composiorchestra-tion, or a publish-subscribe architecture with event messaging (also known as the Saga pattern).

• The complexity of managing different databases increases due to the particularities of how each database works (e.g., NoSQL vs relational databases)

From a refactoring transformation standpoint, picking the first choice is the obvious easiest one since it requires no transformation at the cost of the aforementioned disadvantages. In contrast, picking the database per service approach, although tricky, can be much more advantageous in the long run.

Refactoring Monoliths to Microservices

F

E

U

P

Refactoring Monoliths to Microservices

João Paiva da Costa Pinto

Refactoring Monoliths to Microservices

João Paiva da Costa Pinto

Mestrado em Engenharia de Software

Abstract

Resumo

Acknowledgements

Contents

List of Figures

List of Tables

Listings

Acronyms

Chapter 1

Introduction

1.1

Context

1.2

Motivation

1.3

Thesis Structure

Chapter 2

Background

2.1

Service-Oriented Architecture

2.2

The Emergence of the Cloud

2.3

Monoliths

2.4

Microservices

2.5

Microservices vs SOA

2.6

Microservices vs Monoliths

2.7

Software Refactorings

2.8

Refactoring Challenges and Benefits

2.9

Refactoring Monoliths to Microservices

Chapter 3

Existing Approaches for Refactoring

Monoliths to Microservices

3.1

Decomposition Strategies and Approaches

3.2

Migration Patterns

3.3

Splitting the Database