• Nenhum resultado encontrado

Program synthesis from denotational semantics

N/A
N/A
Protected

Academic year: 2021

Share "Program synthesis from denotational semantics"

Copied!
94
0
0

Texto

(1)

Heitor Paceli Maranhão

Program synthesis from denotational semantics

MSc. Thesis

Federal University of Pernambuco [email protected]

<http://www.cin.ufpe.br/~posgraduacao>

Recife

2016

(2)

Heitor Paceli Maranhão

Program synthesis from denotational semantics

A MSc. Thesis presented to the Center for Informatics of Federal University of Pernam-buco in partial fulfillment of the requirements for the degree of Master of Science in Com-puter Science.

Federal University of Pernambuco Center of Informatics Graduate in Computer Science

Supervisor: Alexandre Cabral Mota

Co-supervisor: Juliano Manabu Iyoda

Recife

2016

(3)

Catalogação na fonte

Bibliotecário Jefferson Luiz Alves Nazareno CRB 4-1758

M311p Maranhão, Heitor Paceli.

Program synthesis from denotational semantics / Heitor Paceli Maranhão – 2016.

93f.: fig., tab.

Orientador: Alexandre Cabral Mota.

Dissertação (Mestrado) – Universidade Federal de Pernambuco. CIn. Ciência da Computação, Recife, 2016.

Inclui referências.

1. Programas de computador. 2. Síntese de programas. 3. Linguagem de programação (computadores). I. Mota, Alexandre Cabral (Orientador). II. Titulo.

(4)

Heitor Paceli Maranhão

Program Synthesis from Denotational Semantics

Dissertação de Mestrado apresentada ao Programa de Pós-Graduação em Ciência da Computação da Universidade Federal de Pernambuco, como requisito parcial para a obtenção do título de Mestre em Ciência da Computação

Aprovado em: 13/09/2016.

BANCA EXAMINADORA

__________________________________________ Prof. Dr. Leopoldo Motta Teixeira

Centro de Informática / UFPE

__________________________________________ Prof. Dr. Rohit Gheyi

Departamento de Sistemas e Computação / UFCG

__________________________________________ Prof. Dr. Alexandre Cabral Mora

Centro de Informática / UFPE

(5)

I dedicate this dissertation to my family, friends and professors who supported me and were always ready to help.

(6)

Acknowledgements

Firstly to God, who loved, guided and blessed me during my entire life.

To my mother, for always loving me and being a source of wisdom, confidence and safety.

To my advisor Alexandre Cabral Mota and my co-advisor Juliano Manabu Iyoda, for helping, advising and supporting me during all this journey, making it possible to get here.

(7)

“Technology is anything that wasn’t around when you were born.” (Alan Kay)

(8)

Abstract

Program synthesis aims to automate the task of programming. Through program synthesis it is possible to let the programmer free to care about the description (specification) of the problem to be solved by the program under development, reducing human interaction with coding tasks. Automating new algorithms creation and transferring responsibility for writing code are some of the benefits propitiated by program synthesis. In this work, program synthesis is presented as an Alloy* specification for an imperative language. We synthesize programs described by pre and post-conditions (contracts) written using a Domain Specific Language proposed in this work. We embed the syntax and the denotational semantics of Winskel’s imperative language in Alloy*. Alloy* has proven to be an easy and productive way of building program synthesizers. Our experiments show that synthesis based on Alloy* is competitive once contracts, scopes and, if needed, sketches, are correctly chosen. As a consequence, our Alloy* program synthesizer can provide, in a single high-level framework, different features in comparison to other synthesizers: (i) synthesis based on scope; (ii) synthesis based on sketches; and (iii) verification. We introduce our Domain Specific Language for contracts and present a detailed description on the synthesis of the swap problem, the product of two numbers, the maximum of 2 and of 3 numbers, and the greatest common divisor. Another contribution of this work is a source code generator, using the programming language C#, of the algorithms created by our synthesizer.

(9)

Resumo

Síntese de programas permite automatizar as atividades de programação. Através desta automação é possível deixar o programador livre para criar a descrição (especificação) do problema que o programa a ser desenvolvido busca resolver, reduzindo a interação humana com a etapa de escrita de código. Automatizar criação de novos algoritmos e transferir para máquinas a responsabilidade de escrever o código de programas são alguns dos benefícios que a síntese de programas possibilita. Neste trabalho, síntese de programas é apresentada através de uma especificação em Alloy* usando uma linguagem imperativa. A síntese é realizada a partir de um par de predicados, pré e pós-condição (contrato), escritos usando uma linguagem de domínio específico proposta neste trabalho. A semântica denotacional da linguagem imperativa usada por Winskel foi embutida em Alloy*. O uso de Alloy* se mostrou uma maneira fácil e produtiva de construir sintetizadores de programas. Os experimentos mostram que síntese baseada em Alloy* é competitiva, uma vez que contratos, escopos e, se necessário, esboços, sejam corretamente escolhidos. Como consequência, o sintetizador de programas em Alloy* pode fornecer, em um único framework de alto nível, características diferentes em comparação com outros sintetizadores: (i) síntese baseada em escopo; (ii) síntese baseada em esboços; e (iii) verificação. Para demonstrar a aplicabilidade prática de nosso trabalho, usamos nossa ferramenta na síntese de problemas clássicos da Computação, tais como troca do valor entre duas variáveis, o produto de dois números, o máximo de 2 e 3 números, e o maior divisor comum entre dois números. Outra contribuição deste trabalho consiste em um gerador de código na linguagem de programação C#, dos algoritmos criados pelo nosso sintetizador.

(10)

List of figures

Figure 1 – Steps from synthesis process . . . 15

Figure 2 – While’s expansion . . . 33

Figure 3 – Synthesizer’s translators . . . 36

Figure 4 – Translation flow from DSL to Alloy* . . . 42

Figure 5 – Alloy* representation of a program that computes the maximum of two numbers.. . . 70

(11)

List of tables

Table 1 – EBNF notation . . . 72

Table 2 – Generated sentences using Production Coverage . . . 75

(12)

Contents

1 INTRODUCTION . . . 13 1.1 Proposal . . . 14 1.2 Contributions . . . 15 1.3 Dissertation structure . . . 16 2 BACKGROUND . . . 17 2.1 Alloy . . . 17 2.2 Alloy* . . . 19

2.3 Spoofax Language Workbench . . . 20

3 IMP SPECIFICATION IN ALLOY* . . . 24

3.1 Alloy* Specification for Program Synthesis . . . 24

3.1.1 The syntax of IMP in Alloy* . . . 24

3.1.2 The well-formedness rules . . . 28

3.1.3 The semantics . . . 30

3.1.4 The synthesis predicate . . . 34

4 DSL FOR PROGRAM SYNTHESIS AND ALLOY* AND C# TRANS-LATORS . . . 36

4.1 Syntax . . . 36

4.2 Translation Rules from DSL to Alloy* . . . 42

4.2.1 Starting translation . . . 43

4.2.2 Variable declarations . . . 44

4.2.3 Sketch declarations . . . 44

4.2.4 Precondition translation . . . 51

4.2.5 Expressions and Values translation . . . 54

4.2.6 Sketch translation. . . 56

4.2.7 Postcondition translation . . . 63

4.2.8 Generating the Run Command . . . 64

4.2.9 Translating the Run Command . . . 66

4.2.10 Applying the proposed rules to the example . . . 68

4.2.11 C# Generator . . . 70

4.2.12 Source Code . . . 71

5 PRELIMINARY VERIFICATION OF THE DSL TO ALLOY* TRANS-LATOR . . . 72

(13)

5.1 Lua Language Generator . . . 72

5.2 DSL Validation . . . 73

5.2.1 Issues detected by validation . . . 76

6 CASE STUDIES . . . 78

6.1 Swap . . . 78

6.2 Product of two numbers . . . 80

6.3 Maximum of two numbers . . . 81

6.4 Maximum of three numbers . . . 82

6.5 The Greatest Common Divisor (GCD) . . . 83

6.6 Synthesis analysis. . . 85

7 CONCLUSION . . . 86

7.1 Related Work . . . 87

7.2 Future Work. . . 89

(14)

13

1 Introduction

Software development requires the developer to have knowledge about the require-ments of the software being developed (and ideally its formal specification), implementation logic and technical details of the programming language used. During coding activities, programmers can make mistakes, causing an abnormal behavior of the system, according to its specification.

Programming is an activity usually performed by a human (a programmer). Thus, its outcome depends almost exclusively on the experience of the programmer because, historically, programming is viewed just as Art [1]. Despite such a tradition, Formal Methods view programming as a combination of Art and Math: a programmer proposes a new design (Art) and later proves that it is a refinement of a previous design (Math) [2].

In a more specific view, refinement calculus regards programming as almost pure calculation [3,4]. In this case, a program can be derived following logical rules from a formal specification, where an enriched language (a hybrid language composed of a specification and a programming language in the same notation) is used. Program synthesis aims to automate the programming task, creating an algorithm based on a given specification. It complements refinement based approaches. Program synthesis performs an automatic search for possible program solutions and possibly verifies the correctness of the required contract automatically as well.

Program synthesis allows us to transfer to machines the responsibility of creating algorithms for given problems, letting humans free to care more about the problem’s description which will be used as the input to the synthesizer, reducing the human interference in implementation activities, consequently reducing the risk of existence of errors caused by human failure. Since the machine is the responsible for creating algorithms, new solutions for many different problems may be automatically discovered during synthesis.

The full automation of the programming activity from a specification is a grand challenge. The automatic translation of a specification to a program can be done in infinitely many ways. The search space is huge. Therefore humans can also aid in pruning the state space by providing additional characteristics of the solution being searched, like scopes, sketches, and examples.

Program synthesis may benefit programming activities in many different ways, such as:

(15)

Chapter 1. Introduction 14

• Automating coding activities;

• Discovering different algorithms and solutions for given problems;

1.1

Proposal

Our main proposal in this work is to create a synthesizer, stated syntactically and semantically in a simple way, able to create algorithms based on a given specification. This synthesizer will benefit the development process as it was mentioned before in this chapter.

At this point the literature follows different routes:

• The development of synthesizers that are specialized in a particular domain (database [5], reactive systems [6], etc);

• The development of synthesizers that translate a particular language into a SAT/SMT language [7, 8];

• And the development of synthesizers that take as input a general purpose high-level specification [9].

In this work, we follow the approach of a general purpose high-level specification [9] which allows us to synthesize programs based on a contract with pre and post-conditions. The synthesizer can have program sketches, which are additional elements used to help it to finding a solution for a given contract, by providing a program with some missing parts which must be completed by the synthesizer. Another element necessary for the synthesis is the run command, which is used to define the cardinalities of commands, variables, expressions, etc. of the program being synthesized, limiting the search scope. We create a synthesizer for Winskel’s language IMP [10] (based on its denotational semantics) using Alloy* [9] and its constraint solver (the Alloy* Analyzer). Alloy* is an extension of Alloy [11], in the sense of allowing a first-order model finder to handle quantifications over higher-order structures. The Alloy* higher-order facility is built based on the Counter Example-Guided Inductive Synthesis (CEGIS) [12], which is an approach for solving higher-order synthesis problems.

In order to make easier to write specifications of programs to be synthesized, it was created a domain specific language (DSL). We created our syntax as similar as possible to the Code Contracts [13]. Problems described in such a DSL are translated into an Alloy* specification prior to invoking its analyzer. After the execution, the solution found is exported to an XML file. This XML can be converted to a source code written in some programming language, in this work we used C#. Finally, the answer found is presented to the user. These steps are illustrated in Figure 1.

(16)

Chapter 1. Introduction 15

Figure 1 – Steps from synthesis process

With the view to have more reliable tool, we validated the transformation rules responsible for rewriting the contract using Alloy* notation. We used a sentence generator tool to create inputs in DSL notation to be validated against the transformation rules. After translating them we verified if the resulting files are valid, checking if they can executed by Alloy* analyzer.

1.2

Contributions

This work has the following main contributions:

• A Domain Specific Language to state problems as formal contracts;

• The embedding of the syntax and the denotational semantics of the IMP language in Alloy*;

• An Alloy* predicate that characterizes program synthesis in terms of a contract (pre-and post-condition);

• A program synthesizer based on bounded scope and program sketches;

• The synthesis of programs with the notions of state, conditional commands, sequential compositions and loops;

• A translator from our proposed DSL to Alloy* program synthesis specification; • A preliminary investigation about the relative completeness of our DSL to

Al-loy* translator by using sentences created by the grammar instance generator Lua Language Generator (LGen) to validate our transformation rules;

• A translator that presents Alloy* instances as C# source code functions;

• Four case studies, including the synthesis of Euclid’s algorithm for computing the greatest common divisor;

• An experimental evaluation of the effort involved in synthesizing the programs that swap the values of two variables, compute the maximum of 2 and 3 numbers, and compute the greatest common divisor. We present a discussion about the influence that the several input parameters of the Alloy* Analyzer have on the synthesiser’s behavior.

(17)

Chapter 1. Introduction 16

1.3

Dissertation structure

This work is organized as follows.

In Chapter 2its presented a brief explanation about the languages used to create the synthesizer’s specification, Alloy [11] and Alloy* [14], as well as we also present the Spoofax Language Workbench [15] that was used to define the DSL syntax and the transformation rules responsible for do the translation from DSL to Alloy* notation.

In Chapter 3we show our Alloy* specification of IMP and how we embedded its syntax, well-formedness rules, and semantics. We also present the synthesis predicate used by the Alloy* Analyzer to find a solution for given contracts.

Chapter 4 describes the DSL which was created in order to hide some Alloy* technicalities by presenting its syntax and also the transformation rules responsible for translating from DSL to Alloy* notation. Its is also shown how tool automatically define the cardinalities from Alloy* specification’s run command. Finally the chapter is closed by presenting the C# Generator which rewrites the solution found by the Alloy* Analyzer as a C# source code file.

Chapter5 shows how the translation from DSL to Alloy* notation was validated. We present a tool which automatically create a set of sentences and how we have created sentences written accordingly to our DSL grammar using this tool. After that we translated those sentences to Alloy* notation using the transformation rules, and checked whether all files resulting from this translation were executable by Alloy* Analyzer. Finally we present the issues found during this validation process.

Chapter6 illustrates our proposal by synthesizing some classical algorithms such as swap of two variables’ values and the algorithm of Euclid to compute the Greatest Common Divisor. After that we have a brief discussion about the performance of our synthesizer for these case studies.

Chapter7 presents our conclusions from this work and also discusses related work. Finally we present the future evolutions intended for the synthesizer.

(18)

17

2 Background

In order to write the specification of the IMP language, we used Alloy*, a general purpose higher-order relational constraint solver based on the Alloy Analyzer.

This chapter introduces Alloy and Alloy*, explaining the elements from those languages, such as signatures, predicates and relations, and also presenting two examples from Jackson’s book [11]. Another topic from this chapter is the Spoofax Language Workbench [15] that was used to create a DSL which creates a higher level layer for writing contracts, abstracting some Alloy* elements. It is presented how to use Spoofax to create a DSL’s syntax, as well as it transformation rules responsible to translate the input received in DSL notation to a desired output.

2.1

Alloy

Alloy is a declarative specification language that is used to write models based on first-order logic. The Alloy models are described by statements like signatures, facts, predicates, functions and assertions. In what follows, we present two examples from Jackson’s book [11]. The first one is a model of an address book that maps names to addresses and the second one models a semaphore.

Alloy example 1. sig Name, Addr {} sig Book {

addr: Name ->lone Addr }

pred show [b: Book] {

#b.addr > 1 }

run show for 3 but 1 Book

Example1 declares three different signatures: Name, Addr, and Book. A signature is a given set that represents a set of objects in the real world. The signature Book has a field called addr that is responsible for mapping a Name to at most one (lone) Addr. The predicate show takes a Book b as argument and constraints b to have more than one association between Name and Addr (the size of the relation addr must be larger than 1). The run command is responsible for finding an instance for the predicate show. The command defines the maximum of 3 atoms from each signature, but only 1 for Book. If we want to have at least one book containing exactly two Name to Addr relations, we can create a fact by adding the following constraint.

(19)

Chapter 2. Background 18

Alloy example 2. fact {

some b: Book | #b.addr = 2

}

This fact imposes that some book from the solution found must have exactly two addr relations.

Our next example is also from the book by Jackson [11]. It models a semaphore with the lights Red, Yellow and Green.

Alloy example 3. abstract sig Color {}

one sig Red, Yellow, Green extends Color {} fun colorSequence: Color -> Color {

Color <: iden + Red->Green + Green->Yellow + Yellow->Red }

sig Light {}

sig LightState {color: Light -> one Color} sig Junction {lights: set Light}

fun redLights [s: LightState]: set Light {s.color.Red} pred mostlyRed [s: LightState, j: Junction] {

lone j.lights - redLights[s]

}

pred trans [s, s’: LightState, j: Junction] { lone x: j.lights | s.color[x] != s’.color[x] all x: j.lights |

let step = s.color[x] -> s’.color[x] {

step in colorSequence

step in Red->(Color-Red) => j.lights in redLights[s]

} }

assert Safe {

all s, s’: LightState, j: Junction |

mostlyRed [s, j] and trans [s, s’, j] => mostlyRed [s’, j] }

check Safe for 3 but 1 Junction

The Color signature is an abstract one (i.e. it has no elements except those belonging to its extensions). This signature is extended by three other signatures: Red, Yellow and Green. The colorSequence function defines the allowed transitions of colors of a semaphore. The transition is represented by the relational operator "->" on Color. The transition Color

(20)

Chapter 2. Background 19

<: iden restricts the universal identity relation iden to the identity relation on Color (i.e. all transitions Red -> Red, Green -> Green, Yellow -> Yellow are allowed). In addition, the transitions from Red to Green, Green to Yellow and Yellow to Red are also allowed.

The Light signature defines a given set of lights. LightState is a relation between a Light and exactly one Color. And a Junction is a set of objects from Light. The function redLights returns all lights in a LightState that are Red. The predicate mostlyRed constraints a junction to a state where at most one light is not Red. The predicate trans models a light change. The constraint "lone x: j.lights | s.color[x] != s’.color[x]" states that at most one light changes from state s to state s’. It also states that all changes must occur following colorSequence.

Finally, if a light changes from Red to Green or to Yellow, then all lights are Red in the initial state s. The assertion Safe verifies that for all LightState’s where at most one Light is not Red and a transition occurs, then again at most one Light should not be Red. The command "check Safe for 3 but 1 Junction" makes Alloy check the assertion for 1 Junction at most 3 objects from the other signatures.

2.2

Alloy*

In order to write the specification of the language used by the synthesizer, we need to use higher order quantification. Unfortunately, Alloy itself does not support higher order quantification, so we had to use Alloy* [9] [14], an extension of Alloy that keeps the traditional Alloy syntax and supports higher order quantification.

Alloy* is able to solve "∃∀" constraints by following the steps described bellow.

• Find a candidate solution

– Change the universal quantifier to an existential quantifier

– Solve as a first order formula

• Verify the candidate solution

– Try to find a counter-example for the original formula

– Use the counter example as constraint in order to find a valid candidate

Example4shows a simple example of an Alloy* predicate that requires higher-order logic and because of that it is not supported by traditional Alloy.

Alloy example 4. sig X { }

(21)

Chapter 2. Background 20

pred higherOrder {

all x: set X, y: set Y | x in y

}

run higherOrder

Unfortunately, during this work we have noticed some issues from Alloy*. The first issue is that Alloy* is not conservative in comparison to traditional Alloy. Running the Example 1 in the traditional Alloy Analyzer works fine, but running it using Alloy* Analyzer it does not find any solution due to this bug. The second issue we have found is that the Alloy* only supports higher-order such as f: T1 -> T2 if T1 and T2 are basic types, otherwise it will not work as expected. Because of those issues we needed to create additional predicates in order to ensure that the synthesizer works as expected. Those additional predicates are explained in more details during the text of this work.

2.3

Spoofax Language Workbench

To create the DSL and the translator to Alloy*, we used the Spoofax Language Workbench [15]. Spoofax provides some features like syntax highlighting and code folding, and also is possible to create menus and buttons on the Eclipse GUI.

To create a DSL using Spoofax, it is necessary to define a syntax, and some trans-formation rules using the Stratego XT [16]. The DSL is translated by the transformation rules defined, inside the Eclipse environment.

The syntax must be defined on some file with the extension ".sdf3" which is placed inside the "syntax" directory from the DSL project on Eclipse. The structure of a simple syntax definition consist of the following items:

• Module definition • Imports

• Context-free start symbols • Lexical syntax

• Context-free syntax

The module is defined by the keyword "module" followed by the name of the module. Just after the module definition it is defined the imports section which consists of the keyword "imports" followed by the name of the module that are being imported.

(22)

Chapter 2. Background 21

After the imports section, the start symbol of the DSL must be defined. This symbol is the the initial element of the DSL, so the first translation rule will be applied to this symbol in the beginning of the translation.

Having defined the start symbol, the next section is the syntax definition, which can be defined using "Lexical syntax" or "Context-free syntax".

The Lexical syntax is usually used to define low level elements of the language, as integer numbers, strings and the LAYOUT element, for example, which can represent structures such as comments, blank spaces and line breaks.

The context-free syntax is defined by a list of productions. Productions can be defined by templates. The structure of templates consists of a sort on the left side (or a sort followed by a "." and a constructor), and on the right side is the definition of the template, which is delimited by "<" and ">" (or "[" and "]"). Inside the template, there are 0 or more symbols, which can be literal strings or other elements from the language that is being defined. These elements also must be delimited by the same characters as the template.

There are others elements that can be used to define syntax using the Spoofax Language Workbench but they are not being presented here. A complete description is available on the official Spoofax website [15].

For better understanding, we show an example of a definition of simple syntax, in which a user informs his name and age.

Example 1.

module S p o o f a x E x a m p l e

imports Common

context −free start −symbols S t a r t

context −free syntax S t a r t . S t a r t = <

name : <ID> a g e : <INT> >

Example1 presents a syntax which only accepts the literal string "name:" followed by an ID, another literal string, "age:" and an integer number. The elements ID and INT are defined in the module "Common" which is imported on the imports section. Besides ID and INT, this module defines basic elements such as comments, strings and LAYOUT.

(23)

Chapter 2. Background 22

In order to translate the created syntax it is necessary to define translation rules responsible for rewriting the received input into the desired format. These rules must be defined in files with the ".str" extension placed inside the "trans" directory.

The structure of a basic file with translation rules consists of the following elements:

• Module definition • Imports section • Translation rules

The module definition has the same format as the ones from syntax. On the imports section we define all modules to be imported (including the syntax which is being translated). And finally, after the imports there is the rules section, in which are defined the translation rules.

The basic structure of a translation rule is a name which identifiers the rule, the sort or constructor which is being translated and the output of the rule. Example 2 presents a translation rule responsible for rewriting the Example 1 into a new format.

The name of the rule is "translate-example", responsible for translating "Start" elements which are composed by two different elements, name and age. The arrow after the symbol to be translated, separates the left side with the DSL elements of right side with the result of translation. The result of a translation can be defined as different elements such as a list, another symbol or even a mix of literal strings with other elements, for example.

The output from the rule from Example2is defined as the mix of literal strings and other elements. To define this mix, the result of the translation must be written between the delimiters "$[" and "]". The text written inside these delimiters is interpreted as a literal string, except when it is written between the characters "[" and "]", used to indicate another element. These elements can be the ones from the constructor being translated or or can be defined on the "with" section.

Example 2. module g e n e r a t e imports i n c l u d e / S p o o f a x E x a m p l e r u l e s translate-example : S t a r t ( name , a g e ) −> $ [ Hello [ name ]!

(24)

Chapter 2. Background 23

[ m e s s a g e ] ]

with

m e s s a g e := $ [You have [ a g e ] years old.]

Example3presents the translation rule introduced in Example2for a specific code fragment.

Example 3.

[[name:J ohn

age:30]]translate−example

(25)

24

3 IMP specification in Alloy*

To create our synthesizer it was needed to embed one language in an Alloy* specification. The language we have chosen was the imperative language IMP, created by Winskel [10]. We have chosen to use Alloy* instead of Alloy because higher-order quantification, which is not supported by Alloy, is required to create IMP specification.

This chapter presents how the language was specified and explains in details how we embedded the syntax, well-formedness rules and semantics from IMP language in Alloy*. Besides that, the chapter also presents the synthesis predicate. The IMP specification was created in a file named "imp.als" which must be imported by all program’s contract created.

3.1

Alloy* Specification for Program Synthesis

In what follows we present an embedding of the IMP language, its well-formedness rules, and its semantics in Alloy*. A previous version of this specification was presented by Mota [17]. The syntax defines the elements from IMP language such as commands and expressions. Well-formedness rules constrains where those elements must appear guarantying well-formed programs. Finally the semantics defines the behavior of the elements from IMP language.

3.1.1

The syntax of IMP in Alloy*

Following there is an example of the syntax from a simple program written in the language IMP composed by two variables, one conditional, one assignment and one skip:

if x > y then x = x + y else skip

In this subsection we present how this syntax was embed in Alloy*. Winskel [10] defined Loc as a given set of variables (locations).

a b s t r a c t sig Loc { }

A location is either a delta location DLoc (a regular variable), an auxiliary location ALoc (variables used for storing temporary values and not allowed in conditional expres-sions), or a read-only location XLoc. For the syntax definition, each type just need to

(26)

Chapter 3. IMP specification in Alloy* 25

extend Loc.

sig D L o c e x t e n d s Loc { } sig A L o c e x t e n d s Loc { } sig X L o c e x t e n d s Loc { }

By partitioning locations into these subsets gives us better control over them, once we can explicitly state how many such variables the program can have and where they can occur. More details are shown below.

A command C is given by the following BNF

C ::= skip | X := a | C0; C1 | if b then C0 else C1 | while b do C

which is defined in Alloy* as an abstract signature Cmd.

a b s t r a c t sig Cmd { }

All concrete commands are defined as signatures that extend Cmd. First we define

skip, which is the command that does nothing.

l one sig S k i p e x t e n d s Cmd { }

The keyword lone restricts to at most one Skip in a specification. That is, the terminal Skip is a partition of the non-terminal Cmd.

An arithmetic expression a is defined as

a ::= n | X | a0+ a1 | a0− a1 | a0× a1

and is characterized in Alloy* as the set AExp, which is either an integer constant (IntVal), an integer variable (IntVar), an addition (Add), a subtraction (Sub) or a multiplication (Mult).

a b s t r a c t sig A E x p { }

sig Int Val e x t e n d s A E x p { val : one Int } sig Int Var e x t e n d s A E x p { n a m e : one Loc }

sig Add e x t e n d s A E x p { op1 : one AExp , op2 : one A E x p } sig Sub e x t e n d s A E x p { op1 : one AExp , op2 : one A E x p }

{ op1 . n a m e != op2 . n a m e }

(27)

Chapter 3. IMP specification in Alloy* 26

Each type of expression has fields. For instance, an IntVar expression contains the field name that belongs to Loc, while an Add expression contains the fields op1 and op2 that represent their operands. Subtraction and multiplication are defined in a similar way to addition. For optimization purposes, there is a constraint that avoids subtractions with the same operand in both sides, since it can be replaced by the value 0.

An assignment X := a is a command whose left-hand side is an integer variable and the right-hand side is an arithmetic expression.

sig As sig n e x t e n d s Cmd { lhs : one Int Var , rhs : one A E x p }

{ ( Int Var <: rhs ) != lhs

and lhs . n a m e ! in X L o c }

The constraint (intVar<:rhs) != lhs makes sure a variable is not assigned to itself. Also lhs.name !in XLoc forbids the left-hand side to be a read-only variable.

The sequential composition of commands C0; C1 becomes the entity SComp. The command C0 is named as cur (for the current command) and C1 as next.

sig S C o m p e x t e n d s Cmd { cur , n e x t : one Cmd }

Winskel [10] introduces a boolean expression by means of a non-terminal b.

b ::= true | false | a0 = a1 | a0 ≤ a1 | b0∧ b1 | b0∨ b1

To simplify our model and without loss of generality, we have not introduced the values true and false. We have modelled BExp as a general model which is divided into BExpSimple and BExpComp. BExpSimple represents an expression of the form X OP Y , where X and Y are distinct variables and OP ∈ {NEQ, LEQ, GEQ, GTH}. The operators NEQ, LEQ, GEQ and GTH denote 6=, ≤, ≥ and >, respectively. BExpComp represents expressions on the form (X OP Y ), where X and Y are distinct BExp and OP ∈ {And, Or}. The operators And and Or denote ∧ and ∨, respectively.

a b s t r a c t sig B E x p { } a b s t r a c t sig B E x p S i m p l e e x t e n d s B E x p { lhs , rhs : one A E x p } { lhs . n a m e ! in A L o c and rhs . n a m e ! in A L o c } sig NEQ e x t e n d s B E x p S i m p l e { } . . . a b s t r a c t sig B E x p C o m p e x t e n d s B E x p { lhs , rhs : one B E x p } { lhs != rhs } sig And e x t e n d s B E x p C o m p { } . . .

Neither the left-hand side nor the right-hand side can be an auxiliary variable. The operators LEQ, GEQ and GTH are defined in a similar way to NEQ.

(28)

Chapter 3. IMP specification in Alloy* 27

A conditional statement if b then C0else C1 becomes the entity CondS, where else is called elsen as else is a reserved keyword in Alloy*.

sig C o n d S e x t e n d s Cmd { c o n d : one BExp , then , e l s e n : one Cmd }

{ t h e n != e l s e n }

We restrict then to be different from elsen to prevent the synthesis of commands of the form if b then C else C, avoiding that the same command be executed, disregarding how the expression from condition is evaluated.

The last statement to embed in Alloy* is the while statement. In addition to the usual boolean condition (cond), invariant (inv) and body (wbody), we introduce an auxiliary structure to unfold the body over the iterations, once that the embedding of while’s semantics is not straightforward.

sig W h i l e e x t e n d s Cmd { c o n d : one BExp , w b o d y : one ( Cmd

-W h i l e ) , u n f o l d : set E x p a n s i o n , in v : one B E x p }

{ (# u n f o l d >= 2 = > ( all e1 , e2 : u n f o l d | e1 != e2 = > e1 . exp . f i r s t . c u r r . b in d != e2 . exp . f i r s t . c u r r . b in d ) ) and c o n d != in v }

The body wbody is any command except another while (we do not handle nested whiles yet). An expansion is a sequence of pairs [(s0, s1), (s1, s2), ..., (sn−1, sn)], where si

is a state. Each pair (si, si+1) denotes a state changing from si to si+1 by running the

body of the loop one time. The expansion denotes the minimum repetition of the body of the while before its condition becomes false. An unfold is a set of expansions. It is used to overcome the Alloy* limitation of handling general recursion. Thus, instead of finding the semantics of while by a fix-point computation (a recursive computation), we ask Alloy* to find expansions that correspond to such computation. The constraint { all e1,e2...} states that if e1 and e2 are distinct expansions, then their initial states s0 (exp.first.curr) are distinct. The Alloy* definition of Expansion and StChg is presented

below.

sig E x p a n s i o n { exp : seq S t C h g }

{ # exp = # exp . e l e m s and ( all i : exp . in ds | i != exp . l as t I d x i m p l i e s

exp [ i ]. n e x t = exp [ add [ i , 1 ] ] . c u r r ) }

sig S t C h g { curr , n e x t : one S t a t e }

Finally, we define a program as a unique atom from which all other statements must be linked to (or reachable from).

(29)

Chapter 3. IMP specification in Alloy* 28

one sig P r o g { b o d y : one Cmd }

3.1.2

The well-formedness rules

Unfortunately, Alloy* signatures do not provide a datatype structure that fully captures a BNF. Although we have captured most of the IMP’s grammar using signatures, we need to add extra constraints to guarantee the synthesis of well-formed programs. The basic rules were defined in a initial version, and were refined and improved according to the synthesizer behavior.

A state is a set of bindings of a location into exactly one integer. The property bind is a relation of Loc and one integer number. This allows us to map the value of locations at specific states of the program.

sig S t a t e { b in d : Loc - > one Int }

We require all variables to bind to a location in all states.

s o m e Int Var = > ( all v : Loc , s : S t a t e | v in s . b in d . u n i v ) and ( all v : Loc | v in Int Var . n a m e )

Whenever IntVar is not empty, for all locations v and states s, v must be in the set s.bind.univ, which is the set of all locations of the state s (univ is the universal set and s.bind.univ is the domain of bind). And all locations v must be in the set of names of IntVar.

Two disjoint states have different bindings.

all d i s j s1 , s2 : S t a t e | s1 . b in d != s2 . b in d

All commands must belong to a program.

all c : Cmd | c in ( P r o g . b o d y ) .*( cur + n e x t + t h e n + e l s e n +

w b o d y )

All relational operators must be associated to the condition of an if-then-else, a while, to an invariant of a while, or to some other boolean expression.

(30)

Chapter 3. IMP specification in Alloy* 29 all c : GEQ | ( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) all c : LEQ | ( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) all c : EQ | ( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d or c = wc . in v ) all c : NEQ | ( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) all c : GTH | ( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) all c : And | (( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) ) and c . lhs != c and c . rhs != c all c : Or | (( s o m e cs : C o n d S | c = cs . c o n d ) or ( s o m e wc : W h i l e | c = wc . c o n d ) ) and c . lhs != c and c . rhs != c All integer values must be the operand of an arithmetic expression.

all i : Int Val | ( s o m e a : Add | i = ( Int Val <: a . op1 ) or i =

( Int Val <: a . op2 ) ) or ( s o m e a : Sub | i = ( Int Val <: a . op1 ) or i = ( Int Val <: a . op2 ) ) or ( s o m e a : M u l t | i = (

Int Val <: a . op1 ) or i = ( Int Val <: a . op2 ) )

The operator <: restricts the domain of the right side to the left side. For an integer i and an addition a, the notation i = (IntVal <: a.op1) states that i is the first operand of a (IntVal <: a.op1 restricts the set IntVal to the first operand of a).

All integer variables must belong to either an assignment command or to a boolean expression.

all i : Int Var | ( s o m e c : As sig n | i = c . lhs or i in c . rhs

. ( * ( ( Add <: op1 ) + ( Add <: op2 ) + ( Sub <: op1 ) + ( Sub <: op2 ) + ( M u l t <: op1 ) + ( M u l t <: op2 ) ) ) ) or ( s o m e b :

B E x p S i m p l e | b . lhs = i or b . rhs = i )

All arithmetic expressions must occur in the right-hand side of an assignment.

all o : Add | one c : As sig n | o in c . rhs . * ( ( Add <: op1 ) +

( Add <: op2 ) + ( Sub <: op1 ) + ( Sub <: op2 ) + ( M u l t <: op1 ) + ( M u l t <: op2 ) )

all o : Sub | one c : As sig n | o in c . rhs . * ( ( Add <: op1 ) +

( Add <: op2 ) + ( Sub <: op1 ) + ( Sub <: op2 ) + ( M u l t <: op1 ) + ( M u l t <: op2 ) )

(31)

Chapter 3. IMP specification in Alloy* 30

all o : M u l t | one c : As sig n | o in c . rhs . * ( ( Add <: op1 ) +

( Add <: op2 ) + ( Sub <: op1 ) + ( Sub <: op2 ) + ( M u l t <: op1 ) + ( M u l t <: op2 ) )

All sequential compositions must not have themselves as the current or the next command.

all c : S C o m p | s o m e c_ : Cmd - c | c . cur = c_ or c . n e x t = c_

An illegal cycle must not be produced. An illegal cycle is a program that executes the command c and, without being in the scope of a while loop, returns to c. Note that cycles are possible if a language allows go-tos, which is not our case.

no c : Cmd | c in c .^( cur + n e x t + t h e n + e l s e n + w b o d y )

The notation c.ˆ (cur + next + then + elsen + wbody) produces the transi-tive closure of c, in other words, it represents the commands reachable from c due to the operator ˆ .

All arithmetic expressions that contains others arithmetic expressions cannot con-tain themselves.

no a : A E x p | a in a . ^ ( ( Add <: op1 ) + ( Add <: op2 ) + ( Sub <:

op1 ) + ( Sub <: op2 ) + ( M u l t <: op1 ) + ( M u l t <: op2 ) )

3.1.3

The semantics

Our final Alloy* characterization of IMP is the model of its denotational semantics. Except for the while statement, the embedding of the semantics is straightforward. The semantics is defined in terms of the relations evalC, evalB, and evalA. The relation evalC relates a command to a relation of two States, its initial and final states. The relation evalB relates a boolean expression to relation of a State to a Bit, which may assume two different values, BitTrue and BitFalse. Finally evalA relates an arithmetic expression to a relation of a State to an integer number, which is the result of the arithmetic expression. The denotational definitions must be written at the area indicated by the comment (the line which starts with //) from example bellow.

p r e d S e m a n t i c s [ e v a l C : Cmd - > ( S t a t e - > S t a t e ) , e v a l B : B E x p

- > ( S t a t e - > Bit ) , e v a l A : A E x p - > ( S t a t e - > Int ) ] { // The de no t a t i o n a l def in i t i o n s of IMP c o m m and s c o m e h e r e }

(32)

Chapter 3. IMP specification in Alloy* 31

Winskel [10] defines Skip as

CJskipK = {(σ, σ) | σ ∈ Σ}

where CJskipK returns a function that maps all initial states σ ∈ Σ into the final state σ (i.e. skip does not change the state). In Alloy* this becomes a predicate inside the body of the predicate Semantics. As iSt in the expression iSt: evalC[c].univ represents any initial state of the command c, we simply have to state that the final state of Skip is equal to its initial state iSt.

e v a l C [ S k i p ] in i d e n

An assignment overrides the binding of the initial state with the new binding established by the assignment. This is formally stated in

CJX := aK = {(σ, σ[n/X ]) | σ ∈ Σ ∧ n = AJaKσ}.

The initial state σ is mapped to the state σ[n/X], where n is the value of the expression a (computed by AJaKσ) and σ[n/X ] binds the value of X to n in σ. In Alloy* we get a

simpler definition by using the override operator ++.

all c : As sig n , iSt : e v a l C [ c ]. u n i v |

e v a l C [ c ][ iSt ]. b in d = iSt . b in d ++ { c . lhs . n a m e - > e v a l A [ c . rhs ][ iSt ] }

The override operator ++ updates the initial binding with the mapping from c.lhs.name to the value evalA[c.rhs][iSt] of the expression in the right-hand side of the assignment. The sequential composition of C0 and C1 is the function composition of their semantics, i.e. the final state of the first command is the initial state of the second command.

CJC0; C1K = C JC1K ◦ C JC0K

The composition operator in Alloy* (denoted by .) combines relations in the reverse order of ◦.

all c : S C o m p | e v a l C [ c ] = ( e v a l C [ c . cur ]) .( e v a l C [ c . n e x t ])

The semantics of the conditional command is stated in Winskel [10] as CJif b then C0else C1K = {(σ, σ

0) | B JbKσ = true ∧ (σ, σ 0) ∈ C JC0K} ∪ {(σ, σ0) | B JbKσ = false ∧ (σ, σ 0) ∈ C JC1K}.

(33)

Chapter 3. IMP specification in Alloy* 32

The expression b is evaluated by BJbKσ. For all initial states σ that evaluate b to true, the meaning of C0 is computed. And, similarly, the meaning of C1 is computed for all those states that evaluate b to false. In Alloy* we do not have a direct way of representing the above definition but we can use a similar one that requires the initial state to be the same as that of the then and the elsen branches.

all c : CondS , iSt : e v a l C [ c ]. u n i v |

( e v a l B [ c . c o n d ][ iSt ] = B i t T r u e = > e v a l C [ c . t h e n ][ iSt ] = e v a l C [ c ][ iSt ]

e l s e

e v a l C [ c . e l s e n ][ iSt ] = e v a l C [ c ][ iSt ])

If the condition evaluates to true, the final state is the final state of the then branch (evalC[c.then][iSt]). Otherwise, the final state is the final state of the elsen branch (evalC[c.elsen][iSt]).

For the semantics of a while loop, Winskel provides a classical definition CJwhile b do C K = fix (Γ), where Γ(ϕ) = {(σ, σ0) | B JbKσ = true ∧ (σ, σ 0) ∈ ϕ ◦ C JcK} ∪ {(σ, σ) | BJbKσ = false} and ϕ = CJwhile b do cK.

The while command is defined as the fixed point of the function Γ, where ϕ is CJwhile b do cK. For all states that evaluate b to true, the body of the loop c followed by the computation of the next iteration of the loop is computed. Otherwise, there is no state change. As Alloy* does not have a general recursive feature, we used a more operational way of representing a loop: as a sequential composition of the same command, repeated as many times as necessary. Thus in addition to the condition and the body of a loop, we added a third element: a set of sequences of state changes. We need a set because the loop can have more than one evaluation (depending on its initial state), where each evaluation is captured by a sequence of state changes representing the iteration of the loop’s body several times. The Alloy* predicate that captures loops says that if the initial state falsifies its condition, then the state does not change. Otherwise, we have to run the body of the while in a compositional way until some future state falsifies the condition.

all c : While , iSt : e v a l C [ c ]. u n i v |

e v a l B [ c . in v ][ iSt ] = B i t T r u e and e v a l B [ c . in v ][ e v a l C [ c ][ iSt ]] = B i t T r u e and

(34)

Chapter 3. IMP specification in Alloy* 33

e v a l C [ c ][ iSt ] = iSt

e l s e

e v a l B [ c . in v ][ iSt ] = B i t T r u e and

( one st : c . u n f o l d | st . exp . f i r s t . c u r r = iSt

and e v a l C [ c ][ iSt ] = st . exp . l as t . n e x t and

( all ic : st . exp . e l e m s | e v a l C [ c . w b o d y ][ ic . c u r r ] = ic . n e x t and e v a l B [ c . in v ][ ic . c u r r ] = B i t T r u e and e v a l B [ c . in v ][ ic . n e x t ] = B i t T r u e and e v a l B [ c . c o n d ][ ic . c u r r ] = B i t T r u e and ( ic = st . exp . l as t = > e v a l B [ c . c o n d ][ ic . n e x t ] = B i t F a l s e e l s e e v a l B [ c . c o n d ][ ic . n e x t ] = B i t T r u e ) ) ) ) )

and ( all st : c . u n f o l d . exp . f i r s t . c u r r |

e v a l B [ c . c o n d ][ st ] = B i t T r u e ) )

In more detail, the set of all initial states evalC[c].univ and the set of all final states univ.(evalC[c]) must be the set of all initial states and final states of its expansions, respectively. If the condition evalB[c.cond][iSt] evaluates to false, then nothing happens and the final state equals the initial state. Otherwise, the condition must be false only at the last state st.exp.last.next of an Expansion. In no other previous state the condition is false and, for all states of an expansion ic, the final state of a single iteration evalC[c.wbody][ic.curr] takes us to the next state in the expansion. Figure 2

presents a while’s expansion.

Figure 2 – While’s expansion

The Expansion is composed by a sequence of StChg, which in turn is composed by two states, cur and next. The next state from StChg located at position i is equals to the cur state from StChg at position i +1. The cur state from the first StChg from Expansion

(35)

Chapter 3. IMP specification in Alloy* 34

is equals to while’s initial state. The next state from last StChg from Expansion is equals to while’s final state. Finally, the wbody evaluation of each cur state is equals to next state from the same StChg.

3.1.4

The synthesis predicate

The following predicate is an adaptation of the predicate reported by Milicevic et

al. [9] in which we take into account states and state changes.

p r e d S y n t [ p : P r o g ] {

all iSt : State , e v a l C : Cmd - > ( S t a t e - > l one S t a t e ) , e v a l B : B E x p - > S t a t e - > Bit , e v a l A : A E x p - > S t a t e - > Int w h e n { m in I n i t i a l S t a t e s [ p , iSt , e v a l C ] P r e C [ iSt ] S e m a n t i c s [ evalC , evalB , e v a l A ] }{ P o s C [ iSt , e v a l C [ p . b o d y ][ iSt ]] } }

The synthesis predicate states that when the pre-condition PreC[iSt] on the initial state and the Semantics[evalC,evalB,evalA] are true, then the post-condition, which is represented by the predicate PosC[iSt,evalC[p.body][iSt]], is true. PreC[iSt] and PosC[iSt,evalC[p.body][iSt]] predicates must be defined at the contract of program to be synthesized.

p r e d m a x S [ v : Loc , s : S t a t e ] {

all v_ : ( s . b in d . u n i v ) - v | s . b in d [ v ] >= s . b in d [ v_ ]

}

p r e d m in I n i t i a l S t a t e s [ p : Prog , iSt : State ,

e v a l C : Cmd - > ( S t a t e - > l one S t a t e ) ] { all v : A L o c | s o m e s : e v a l C [ p . b o d y ]. u n i v | ( m a x S [ v , s ] and no s_ : e v a l C [ p . b o d y ]. u n i v - s | m a x S [ v , s_ ]) # ( e v a l C [ p . b o d y ]. u n i v ) = # Loc all d : e v a l C [ p . b o d y ]. u n i v | # u n i v .( d . b in d ) = # Loc iSt in e v a l C [ p . b o d y ]. u n i v }

The predicate minInitialStates constraints the initial state iSt to be as varied as possible in order to force Alloy* to find a solution that is as general as possible (instead of finding a program that only works for a particular initial state). This predicate is

(36)

Chapter 3. IMP specification in Alloy* 35

necessary due to the issue that we mentioned on Chapter 2, Alloy*’s higher order only supports relation between basic types, If the relation is between signatures it does not work as expected.

(37)

36

4 DSL for Program Synthesis and Alloy* and

C# Translators

Program specifications must be written in the form of a contract with pre and post-conditions. In order to prevent the user from writing the contract in Alloy*, abstracting some concepts from its syntax, we propose a higher level Domain Specific Language (DSL) for contracts. Such a DSL has a similar syntax to Code Contracts [13] and hides from the user some technicalities of Alloy*. From this chapter on, a contract is a set of variable declarations, pre and post-conditions, an optional sketch of the program to be synthesized, and additional predicates in Alloy* notation, if needed.

To provide tool support, we end this chapter by presenting two translators: (i) DSL to Alloy* (This translator is our main concern and we implemented it by using the Spoofax workbench [15] using proposed translation rules); (ii) Alloy* instance to C# function (This translator is just a Java implementation that executes the Alloy* files using the Alloy* API, exports the solution found to an XML file and translates it to a C# function). Both translators and their input/output are presented in Figure 3.

Figure 3 – Synthesizer’s translators

This chapter explains how the syntax from our DSL was defined, presenting its BNF. After the transformation rules that translate the contracts to Alloy* notation are explained in details, including an example of translation for better understanding. Finally, we present the C# generator which rewrites in C# notation the solutions found by Alloy* Analyzer.

4.1

Syntax

This section presents the syntax of our DSL for contracts.

A contract begins at the element Start of the DSL. This is the top-level constructor that contains all the other elements of a contract.

(38)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 37

The Start syntactical class is formed by exactly one Declarations, one Precondition, one Postcondition, zero or more Pred, and optionally one RunCommand.

Before stating the pre and post-conditions, the user must declare the variables that the synthesized program will use. The constructor that represents that is Declarations, which has the following syntax:

Declarations ::= ”vars : ” {V ars}

The label "vars:" defines the start of the variable declarations section. Vars defines how to declare a single variable.

V ars ::= T ype ID

A variable is declared by introducing its Type followed by its ID (its name). There are three allowed types.

T ype ::= ”int” | ”aux int” | ”const int”

An int is the type of a regular integer variable. It can be used in both sides of an assignment command, which will be presented later on this section. A const int is the type of a constant (read-only) integer variable. And an aux int is the type of an integer auxiliary variable. It is a variable mainly used only in the left-hand side of an assignment. They are forbidden to be used in if-then-else and while conditions, as it was explained for the Loc’s (equivalent in Alloy* notation) in Section 3.1.2.

A Precondition describes the condition over the variables that the program assumes to be true at the beginning of its execution, and can also have a sketch of the desired program. Our DSL uses a similar syntax from Code Contracts of C# [13] to write a pre-condition, once we had intended to use its static analysis features. The use of those features was dropped because we decided to keep the focus on other tasks such as the synthesis itself and the transformation rules from DSL to Alloy* notation, but we have decided to do not change the syntax. A pre-condition can also be empty, which turns it an optional element.

P recondition ::= [(”Contract.Requires(”Quantif ier”); ”)] [Sketch]

A Quantifier is used inside the Precondition, and can be of two distinct types, "For all" or "Exists". It is possible to use no quantifier and write the expressions directly.

Quantif ier ::= ”CodeContract.F orAll(”Logic”)”

(39)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 38

| Logic

Similar to Precondition, a Postcondition is written using the same syntax from Code Contracts as well, and can also be empty. A post-condition describes a condition over the variables that the program must satisfy at the end of its execution.

P ostcondition ::= [(”Contract.Ensures(” Logic ”); ”)]

A Sketch is used to write a template of the program to be synthesized. It can be empty, or it can contain the label "sketch:" followed by a Program, which in turn can be followed by 0 or more commands Cmd. A sketch must be used in order to help the synthesizer to find a satisfiable solution for a given specification. Once a sketch is used, other elements from contract cannot conflict with it. For example, if some type of expression is used a number x times in the sketch, the cardinality of this type of expression must be at least equals to x in the run command (which will be discussed in more details following).

Sketch ::= [(”sketch : ” P rogram Cmd)]

A Program is simply the first command of the sketch. Its syntax is just a single Cmd.

P rogram ::= Cmd

A Cmd is any of the following commands.

• Skip - To create a skip command, simply use the word "Skip". This represents a command that does nothing.

Skip ::= ”Skip”

• UnknownCmd - This element represents a command which the synthesiser should discover by itself. Its syntax in the DSL is the character "?", which indicates where this command should be placed in the code.

U nknownCmd ::= ”?”

• Assign - This element represents an assignment. The left-hand side is a variable name and the right-hand side is the value the variable is assigned to.

Assign ::= ID ” = ” V alue

• SComp - Scomp is the sequential composition of two commands. Those two commands must be separated by a semicolon.

(40)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 39

• CondIf - This constructor represents the conditional "if-then" command (i.e. with no "else" branch). This element comprises a conditional expression and a command to

be executed if the condition evaluates to true.

CondIf =:: ”if (” Logic ”){” Cmd ”}”

• CondIfElse - This element represents the "if-then-else" command. Like the element CondIf, this element contains a conditional expression to be evaluated, and a com-mand that will be executed if the condition evaluates to true. It also contains a second command that should execute if the condition evaluates to false.

CondIf Else ::= ”if (” Logic ”){” Cmd ”}else{” Cmd ”}”

• While - The While element represents the while loop. Its syntax is similar to the one of a C-like language, the only difference is the invariant, that must be present just after the condition, separated by a semicolon.

W hile ::= ”while(” Logic ; Logic ”){” Cmd ”}”

Value represents each value in the program. The values may vary depending on its type. The values from variables are identified by its ID. If it is the value of a variable after the execution of a command, the ID must be followed by an apostrophe. The value of an integer number is just the number itself. The Values Sub, Add and Mult represent the arithmetic expressions subtraction, addition and multiplication respectively.

V alue ::= V alueIDP ost

| V alueID | V alueIN T | Sub | Add | M ult V alueID ::= ID V alueIDP ost ::= ID”0”

V alueIN T ::= [” − ”](0 − 9)+ Sub ::= ”(”V alue” − ”V alue”)” Add ::= ”(”V alue” + ”V alue”)”

(41)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 40

M ult ::= ”(”V alue” ∗ ”V alue”)”

Logic is a boolean expression composed by others boolean expressions. It is used to represent expression that contains "And" and "or". A constructor composed by only one expression is also supported.

Logic ::= ”(” Logic ”||” Logic ”)” | ”(” Logic ”&&” Logic ”)” | Expr

An Expr is a boolean expression. Unknown expressions are written using the character "?" as we do with unknown commands. In particular, an Expr can be the application of a predicate. In this case, the predicate ID must be followed by its arguments. All the other expressions can be written in a similar way of a C-like language.

Expr ::= ”?” | V alue ” > ” V alue | V alue ” >= ” V alue | V alue ” <= ” V alue | V alue ” == ” V alue | V alue ” ! = ” V alue | ID ”[” P arameters ”]”

P arameters ::= V alue (”, ” P arameter )?

A Pred has the same syntax of an Alloy* predicate and are used to describe more complex programs that cannot be specified using only the DSL notation. It must begin with the keyword pred followed by a predicate name, its parameters, and its body.The predicates are not modified by the transformation rules during the translation. The Pred must be defined after the post-condition. Once the predicate syntax is the same as Alloy*, the constructor BODY was defined as string, so any string is accepted at this place for our DSL, but it must be valid in Alloy*, otherwise it will cause an error during the synthesis process.

P red ::= ”pred” ID ”[” P arameter ”] {” BODY ”}”

RunCommand and RunNumber are used to write our DSL equivalent of the Alloy* command Run and its parameters, such as the number of instances of expressions and commands. A RunCommand starts with the string "run command: run for" and is followed by an integer number and the parameters of the command, represented by a RunNumber. If no RunComand is given by the user, the tool generates one automatically.

(42)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 41

A RunNumber is basically composed by the name and occurrences of each command, expressions, variable or any other element from the program being synthesized and also by the range of integer numbers. For this case the element is composed by the string "int range" followed by two integers with the string ".." between them.

RunN umber ::= ”int range ” IN T ”..” IN T

| ”const int ” IN T | ”int ” IN T | ”aux int ” IN T | ”variables ” IN T | ”int variables ” IN T | ”int values ” IN T | ”conditionals ” IN T | ”skips ” IN T | ”seq comp ” IN T | ”assignments ” IN T | ”whiles ” IN T | ”and ” IN T | ”or ” IN T | ”geq ” IN T | ”neq ” IN T | ”gth ” IN T | ”leq ” IN T | ”eq ” IN T | ”expansions ” IN T | ”state changes ” IN T | ”additions ” IN T | ”subtractions ” IN T | ”multiplications ” IN T | ”arithmetic expr ” IN T | ”commands ” IN T | ”states ” IN T

(43)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 42

4.2

Translation Rules from DSL to Alloy*

In order to translate a program contract written using our DSL to an Alloy* specification, transformation rules were defined. The whole translation process from the input in the form of a contract to an Alloy* specification is presented in Figure 4. In order to complete the translation, the flow presented in Figure 4 must be followed until the end. All of those rules were defined using the Spoofax Workbench [15] which was presented in Chapter 2. The rules range in the arrows in Figure4 represents the set of rules responsible to do the translation in that specific point, and the rules from this set are used according to the element which is being translated. For example, each type of command is translated by different rules. All this rules are presented in this chapter.

Figure 4 – Translation flow from DSL to Alloy*

The Example4presents a program specification written in DSL notation, on the left side, and in Alloy* notation on the right side. The whole process of rewriting the contract in Alloy* notation is done by the rules which are presented in this chapter. The program used as example is very simple and consists of two variables and one assignment.

(44)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 43 Example 4. v a r s : i n t x c o n s t i n t y C o n t r a c t . R e q u i r e s ( ( x > 0 && y > 0 ) ) ; s k e t c h : x = ? C o n t r a c t . E n s u r e s ( x ’ == y ) ; run command : run f o r 3 i n t r a n g e − 1 . . 3 , i n t 1 , c o n s t i n t 1 , a s s i g n m e n t s 1 , commands 1 I m o d u l e p r o g F i n d e r o p e n ../ imp [ PreC , P o s C ] one sig x e x t e n d s D L o c {} one sig y e x t e n d s X L o c {} one sig A s s i g n 1 e x t e n d s A s s i g n {} p r e d P r e C [ i S t 1 : one S t a t e ] { ( i S t 1 . b i n d [ x ] > 0 and i S t 1 . b i n d [ y ] > 0) P r o g . b o d y = A s s i g n 1 A s s i g n 1 . lhs . n a m e = x } p r e d P o s C [ iSt1 , fSt : one S t a t e ] { fSt . b i n d [ x ] = i S t 1 . b i n d [ y ] }

run S y n t for 3 but -1..3 Int , e x a c t l y 1 DLoc , e x a c t l y 1 XLoc , e x a c t l y 1 Assign , e x a c t l y 1 Cmd

The main rules shown in the control flow in Figure 4 are introduced below. As well as what occurs in Example 4, the contract written in DSL notation is displayed at the left side and the output from rule is presented at the right side. The right side may also contains elements that still need to be translated by other rules. The elements that each rule translate appear between "[[" and "]]" symbols followed by the rule’s name. For a given element in DSL notation, may have more than one different rule, the name is used to identify which rule is applied in that point of translation.

4.2.1

Starting translation

Rule 1 matches the pattern defined by the element Start and produces the top-level structure in Alloy* by defining the module name, the import of the file "imp.als", and by calling the translation of the remaining constructors (Declarations, Precondition, Postcondition, and RunCommand). As the Precondition contains the Sketch, and the Sketch has some commands and expressions that must be declared, we need to apply two different rules to translate the Precondition. One for declarations, and other for rewriting in the Alloy* syntax.

(45)

Chapter 4. DSL for Program Synthesis and Alloy* and C# Translators 44 Rule 1. [[Declarations P recondition P ostcondition RunCommand]]to−alloy I m o d u l e p r o g F i n d e r o p e n ../ imp [ " [ " ] PreC , P o s C [ " ] " ] [[Declarations]]declare−vars [[P recondition]]declare−sketch [[P recondition]]translate−precondition [[P ostcondition]]translate−postcondition [[RunCommand]]translate−run−command

4.2.2

Variable declarations

All signatures used in an Alloy* specification must be declared. Since the variables are modeled by signatures in our IMP syntax, it is necessary to declare them all.

Rule 2 translates each variable declaration into a signature declaration. Re-call that Vars represents the variables from the program containing its type and its name.

Rule 2. v a r s :

[[(T ype V arN ame)∗]]declare−vars

I one sig V arN ame e x t e n d s[[T ype]]translate−type

{}

The types int, const int and aux int are translated to DLoc, XLoc and ALoc, respectively. These three translation rules (one for each type) are presented together as Rule 3.

Rule 3.

[[int]]translate−type [[const int]]translate−type

[[aux int]]translate−type

I I I DLoc XLoc ALoc

4.2.3

Sketch declarations

Commands and expressions of a sketch are also represented by signatures in Alloy*. Rule 4 is responsible for this translation. As the sketch is composed by 1 Program and 0

Referências

Documentos relacionados

No ambiente educacional, Arroyo (2013, p.193) questiona: “por que a escola teima em ignorar processos identitários, valores, saberes, formas de pensar aprendidas na pluralidade de

A valência de um átomo está relacionada com a habilidade do átomo para entrar em combinação química com outros elementos, sendo frequentemente determinada pelo número de

O grande objetivo da educação aritmética e algébrica, hoje, deve ser o de encontrar um equilíbrio entre três frentes: i) o desenvolvimento da capacidade de pôr em jogo

A análise dos resultados possibilitou concluir-se que as dermatoses infecciosas e parasitárias ocorreram com maior freqüência em animais jovens, enquanto as neoplasias

DEMONSTRAÇÃO DE RESULTADOS DESPESAS Administrativas Salários Encargos sociais Impostos e taxas Alugueis Serviços gerais Manutenção Depreciação Materiais de consumo

O credor que não requerer perante o Juízo da execução a(s) adjudicação(ões) do(s) bem(ns) a ser(em) leiloado(s) antes da publicação deste Edital, só poderá adquiri- lo(s)

The assessment of the Peer Instruction sessions was undertaken by calculating the correct responses given by the students during the in-person sessions to the

Nos casos de traumas infantis, os processos de enfermagem são condições essenciais para a promoção de cuidados em saúde, em que a atuação do enfermeiro tem