• Nenhum resultado encontrado

Nelma Moreira, David Pereira Simão Melo de Sousa Deciding Kleene algebra terms equivalence in Coq

N/A
N/A
Protected

Academic year: 2022

Share "Nelma Moreira, David Pereira Simão Melo de Sousa Deciding Kleene algebra terms equivalence in Coq"

Copied!
25
0
0

Texto

(1)

Deciding Kleene algebra terms equivalence in Coq

NelmaMoreira, David Pereira, Simão Melode Sousa

a b s t ra c t

Keywords:

Proofassistants Regularexpressions Kleenealgebrawithtests Programverification

Thispaperpresentsamechanicallyverifiedimplementationofanalgorithmfordeciding the equivalenceof Kleenealgebra terms within the Coq proof assistant. The algorithm decidesequivalenceoftwogivenregularexpressionsthroughaniteratedprocessoftesting the equivalenceoftheirpartial derivativesand doesnot requirethe construction ofthe corresponding automata. Recenttheoretical andexperimentalresearchprovidesevidence that this method is, on average, more efficient than the classical methods based on automata.Wepresentsomeperformancetests,comparisonswithsimilarapproaches,and also introduce ageneralization of the algorithm to decidethe equivalence of termsof Kleene algebra with tests.The motivation for the work presented inthis paper is that ofusingthelibrariesdevelopedastrustedframeworksforcarryingoutcertified program verification.

1. Introduction

Formal languagesareoneofthepillarsofcomputerscience.Amongsttheseveralcomputationalmodelsofformal lan- guages, that of regularexpression is one of the mostwidely known andused. The notion of regular expressions has its originsintheseminalworkofKleene,wheretheauthorintroducedthemasaspecificationlanguagefordeterministicfinite automata (DFA)[1]. Nowadays,regular expressions findapplications ina wide variety ofareas dueto their capability of expressingpatternsinasuccinctandcomprehensiveway.Theyaboundintechnologiesderiving fromtheWorldWideWeb, intextprocessors,instructuredlanguagessuchas XML,andareacoreelementofprogramminglanguageslikePerl[2] and Esterel[3].Morerecently,regularexpressionshavebeensuccessfullyappliedintheruntimeverificationofprograms[4,5].

In thepastyears, muchattentionhasbeengivento themechanization ofKleenealgebra(KA) the algebraofregular expressions withinproof assistants.Formally,aKAisan idempotentsemiring togetherwiththeKleenestaroperator ·, that is characterized axiomatically. J.-C. Filliâtre [6] provided a firstformalization ofthe Kleene theoremfor regular lan- guages [1] within the Coqproof assistant [7].Höfner andStruth [8] investigatedthe automated reasoning invariants of Kleene algebras withProver9 and Mace4 [9]. Pereira and Moreira [10] implemented in Coq an abstract specification of Kleenealgebrawithtests(KAT) [11] andtheproofsthat propositionalHoare logicdeductionrulesare theoremsofKAT.An obviousfollowup ofthat workwas toimplementacertifiedprocedure fordeciding equivalenceofKAterms,i.e., regular expressions. A first stepwas the proof ofthecorrectnessof thepartial derivative automatonconstruction fromaregular expression[12].Inthispaperwedescribethemechanizationofadecisionprocedurebasedonpartialderivativesthatwas

(2)

proposedbyAlmeidaetal.[13],andthatisafunctionalvariantoftherewritesystemintroducedbyAntimirovandMosses in[14].Thisproceduredecidesregularexpressionequivalencethroughaniteratedprocessoftestingtheequivalenceoftheir partialderivatives.

Similarapproachesbasedonthecomputationofabisimulationbetweenthetworegularexpressionswereusedrecently.

In1971,HopcroftandKarp[15]presentedanalmostlinearalgorithmforequivalenceoftwoDFAs.Bytransformingregular expressions intoequivalentDFAs,HopcroftandKarp’smethodcanbe usedforregularexpressions equivalence.A compar- ison ofthatmethodwiththemethodproposed here isdiscussedby Almeidaetal.[16,17].Thereitisconjectured that a direct method should performbetter on average, andthat is corroborated by theoretical studies basedon analytic com- binatorics [18]. HopcroftandKarp’s method was usedby Braibant andPous [19] toformally verifyKozen’sproof of the completenessofKleenealgebra[20]inCoq.

Independentlyoftheworkpresentedhere,CoquandandSiles[21]mechanicallyverifiedanalgorithmfordecidingregular expression equivalencebasedonBrzozowski’sderivatives[22] andan inductivedefinitionoffinitesetscalledKuratowski- finite sets.Based onthesamenotion ofderivative,Krauss andNipkow[23] providean elegant andconciseformalization ofRutten’sco-algebraicapproachofregularexpressionequivalence[24]intheIsabelleproofassistant[25],buttheydonot address the terminationof thedecision procedure.Komendantsky provides a novel functionalconstruction of thepartial derivativeautomaton[26],andalsomadecontributions[27]to themechanization ofconceptsrelatedtoMirkin’s construc- tion [28] of thatautomata. More recently,Andrea Aspertiformalized a decisionprocedure forthe equivalence ofpointed regularexpressions[29],thatisbothcompactandefficient.

Besides avoiding theneed forbuildingDFAs, our useofpartial derivativesalso avoidsthe necessary normalization of regular expressionsmodulo ACI (i.e.,thenormalizationmoduloassociativity, idempotenceandcommutativityoftheunion of regular expressions) in order to ensure the finiteness of Brzozowski’s derivatives. Like in other approaches [19], our methodalsoincludesarefutationstepthatimprovesthedetectionofinequivalentregularexpressions.

Althoughthealgorithm wehavechosentoverifyseems straightforward,the processofitsmechanicalverificationina theoremproverbasedon atypetheoryraisesseveralissueswhicharequitedifferentfromausualimplementationinstan- dard programminglanguages.TheCoq proofassistantallows userstospecifyandimplementprograms,andalsotoprove thattheimplementedprogramsarecompliantwiththeirspecification.Inthissense,thefirsttaskistheeffortofformalizing the underlyingalgebraic theory.Afterwards,andinorder toencode the decisionprocedure,we havetoprovide aformal proof ofits terminationsince ourprocedure isageneralrecursive one, whereasCoq’s type systemacceptsonlyprovably terminating functions.Finally,aformal proof mustbeprovidedinordertoensurethat thefunctionalbehavioroftheim- plementedprocedureiscorrectw.r.t. regularexpressionequivalence.Moreover,theencodingeffortmustbeconductedwith careinordertoobtainasolutionthatisabletocomputeinsideCoq,orextractedandcompiledasanOCamldevelopment, bothwithreasonableperformances.

1.1. Paperorganization

Thispaperisorganizedasfollows.InSection2weprovideaconciseintroductiontotheCoqproofassistant.InSection3 we reviewsomeoftheconcepts offormal languagesthatweneedtoformalize inordertoimplementthedecisionproce- dure;inSection4wedescribetheformalization ofthedecisionprocedure,itsproofsofcorrectnessandcompleteness,and commentontheprocedure’scomputationalefficiency;inSection5wedescribethegeneralizationofthedecisionprocedure to decideKAT termsequivalence,andshow howthisprocedureis usefulinprogramverification;finally,inSection 6 we presentourconclusionsabouttheworkpresentedinthispaper,andpointtofutureresearchdirections.Theworkpresented here is an extended version of the work previously presented in [30,31], andthe corresponding development in Coq is availableat[32].

2. AnoverviewoftheCoqproofassistant

The Coq proof assistant[7]is an implementationofPaulin-Mohring’sCalculusofInductiveConstructions(CIC) [33]. The CICisarichtypedλ-calculusthatfeaturespolymorphism,dependenttypes,andthatextendsCoquandandHuet’sCalculus ofConstructions(CC)[34]withveryexpressive(co-)inductivetypes.

The CIC is built upon the Curry–HowardIsomorphism (CHI) programs-as-proofs principle [35], where a typing relation t:A is interpreted eitherasa termt thathas thetype A,orast beinga proof ofthe proposition A. Hence, theCIC is simultaneously a functional programminglanguage witha very expressive type systemand ahigher-order logic, andso, userscandefinespecificationsofprograms,andalsobuildproofsconcerningthosespecifications.

In theCICthere existsnodistinction betweentermsandtypes.Therefore,all typesalsohavetheir owntype,calleda sort,andeachsortbelongstothewell-formedsetS= {Prop,Set,Type(i)|i∈N},whereType(i)isthetypeofsmallersorts Type(j)with j<i,includingthesortsProp andSetwhichensureastrict separationbetweenlogicaltypesandinformative types:theformeristhetypeofpropositionsandproofs,whereasthelatteraccommodatesdatatypesandfunctionsdefined over those data types.An immediateeffectof thenon-existing distinction betweentypes andtermsin CIC isthat com- putations occurbothinprogramsandinproofs.AfundamentalfeatureofCoq’sunderlyingtypesystemisthesupportfor dependentproducttypesΠx:A.BwhichextendfunctionaltypesABinthesensethatthetypeofΠx:A.B isthetypeof

(3)

functionsthatmap eachinstanceofxoftype A toatypeof Bwherexmayoccurinit.Ifxdoesnotoccurin Bthenthe dependentproductcorrespondstothefunctiontype AB.

InductivedefinitionsareakeyingredientofCoq.Inductivetypesareintroducedbyacollectionofconstructors,eachwith itsown arity.Atermofan inductivetype isacompositionofsuch constructorsandifT isthetype underconsideration, thenits constructorsare functionswhosefinal typeis T,oran applicationof T toarguments.Usingpatternmatching,we can implementrecursive functions by deconstructing thegiven termandproducing new termsfor each constructor. For instance, it is straightforward to define Peano natural numbers and a function plus that implements addition on these numbers:

Inductivenat : Set : =

| 0 : nat

| S : nat nat.

Fixpointplus (n m:nat) : nat : = matchn with

| O m

| S p ⇒S (p + m) end where"n + m" : = (plus n m) .

ThedefinitionofplusisacceptedbyCoq’stype-checkerbecauseitexhaustivelypattern-matchesoveralltheconstructorsof nat,andbecausetherecursivecallsareperformedontermsthatarestructurallysmallerthantherecursiveargument.This isastrongrequirementofCICthatforcesallfunctionstobeterminating.

Wecandefineinductivetypesthataremorecomplexthannat,namely,inductivetypesthatdependonvalues.Aclassic exampleisthefamilyofvectorsoflengthn∈N,whoseelementshaveatypeA:

Inductivevect (A : Type) : nat Type : =

| vnil : vect A 0

| vcons : n : nat, A→vect An→vect A (Sn)

Giventhedefinitionofvect,wecandefinetheconcatenationofvectors,asfollows:

Fixpointapp(n:nat) (l1:vect A n) (n:nat) (l2:vect An) {structl1} : vect (n+n) : = matchl1 in (vect _m) return (vect A (m + n) ) with

| vnil l2

| vconsn0 v l1 ⇒vcons A (n0 + n) v (app n0 l1 n l2) end.

Notethatthereisadifferencebetweenthepattern-matchingconstructionusedinthedefinitionofplusandtheoneused toimplementapp:inthelatter,thereturningtypedependsonthesizesofthevectorsgivenasarguments;therefore,the extendedmatchconstructioninapphastobindthedependentargumentm toensurethatthefinalreturntypeisavector whosesizeisn+n.

In Coq’senvironment, the primitivewayto constructa proof is toexplicitly build CIC terms.However, proofscan be builtmoreconveniently, inan interactive andbackward fashion throughtheusage ofhigh-level commands calledtactics.

TheCICtermsbuiltbytacticsarealwaysverifiedbyCoq’stypechecker,whichensuresthatpossibleerrorsinthetacticsdo notinterferewiththesoundnessoftheproofconstructionprocess.

We finishourbriefintroduction toCoq addressingthe development ofnon-structurally recursive functions. Abovewe haveseenpatternmatchingover(dependent)inductivetypes,andwhosedecreasingcriteriaisstructuralrecursion.However, thisapproachisnotalwayspossibleandthewaytodealwiththisproblemisviaanencoding oftheoriginal formulation intoanequivalentfunctionthatisstructurallyrecursive.Thereareseveraltechniquesavailabletoaddressthedevelopment ofnon-structurallydecreasingfunctionsinCoq,whicharedescribedindetailin[7];herewe willconsiderthemethodfor definingwell-foundedrecursivefunctions.

AgivenbinaryrelationRoverasetSissaidtobewell-foundedifforallelementsxS,thereexistsnoinfinitesequence (x,x0,x1,x2,. . .)ofelementsofSsuchthat(xi+1,xi)∈R,foralli∈N.Well-foundedrelationsareavailableinCoqthrough thedefinitionoftheinductivepredicateAccandthepredicatewell_founded:

InductiveAcc (A : Type) (R : A A Prop) (x : A) : Prop : =

| Acc_intro : ( y : A, R y x→Acc A R y)→Acc A Rx

SincethetypeAccisinductivelydefined,wecanuseitasthestructurallyrecursiveargumentinthedefinitionofafunction.

Thankfully, Coq provides a high-level command named Function [36] that eases the burdenof manually constructing a recursive functionover Accpredicates. Thecommand Functionallows users toexplicitlystate that the target functionis goingtobedefinedoveraproofthatassertsthattheunderlyingrecursivemeasureiswell-founded.

Forfurther information aboutthe details of theCoq proof assistant, we point the readerto the worksof Bertotand Casterán[7],ofChlipala[37],andofPierceetal.[38].

3. Preliminariesofformallanguages

Inthissectionweintroducesome classicconcepts offormallanguagesthatwe willneedinthework weareaboutto describe.Theseconceptscanbefoundintheintroductorychaptersofclassicaltextbookssuch astheonebyHopcroftand

(4)

Ullman[39]ortheonebyKozen[40].TheencodinginCoqoftheseveraldefinitionsthatweareabouttointroducecanbe seenin[31].

3.1. Alphabets,wordsandlanguages

AnalphabetΣisanon-emptyfinitesetofobjectsusuallycalledsymbols(orletters).Aword(orstring)overanalphabet Σ is a finite sequence of symbolsfrom Σ. A language is anyfinite or infiniteset of words over an alphabet Σ. Given an alphabet Σ,the set ofall wordsover Σ, denoted by Σ,is inductively defined asfollows: the empty wordǫ isan elementofΣ and,ifw∈Σanda∈Σ,thenawisalsoamemberofΣ.Theconstantlanguagesaretheemptylanguage, the languagecontaining only ǫ,andthelanguage containing onlyasymbola∈Σ.Theoperationsoverlanguagesinclude the usual Boolean set operations (union, intersection, and complement), plus concatenation, power and Kleenestar. The concatenationoftwolanguagesL1andL2isdefinedby L1L2= {wu|wL1uL2}.Thepowerofalanguage L,denoted by Ln,with∈N,isinductivelydefinedby L0= {ǫ},andLn+1=LLn,forn∈N.TheKleenestar ofalanguage Listheunion ofallthefinitepowersofL,thatis,

L=

i0

Li. (1)

Wedenotelanguageequalityby L1=L2.Finally,weintroducetheconceptoftheleft-quotientofalanguage Lwithrespect toa wordw∈Σ,whichisdefinedasDw(L)= {v|w vL}.Inparticular,if w=a,witha∈Σ,wesaythatDa(L)isthe left-quotientofLwithrespecttothesymbola.

3.2. Regularexpressions

RegularexpressionsareinductivelydefinedoveranalphabetΣ,asfollows:theconstants0and1areregularexpressions;

allthesymbolsa∈Σareregularexpressions;if αandβ areregularexpressions,thentheirunionα+β andtheirconcate- nation αβ are regularexpressions aswell;finally,ifα is aregular expression,then soisits Kleenestarα.The syntactic equalityoftworegular expressions α andβ isdenotedby α≡β.Thesetofallregular expressionsoveranalphabetΣ is thesetREΣ.Thelengthofaregularexpression αisthetotalnumberofconstants,symbolsandoperatorsof α;thealpha- beticlengthofaregularexpression αisthetotalnumberofoccurrencesofsymbolsofΣin α.Theprevioustwomeasures aredenotedby|α|andby|α|Σ,respectively.

Regularexpressions denoteregular languages.Thelanguage ofaregularexpression α,denotedL(α),isinductivelyde- fined in theexpectedway:the languagesof theconstants 0 and1are, respectively, thesetsand {ǫ}; thelanguage of theregularexpressiona,witha∈Σ,istheset{a};if α andβ areregular expressions,thenthelanguagesdenotedbythe expressions α+β, αβ,and αare,respectively,thelanguagesL(α)∪L(β),L(α)L(β),andL(α).Thelanguageofafinite setofregularexpressions Sisdefinedby

L(S)=

αiS

L(αi).

Tworegularexpressions αandβ aresaidtobeequivalentiftheydenotethesamelanguage,andwewrite α∼β whenever thatisthecase.1Naturally,twosetsofregularexpressions S1andS2areequivalentifL(S1)=L(S2),andwewrite S1S2. Givenasetofregularexpressions S= {α12,. . . ,αn}wedefine

S12+. . .+αn,

whoselanguageis

L

S

=L1)∪L2)∪ · · · ∪Ln).

Wesaythataregularexpression α isnullableif ǫ∈L(α)andnon-nullableotherwise.Moreover,weconsidertheBoolean function ε(·)suchthat the ε(α)=trueifandonlyif ǫ∈L(α)holds.Nullabilityextendstosetsofregular expressionsin a straightforward way:aset S is nullable ifε(α)evaluates positively, that is,if ε(α)=trueforat leastone α∈S. We denotethenullabilityofasetofregularexpressions S by ε(S).Twosetsofregularexpressions S1 andS2 areequi-nullable if ε(S1)=ε(S2).Wealsoconsidertheright-concatenation S⊙αofaregularexpression α withasetofregularexpressions S,whichisdefinedasfollows: S⊙α= ∅if α≡0,S⊙α=S if α≡1,andS⊙α= {βα|β∈S}otherwise.Weusuallyomit theoperatorandwrite Sαinstead.

1 Asthereaderwillnotice,weoverloadthenotationwheneverequivalencebymeansoflanguageequalityisconsidered.

(5)

3.3. Derivativesofregularexpressions

Thenotionofderivativeofaregularexpression α was introducedbyBrzozowski inthe1960s[22],andwas motivated by the construction of sequential circuits directly fromregular expressions extended with intersectionand complement.

In thesame decade, Mirkin introduced the notionof prebase andbase ofa regular expression asa method to construct non-deterministicfiniteautomata(NFA)thatrecognize thecorrespondinglanguages[28].Mirkin’sdefinitionisageneralization ofBrzozowski’sderivativesforNFAandwas independentlyre-discoveredalmost thirtyyearslaterby Antimirov[41],who coineditasthepartialderivativesofaregularexpression.

Let α bearegularexpressionandleta∈Σ.Theseta(α)ofpartialderivativesoftheregular expression α withrespect toa isinductivelydefinedasfollows:

a(0)= ∅ ∂a(α+β)=∂a(α)∪∂a(β)

a(1)= ∅ ∂a(αβ)=

a(α)β∪∂a(β) ifε(α)=true,

a(α)β otherwise.

a(b)=

{ε} ifab,

otherwise. ∂a)=∂a(α)α

Theoperationofpartialderivationnaturallyextendstoasetofregularexpressions Sasfollows:

a(S)=

αS

a(α).

The language of the set of partial derivativesa(α) is the left-quotient of L(α), i.e., L(∂a(α))=Da(L(α)). The set of partialderivativesisextendedtowordsinthefollowingway:givenaregularexpression α andawordw∈Σ,thepartial derivativew(α)of αwithrespecttow isdefinedinductivelybyε(α)= {α},andwa(α)=∂a(∂w(α)).Wecanusepartial derivatives and nullability of regular expressions to determine if a word w∈Σ is a member of some language L(α). Forthat,itisenough tocheckthevalue computedby ε(∂w(α)):if ε(∂w(α))=truethenwe have w∈L(α); otherwise, w∈/L(α)holds.

Example1.Thewordderivativeoftheregularexpressionabwithrespecttoabbisgivenbythefollowingcomputation:

abb(α)=∂b

b

a ab

=∂b

b

a(a)b

=∂b

b b

=∂b

b(b)b

=∂b b

= b .

Fromthenullabilityoftheresultingsetofregularexpressions{b},weeasilyconcludethatabb∈L(α)since ε(b)=true. Finally,wepresentthesetofpartialderivativesofagivenregularexpression α,whichisdefinedby

PD(α)=

w∈Σ

w(α) .

Antimirovprovedin[41] thatgivenaregularexpression α,thesetPD(α)isalwaysfiniteanditscardinalityhasan upper bound of|α|Σ+1.Champarnaud andZiadi[42] introduced an elegantrecursive functionforcalculating thesupport ofa given regular expression α, andfrom whichit is easy to calculate PD(α).The function,denoted by π(α), isrecursively definedasfollows:

π(0)= ∅ π(1)= ∅ π(a)= {ε}

π(α+β)=π(α)∪π(β) π(αβ)=π(α)β∪π(β)

πα=π(α)α

Champarnaud andZiadi provedthat PD(α)= {α}∪π(α)holdsforall regularexpressions α,andonceagainwe conclude that|PD(α)|≤ |α|Σ+1.

Referências

Documentos relacionados

Ao identificarmos as atividades do enfermeiro responsável de turno, refletindo de forma crítica sobre a forma como são postas em prática e colaborando na sua

Os valores de PR obtidos pelo método Tonnesen são in- feriores comparativamente ao método Taylor (cf. Tabela 3), dado que, e de acordo com a literatura, as equações de Tonnesen

In view of the proof of Proposition 4.1.4, on an orientable manifold M of dimension n, there exists a bijection between equivalence classes of ori- ented atlases and equivalence

Semantic Equivalence (Referential Meaning Equivalence—RME and General Meaning Equivalence—GME) between the original CDC-CCI (in English) and the version in Brazilian Portuguese by

São Leopoldo era um centro comercial, com “contatos e experiências sociais e étnicas diversas” (MOREIRA; MUGGE, 2014, p. Ambos pararam na casa de comércio do imigrante

This study aims to demonstrate that a modification of a chevron osteotomy, as popularized by Austin and Leventhen (1) , and largely employed for mild deformities, allows the

Ao longo da história as relações entre animais humanos e não humanos se revelam contraditórias. Enquanto alguns animais são tratados como verdadeiros membros da família,

12 Até 1816, essa pendência envolvendo Angelo ainda mobilizava as autoridades, tendo sido remetida para a Secretaria de Governo da capitania de Minas Gerais.13 No primeiro lugar de