• Nenhum resultado encontrado

complexity Thirty years of credal networks: Specification, algorithms and International Journal of Approximate Reasoning

N/A
N/A
Protected

Academic year: 2022

Share "complexity Thirty years of credal networks: Specification, algorithms and International Journal of Approximate Reasoning"

Copied!
25
0
0

Texto

(1)

Contents lists available atScienceDirect

International Journal of Approximate Reasoning

www.elsevier.com/locate/ijar

Thirty years of credal networks: Specification, algorithms and complexity

Denis Deratani Mauá

a,∗

, Fabio Gagliardi Cozman

b

aInstituteofMathematicsandStatistics,UniversidadedeSãoPaulo,Brazil bEscolaPolitécnica,UniversidadedeSãoPaulo,Brazil

a r t i c l e i n f o a b s t r a c t

Articlehistory:

Received20December2018

Receivedinrevisedform10August2020 Accepted12August2020

Availableonline21August2020

Keywords:

Impreciseprobabilities Probabilisticgraphicalmodels

Credal networks generalize Bayesian networks to allow for imprecision in probability values.Thispaperreviewsthemainresultsoncredalnetworksunderstrongindependence, as therehasbeensignificantprogressintheliteratureduringthelastdecadeorso.We focusoncomputationalaspects,summarizingthemainalgorithmsandcomplexityresults forinferenceanddecisionmaking.Weaddressthequestion“Whatisreallyknownabout strongextensionsofcredalnetworks?”bylookingattheoreticalresultsandbypresenting ashortsummaryofrealapplications.

©2020ElsevierInc.Allrightsreserved.

1. Introduction

The now unclassified United StatesNationalIntelligence Estimate report number 29-51concluded that “an attack on Yugoslaviain1951shouldbeconsideredaseriouspossibility”.Whenaskedtoquantifytheexpression“seriouspossibility”, theauthorsofthereportprovidedoddsthatrangedfrom20/80to80/20infavorofanattack[1].Translatingtoprobabilities, theexpertsassessedthat

0

.

2

≤ P(

attack

)

0

.

8

.

This is not an isolated example: imprecision in probability values pervades practical life. Often thisimprecision appears whenmodelingscenarioswithmanyrandomvariablesthatinteractinanon-trivialway.

Accordingtothereport,thepossibilityofanattackwas largelydependentonadecisiontoinvadethecountry.Thelatter eventwasinfluencedby broadSovietobjectives,butalsobyapossibleregimechangeinYugoslavia,perhapsduetoTito’s assassination,orduetoa coup/revolt. Adecisionto invadeshould causeamilitarybuild-up andan increase inpropaganda against supposed Sovietenemies.This intricate setof relationsis capturedby the directed acyclicgraph(DAG) inFig. 1, whereeachnodeisa(binary)randomvariable.1 Ifwecouldassociateasharpconditionalprobabilityvaluewitheachvalue ofa variable giveneach value of those variablesthat pointto it inthe graph,we would obtain aBayesian network [2].

Thatis, we would havea “probabilisticgraphical model” that could be used to calculatethe probability of anypossible scenarioinvolvingthevariablesinthemodel(e.g.,theprobabilityofan invasiongiventheobservationofmilitarybuild-up

*

Correspondingauthor.

E-mailaddress:denis.maua@usp.br(D.D. Mauá).

1 Notethateachoneoftheseevents(attack,propaganda,etc)canbeassociatedwitharandomvariablethatyields1 whentheeventhappensand0 otherwise;tosimplifymatters,weoftendonotdistinguishbetweenaneventandthecorrespondingrandomvariablethatindicatesitshappening.

https://doi.org/10.1016/j.ijar.2020.08.009

0888-613X/©2020ElsevierInc.Allrightsreserved.

(2)

assassination coup/revolt

regime change decision to invade

build-up propaganda

attack

invasion

Fig. 1.A simplified graphical representation for the reasoning in the 29-51 report.

and propaganda) [3,4]. Ifwe insteadcannot associatea single jointprobability distributionwith the graph,we obtain a credalnetwork.

In short,credalnetworksare directedacyclicgraphswhose nodesrepresentrandom variablesassociatedwith“impre- cise/indeterminate” probabilistic assessments, and whose arcs (or their absence) represent irrelevance or independence.

Credal networks keep the graphicalappealof Bayesian networks[2] while allowing for a moreflexible quantification of values. Thisability torepresent andreasonwithimprecision inprobability values allows credalnetworksto reach more conservativeandrobustconclusionsthanBayesiannetworks.

Eventhoughitisdifficulttopinpointwhencredalnetworksandtheirstrongextensionsfirstappearedintheliterature, itseemsfairtosaythat by1990some authorshadalreadytouchedontheir basicfeatures.Theliterature evolvedsteadily until 2005, whenthe second authorpublished areview of graph-basedtechniquesthat dealwithimprecise probabilities [5].Sincethen,researchinthefieldhasledtoaconsiderablechangeinourunderstandingofcredalnetworks.First,arich toolboxofalgorithmshasbeenputtogether,containing(fast)approximatetechniquesand(slow)exactmethods.Second,the complexity ofvariousinferenceshasbeenstudiedindepth,sowecanidentifyfeaturesthatguaranteetractableinference.

Third,thereisnowasignificantnumberofrealapplicationsofcredalnetworksinvariousdomainswhereitisadvantageous toencode imprecisioninprobability values.Finally,thelast tenyearshavewitnessedprogressalsoonelicitationofcredal networksandondecisionmakingwithcredalnetworks.

Severalofthosemorerecentresultshavebeendiscussedinanexcellenttutorialoncredalnetworkspublishedin2014 [6].Alsoofnoteisapreviousrelevanttutorialfocusedontheelicitation ofcredalnetworks[7].Thepresentreviewshould be valuable tothenovice whohasread some ofthatintroductory material, andwants morescholarly detail;thisreview should alsobe usefulto theresearcherwho is interestedinan up-to-datetechnical introductiontocredal networks. We willexaminethemainresultsonthetheoryofcredalnetworks,includingbasic(butnon-trivial)resultsabouttheircharac- terization,algorithmicstrategiesforproducinginferencesandcomputationalcomplexityresults.Inshort,wetrytoprovide some guidanceregardingthefairquestion:“Aftersomethirtyyearsofresearchoncredalnetworks,whatexactlyisknown abouttheirspecification,algorithmsandinferentialcomplexity?”.

Therearemanywaystointerpretindependencerelationsinacredalnetwork;inthispaperwefocusonwhatisknown asstrongindependenceanditsrelatedconceptofstrongextension.Aswediscusslater, thereareotherwell-knownexten- sionsforcredalnetworksintheliterature[8],butstrongextensionseemstobethemostpopularone.

The paper isorganized as follows.Section 2 summarizesthe literature up until2005, presentingan abridgedversion ofthesurveywe havementioned [5].Section3introducesbasicterminologyandnotation, whileSection 4presentssome basicresultsoncredalnetworks.InSection5,weprovideafairsampleofthealgorithmsformarginalinferenceanddecision making,whileinSection6wepresentresultsoncomputationalcomplexity ofthosetasks.Section7containsanoverview of some recentapplications of credal networks. We finish in Section 8 with commentson recent developments andan optimisticviewaboutthefutureofcredalnetworks.

2. Credalnetworksupto2004:asummary

Severalearlyexpertsystemsandknowledgerepresentationformalismsweredevelopedamidstintensedebateconcerning the bestwaytohandleuncertainty[9–11].Bayesiannetworksappearedby themiddleeighties[2] andhavesinceproved thatprobabilisticmodelingcanbeusedsuccessfullyinmanycircumstances.AssoonasBayesiannetworksstartedtoattract interest,researchersfelttheneedtogeneralizethemsoastoallowtherepresentationofimprecisioninprobabilityvalues.

Onemotivationforthiswas thedifficulty inelicitingthemanyprobabilitiesrequiredbyaBayesiannetwork.Anothermo- tivation was theanalysisofrobustness to perturbationsin Bayesiannetworks(aparallel lineofresearch hasinvestigated sensitivitytochangesinparameters[12,13]).Probablyanevenstrongermotivation,amongstresearchersinterestedinarti- ficialintelligence,wasthedesiretostudymanylanguagesinwhichtorepresentuncertainty,fromDempsterShafertheory topossibilitytheory,manyofwhichcouldbereframedusingintervalprobabilities,intervalexpectations,orcredalsets.

(3)

TheearliestappearanceofBayesiannetworkswithset-valuedparametersseemstobebyLamataandMoral[14].There was then considerabledebate on the meaningof independenceforspecial kindsof assessments,such asinterval proba- bilities,Choquetcapacities andqualitative probabilities.Theearly effortsthatadoptedset-valuedassessmentstook strong extensionsasthesolesemanticsofanygivencredalnetwork.

Bymidnineties,manyconsequencesoftheMarkovcondition andinference algorithms hadbeendeveloped forstrong extensions.Hereinferencereferstoa computationoftightprobability boundsforavalue ofsome variablegivenvaluesof someothervariables.PerhapsthemostrepresentativereferenceofthatperiodistheworkofCanoetal.[15].Atthatpoint there was great emphasison inference algorithms that operatedby passing messages around nodes,thus mimickingthe behaviorofinferencealgorithmsforBayesiannetworks.Algorithmssuchasthe2-Updating (2U),forpolytreeswithbinary variables[16],orTessem’spropagation,whereintervalarithmeticsisexploited [17],representedthestate-of-art.References thatreflectthosedevelopments,andthatincludeothergraph-theoreticalstructures,canbefoundinprevioussurveypapers [5,18].

It became clear by mid nineties that, instead of focusing on special cases such as interval probabilities, one should adoptgeneralcredal setsfrom theoutset. Giventhat the classof closedconvexcredal setsis closedunder elementwise marginalization andconditioning undermild conditions [19], andthat mostaxiomatizations dealing with imprecisionin probability valuesgenerateconvexcredalsets[20], mostoftheliterature just assumedclosureandconvexitythroughout.

Bymidninetiesit alsobecameclearthat, insteadoffocusing onasingle concept ofindependence,one mustaccept that severalconcepts ofindependence makesense when credalsets are present; this understanding was championed by the influentialworkofWalley[21].Theideathatcredalnetworksshouldbebroadenoughtoencompassseveraldefinitionsof independencewasthenproposedbytheendofthenineties[22].

Computationalexperienceduringtheninetiesledtotheunderstandingthat,insteadoftryingtoadaptBayesiannetwork messagepassing algorithms to inference withstrongextensions,the mostprofitablepath was tolook atinference asan optimizationproblem.Eventhoughanumberofsuccessfulmessagepassingtechniqueshaveemerged,themainalgorithms developedfromthelateninetiesonhavebeeninspiredbyoptimizationtechniques,asdescribedinSection5.

3. Background:Bayesiannetworksandcredalsets

Giventhat credal networksare extensions ofBayesian network that replaceconditional probability distributions with credalsets,inthissectionwereviewbasicconceptsforBayesiannetworksandcredalsets;thissectionalsohelpsinfixing notationandterminologyusedlater.

3.1. Bayesiannetworks

InaDAG,anode X isaparentofanode Y (conversely,Y isachildof X)ifthereisanarcfrom X toY.Wedenotethe parentsofanode X byPa(X)anditschildrenbyCh(X).Thein-degreeofanodeisthecardinalityofPa(X).Theancestorsof anodeareitsparents,theparentsofitsparents,andsoon,recursively.Similarly,thedescendantsofanodeareitschildren, the childrenof itschildren, andso on,recursively; thenondescendantsare all nodesthat are notdescendants exceptthe nodeitself.

Formally,aBayesiannetworkisapaircontainingaDAGwhosenodesarerandomvariablesinasetX= {X1,. . . ,Xn},and aprobability distributionP(X)overX.TheDAGandthedistributionsatisfy theMarkovcondition: forevery node/variable X,the nondescendantnonparents of X areindependent of X giventheparents of X. Toillustratethe Markovcondition, consideragainFig.1.Thestructureofthegraphimpliesanumberofindependencerelations.Forexample,thegraphimplies thatTito’sassassinationandcoup/revoltareindependent.Ifsuchindependencerelationisdeemedunreasonable,thenedges should be added accordingly.Another independencerelation extractedfrom thegraph is: given regimechange,then the decisiontoinvadeisindependentofTito’s assassination.Thiscapturestheessence oftheMarkovcondition:onceweknow theparentsofavariable,anythingaboutotherancestorsisirrelevanttothevariable.

Inthispaper,weconsideronlyrandomvariableswithfinitesupport.Whenthisisthecase,theMarkovconditionimplies aparticularfactorizationofthejointprobabilitydistribution:

P(

X1

=

x1

, . . . ,

Xn

=

xn

) =

n i=1

P(

Xi

=

xi

|

Pa

(

Xi

) = π

i

),

(1) wheneverallconditionalprobabilitiesontherightarewell-defined.Intheequationabove

π

i istheprojectionof{x1,. . . ,xn} overPa(Xi),andifPa(X)isempty,thenweuseP(Xi=xi)inlieuofP(Xi=xi|Pa(Xi)=

π

i).

Because“local” conditionalprobabilities P(X=x|Pa(X)=

π

)sufficeto specifyanyBayesiannetwork, oftenaBayesian networkisspecifiedthroughaDAGandasetofprobabilityassessmentsP(X=x|Pa(X)=

π

)foreach X,xand

π

.Thisleads toadrasticreductioninthenumberofparametersonemustspecifyandrepresentwithrespecttothejointdistribution.

The moralgraph of a DAG is obtained by connecting nodeswith a common child and then dropping arc directions.

One celebratedfeatureofBayesian networksis thatone can detectmany independencerelationsimplied by theMarkov condition usingd-separation [2]. The latter isa graph-theoretical criterion that we apply asfollows to a given Bayesian network. Suppose we havethree sets ofvariables X, Y, Z. Delete all nodesthat are not ancestors to atleast one ofthe

(4)

1

assassination coup/revolt regimechange

2 regimechange

decisiontoinvade 4

attack decisiontoinvade

build-up 3 decisiontoinvade

propaganda

5

attack build-up invasion 6 attack

Fig. 2.Rooted tree decomposition for the DAG in Fig.1.

variablesinthosesets,andobtainthemoralgraphoftheresultingDAG.SetsXandYared-separatedbyZifandonlyifin theresultingundirectedgraphanypathbetweenXandYisblockedbyZ(i.e.,itcontainsanodeinZ) [23].Theimportant consequenceofthisd-separationisthatXandYarestochasticallyindependentgivenZ.Becaused-separationcanbequickly computed [24],itisoftenemployedinpracticetodiscardunnecessaryvariablesbeforeanyinferenceiscarriedout.

The complexity ofdrawing inferenceswithBayesiannetworks istightlycoupled withthetopology ofits DAG,andto graph-theoreticalconceptssuchastreewidth.

A polytreeis aDAG that contains no undirectedcycles,that is,the subjacentundirected graphwe obtainby ignoring thearcdirectionshasnocycles.Adirectedtreeisapolytreewitheachnodehavingatmostoneparent.Polytreesarealso calledsinglyconnecteddirectedgraphs.ADAGisloopyormultiplyconnectedifitisnotapolytree.TheDAGinFig.1isloopy;

thesubDAGoverthenodesassassination,coup/revoltandregimechangeisapolytreethatisnotatree.

Acycleinanundirectedgraphcontainsachordifthereisanedgewhichconnectstwonodesofthecycleandisnotin thatcycle.Anundirectedgraphischordalifandonlyifeverycycleoflengthfourormorehasachord.Everygraphcanbe turnedintoachordalgraphbyinsertingedges,aprocesscalledchordalization.

The treewidthof achordal graphis thesize ofthelargest cliqueminus one, thatis, themaximumsize ofa complete subgraph minus one (a complete graphis a graph such that each pairof nodes isconnected). The treewidth ofa non- chordal graphis theminimumtreewidth over allchordalizations ofit. Asits namesuggests, thetreewidth measures the resemblanceofanundirectedgraphtoatree.Thetreewidthofadirectedgraphisthetreewidthofitsmoralgraph.Inthe caseofpolytrees,thetreewidthisgivenbythemaximumin-degreeofanode(sothetreewidthofdirectedtreesisone).All knownalgorithms fordrawing inferenceinBayesiannetworkshavea worst-caserunningtime thatisatleastexponential inthenetworktreewidth.

A tree-decompositionof an undirectedgraph G=(V,E) isa pair (C,T),where C is a collectionof subsetsCiV of nodes ofG, andT is an undirectedtreewhose nodesare identifiedwith elements ofC, andthat satisfiesthe following properties: (1) Forevery edge (u,v) in E there isa set CiC with u,vCi; (2)For every node v in V the subgraph obtainedbydeletingnodes Ci not containing v isa(connected)tree.The widthofatreedecompositionisthemaximum cardinalityofasetCi minusone.Theminimumwidthoveralltreedecompositionsofagraphisthegraph’streewidth.A tree-decompositioncanberootedatacertainnodeRC bydirectingedgesawayfromR;thisallowsustorefertoparents andchildren of thenodesin T (notethat a rooted T isa directedtree,thus every node hasatmostone parent).Fig. 2 showsatree-decompositionfortheDAGinFig.1rootedatR= {attack}.

3.2. Credalsets

It is often difficult to specify sharp probability values for events dueto insufficiency of resources (time, cost, data) or lack of consensus. An alternative is to specify a set ofprobability distributions that encodes the incompleteness and indeterminacyinthebeliefsofthemodelbuilder,andarguablyleadstomoreconservativeandhencerobustconclusions.

We start with some needed concepts of imprecisely specified probability models and their usual notation. A set of probabilitydistributionsforavariableX iscalledacredalsetandisdenotedbyK(X)[19].Similarly,K(X1,. . . ,Xn)denotes a set of jointdistributions forvariables X1,. . . ,Xn.The convexhull ofa set Kis denoted by coK. Afinitely generated setistheconvexhullofafinitesetofpoints (henceitisclosedandconvex).Asetofconditionalprobabilitydistributions foravariable X givenevent A iscalledaconditionalcredalsetanddenotedby K(X|A).Thelower probabilityPK(X)(A)is equaltoinfP(A),wheretheinfimumistakenoverallpossibleprobabilityvaluesforevent A inducedbythedistributions inK(X);similarlywe havethelowerconditionalprobabilityPK(X|B)(A|B).Wealsohavetheupperprobability PK(X)(A) that issupP(A)andsimilarlyupperconditional probabilities.Intheoperators above,weusuallyomitthedependenceon the credal set when its choice is clearfrom context and write, for example,P(A) andP(A|B). Bymarginalizing every distribution in the credal set K(X1,. . . ,Xn), we obtain a marginal credal set over a subset of the variables. Whenever P(Y=y)>0,weobtainaconditionalcredalsetK(X|Y=y)byapplyingBayesruletoeachdistributioninK(X,Y)such that P(Y=y)>0.Thisprescriptionforconditioningisoftencalledregularconditioning[25].Onecanfindprescriptionsin the literature that handle conditioningevents differentlywhen they mayhave zeroprobability [26–28]. One particularly

(5)

popularalternativestrategy istotakeK(X|Y=y)to bevacuous2 wheneverP(Y=y)=0,andotherwiseto applyBayes ruleelementwiseasbefore;becausethisstrategyisobtainedwithWalley’snaturalextension [21],werefertoitasnatural conditioning.3 Inour definitions andresults we adopt regular conditioning unless explicitlyindicated, andwe implicitly assumethattheupperprobabilityoftheconditioningeventispositive.

Onebasicresultisthat,ifacredalsetisconvex,bothitselementwisemarginalizationanditselementwiseconditioning leadtoconvexsets[19].

One can specifya credal set by listing constraintson probability values.Ifa credal set is closedand convex, then it canbe describedbylisting itsextremepoints(andtakingtheconvexhull).Constraintsandextremepointsareconnected by duality:one can transformone representation intothe other usingwell known algorithms [29]. However, the size of therepresentationmayexplodethroughthistransformation:asetof(evenlinear)constraintsmaygenerateanexponential numberofextremepointsandvice-versa.Somealgorithms discussedlater assumethat theinputcredalsetsare givenas setsofconstraints,whileotheralgorithmsassumetheinputisalistofextremepoints;thesearemathematicallyequivalent, butnotcomputationallyinterchangeable.Mattersaredramaticallysimplerifwedealwithasinglebinaryvariable X:inthis case,aclosedconvexcredalsetK(X)isfullycapturedbyasingleclosedinterval[P(X=1),P(X=1)].

Thereareseveralconceptsofindependencethatapply tosetsofprobabilitydistributions[28,30].Aconceptuallysimple conceptiscompleteindependence:variablesX andY arecompletelyindependentwithrespecttoacredalsetK(X,Y)ifthey are stochastically independent withrespect to every jointdistribution in K(X,Y).This definitionextends to conditional independenceinthetrivialway:inessence,itrequireselementwisestochasticindependence.

If X andY arecompletelyindependentwithrespecttosomeK(X,Y),theningeneralK(X,Y)cannotbeaconvexset [28]. Complete independenceseems to requireone toaccommodate non-convexcredalsets. As manyprincipled waysto justifycredalsetspostulatethattheyoughttobeconvex [19,31],therehasbeensteadyinterestin“convexifying”complete independence,amovethatyieldsstrongindependence.

Variables X and Y are stronglyindependent withrespect to a credal set K(X,Y) if the latter can be written asthe convexhullof acredal setforwhich X and Y are completely independent.When K(X,Y)is closedand convex,strong independenceisequivalenttoassumingthat X andY arestochasticallyindependentwithrespecttoeachextremepointof K(X,Y).

Anotherpopularconceptisepistemicirrelevance[21]:Y isepistemicallyirrelevantto X givenZ whentheconvexhullof K(X|Y=y,Z=z)isequaltotheconvexhullofK(X|Z=z)foreverypossible(y,z).Epistemicirrelevanceisasymmetric:

X may be epistemicallyirrelevant to Y while theopposite fails[21].Epistemicindependence isa symmetrized version of epistemicirrelevance: XandY areepistemicallyindependentwhenX isepistemicallyirrelevanttoYandY isepistemically irrelevantto X.

YetanotherconceptofindependenceisduetoKuznetsov,ageneralizationoftheusualfactorizationofexpectationsthat isimpliedby stochasticindependence[32].LittleisknownaboutKuznetsovindependence,andthefew existinginference algorithmsthatcanhandleitrelyonoptimizationtolookforboundsonprobabilities[33].

4. Credalnetworksandtheirstrongextensions

AcredalnetworkconsistsofapaircontainingaDAGwhereeachnodeisavariable,andacredalsetforthosevariables, such that,foreverynode/variable X,thenondescendant nonparentsof X are “independent”of X giventheparents of X. ObviouslytheMarkovconditionhasadifferenteffectdependingontheconceptofindependencethatisadopted.

Suppose for instancethat “independence”in the Markov condition means “epistemic irrelevance”. As epistemic irrel- evance isnot symmetric,the directionof edges in the credalnetwork then gets denser content. During thelast decade arichtheory andapowerfulsetofalgorithms havebeendevelopedaroundextensionsofcredalnetworksthat satisfythe Markovconditionwithepistemicirrelevance—wedonotdwellonthattopicasarecentin-depthreviewpaperhascovered allnecessaryterritory[8].AsforepistemicindependenceandKuznetsovindependence,littleisknowninconnectionwith credalnetworks(eventhoughcredalnetworksunderepistemicindependencehavereceivedsomeattention[34]).

IntheremainderofthispaperweadopttheMarkovconditionwithstrongindependence;thatis,eachvariableisstrongly independentofitsnondescendantsgivenitsparents.

4.1. Specifyingcredalnetworks

Themostcommonwaytospecifysetsofprobabilitiesassociatedwithcredalnetworksistomimicthestrategyusedfor Bayesiannetworks.Thatis,tospecifyaclosedconvexcredalsetK(X|Pa(X)=

π

)foreach variable X,foreachvalue

π

of parentsPa(X).Asaverysimpleexample,wemighthavethreebinaryvariables X,Y andZ,thesimpleDAG

X Y Z

2 Acredalsetisvacuousifitcontainsalldistributionsofthatvariable.

3 Yetanotherstrategyistoreplaceusualprobabilitymeasuresbyfullconditionalprobabilitiesandtheirvariants[27],whereconditionalprobabilityis definedevenwhentheconditioningeventhasprobabilityzerothisisamovewedonotconsiderinthispaper.

(6)

andprobabilisticconstraints

P (

X

=

1

) ∈ [

1

/

10

,

1

/

3

] , P (

Y

=

1

) =

4

/

5

,

(2)

P (

Z

=

1

|

Y

=

0

) ∈ [

2

/

5

,

3

/

5

] , P(

Z

=

1

|

Y

=

1

) ∈ [

7

/

10

,

9

/

10

] .

(3) Recall thatan intervalsufficesto specifythecredalsetforabinary variable.Ingeneralamorecomplexlanguage maybe neededtospecifycredalsetsforvariableswithmorethantwovalues(Section4.4).

WhenacredalnetworkisspecifiedthroughaDAGandasetofcredalsetsK(X|Pa(X)=

π

),oneforeachvariable Xand eachvalue

π

ofparentsof X,wesaythenetworkisseparatelyspecified.

Adifferentspecificationstrategyistoexplicitlydescribea setoffunctionsP(X|Pa(X))foreachvariable X (note:here P(X|Pa(X)) is a functionboth of X andofPa(X)). We then saythat the credal setfor X is extensivelyspecified, orjust extensive forshort.Forinstance,onemightreplaceassessmentsinExpression(3) bytwofunctionsP1(Z|Y)andP2(Z|Y), takingtheirconvexhullasthesetofpossiblefunctionsP(Z|Y).

Afewkindsofnon-separatelyspecifiedcredalnetworkshavebeenexploredintheliterature.Oneexampleisthe(large) familyofqualitativeprobabilisticnetworks(QPNs)[35].AQPNisusuallyinterpretedasapartiallyspecifiedBayesiannetwork, where in lieu of the conditional probability values one writesdown qualitative constraints amongst these probabilities.

Thegoalistosimplifyelicitationefforts.Forinstance,thequalitative constraintreferredtoaspositiveadditivesynergyona variable X withparents Y andZ imposes

P(

X

=

1

|

Y

=

1

,

Z

=

1

) + P (

X

=

1

|

Y

=

0

,

Z

=

0

)

P(

X

=

1

|

Y

=

0

,

Z

=

1

) + P(

X

=

1

|

Y

=

1

,

Z

=

0

),

meaning,roughly,thatthe“concerted”influenceof X and Z on X isgreaterthanthesumoftheir “discordant”influence.

Notethatconstraintsareimposedacrossvaluesofconditioningvariablesinanon-separatelyspecifiedscheme.Manyother constraints havebeenproposed toenlargetheexpressivity ofQPNs [36,37] andtoapply themtopractical scenarios [38].

Another exampleofnon-separatelyspecified credalnetworkscan befound inthe Bayesianlogic ofAndersen andHooker [39],whereadirectedgraphcanbeassociatedwithprobabilitiesovereventsdefinedwithaflexiblelogicallanguage.4

Whatever specification strategy one adopts, one ends up witha set of constraints over probabilities. The largest set of jointprobability distributions that satisfies thoseconstraints and the Markovcondition, where “independence” means

“strong independence”, isthe strongextension ofthecredal network.Ofcourse one mightconsidersets ofjoint distribu- tions that aresmaller thanthestrong extension;thestrongextension corresponds toaminimum commitmentgiventhe assessments.

4.2. Thestrongextensionforclosedconvexlocalcredalsets

Suppose that one has a separately specified credal network where each variable X is associated with a closed and convexconditionalcredalsetK(X|Pa(X)=

π

),foreachconfiguration

π

oftheparents.Thisisthemostcommonsituation discussedintheliterature.

Definethecompleteextensionofthenetwork,denotedbyKC(X1,. . . ,Xn),tobethefollowing(possiblynon-convex)set:

P : P(

X1

=

x1

, . . . ,

Xn

=

xn

) =

n i=1

P(

Xi

=

xi

|

Pa

(

Xi

) = π

i

) ,

with

P(

Xi

|

Pa

(

Xi

) = π

i

) ∈ K(

Xi

|

Pa

(

Xi

) = π

i

)

,

(4)

whereeach

π

iistheprojectionof{x1,. . . ,xn}overPa(Xi).

The set KC(X1,. . . ,Xn) andthe input DAG satisfy the Markov condition withrespect to complete independence:for every X,thenondescendantsof X arecompletelyindependentof X givenPa(X).

Wehavethat:

Theorem1.Theconvexhullofthecompleteextensionofacredalnetworkisthestrongextensionofthecredalnetwork:itisthelargest credalsetthatsatisfiestheMarkovconditionwithrespecttostrongindependence.Thisisalsotruewhenwereplaceeachcredalset K(Xi|Pa(Xi)=

π

i)inExpression(4)bythesetofitsextremepoints.

Theproofsofalltheoremsareintheappendix.

4 Non-separatelyspecifiedconstraintsalsosurfaceinconnectionwithconservativeupdatingonprobabilisticgraphicalmodels,wheremissingdatamay leadtocredalsets[40–42];Antonuccietal.[6] presentaninterestingdiscussionofthisphenomenon.

(7)

Hence every extreme point ofthe strongextension ofa separately specified credal networkwith closedconvex local credalsets isbuiltby acombinationof “local”extremepoints, andthusit behaveslike aBayesian network.Importantly, d-separationintheDAGimpliesstrongindependence;inthissensestrongextensionsmimicthesemanticsofindependence commonlyassociatedwithBayesiannetworks.5Interestingly:

Theorem2.Anycombinationofextremepointsfromthelocalcredalsetsisanextremepointofthestrongextension.

Thus the strong extension can actually be built directly from the extreme points of local credal sets, that is, coKC(X1,. . . ,Xn)isequivalentlyproducedifwereplaceeachlocalcredalsetK(Xi|Pa(Xi)=

π

i)inExpression (4) bytheset ofitsextremepoints,denotedby extK(Xi|Pa(Xi)=

π

i).Forinstance,take thethree-variablenetworkwithassessmentsin Expressions(2) and(3).Thestrongextensionistheconvexhullofeightdistributions,generatedbymultiplyingtheextreme pointsofK(X),K(Z|Y=0)andK(Z|Y=1),andthedistributionP(Y=1).

4.3. Awordonextensivelocalcredalsets

Expression(4) focusesonseparatelyspecifiedcredalnetworks;ifinsteadwehaveanextensivespecification,thestrong extension istheconvexhullofalldistributions P(X)=

XiXP(Xi|Pa(Xi))builtsothat eachfunctionP(Xi|Pa(Xi))isan extremepointoftheextensivespecificationrelatedto Xi.

Notethatifwehaveaseparatelyspecifiedcredalnetworkwecanalwaysbuildasetofconditionalprobabilityfunctions P(Xi|Pa(Xi))bycombiningallextremepointsofeachK(Xi|Pa(Xi)=

π

)acrossall

π

.Oncewedothat,thestrongextension oftheselatterextensivelyspecifiedcredalsetswillbeexactlythestrongextensionoftheoriginalseparatelyspecifiedcredal network. Sucha transformationis howevernotharmless fromacomputational point ofview: theextensive specification relatedtoavariable XimaybeexponentiallylargerthantheseparatelyspecifiedcredalsetsforXi.

4.4. Specificationlanguagesforcredalnetworks

There has been relatively little work on specification languages for credal networks. The contrast with research in Bayesian networksis striking. Since theappearance of thefirst inference engines forBayesian networks, there hasbeen active work inlanguages that encode probability distributions. Oneexample isthe BUGS package [47],where a declara- tive languageisusedto describethestructureandtheprobability valuesofanyBayesiannetwork.There areother visual languagessuchasDAPER[48],textual languagessuchasBLOG[49],andlanguagesthatexplorelogicprogramming[50,51]

amongotherproposals.Almostnoneofthisactivitycanbefoundinconnectionwithcredalnetworks,despitethefactthat thereisarguablymorestructuretobeexplored:possiblewaystolistextremepointsandexploitregularitiesamongstthem;

possibleways toencode qualitative andquantitative constraints,conceivably by recyclingconstructsfromconstraintpro- gramming.Existingpackagesforcredalnetworksusually adoptsimple strategiesforspecificationwhere eachlocalcredal set isrepresentedas aset oflinearinequalities, orwhereeach extremepoint is adistribution represented asatable of probabilityvalues.

Ithasrecentlybeennotedthatcredalsetscanbe specifiedusingprobabilisticlogicprogramming[52],butthisinsight hasnotbeenusedtospecifycredalnetworks.AnotherrelativelyrecenteffortmixesBooleanoperatorsandinterval-valued probabilitiesinspecificationssuchas [53]:

regime change

=

assassination

(

coup/revolt

∧ ¬

inhibitor

), P(

inhibitor

) ∈ [

0

.

9

,

1

] ,

wherethefirstsentencereads“regimechangeobtainsifandonlyifthereisanassassinationoracoup/revolt(providedthe latteris notinhibited)” andthe second sentencereads“probabilityofinhibitor isno smallerthan 0.9”.Inthisparticular exampleoneobtainsaNoisy-ORgatewithinterval-valuedinhibitoryprobabilities.

Thereseemstobesignificantroomforrefinedandextendedspecificationstrategiesinthefuture.

5. Algorithmsformarginalinferenceanddecisionmaking

The mostcommoncomputation appliedto credalnetworksis marginalinference,wherewe must compute boundsfor themarginalprobabilityofavalueofaqueryvariableconditionalonsomeevidenceonsomeothervariables.Forinstance, inthecredalnetworkrepresentingapossibleattackonYugoslaviain1951,onemightbeinterestedintheprobabilityofan attackafterobservinganincreaseinpropaganda.

Credalnetworksarealsousedtosupportdecisionmaking;theretheproblemistoselectactionsthatare endorsedby someappropriatecriteria—maximality,E-admissibility,minimality.

Inthissection wereviewthemosteffectiveknownalgorithms formarginal inferenceanddecisionmakingwithcredal networks.

5 Otherconceptsofindependence,such asepistemic independence,do notnecessarilyinteractwellwith d-separation [43],and alternatives tod- separationhavebeendesignedforthem [44–46].

(8)

5.1. Marginalinference

Given a credalnetwork over variables X= {X1,. . . ,Xn}, an eventofinterest Z=z and some evidence Y=y, we are interestedinobtaining:

P (

Z

=

z

|

Y

=

y

)

and

P(

Z

=

z

|

Y

=

y

) ,

(5) where {Z} and Y are assumed to be disjoint subsets of X such that P(Y=y)>0, and we take the lower and upper probabilitieswithrespecttotheregularconditioningofthestrongextensionofthenetwork.

The following result is fundamental as it shows that marginal inference can be accomplished by operating only on distributionsthatsatisfytheMarkovproperty:

Theorem3.Thelower/upperprobabilityinExpression(5)areobtained,respectively,by inf

/

sup

x

n

i=1

P (

Xi

=

xi

|

Pa

(

Xi

) = π )

z

x

n

i=1

P (

Xi

=

xi

|

Pa

(

Xi

) = π ) ,

(6)

where thesuminthenumeratorandtheinnersuminthedenominatorareoverthevaluesofthesetof marginalizedvariables X =X\({Z}∪Y),theoutersuminthedenominatorisoverthevaluesofthequeryvariable Z ,andtheoptimizationisoverthe distributionsfromthecompleteextensionsuchthatP(Y=y)>0.

The mostpopulartype ofspecificationforcredalnetworksemploysfinitelygeneratedlocalcredalsets;forthesecases wecanfurtherreducethesearchspaceinExpression(6)6:

Theorem4.Supposeacredalnetworkisseparatelyspecifiedwithclosedconvexlocalcredalsets.ThenthesolutiontoEquation(6)is attainedatsomeextremepointPofthestrongextensionsuchthatP(Y=y)>0;henceitisattainedatsomecombinationofextreme pointsofthelocalcredalsetsK(X|Pa(X)=

π

).

Solving theoptimizationinEquation(6) presentstwochallenges.First,thelarge(exponential)numberofcombinations oflocalextremepoints,assumingonehasalreadyobtainedthese.Second,thelarge(exponential)numbersoftermsinthe suminthedenominationandnumerators.Thereareessentiallytwoapproachestotheproblem.

One approachis tocasttheproblemasa combinatorialsearch overtheextremedistributions ofthe localcredalsets.

Thisapproachassumesavertex-basedrepresentationofcredalsets.Typicallythesesearchschemesreproduce,atsomelevel, themessage-passingschemesthatsucceedwithBayesiannetworks,withtheconditionthatnowmessageshavetobe set- valued(perhapssetsofrealnumbers,perhapssetsoffunctions).Messagepassing schemesreceivedattentioninparticular during thenineties.Amongtheinsightsdevelopedinthatinvestigation,wemustmentionthe2Ualgorithm,acleverexact algorithmthatsolvesaparticularclassofinferenceproblemsinpolynomialtime.Manyotherideasincombinatorialsearch havebeenexploredsincethen.Notableexamplesofthisapproacharetheinteger-programmingformulationofdeCampos andCozman[54],thealgebraicsystemofMauáetal.[55],thelocalsearchmethodsofCanoetal.[56].

The other approach is to cast the problem as a continuous optimization, and to adapt known techniques from the numerical optimizationliterature forthe resulting(difficult) optimizationproblem. Notable methodsin thiscategory are themultilinearprogrammingreformulationofdeCamposandCozman[57],andthelinearprogrammingapproximationof Antonuccietal.[58].Animportantfeatureofthisapproachistheabilitytonaturallyhandleboththeconstraint-basedand vertex-basedrepresentationsofcredalsets.

We next presentsomeofthe mostgeneralandeffectiveapproachesto marginalinference, emphasizingtheonesthat havedemonstratedleading performanceintheliterature.Notethatwe assumeseparately specifiednetworkswithfinitely generatedlocalcredalsets(thatis,eachsetistheconvexhulloffinitelymanydistributions),unlessexplicitlyindicated.

5.1.1. Message-passing:basicideasandthe2Ualgorithm

One (intractable)simpleideaistogenerateall possibleextremepoints ofthestrongextension bycombiningthelocal extremepoints, andforeachone runstandard messagepassingalgorithms ofBayesiannetwork. Aslightlymoreefficient approachistoexploittheredundanciesamongthesesharedBayesiannetworkinferences, andtopropagatemessagesthat insteadofcontainingasinglefunction,containsetsoffunctionsthatcorrespondto(someof)theextremepoints.

We havealreadymentionedsomeimportantlandmarks withinthisapproachinSection 2.Forinstance,thewholeex- haustivemessagepassingschemeiscapturedbythealgorithmofCanoetal.[15],whileapproximateinferenceisobtained byTessem’salgorithm,whichpropagatesonlyboundsforeachset[17].

There is a very distinguished casewhere message propagationcan be performedboth efficiently andaccurately: this happens when the DAG is a polytreeand all variables are binary. The well known 2U algorithmexploits thispocket of

6 TheproofofTheorem4,presentedintheappendix,alsoshowsthatregularconditioningonafinitelygeneratedcredalsetproducesafinitelygenerated conditionalcredalset;ifonestartswiththelatterfact,thenTheorem4isadirectcorollary.

(9)

tractability [16];its main ideas canbe explainedwithrelative ease.7 Abriefdescription ofPearl’spropagationalgorithm forinference withpolytreesisneededfirst[2].SaywehaveaBayesiannetworkwhoseunderlyinggraphisapolytreeand someevidenceY=y.Tofacilitatethenotation,assumethattheevidencesetsonlythevaluesofleavesofthegraphwitha singleparent.Thiscanbe alwaysmadetrue:if X isavariablesetbyevidencethatdoesnotsatisfy thoseconditions,then insertafresh“dummy”node X withonly X asparentandsuchthatP(X =1|X=x)=1 andP(X =1|X=x)=0,where xistheoriginalvaluesetforX bytheevidence.Thensetevidence X =1 (andremove X fromtheevidenceset).Onecan checkthatanymarginalinferencehasitsvalueunchangedbythistransformation.

InPearl’salgorithmonefocusesononevariable X atatime, andpassesmessagesto itsneighborsasfollows.Assume that X hasparents U1,. . . ,Um andchildrenY1,. . . ,Yn,andnoevidenceissetforX.

From X toeachchildYj,sendthemessage:

fXYj

(

x

) =

k=j

fYkX

(

x

)

π

P(

X

=

x

|

Pa

(

X

) = π )

m i=1

fUiX

(

ui

) ,

where

π

rangesoverthevaluesofPa(X)andui istheprojectionof

π

onUiPa(X).

From X toeachparentUi,sendthemessage:

fXUi

(

ui

) =

x

m j=1

fYjX

(

x

)

πui

P(

X

=

x

|

Pa

(

X

) = π )

k=i

fUkX

(

uk

) ,

where

π

uiaretheconfigurationswhoseprojectiononUiisui,andukistheprojection

π

onUk.

Intheexpressionsabove,theproductsareidenticaltooneifthecorrespondingrangeisempty(e.g.ifX isarootthenthere arenomessages fUiX).Soforexample,if X isaleafwithaparentU andnoevidenceissetforX,then fXU=1.

Considernowthecaseofanode X whichhasvalue xfixedbytheevidence.Recallthatweassumedthatevidencewas setonlyforleafnodeswithasingleparent.Then X sendsmessage fXU(u)=P(X=x|U=u)toitssingleparentU.

Since the graph is a polytree, it is always possible to finda node that can send a message to a neighbor based on themessagesit hasreceived.Forexample,if X isa leafwitha singleparent U thenits message fXU doesnotdepend on any other message. Another example is the case of a root node X witha single child Y, which can send message

fXY(x)=P(X=x).

Afterallmessagesaresent,foreachnode X inthenetworkwehave:

P(

X

=

x

|

Y

=

y

)

m j=1

fYjX

(

x

)

π

P(

X

=

x

|

Pa

(

X

) = π )

n i=1

fUiX

(

ui

) .

The message-passingalgorithm above canbe extended forcredalnetworksby considering thewhole set ofmessages fAB thatgothroughanyedge AB aswevarythedistributionsinthecompleteextension.Thatfactbyitselfdoesnot rendertheproblemeasy.Akeyinsightbehindthe2Ualgorithmisthatthesesetsofmessagescanbecomputedefficiently withaverysimilarmessagepassingschemeforthecaseofbinaryvariables.The2Ualgorithmoperatesasfollows.

From X toeachchildYj,sendthemessage:

fXYj

(

x

) =

1

+ (

1

μ (

x

)) μ (

x

)

1

k=jf

YkX

(

x

)

1

,

where

μ (

x

) =

min

π

P (

X

=

x

|

Pa

(

X

) = π )

m i=1

fUiX

(

ui

) ,

withtheminimizationbeingperformedovertheextremepointsoftheintervals[fU

iX(ui),fUiX(ui)].Upperbounds areobtainedbyreplacingminimization/lowerboundswithmaximizations/upperboundsintheequationsabove.

From X toeachparentUi,sendthemessage:

fXUi

(

ui

) =

min

λ(

x

) ρ (

x

|

ui

) + λ(¯

x

)(

1

ρ (

x

|

ui

)) λ(

x

) ρ (

x

ui

) + λ(¯

x

)(

1

ρ (

x

ui

)) ,

7 Thefactthatthe2Ualgorithmcanbederivedasamessagepassingalgorithmhasbeen mentionedbefore[59],buttherelevantdetailshavenot appearedintheliterature.

(10)

where

ρ (

x

|

ui

) =

πui

P (

X

=

x

|

Pa

(

X

) = π )

k=i

fUkX

(

uk

)

andtheminimizationsareperformedover

λ(

x

)

⎧ ⎨

n j=1

fYkX

(

x

),

n j=1

fYkX

x

)

⎫ ⎬

, P (

X

=

x

|

Pa

(

X

) = π )

ext

K(

X

|

Pa

(

X

) = π ),

fUkX

(

uk

)

fUkX

(

uk

),

1

fU

kX

(

u

¯

k

)

.

AsinPearl’salgorithm,theproductsintheaboveexpressionsarereplacedwithvalue1whentheirrangesareempty.

If X isaleafnodewithevidence X=x,thenitsendsmessage fXU

(

u

) =

min

P∈extK(X|U=u)

P (

X

=

x

|

U

=

u

) P (

X

=

x

|

U

= ¯

u

)

toitssingleparentU.Thisisequivalenttosettingλ(x)=1 andλ(¯x)=0 intheequationforanode X withnoevidence.

AttheendonecancomputeexactboundsonP(X=x|Y=y)=(1+(1/

μ

(x)1)λ(¯x)/λ(x))1 foreachvariable X inthe networkbyoptimizingovervariables

μ

(x)andλ(x)asintheequationsabove.

The expressions above assume that probabilities are always positive. Regular conditioning with zero probabilities is obtained by extendingthe domainofthe messages toinclude ∞, andby extendingsums andproducts accordingly(e.g.

using1/0= ∞,1/∞=0,1+ ∞= ∞).Thiscanbedonewithouthurtingcomputationalperformance[16].

The 2U algorithm has inspired a few other approximate methods. The Loopy 2U (L2U) algorithm focuses on credal networks withbinary variables wherethe underlying graphmayfail to be a polytree:in thiscaseone can always keep iteratingtheinterval-valuedmessages,hopingtoreachconvergence[60].Suchaschemecorrespondstobeliefpropagation for Bayesian networks, wherePearl’s propagation isrun until messages converge oruntil some prescribed time limit is reached,anideathatmayseemfar-fetchedbutthatoftenobtainsreasonableaccuracy[61].Thisapproachhasbeenfurther extendedintheGeneralizedL2U(GL2U)forcredalnetworksthatmaycontainnon-binaryvariables;theideaistoturnnon- binary variables intoconnectedclustersofbinary variables thatare then processedby L2U[62].A differentpossibilityis to“cut”edges soastoproduceapproximatecredalnetworkswhoseunderlyinggraphsarepolytreesthatcanbeprocessed by2U;theIPEalgorithmexploresthispath[60].Theseapproximateschemesresembleinseveralwaysvariationalmethods, andindeedsomepreliminaryeffortshavebeenmadetoapplyvariationalmethodstocredalnetworks[60].

5.2. Message-passing:algebraicvariableelimination

As we discuss later in Section 6,the 2U algorithm is a truly isolated island of tractability for inference with strong extensions. To handleother situations one may resortfor instance tostochastic search techniques.Cano andMoral [63]

employed geneticprogrammingtofindapproximatesolutions.Cano etal.[64] implementedalocalsearchwithsimulated annealing in order to avoid poor local optima. These approximate methods were later used in conjunction with outer approximationalgorithmstodevelopabranch-and-boundalgorithm [56].

A moresophisticatedapproach,basedonthemanipulation ofsetsofpairs offunctions,was developedbyMauáetal.

[55].Theapproachisabletoproducebothexactandapproximatesolutions,andhasbeenshowntooutperformtheinteger programmingformulation(describedlater)ona largecollectionofsyntheticproblems.Wefocuson thisalgorithm inthe remainderofthissubsection;beforedescribingit,wemustreviewtheoperationsbehindvariableeliminationinthecontext ofBayesiannetworks.

Inshort,variableeliminationextendsPearl’smessagepassingalgorithmtonetworksofarbitrarytopology.8Solet(C,T) beatree-decompositionfortheDAGofaBayesiannetworkandRC besuchthat ZR.Assumewithnolossofgenerality that R= {Z}(otherwisewecanadd anewnodeto T satisfyingsuchpropertyandconnect itto R andtonoothernode).

Root T at R,orientingthearcsaway from R.Thevariableeliminationprocedure inAlgorithm1computesP(Z=z|Y=y) foragivenrootedtreedecomposition(C,T)andsetofconditionalprobabilitymassfunctions0= {p(X|Pa(X)):XX}.9 Thefunctionsδy(Y)inthealgorithmdenotetheKroneckerdeltafunctionthatreturnsoneatY=y andzeroelsewhere.The algorithm startsby creatingandinserting thesefunctionsinto0.Thissomewhat uncommonstepisonlya convenience, asit allows ustodispense withtheneed ofeitherfixing thevaluesof evidencesinthe initialdistributions in 0, orto treat evidence variables separately.The second “for loop” ofthe algorithm is theelimination step:at each iteration, the

8 Thevariableeliminationwepresentdiffersmarginallyfromotheralgorithmssuchasbucketelimination [65],junctiontreepropagation [66],Shenoy- Shafferalgorithm [67],andthecollectalgorithm [68].

9 Notethatp(X|Pa(X))representsafunctionoverthevaluesofXandPa(X).

Referências

Documentos relacionados

Diante do exposto, é apresentada nesse trabalho uma proposta de ferramenta didática para auxiliar as aulas de Química na formação básica/Ensino Médio, visando uma facilitação

Raros são os estudos com essa espécie, portanto, há uma necessidade de informações que possibilitem a avaliação da qualidade fisiológica das sementes, desde a sua maturação até

FRANCISCO MIGUEL ARAÚJO é licenciado em História (2003) e mestre em História da Educação (2008) pela Faculdade de Letras da Universidade do Porto, frequenta o 3.º ciclo de estudos

Pensemos a partir da seguinte passagem contida no capítulo XXI do Leviatã – transposta e comentada aqui por Malherbe (2002, p. 47) –: “[...] o obstáculo é sempre exterior;

Because the artistic education of people is, in many cases, better developed than their grasp of science, it would appear more useful to employ a metaphorical description of the

El modelo, fue propuesto para analizar la Metodología Híbrida de Desarrollo Centrado en el Utilizador (MHDCU) usada para desenvolver el recurso educativo Courseware Sere – El Ser

Tigres e censura regressam à objectiva do discurso crítico no momento em que Miguel Gomes Ramalhete refl ecte acerca da fortuna de um texto de Manuel Gusmão, Os Dias

[r]