• Nenhum resultado encontrado

The effect of combination functions on the complexity of relational Bayesian networks ✩

N/A
N/A
Protected

Academic year: 2023

Share "The effect of combination functions on the complexity of relational Bayesian networks ✩ "

Copied!
18
0
0

Texto

(1)

Contents lists available atScienceDirect

International Journal of Approximate Reasoning

www.elsevier.com/locate/ijar

The effect of combination functions on the complexity of relational Bayesian networks

Denis Deratani Mauá

a

, , Fabio Gagliardi Cozman

b

aInstitutodeMatemáticaeEstatística,UniversidadedeSãoPaulo,Brazil bEscolaPolitécnica,UniversidadedeSãoPaulo,Brazil

a rt i c l e i n f o a b s t ra c t

Articlehistory:

Received23December2016

Receivedinrevisedform29March2017 Accepted30March2017

Availableonline4April2017

Keywords:

RelationalBayesiannetworks Complexitytheory Probabilisticinference

WestudythecomplexityofinferencewithRelationalBayesianNetworksasparameterized bytheirprobabilityformulas.Weshow thatwithoutcombinationfunctions,inference is pp-complete,displaying the same complexity as standard Bayesian networks (thisis so evenwhenthedomainissuccinctlyspecifiedinbinarynotation).Usingonlymaximization ascombinationfunction,weobtaininferentialcomplexitythatrangesfrompp-completeto pspace-complete topexp-complete. Andby combiningmean and threshold combination functions, we obtain complexity classes in all levels of the counting hierarchy. We alsoinvestigatethe use ofarbitrarycombination functions and obtainthat inference is exp-complete even under aseemingly strong restriction.Finally, weexamine the query complexityofRelationalBayesianNetworks(i.e.,whentherelationalmodelisfixed),and weobtainthatinferenceiscompleteforpp.

©2017ElsevierInc.Allrightsreserved.

1. Introduction

Bayesian networksprovidean intuitivelanguage fortheprobabilistic descriptionofconcretedomains[22].Jaeger’sRe- lational Bayesian Networks, here referred to asrbns, extend Bayesian networks to abstract domains, andallow for the description ofrelational, context-specific, deterministic and temporal knowledge [20,21]. There are manylanguages that alsoextendBayesiannetworksintorelationalrepresentations[11,12,16,23,24,30];rbnsofferaparticularlygeneralandsolid formalism.

rbns constitutea specificationlanguage containing asmallnumberofconstructs:relations, probabilityformulas,com- bination functions,andequalityconstraints.Combination functionsare a particularlyimportantmodelingfeature,asthey provideawayofaggregatinginformationfromdifferentelementsofthedomain.

ItshouldnotbesurprisingthattheinferentialcomplexityofBayesiannetworksspecifiedbyrbnsdependsonthechoice ofconstructsallowed.However,fewresultshavebeenproducedontherelationbetweentheexpressivityofsuchconstructs andthecomplexityofinference.

Inthispaper,weexaminetheeffectofcombinationfunctionsonthecomplexityofinferenceswith(Bayesiannetworks specified by) rbns. We first argue that, without combination functions, rbns simply offer a language that is similar to platemodels,a well-knownformalism to describemodels withsimplerepetitivestructure [17,27].We show that without

This paperispartofthe VirtualspecialissueontheEighthInternationalConferenceonProbabilistic GraphicalModels,edited byGiorgioCorani, AlessandroAntonucci,CassioDeCampos.

*

Correspondingauthor.

E-mailaddress:denis.maua@usp.br(D.D. Mauá).

http://dx.doi.org/10.1016/j.ijar.2017.03.014 0888-613X/©2017ElsevierInc.Allrightsreserved.

(2)

Combination functions Domain spec. Bounded arity? Bounded nesting? Complexity of inference

none unary yes pp-complete

none binary yes pp-complete

max binary yes yes pexp-complete

max unary no yes pexp-complete

max unary yes no pspace-complete

max unary yes yes pp

p k-complete

threshold,mean unary yes yes ckp-harda

polynomial unary yes yes exp-complete

a MembershipwhenthresholdandmeanareuseddependonfurtherconstrainsexplainedinSection4.3.

combinationfunctionsinferenceispp-complete,irrespectiveoftheencodingofthedomain;thismatchesthecomplexityof inferenceinstandard(propositional)Bayesiannetworks.Whenweallowcombinationfunctionsandtheassociatedequality constraintsintothelanguage,matterscomplicateconsiderably.Wheneitherthedomainisspecifiedinbinaryorthearityof relationsisunbounded,inferenceispexp-completeevenwhentheonlycombinationfunctionismaximization.Ifweplace aboundon arity,andspecifythedomaininunary notation(orequivalent, asan explicitlistofelements),andallowonly maximization ascombinationfunction,then inference ispspace-complete.Thisis mostly generatedby theabilityto nest an unbounded numberofmaximization expressions. Infact, byfurther restrictingthe numberofnesting ofcombination functions,weobtainthesamepowerasaprobabilisticTuringmachinewithaccesstoanoracleinthepolynomialhierarchy.

We then look at a combinationof mean anda threshold: the former allows probabilities to be definedas proportions;

the latter allows the specificationof piecewise functions. We argue that threshold andmean combined are aspowerful asmaximization, andthus all previous results hold. And bya suitable constraint onthe useof thresholdandmean, we showthat wecanobtaincomplexity ineveryclassofthecountinghierarchy.Wealsolookatthecomplexity ofinference whenthecombinationfunctionisgivenaspartoftheinput.Thechallengehereistoconstrainthelanguagesoastoobtain non-trivialcomplexity results.Weshow that requiringpolynomial-timecombinationfunctionsistooweak acondition in thatitleadstoexp-completeinference.Ontheotherhand,requiringpolynomiallylongprobabilityformulasbringsinference downtopp-completeness.TheseresultsaresummarizedinTable 1.

Wealsoinvestigatethecomplexityofinference whentherbnisassumedfixed.Thisisequivalenttothe ideaofcom- pilingaprobabilisticmodel[6,9].Weshowthatcomplexityiseitherpolynomialifprobabilityformulascanbecomputedin polynomialtime(whichincludesthecaseofnocombinationexpressions)orpp-complete,whenthecombinationfunctions canbecomputedinpolynomialtime(whichincludesthecasesofmaximization,thresholdandmean).

Thepaperbeginswithabriefreviewofrbns(Section2),andkeyconceptsfromcomplexitytheory(Section3).Ourcon- tributionsregardinginferentialcomplexity appearinSection4.Thecomplexityofinferencewithoutcombinationfunctions appearinSection4.1.RelationalBayesiannetworksallowingonlycombinationbymaximizationareanalyzedinSection4.2, whilenetworksallowingmeanandthresholdareanalyzedinSection4.3.Generalpolynomial-timecomputablecombination formulasareexaminedinSection 4.4.Querycomplexity isdiscussedinSection5.Wejustifyouruseofdecisionproblems (instead offunctional problem) and discuss how our results can be adapted to provide completeness forclasses in the functionalcountinghierarchyinSection6.AsummaryofourcontributionsandopenquestionsarepresentedinSection7.

2. RelationalBayesiannetworks 2.1. Bayesiannetworks

ABayesiannetworkisacompactdescriptionofaprobabilisticmodeloverapropositionallanguage[7,22].Itconsistsof two parts:an acyclicdirected graphG=

(

V

,

A

)

overa finitesetofrandom variables X1

, . . . ,

Xn,anda setofconditional probability distributions, one for each variable and each configuration of its parents. The parents of a variable X in V are denoted by pa

(

X

)

.In this paper, we consider only 0/1-valuedrandom variables, hence each conditional distribution P(X|pa

(

X

))

canberepresentedasatable.

The semantics of a Bayesian network is obtained by the directed Markov property, which states that every variable is conditionally independent of its non-descendants given its parents. Forcategorical random variables, this assumption inducesasinglejointprobabilitydistributionby

P (

X1

=

x1

, . . . ,

Xn

=

xn

) =

n

i=1

P (

Xi

=

xi

|

pa

(

Xi

) = π

i

) ,

where

π

iisthevectorofvaluesforpa

(

Xi

)

inducedbyassignments{X1=x1

, . . . ,

Xn=xn}.Bayesiannetworkscanrepresent complexpropositionaldomains,butlacktheabilitytorepresentrelationalknowledge.

(3)

2.2. Definition

rbnscanspecifyBayesiannetworksoverrepetitivescenarioswhereentitiesandtheirrelationshipsaretoberepresented.

To do so,rbns borrow severalnotions from function-free first-orderlogic withequality [14], aswe now quicklyreview.

Considerafixedvocabularyconsistingofdisjointsetsofnamesofrelationsandlogicalvariables(notethatwedonot allow functionsnorconstants).Wedenotelogicalvariablesbycapitalletters(e.g., X

,

Y

,

Z),namesofrelationsbylowercaseLatin letters(e.g.,r

,

s

,

t),andformulasbyGreekletters(e.g.,

α

,

β

).Eachrelationnamerisassociatedwithanonnegativeinteger

|r|describingitsarity.Anatom hastheformr

(

X1

, . . . ,

X|r|

)

,wherer isarelationname,andeach Xi isalogicalvariable.

Anatomicequalityisanyexpression X=Y,whereX andY arelogicalvariables.AnequalityconstraintisaBooleanformula containingonlydisjunctions,conjunctionsandnegationsofatomicequalities.Forexample,

(

X

=

Y

)(

X

=

Z

)

(

Y

=

Z

) ,

where

(

X=Y

)

isasyntacticsugarfor¬(X=Y

)

(wewilladoptthisnotationthroughoutthetext).

The localconditional probability modelsinan rbn arespecified usingprobabilityformulas;each probability formulais oneofthefollowing:

(PF1) Arationalnumberq∈ [0

,

1];or (PF2) Anatomr

(

X1

, . . . ,

X|r|

)

;or

(PF3) Aconvexcombination F1·F2+

(

1−F1

)

·F3 whereF1

,

F2

,

F3 areprobabilityformulas;or (PF4) Acombinationexpression,thatis,anexpressionoftheform

comb

{

F1

, . . . ,

Fk

|

Y1

, . . . ,

Ym

; α } ,

wherecombisawordfromafixedvocabularyofnamesofcombinationfunctions(disjointfromthesetsofrelations symbols andlogical variables), F1

, . . . ,

Fk are probability formulas, Y1

, . . . ,

Ym is a (possibly empty) list of logical variables,and

α

isanequalityconstraintcontainingonlythoselogicalvariablesandthelogicalvariablesappearingin thesubformulas.Welaterdefinecombinationfunctions.

Thelogicalvariables Y1

, . . . ,

Ym inacombinationexpressionaresaidtobeboundbythatexpression;theremightbeno logicalvariablesboundbyaparticularcombinationexpression,inwhichcasewewritecomb{F1

, . . . ,

Fk|∅;

α

}.

We definefreelogicalvariables ina probabilityformula F asfollows.Arationalnumberhasno(free)logicalvariables.

Alllogicalvariablesinatomr

(

X1

, . . . ,

X|r|

)

arefree.Avariableisfreein F1·F2+

(

1−F1

)

·F3 ifitisfreeinoneofF1, F2 and F3.Finally,alogicalvariableisfree incomb{F1

, . . . ,

Fk|Y1

, . . . ,

Ym;

α

}ifitsisfreeinanyof F1

, . . . ,

Fk or

α

,anditis notboundbycomb(i.e.,itisnotoneofY1

, . . . ,

Ym).

Anexampleofaprobabilityformulais

mean

{

0

.

6

·

r

(

X

) +

0

.

7

·

max

{

1

s

(

X

,

Y

)|

X

;

X

=

X

}|

Y

,

Z

;

Y

=

X

Z

=

X

} .

InthisformulaX isfree(eventhough X isboundbythecombinationexpressionheadedbymax),andY andZ arebound.

Weoftenwrite F

(

X1

, . . . ,

Xn

)

toindicatethat X1

, . . . ,

Xn arethefreelogicalvariablesinF.

Similarly toBayesian networks, an rbnconsistsoftwo parts: an acyclicdirected graphwhereeach node isa relation symbol, anda collection of probability formulas, one foreach node. The probability formula Fr associated with node r containsexactly|r|freelogicalvariablesandmentionsonlytherelationsymbolsspa

(

r

)

.

2.3. Semantics

Roughly speaking,one caninterpret an rbnwithgraph G=

(

V

,

A

)

asa rulethat takesa set Dof elements,called a domain,andproduces an auxiliaryBayesiannetwork, asfollows.First, generateall groundings r

(

a

)

foraD|r| andrV. These groundingsare thenodesofthe Bayesiannetwork. Foreachgroundingr

(

a

)

,consider theprobability formulaata;

thatis,Fr

(

a

)

.Thisformuladependsonmanygroundingss

(

a

,

b

)

ofparentsofr;thosearetheparentsofr

(

a

)

intheBayesian network. And foreachconfiguration

π

ofthese(grounded)parents, take P

(

r

(

a

)

=1|pa

(

r

(

a

))

=

π )

=Fr

(

a

)

where Fr

(

a

)

is evaluatedat

π

.Weformalizethisprocedurelater,rightafterwepresentanexample:

Example1. Consider a simplemodelof studentperformance, inspired by the “University World”often employed inde- scriptions ofProbabilisticRelational Models[16,22].Supposeastudent istakingcourses;foreachpairstudent/course,we have approval or not. We have relations difficult and committed, respectively indicating whether a course is difficult and whethera studentiscommitted, andrelationapproved,indicating whetherastudentgets an approvalgradeina course.

The probabilisticrelationships betweentheserelationsarecapturedby thegraphinFig. 1,wherewe usedtwoadditional relations student andcourse to “type”elements ofthe domain.The ideais that theserelationsare fullyspecified by the evidence during inference(thatis,every elementis typed),soits associatedprobabilitiesare notimportant;we cantake forexample:

Fstudent

(

Y

) =

1

/

2 andFcourse

(

X

) =

1

/

2

.

(4)

Fig. 1.Relational Bayesian network modeling student performance.

Fig. 2.Bayesian network obtained by grounding therbninExample 1and removing nodes with evidence.

Weneedprobabilityformulasfordifficultandcommitted:

Fdifficult

(

X

) =

0

.

3

·

course

(

X

) ,

Fcommitted

(

Y

) =

0

.

6

·

student

(

Y

) .

Also,wehaveaprobabilityformulaforapproved:

Fapproved

(

X

,

Y

) =

0

.

8

· [

0

.

6

· (

1

difficult

(

X

)) +

0

.

4

·

committed

(

Y

)] ·

course

(

X

) ·

student

(

Y

) .

Now supposewehavedomainD= {s1

,

s2

,

c2

,

c2}.Notethatwe have32groundings (fourunaryrelationsandonebinary relation),soweobtainaBayesiannetworkwith32nodes/randomvariables.Inthisnetwork,wehave

P (

difficult

(

a

) =

1

|

course

(

a

) =

s

) =

0

.

0 ifs

=

0

(

ais not a course

),

0

.

3 otherwise

;

andlikewiseforallothergroundingsofrelations.Nowsupposethattherelationstudentisgiven;saywehaves1ands2as students,andc1andc2ascourses(hencenot students).We denotethisevidenceE.GivenE,wecan focusonasmaller Bayesiannetworkobtainedbyconditioningonthis“typing”evidence;suchanetworkispresentedinFig. 2. 2

The probability formulas inthe previous example did not use combinationexpressions. Combination expressions are slightlymoredifficulttointerpret;theirsemanticsare givenbycombinationfunctions.Acombinationfunctionisanyfunc- tion that maps a multiset of finitely many numbers in [0

,

1] into a single rationalnumber in [0

,

1]. Some examples of combinationfunctionsare:

• Maximization:max{q1

, . . . ,

qk}=qi,wherei isthesmallestintegersuchthatqiqjfor j=1

, . . . ,

k,andmax{}=0;

• Minimization:min{q1

, . . . ,

qk}=1max{1q1

, . . . ,

1−qk};

Arithmeticmean:mean{q1

, . . . ,

qk}=k

i=1qi

/

k,andmean{}=0;

Noisy-or:noisy-or{q1

, . . . ,

qk}=1k

i=1

(

1−qi

)

,andnoisy-or{}=0;

Threshold:threshold{q1

, . . . ,

qk}=1 ifmax{q1

, . . . ,

qk}≥1

/

2 andthreshold{q1

, . . . ,

qk}=0 otherwise.

Thecombinationfunction threshold, whichdoesnotappearin previousrbns, isusefultoencode case-based functions suchaspiecewiselinearfunctions.Itmayseemthatthecomparisonwithafixedvalue1

/

2 istoorestrictive;however,witha smalleffortwecancompareanyprobabilityformulawithanyconstantusingthiscombinationfunction.Toseethis,suppose we wanttocomparea probability formula F witha number

γ

∈ [0

,

1]; that is,wewantto determinewhether F

γ

.If

γ >

1

/

2,thenusethreshold{

(

1

/(

2

γ ))

·F}.Andif

γ <

1

/

2,thenusethreshold{

γ

F+

(

1−

γ )

}where

γ

=1

/(

2

(

1−

γ ))

.Thus wecangenerateallsortsofpiecewiseandcase-basedfunctionswiththreshold.Weexploitthisversatilitytoencodedecision problemsrelatedtocountinginsomeofourproofs.

We can now explain the semantics of rbns, which is givenby interpretations of a domain D.An interpretation is a function

μ

mappingeach relationsymbol rRintoarelationrμD|r|,andeach equalityconstraint

α

intoitsstandard meaning

α

μ. An interpretation

μ

induces a mapping from a probability formula F with n free logical variables into a function

(

a

)

fromDnto[0

,

1]asfollows.IfF=q,thenistheconstantfunctionq.IfF=r

(

X1

, . . . ,

Xn

)

,then

(

a

)

=1 if arμ and Fμ

(

a

)

=0 otherwise. If F=F1·F2+

(

1−F1

)

·F3 then

(

a

)

=Fμ1

(

a

)

2

(

a

)

+

(

1−F1μ

(

a

))

3

(

a

)

. Finally, if F=comb{F1

, . . . ,

Fk|Y1

, . . . ,

Ym;

α

}, then

(

a

)

=comb

(

Q

)

, where this latter comb is the corresponding combination function andQis themultisetcontaining a number Fμi

(

a

,

b

)

forevery

(

a

,

b

)

α

μ (evenif Fi doesnot depend onevery coordinate).

For example,consider a domain D= {a

,

b}. The probability formula F=max{0

.

1

,

0

.

2

,

0

.

3|Y;Y =Y} is interpreted as max{0

.

1

,

0

.

2

,

0

.

3

,

0

.

1

,

0

.

2

,

0

.

3}, and the probability formula F

(

X

,

Y

)

=max{q1

,

q2|∅;X =Y} is interpreted as

(

a

,

a

)

=

(

b

,

b

)

=max{}=0,andFμ

(

a

,

b

)

=Fμ

(

b

,

a

)

=max{q1

,

q2}.

(5)

Finally: GivenafinitedomainD,anrbnwithgraph G=

(

V

,

A

)

isassociatedwithaprobabilitydistributionoverinter- pretations

μ

definedby

P

D

( μ ) =

rV

P

D

(

rμ

|{

sμ

:

s

pa

(

r

)}) ,

where

P

D

(

rμ

|{

sμ

:

s

pa

(

r

)}) =

arμ

Frμ

(

a

)

a/rμ

(

1

Frμ

(

a

)) .

Wecan nowrephrase,inpreciseterms,themoreinformalexplanationgivenbeforeExample 1.Inessence,anrbnisa template modelthat foreachdomain Dgenerates aBayesian networkwhereeach node r isamultidimensional random variabletakingvaluesinthesetof2N|r| possibleinterpretations,whereN= |D|.Alternatively,wecanassociatewith every relationsymbolrRandtupleaD|r| arandomvariabler

(

a

)

thattakesvalue 1whenarμand0whena

/

rμ. An rbnandadomainD producean auxiliaryBayesiannetworkovertheserandomvariables,wheretheparents ofanode r

(

a

)

areallnodess

(

a

,

b

)

suchthatsisaparentofrintherbnand

(

a

,

b

)

D|s|,andtheconditionaldistributionassociated withr

(

a

)

isgivenbyinterpretationsoftheprobabilityformulaFr ata.ThisBayesiannetworkinducesajointdistribution

P

{

r

(

a

) = ρ

r(a)

}

=

rV

aD|r|

P

r

(

a

) = ρ

r(a)

|

pa

(

r

(

a

)) = π = P

D

( μ ) ,

wherearμif

ρ

r(a)=1 anda

/

rμif

ρ

r(a)=0,and

π

isaconfigurationconsistentwiththosevalues.Followingterminology from first-order logic,we often say that thisBayesian network is the “grounding” ofthe rbn on domain D. Often, it is possibletogenerateasmallergrounding,wheretheparentsofr

(

a

)

areonlythoserelevanttoevaluate Fr

(

a

)

(whichmight not depend on all s

(

a

,

b

)

,s∈pa

(

r

)

, a

,

bD|s|). Giventhisequivalence betweenan rbn endowedwith adomain andan auxiliaryBayesiannetwork,werefertoprobabilitiessuchasP(r

(

a

)

=1

)

todenoteP(arμ

)

=

μ:arμPD

( μ )

,andsimilarly formorecomplexevents(notethatthedependencyofthedistributiononthedomainisleftimplicitinthefirstcase).

Thenextexampleillustratesthemodelingpowerofdifferentcombinationfunctions.

Example2. ConsideragaintheuniversitydomaininExample 1,andsupposethat inadditiontostudentsandcourseswe alsohaveteachersinthedomain.Attheendoftheacademicyearstudentsvoteforthebestteacheroftheyear.Anawardis thengiventothemostvotedteacher.Weuseabinaryrelationtaughttoindicatewhetheracoursewastaughtbyateacher inthatyear.Thisrelationwillusuallybegivenasevidence,soitsprobabilityisnotimportant;forsimplicitywespecify

Ftaught

(

X

,

Z

) =

1

/

2

.

Weuseanunaryrelationawardedtoindicatewhethersomeoneistherecipientoftheaward.Theprobabilitythatateacher is awarded is linearlyrelatedto the proportion p of students that failedin atleast one coursehe/she taught,withtwo regimes:

P (

awarded

(

Z

) =

1

|

p

) =

0

.

425

0

.

4p

,

ifp

0

.

7

,

0

.

25

0

.

15p

,

ifp

<

0

.

7

.

Weencodethisas,

Fawarded

(

Z

) =

T

(

Z

) · [

0

.

025

+

0

.

4

· (

1

F1

(

Z

))] + (

1

T

(

Z

)) · [

0

.

1

+

0

.

15

· (

1

F1

(

Z

))],

where T

(

Z

)

= threshold{5

/

7 · F1

(

Z

)

} denotes whether the proportion of failed students is above 0.7, F1

(

Z

)

= mean{F2

(

Y

,

Z

)

|Y;Y=Y} is the proportion ofstudents that failed some course taught by Z, and F2

(

Y

,

Z

)

is a formula indicatingwhetherY failedacoursetaughtby Z:

F2

(

Y

,

Z

) =

max

{(

1

approved

(

X

,

Y

)) ·

taught

(

X

,

Z

)|

X

:

X

=

X

} .

Sincetherecanbeexactlyonewinneroftheaward,weintroducearelationsingleWinnerofarityzero(i.e.,aproposition), andspecify:

FsingleWinner

=

max

{

awarded

(

Z

)|

Z

;

Z

=

Z

} ·

threshold

{ γ ·

mean

{

1

awarded

(

Z

)|

Z

;

Z

=

Z

}|∅} ,

where

γ

=

(

N1

)/

N

)

andN= |D|isthesizeofthedomain(thenumberofcourses,studentsandteachers).Thegraphof thecorrespondingrbnisshowninFig. 3. 2

Thisexampleusesthemax combinationfunctiontoencodeanexistentialquantifier;wewilllaterresorttothistrickin ourproofs.

(6)

Fig. 3.Relational Bayesian network modeling the university domain inExample 2.

3. Thecomplexityofcounting

Weassume familiaritywithstandard complexityclassessuch asp,np,pspaceandexp,andwiththeconceptoforacle Turingmachines[1,29].Asusual,wedenotetheclassoflanguagesdecidedbyaTuringmachineincomplexityclasscwith oracle in complexity class o asco. We alsoassume that problemsare reasonably encodedusing a binary alphabet (say, 0 and1).Adecisionproblemconsistsincomputinga0/1-valuedfunctiononbinary strings.Wecallaninputofadecision problemayesinstanceifitsoutputis1,otherwisewecallanoinstance.

Thepolynomialhierarchy isthecollectionofcomplexity classes

kp,

pk and pk,forevery nonnegativeintegerk.These decisionproblems aredefined inductivelyby

p0=p,

kp=nppk−1,

pk=pk−1p ,and kp=co

pk. Thecomplexity class ph contains alldecisionproblemsin thepolynomialhierarchy: ph= k

kp (over all integersk).We thushavethefollowing inclusion:

p

conpnp

pnp

2p

p

2

⊆ · · · ⊆

kp

p k

kp

⊆ · · · ⊆

ph

.

Awell-knownresultbyMeyerandStockmeyer[28]statesthatif kp=

kpforsomekthenph=

kp.Inparticular,ifp=np thenph=p.Thissuggeststhatthelevels

kp ofthehierarchyaredistinct,andthatusinganoracleinahigherlevelbrings inmorecomputationalpower.

A probabilisticTuringmachineis astandard non-deterministic Turing machine that acceptsifat leasthalf ofthe com- putationpathsaccept, andrejects otherwise.The complexityclasspp containsdecisionproblemswhichareacceptedbya probabilisticTuringmachineinpolynomialtime.Theclassppisknowntobeclosedundercomplementandunion[4].The classpexpisdefinedanalogouslywithpolynomial-timereplacedwithexponential-time.

The countinghierarchyaugments thepolynomial hierarchywithoraclemachinesbased onpp.Itcan bedefinedasthe smallestcollectionofclassescontaining p andsuchthat ifc isinthe collectionthennpc,conpc,andppc are alsointhe collection[34,36].Wedefinetheclasschastheunionoverallnonnegativeintegerskoftheclassesckp,whereckp=ppck−1 and c0p=

0p=p [36]. Toda’s celebrated result showsthat any problem in ph can be solved in polynomial time by a deterministicTuring machinewithanoracle pp[32].Hence,theclass chcontains theentirepolynomialhierarchy.Iffact, wehavethefollowinghierarchy:

ph

ppp

conpnppppp

pppp

c2p

⊆ · · · ⊆

ch

pspace

exp

pexp

.

Amany-onereduction froma problem A to a problem B is a polynomial-timefunction f that maps inputsof A into inputsof B,andsuchthat xisayesinstanceof A ifandonlyif f

(

x

)

isayesinstanceof B.Foranycomplexityclassc in thecountinghierarchy,wesaythataproblemA isc-hardifforanyproblemB incthereisamany-onereductionfromA to B.IfA isalsoinc,then Aissaidtobec-complete.

4. InferentialcomplexityofrelationalBayesiannetworks

Ourgoal in thispaper isto examine the complexity of the decisionvariant ofthe following (single-query) inference problem:

Input:Anrbnwithgraph

(

V

,

A

)

,adomainD,alistofrandomvariablesr

(

a

),

r1

(

a1

), . . . ,

r

(

an

)

,andarational

γ

∈ [0

,

1]. Decision:IsP

(

r

(

a

)

=1|r1

(

a1

)

=1∧ · · · ∧rn

(

an

)

=1

)

γ

?

The conditioningeventr1

(

a1

)

=1∧ · · · ∧rn

(

an

)

=1 is calledevidence,andalsodenoted as{r1

(

a1

)

=1

, . . . ,

rn

(

an

)

=1} and {r1

(

a1

)

=1},

. . . ,

{rn

(

an

)

=1}. We haveassumed implicitlyin the formulation ofthe problem above that P(r1

(

a1

)

= 1

, . . . ,

rn

(

an

)

=1

) >

0 (i.e.,that evidencehaspositiveprobability).Adifferentstrategywouldbe to defineinstanceswhere thiscondition isnotobservedasnoinstances(orasyesinstances).Inthiscase, toprovemembershipinsome complexity classwe wouldhavetofirstshow howtodecideP(r1

(

a1

)

=1

, . . . ,

rn

(

an

)

=1

) >

0 inthatclass.Forsimplicity,weassume thatthisisalwaystrue;itispossiblehowevertoshow thatthecomplexity ofinferenceisnotchangedbythissimplifying assumption.

(7)

Ifthe domainisa singleton,then therbn isactuallya Bayesiannetwork,since probability formulascanbe efficiently converted into tables. Moreover, anyBayesian network can be polynomially encodedas a rbn witha singleton domain.

Hence,inferenceinrbnsispp-hard,asthisisthecaseforBayesiannetworks[8,Theorems11.3and11.5].

For domains withmany elements, complexity might depend on howthe domain D is encoded. One possibility is to assume that the domainis some set {1

,

2

, . . . ,

N} specified by the integer N (butnote that thereis no ordering on the elementsimplied).TheintegerN canbeencodedindifferentways.Oneisthis:

Assumption3.Thedomainis{1

,

2

, . . . ,

N},whereN isencodedinbinarynotation.

Underthepreviousassumption,theencodingofthedomaincontributeswithO

(

logN

)

bitstotheinputsize.Depending ontheconstructsallowed,thisalternativecanproduceprobabilitieswhichrequireexponentialefforttobewrittendown,as they mightneedO(N

)

bits.Toseethis, consideran rbnwithapropositionrsuchthat Fr=noisy-or{1

/

2|X;X=X}.Then P

(

r=0

)

=1− [1N

i=11

/

2]=2N.

Anotherpossibilityistostillassume thatthedomainis{1

, . . . ,

N},buttoencode N usingunary notation.Inthiscase, theencodingofthedomaincontributeswithO(N

)

symbolstotheinput.Thus,thisspecificationisequivalent(uptosome constantfactor)toencodingthedomainbyanexplicitlistwitheveryelementitcontains.Sowemightadopt:

Assumption4.Thedomainisspecifiedasanexplicitlistofelements.

Another parameter that affects inferential complexity is the arity of relations. If we place no bound on the arity of relations,thenanexponentialeffortmightberequiredtowritedownprobabilities.Forexample,consideranrbnwithgraph r s wherer isak-aryrelationwithprobabilityformula Fr

(

X1

, . . . ,

Xk

)

=1

/

2,andsispropositionwithprobability formula Fs=noisy-or{r

(

X1

, . . . ,

Xk

)

|X1

, . . . ,

Xk;X1=X1};then inference P

(

s=0

)

requires an exponential numberof bits when4.Wewillthusoftenadopt:

Assumption5.Thearityofrelationsisbounded.

Without further restrictions, the complexity of inference is dominated by the complexity of evaluating combination expressions.Forinstance,ifthecombinationfunctioninacombinationexpressionisanarbitraryTuringmachine(withno boundsontimeandspace)theninferenceisundecidable.Thus,weneedtoconstraintheuseofcombinationfunctionsin probabilityformulastomakethestudyofinferentialcomplexitymoreinteresting.

4.1. rbnswithoutcombinationfunctions

Westartouranalysiswiththesimplestcase,wherenocombinationfunctionisavailable.Thatis,weadopt:

Assumption6.Everyprobability formulais eitherarational number,oran atom,ora convexcombinationofprobability formulas.

Undertheassumptionabove,everylogicalvariableinaprobabilityformulaisfree;also,intheacyclicdirectedgraphof relationsofanrbnnonodecanhaveanaritylargerthananyofitschildren.

Ifthedomainisgivenasanexplicitlistofelements(orbyanintegerinunarynotation),thenwecan“ground”anrbn into anauxiliary Bayesiannetwork, andruninference there(consider Example 1).Thisprocess generatesa polynomially- large Bayesiannetworkiftherelationshaveboundedarity(ifnecessary,we caninsertauxiliarynodessoastobound the numberofparentsofanygivennode).BecauseinferenceinBayesiannetworksispp-complete,wehavethat

Theorem7. Inferenceispp-completewhenthedomainisspecifiedasalist,therelationshaveboundedarityandthereisnocombina- tionfunction.

Interestingly,thecomplexityofinferenceremainsthesameevenifthedomainisspecifiedbyitssizeinbinarynotation (notethatinthiscasetheauxiliaryBayesiannetworkhasasizethatisexponentialintheinput):

Theorem8. Inferenceispp-completewhenthedomainisspecifiedbyitssizeinbinary,therelationshaveboundedarityandthereis nocombinationfunction.

Proof. Hardness followsfromthe fact that an rbn can efficientlyencode any propositional Bayesiannetwork using only convexcombinationsofnumbersandatoms(asinExample 1).Toprovemembership,considertheauxiliaryBayesiannet- workthatisgeneratedby“grounding”agivenrbn.Thisnetworkmaybeexponentiallylargeontheinput,whenthedomain isgiveninbinarynotation.Weshowthatweonlyneedtogenerateapolynomially-largefragmentoftheauxiliaryBayesian networktodecidewhetherP

(

r

(

a

)

=1|r1

(

a1

)

=1∧ · · · ∧rn

(

an

)

=1

)

γ

.Notethattocomputethatconditionalprobability,

(8)

that Fr isa convexcombination;then itmayrecursively referto severalatomsthatare parents ofr. Theseparents may inturnhaveseveralparents,andsoon.Sincetherearenotcombinationfunctions,wecanalwaysrenamelogicalvariables suchthe probabilityformulasofeveryancestorofr containonlylogicalvariablesintheprobability formulaFr.Thus,ifr hasarityk,thentherecanbe atmost2k ancestorsofr

(

a

)

,asthisisthenumberofsubsetsofthelogicalvariables in Fr. Sinceweassumedkisboundedbyaconstant,thisnumberofancestorsisalsoboundedbyaconstant.Thus,wecanbuild aBayesiannetworkconsistingofthesenodesandtheedgesbetweenthem,togetherwiththeassociatedprobabilityvalues.

Inference mustthenberun inthissub-network,andthiscan bedone byapolynomial-timeprobabilistic Turingmachine [8,Theorems11.3and11.5]. 2

Accordingto Theorem 8,thecomplexity ofinferenceincombination-function-free rbnsmatches thecomplexity ofin- ferenceinBayesiannetworks(ifrelationarityisbounded).Thatis,eventhoughanrbnmayrepresentexponentiallylarge Bayesiannetworks, onlya polynomialnumberofnodesofthisnetworkisrelevantforinference(and thesenodescan be collectedinpolynomialtime).

4.2. rbnswithmaxcombinationfunctions

Wenowconsiderrbnswhereinformationisonlyaggregatedbymaximization.Thatis,weassumethat Assumption9.Theonlycombinationfunctionusedismax.

Aswe haveseen,withoutcombinationfunctions, inferenceinrbns hasthesamecomputationalpowerasinferencein Bayesiannetworks, evenwhen thedomainisgivenby aninteger inbinary notation.Thisisnot thecasewhenwe allow combinationfunctions,evenoneassimpleasmaximization:

Theorem10. Inferenceispexp-completewhenthedomainisspecifiedbyitssizeinbinary,therelationshaveboundedarityandthe onlycombinationfunctionismax.

Proof. Membershipfollowssincean rbnsatisfyingthe aboveassumptions canbe “grounded” intoanexponentially large Bayesiannetwork.Toseethis,notethatifwehaveadomainofsize2m,andaboundofarityk,theneachrelationproduces atmost

(

2m

)

k=2km (grounded)random variables. Inference canbe decided in that networkusing aprobabilistic Turing machine withan exponential bound on time, using the same scheme normally employed to decide an inference for a Bayesiannetwork[8].

Toshowhardness,wesimulateanexponential-timeprobabilisticTuringmachineusinganrbnsuchthatwecan“count”

thenumberofacceptingpathsbyasingleinference.WedosobyresortingtoastandardTuringmachineencodingdescribed byGrädel[18,Theorem3.2.4].Thereareotherpossibleencodingswemightuse,butthisoneisquitecompactandrelatively easytograsp.

So,take a Turingmachinethat can decidea pexp-completelanguage. Thatis,take a pexp-completelanguage Landa nondeterministicTuringmachinesuchthat

Lifandonlyifthemachinehaltswithintime 2p withhalformoreofthe computation pathsaccepting, wheren isthelengthof

and pdenotes apolynomialinn.1 Denote by

σ

asymbolinthe alphabetofthemachine(whichincludesblankandboundarysymbolsinadditionto0and1s),andbyq astate.Asusual, we describeaconfigurationofthe machineby astring

σ

1

σ

2· · ·

σ

i1

(

q

σ

i

) σ

i+1· · ·

σ

2p,whereeach

σ

j is asymbolinthe tape,and

(

q

σ

i

)

indicatesthatthemachineisatstateq anditsheadisatcelli.Theinitialconfigurationis

(

q0

σ

01

) σ

02· · ·

σ

0n

followedby 2pnblanks, whereq0 is theinitial state. Letqa andqr be,respectively, statesthat indicateacceptance or rejection ofthe input string

σ

01· · ·

σ

0n. The transitionfunction

δ

ofthe machine takes a pair

(

q

, σ )

consistingof a state anda symbol inthe tape,andreturns triples

(

q

, σ ,

m

)

:the next state q,the symbol

σ

to bewritten inthe tape (we assumethat ablankisneverwrittenbythemachine), andan integerm in{−1

,

0

,

1}.Here1 meansthat theheadisto moveleft,0 meansthat theheadistostayinthecurrentcell,and1 meansthat theheadistomoveright.Wemakethe importantassumptionthat,ifqa orqr appearinsomeconfiguration,thentheconfigurationisnot modifiedanymore (the transitionfunctiongoesfromthisconfigurationtoitselffromthatpointon).Thisisnecessarytoguaranteethatthenumber ofacceptingcomputationsisequaltothenumberofwaysinwhichthemachinecanrunfromtheinput.

Grädel’s encoding uses quantified Boolean formulas to represent the Turing machine. As shown by Jaeger [20, Lemma 2.4],probability formulascanencode conjunctionandnegation,respectively,by multiplicationandinversion(and then disjunction and implication are easily obtained). Logical equalities such as X =Y are obtained by the formula

1 Intheusualdefinition,aprobabilisticTuringmachineacceptswhenthemajorityofcomputationpathsaccept;however,givensuchamachinewecan alwaysbuildamachinethatdecidesthesamelanguage,andsuchthatthemachineacceptsifandonlyifatleasthalfofthecomputationpathsaccept.

(9)

F

(

X

,

Y

)

=max{1|∅;X=Y},andquantified formulassuch asY

ψ(

X

,

Y

)

areobtainedas F

(

X

)

=max{Fψ

(

X

,

Y

)

|Y;Y=Y}, where Fψ

(

X

,

Y

)

encodes

ψ(

X

,

Y

)

.ThuswecaneasilyreproduceGrädel’slogicalencodingofthemachinebywritinglogical formulasasprobabilityformulas.Forexample,considerarelationless(meaning“lessthan”),usedtoordertheelementsof thedomain,andconsequentlyrepresenttheorderingonthecomputationstepsandonthepositionsofthetape.Sincethe elements ofthedomainareordered,wecaninterpretthemasintegers.DefineFless

(

X

,

Y

)

=1

/

2.Wemustimposeonless theaxiomsoflinearorderings.Toconstrainlesstobeirreflexive,weintroduceapropositionirrefwithprobabilityformula

Firref

=

1

max

{

less

(

X

,

X

)|

X

;

X

=

X

} .

We have that

irref=1 if and only if lessμ is irreflexive. Note that the above probability formula simply encodes the quantified Boolean formula ¬∀Xless

(

X

,

X

)

in the language of rbns. Similarly, we enforce transitivity by a relation trans withprobabilityformula

Ftrans

=

1

max

{

less

(

X

,

Y

) ·

less

(

Y

,

Z

) · (

1

less

(

X

,

Z

))|

X

,

Y

,

Z

;

X

=

X

} .

To understand thisexpression, note that less

(

a

,

b

)

less

(

b

,

c

)

less

(

a

,

c

)

holds if andonly if 1− [less

(

a

,

b

)

·less

(

b

,

c

)

·

(

1−less

(

a

,

c

))

]=1. Finally,weimpose trichotomy(thatis,if X=Y thenless

(

X

,

Y

)

less

(

Y

,

X

)

) bya relationtrichwith probabilityformula

Ftrich

=

1

max

{ (

1

less

(

X

,

Y

)) · (

1

less

(

Y

,

X

)) |

X

,

Y

;

X

=

Y

} .

Usingtherelationlesswecanencodearelationfirstindicatingthesmallestelementofthedomain:

Ffirst

(

X

) =

1

max

{

less

(

Y

,

X

)|

Y

;

Y

=

Y

} .

Wecanalsoencodetherelationsuccessor

(

X

,

Y

)

,whichistrueifandonlyifY isthesmallestelementlargerthan X:

Fsuccessor

(

X

,

Y

) =

less

(

X

,

Y

) · (

1

max

{

less

(

X

,

Z

) ·

less

(

Z

,

Y

) |

Z

;

Z

=

Z

} ) .

WhileprobabilityformulascanencodeanyquantifiedBooleanformulausingonlymaxascombinationfunction,theyareless readable thantheirlogicalcounterparts.Sofortherestofthisproof,wewilluselogicalexpressionsinsteadofprobability formulas.

Introduceaunaryrelationstateqforeachstateqanddefine Fstateq

(

X

)

=1

/

2.Atomstateq

(

X

)

denotesthatthemachineis instateq atcomputationstep X.Also,introduceabinaryrelationsymbolσ foreachsymbol

σ

inthealphabet,andspecify Fsymbolσ

(

X

,

Y

)

=1

/

2.Atomsymbolσ

(

X

,

Y

)

denotesthat

σ

iswrittenintheYthpositionofthetapeinthe Xthcomputation step.Finally,introduceabinaryrelationheadwithFhead

(

X

,

Y

)

=1

/

2,meaningthatthemachineheadisintheYthposition ofthetapeinthe Xthcomputationstep.Wemustguaranteethatatanycomputationstepthemachineisinasinglestate, each tapepositionhasasinglesymbol,andtheheadisatasingletapeposition.Wedosoby introducingapropositionr andaprobabilityformulaFr thatencodesthelogicalexpression:

X

q

stateq

(

X

)

q=q

¬

stateq

(

X

)

σ

Y

symbolσ

(

X

,

Y

)

σ=σ

¬

symbolσ

(

X

,

Y

)

Y

head

(

X

,

Y

) ∧ ∀

Z

(

Z

=

Y

) → ¬

head

(

X

,

Z

)

,

WeintroduceapropositionstartwithaprobabilityformulaFstart

(

X

)

imposingtheinitialconfigurationofthemachine,and a proposition compute witha probability formula Fcompute encoding the transitions ofthe machine. The formula Fstart is simple:itfixesthefirstelementusingtheconstruction∀X[first

(

X

)

stateq0

(

X

)

head

(

X

,

X

)]

.TheformulaFcomputeismore elaborate; we onlysketch theconstructionby Grädel.First, Fcompute encodestheconjunctionoftwologicalformulas,one affectingthetapepositionsthatdonotchange,andtheotheractuallyencodingthetransitions.Thefirstformulais

X

,

Y

,

Z

,

W

σ

symbolσ

(

X

,

Y

)(

Z

=

Y

)

head

(

X

,

Z

)

successor

(

X

,

W

)

symbolσ

(

W

,

Y

)

.

Thesecondformulais

X

,

Y

,

W

q

stateq

(

X

)

head

(

X

,

Y

)

symbolσ

(

X

,

Y

)

successor

(

X

,

W

)

(q,σ,m)δ(q,σ)stateq

(

W

)

symbolσ

(

W

,

Y

)

HEADMOVE

,

whereHEADMOVE istheformula∃Z[

(

Z=Y+m

)

head

(

W

,

Z

)

],and Z=Y+m issyntacticsugarforaconstructionthat canbeaccomplishedwiththesuccessorrelation.

We are stillto imposethat theconstraintswe includedare infact enforcedby anyinterpretation.Wedo soby using evidence {irref=1}, {trans=1}, {trich=1}. We also need to impose that the input is represented on the tape. So use

Referências

Documentos relacionados

Após o instrumento original passar por diversas fases do processo de tradução e adaptação transcultural, e resultar em um instrumento na língua portuguesa e adaptado para