The effect of combination functions on the complexity of relational Bayesian networks ✩

(1)

Contents lists available atScienceDirect

International Journal of Approximate Reasoning

www.elsevier.com/locate/ijar

The effect of combination functions on the complexity of relational Bayesian networks ^✩

Denis Deratani Mauá

^a

^,∗ , Fabio Gagliardi Cozman

^b

aInstitutodeMatemáticaeEstatística,UniversidadedeSãoPaulo,Brazil bEscolaPolitécnica,UniversidadedeSãoPaulo,Brazil

a rt i c l e i n f o a b s t ra c t

Articlehistory:

Received23December2016

Receivedinrevisedform29March2017 Accepted30March2017

Availableonline4April2017

Keywords:

RelationalBayesiannetworks Complexitytheory Probabilisticinference

WestudythecomplexityofinferencewithRelationalBayesianNetworksasparameterized bytheirprobabilityformulas.Weshow thatwithoutcombinationfunctions,inference is pp-complete,displaying the same complexity as standard Bayesian networks (thisis so evenwhenthedomainissuccinctlyspeciﬁedinbinarynotation).Usingonlymaximization ascombinationfunction,weobtaininferentialcomplexitythatrangesfrompp-completeto pspace-complete topexp-complete. Andby combiningmean and threshold combination functions, we obtain complexity classes in all levels of the counting hierarchy. We alsoinvestigatethe use ofarbitrarycombination functions and obtainthat inference is exp-complete even under aseemingly strong restriction.Finally, weexamine the query complexityofRelationalBayesianNetworks(i.e.,whentherelationalmodelisﬁxed),and weobtainthatinferenceiscompleteforpp.

©²⁰¹⁷ÊlsevierÎnc.Âll^rights^reserved.

1. Introduction

Bayesian networksprovidean intuitivelanguage fortheprobabilistic descriptionofconcretedomains[22].Jaeger’sRe- lational Bayesian Networks, here referred to asrbns, extend Bayesian networks to abstract domains, andallow for the description ofrelational, context-speciﬁc, deterministic and temporal knowledge [20,21]. There are manylanguages that alsoextendBayesiannetworksintorelationalrepresentations[11,12,16,23,24,30];rbnsofferaparticularlygeneralandsolid formalism.

rbns constitutea speciﬁcationlanguage containing asmallnumberofconstructs:relations, probabilityformulas,combination functions,andequalityconstraints.Combination functionsare a particularlyimportantmodelingfeature,asthey provideawayofaggregatinginformationfromdifferentelementsofthedomain.

ItshouldnotbesurprisingthattheinferentialcomplexityofBayesiannetworksspeciﬁedbyrbnsdependsonthechoice ofconstructsallowed.However,fewresultshavebeenproducedontherelationbetweentheexpressivityofsuchconstructs andthecomplexityofinference.

Inthispaper,weexaminetheeffectofcombinationfunctionsonthecomplexityofinferenceswith(Bayesiannetworks speciﬁed by) rbns. We ﬁrst argue that, without combination functions, rbns simply offer a language that is similar to platemodels,a well-knownformalism to describemodels withsimplerepetitivestructure [17,27].We show that without

✩ This paperispartofthe VirtualspecialissueontheEighthInternationalConferenceonProbabilistic GraphicalModels,edited byGiorgioCorani, AlessandroAntonucci,CassioDeCampos.

*

Correspondingauthor.

E-mailaddress:denis.maua@usp.br(D.D. Mauá).

http://dx.doi.org/10.1016/j.ijar.2017.03.014 0888-613X/©²⁰¹⁷ÊlsevierÎnc.Âll^rights^reserved.

(2)

Combination functions Domain spec. Bounded arity? Bounded nesting? Complexity of inference

none unary yes – pp-complete

none binary yes – pp-complete

max binary yes yes pexp-complete

max unary no yes pexp-complete

max unary yes no pspace-complete

max unary yes yes pp

p k-complete

threshold,mean unary yes yes c_kp-hard^a

polynomial unary yes yes exp-complete

a MembershipwhenthresholdandmeanareuseddependonfurtherconstrainsexplainedinSection4.3.

combinationfunctionsinferenceispp-complete,irrespectiveoftheencodingofthedomain;thismatchesthecomplexityof inferenceinstandard(propositional)Bayesiannetworks.Whenweallowcombinationfunctionsandtheassociatedequality constraintsintothelanguage,matterscomplicateconsiderably.Wheneitherthedomainisspeciﬁedinbinaryorthearityof relationsisunbounded,inferenceispexp-completeevenwhentheonlycombinationfunctionismaximization.Ifweplace aboundon arity,andspecifythedomaininunary notation(orequivalent, asan explicitlistofelements),andallowonly maximization ascombinationfunction,then inference ispspace-complete.Thisis mostly generatedby theabilityto nest an unbounded numberofmaximization expressions. Infact, byfurther restrictingthe numberofnesting ofcombination functions,weobtainthesamepowerasaprobabilisticTuringmachinewithaccesstoanoracleinthepolynomialhierarchy.

We then look at a combinationof mean anda threshold: the former allows probabilities to be deﬁnedas proportions;

the latter allows the speciﬁcationof piecewise functions. We argue that threshold andmean combined are aspowerful asmaximization, andthus all previous results hold. And bya suitable constraint onthe useof thresholdandmean, we showthat wecanobtaincomplexity ineveryclassofthecountinghierarchy.Wealsolookatthecomplexity ofinference whenthecombinationfunctionisgivenaspartoftheinput.Thechallengehereistoconstrainthelanguagesoastoobtain non-trivialcomplexity results.Weshow that requiringpolynomial-timecombinationfunctionsistooweak acondition in thatitleadstoexp-completeinference.Ontheotherhand,requiringpolynomiallylongprobabilityformulasbringsinference downtopp-completeness.TheseresultsaresummarizedinTable 1.

Wealsoinvestigatethecomplexityofinference whentherbnisassumedﬁxed.Thisisequivalenttothe ideaofcom- pilingaprobabilisticmodel[6,9].Weshowthatcomplexityiseitherpolynomialifprobabilityformulascanbecomputedin polynomialtime(whichincludesthecaseofnocombinationexpressions)orpp-complete,whenthecombinationfunctions canbecomputedinpolynomialtime(whichincludesthecasesofmaximization,thresholdandmean).

Thepaperbeginswithabriefreviewofrbns(Section2),andkeyconceptsfromcomplexitytheory(Section3).Ourcon- tributionsregardinginferentialcomplexity appearinSection4.Thecomplexityofinferencewithoutcombinationfunctions appearinSection4.1.RelationalBayesiannetworksallowingonlycombinationbymaximizationareanalyzedinSection4.2, whilenetworksallowingmeanandthresholdareanalyzedinSection4.3.Generalpolynomial-timecomputablecombination formulasareexaminedinSection 4.4.Querycomplexity isdiscussedinSection5.Wejustifyouruseofdecisionproblems (instead offunctional problem) and discuss how our results can be adapted to provide completeness forclasses in the functionalcountinghierarchyinSection6.AsummaryofourcontributionsandopenquestionsarepresentedinSection7.

2. RelationalBayesiannetworks 2.1. Bayesiannetworks

ABayesiannetworkisacompactdescriptionofaprobabilisticmodeloverapropositionallanguage[7,22].Itconsistsof two parts:an acyclicdirected graphG=

(

V

,

A

)

overa ﬁnitesetofrandom variables X1

, . . . ,

Xn,anda setofconditional probability distributions, one for each variable and each conﬁguration of its parents. The parents of a variable X in V are denoted by pa

(

X

)

.In this paper, we consider only 0/1-valuedrandom variables, hence each conditional distribution P(^X|^pa

(

X

))

canberepresentedasatable.

The semantics of a Bayesian network is obtained by the directed Markov property, which states that every variable is conditionally independent of its non-descendants given its parents. Forcategorical random variables, this assumption inducesasinglejointprobabilitydistributionby

P (

X1

=

^x1

, . . . ,

Xn

=

^xn

) =

n

i=¹

P (

X_i

=

^xi

|

^pa

(

X_i

) = π

_i

) ,

where

π

iisthevectorofvaluesforpa

(

X_i

)

inducedbyassignments{^X1=^x1

, . . . ,

X_n=^xn}^.^Bayesian^networks^can^represent complexpropositionaldomains,butlacktheabilitytorepresentrelationalknowledge.

(3)

2.2. Deﬁnition

rbnscanspecifyBayesiannetworksoverrepetitivescenarioswhereentitiesandtheirrelationshipsaretoberepresented.

To do so,rbns borrow severalnotions from function-free ﬁrst-orderlogic withequality [14], aswe now quicklyreview.

Consideraﬁxedvocabularyconsistingofdisjointsetsofnamesofrelationsandlogicalvariables(notethatwedonot allow functionsnorconstants).Wedenotelogicalvariablesbycapitalletters(e.g., X

,

Y

,

Z),namesofrelationsbylowercaseLatin letters(e.g.,r

,

s

,

t),andformulasbyGreekletters(e.g.,

α

,

β

).Eachrelationnamerisassociatedwithanonnegativeinteger

|^r|^describingîtsârity.Ânâtom ^has^the^form^r

(

X1

, . . . ,

X_|r|

)

,wherer isarelationname,andeach X_i isalogicalvariable.

Anatomicequalityisanyexpression X=^Y^,^where^X ând^Y âre^logical^variables.Ânêquality^constraintîsâ^Boolean^formula containingonlydisjunctions,conjunctionsandnegationsofatomicequalities.Forexample,

(

X

=

^Y

) ∧ (

X

=

^Z

)

∨ (

Y

=

^Z

) ,

where

(

X=^Y

)

isasyntacticsugarfor¬(^X=^Y

)

(wewilladoptthisnotationthroughoutthetext).

The localconditional probability modelsinan rbn arespeciﬁed usingprobabilityformulas;each probability formulais oneofthefollowing:

(PF1) Arationalnumberq∈ [⁰

,

1]^;^or (PF2) Anatomr

(

X1

, . . . ,

X_|r|

)

;or

(PF3) Aconvexcombination F1·^F2+

(

1−^F1

)

·^F3 whereF1

,

F2

,

F3 areprobabilityformulas;or (PF4) Acombinationexpression,thatis,anexpressionoftheform

comb

{

F1

, . . . ,

Fk

|

Y1

, . . . ,

Ym

; α } ,

wherecombisawordfromaﬁxedvocabularyofnamesofcombinationfunctions(disjointfromthesetsofrelations symbols andlogical variables), F1

, . . . ,

F_k are probability formulas, Y1

, . . . ,

Ym is a (possibly empty) list of logical variables,and

α

isanequalityconstraintcontainingonlythoselogicalvariablesandthelogicalvariablesappearingin thesubformulas.Welaterdeﬁnecombinationfunctions.

Thelogicalvariables Y1

, . . . ,

Ym inacombinationexpressionaresaidtobeboundbythatexpression;theremightbeno logicalvariablesboundbyaparticularcombinationexpression,inwhichcasewewritecomb{^F1

, . . . ,

F_k|∅;

α

}^.

We deﬁnefreelogicalvariables ina probabilityformula F asfollows.Arationalnumberhasno(free)logicalvariables.

Alllogicalvariablesinatomr

(

X1

, . . . ,

X_|r|

)

arefree.Avariableisfreein F1·^F2+

(

1−^F1

)

·^F3 ifitisfreeinoneofF1, F2 and F₃.Finally,alogicalvariableisfree incomb{^F1

, . . . ,

F_k|^Y1

, . . . ,

Y_m;

α

}îfîtsîs^freeⁱⁿânyôf ^F1

, . . . ,

F_k or

α

,anditis notboundbycomb(i.e.,itisnotoneofY1

, . . . ,

Ym).

Anexampleofaprobabilityformulais

mean

{

⁰

.

6

·

^r

(

X

) +

⁰

.

7

·

max

{

¹

−

^s

(

X

,

Y

)|

^X

;

^X

=

^X

}|

^Y

,

Z

;

^Y

=

^X

∧

^Z

=

^X

} .

InthisformulaX isfree(eventhough X isboundbythecombinationexpressionheadedbymax),andY andZ arebound.

Weoftenwrite F

(

X1

, . . . ,

Xn

)

toindicatethat X1

, . . . ,

Xn arethefreelogicalvariablesinF.

Similarly toBayesian networks, an rbnconsistsoftwo parts: an acyclicdirected graphwhereeach node isa relation symbol, anda collection of probability formulas, one foreach node. The probability formula F_r associated with node r containsexactly|^r|^free^logical^variables^and^mentions^only^the^relation^symbols^s∈^pa

(

r

)

.

2.3. Semantics

Roughly speaking,one caninterpret an rbnwithgraph G=

(

V

,

A

)

asa rulethat takesa set Dôf êlements,^called â domain,andproduces an auxiliaryBayesiannetwork, asfollows.First, generateall groundings r

(

a

)

fora∈D^|^r^| ^and^r∈^V^. These groundingsare thenodesofthe Bayesiannetwork. Foreachgroundingr

(

a

)

,consider theprobability formulaata;

thatis,F_r

(

a

)

.Thisformuladependsonmanygroundingss

(

a

,

b

)

ofparentsofr;thosearetheparentsofr

(

a

)

intheBayesian network. And foreachconﬁguration

π

ofthese(grounded)parents, take P

(

r

(

a

)

=¹|^pa

(

r

(

a

))

=

π )

=^Fr

(

a

)

where Fr

(

a

)

is evaluatedat

π

.Weformalizethisprocedurelater,rightafterwepresentanexample:

Example1. Consider a simplemodelof studentperformance, inspired by the “University World”often employed inde- scriptions ofProbabilisticRelational Models[16,22].Supposeastudent istakingcourses;foreachpairstudent/course,we have approval or not. We have relations difficult and committed, respectively indicating whether a course is diﬃcult and whethera studentiscommitted, andrelationapproved,indicating whetherastudentgets an approvalgradeina course.

The probabilisticrelationships betweentheserelationsarecapturedby thegraphinFig. 1,wherewe usedtwoadditional relations student andcourse to “type”elements ofthe domain.The ideais that theserelationsare fullyspeciﬁed by the evidence during inference(thatis,every elementis typed),soits associatedprobabilitiesare notimportant;we cantake forexample:

F_student

(

Y

) =

¹

/

2 andF_course

(

X

) =

¹

/

2

.

(4)

Fig. 1.Relational Bayesian network modeling student performance.

Fig. 2.Bayesian network obtained by grounding therbninExample 1and removing nodes with evidence.

Weneedprobabilityformulasfordifficultandcommitted:

F_difficult

(

X

) =

⁰

.

3

·

course

(

X

) ,

F_committed

(

Y

) =

⁰

.

6

·

student

(

Y

) .

Also,wehaveaprobabilityformulaforapproved:

F_approved

(

X

,

Y

) =

⁰

.

8

· [

⁰

.

6

· (

1

−

^difficult

(

X

)) +

⁰

.

4

·

^committed

(

Y

)] ·

^course

(

X

) ·

^student

(

Y

) .

Now supposewehavedomainD= {^s1

,

s2

,

c2

,

c2}^.^Note^that^we ^have³²^groundings ^(fourûnary^relationsândône^binary relation),soweobtainaBayesiannetworkwith32nodes/randomvariables.Inthisnetwork,wehave

P (

difficult

(

a

) =

¹

|

^course

(

a

) =

^s

) =

0

.

0 ifs

=

⁰

(

ais not a course

),

0

.

3 otherwise

;

andlikewiseforallothergroundingsofrelations.Nowsupposethattherelationstudentisgiven;saywehaves1ands2as students,andc1andc2ascourses(hencenot students).We denotethisevidenceE^.^GivenE^,^we^can ^focus^on^a^smaller Bayesiannetworkobtainedbyconditioningonthis“typing”evidence;suchanetworkispresentedinFig. 2. 2

The probability formulas inthe previous example did not use combinationexpressions. Combination expressions are slightlymorediﬃculttointerpret;theirsemanticsare givenbycombinationfunctions.Acombinationfunctionisanyfunc- tion that maps a multiset of ﬁnitely many numbers in [0

,

1] into a single rationalnumber in [0

,

1]. Some examples of combinationfunctionsare:

• Maximization:max{^q1

, . . . ,

q_k}=^qi,wherei isthesmallestintegersuchthatq_i≥^qjfor j=¹

, . . . ,

k,andmax{}=^0;

• Minimization:min{^q1

, . . . ,

q_k}=¹−^max{¹−^q1

, . . . ,

1−^qk}^;

• ^Arithmetic^mean:^mean{^q1

, . . . ,

q_k}=_k

i=1q_i

/

k,andmean{}=^0;

• ^Noisy-or:^noisy-or{^q1

, . . . ,

q_k}=¹−k

i=¹

(

1−^qi

)

,andnoisy-or{}=^0;

• ^Threshold:^threshold{^q1

, . . . ,

qk}=^{1 if}^max{^q1

, . . . ,

qk}≥¹

/

2 andthreshold{^q1

, . . . ,

qk}=0 otherwise.

Thecombinationfunction threshold, whichdoesnotappearin previousrbns, isusefultoencode case-based functions suchaspiecewiselinearfunctions.Itmayseemthatthecomparisonwithaﬁxedvalue1

/

2 istoorestrictive;however,witha smalleffortwecancompareanyprobabilityformulawithanyconstantusingthiscombinationfunction.Toseethis,suppose we wanttocomparea probability formula F witha number

γ

∈ [⁰

,

1]^; ^that ^is,^we^want^to ^determine^whether ^F≥

γ

.If

γ >

1

/

2,thenusethreshold{

(

1

/(

2

γ ))

·^F}^.^And^if

γ <

1

/

2,thenusethreshold{

γ

F+

(

1−

γ )

}^where

γ

₌1

/(

2

(

1−

γ ))

.Thus wecangenerateallsortsofpiecewiseandcase-basedfunctionswiththreshold.Weexploitthisversatilitytoencodedecision problemsrelatedtocountinginsomeofourproofs.

We can now explain the semantics of rbns, which is givenby interpretations of a domain D.An interpretation is a function

μ

mappingeach relationsymbol r∈Rîntoâ^relation^r^μ⊆D^|^r^|^,ândêach êquality^constraint

α

intoitsstandard meaning

α

^μ. An interpretation

μ

induces a mapping from a probability formula F with n free logical variables into a functionFμ

(

a

)

fromDⁿto[0

,

1]asfollows.IfF=q,thenFμistheconstantfunctionq.IfF=r

(

X1

, . . . ,

Xn

)

,then Fμ

(

a

)

=1 if a∈^r^μ ^and ^F^μ

(

a

)

=0 otherwise. If F=^F1·^F2+

(

1−^F1

)

·^F3 then Fμ

(

a

)

=^F^μ₁

(

a

)

Fμ

2

(

a

)

+

(

1−^F₁^μ

(

a

))

Fμ

3

(

a

)

. Finally, if F=^comb{^F1

, . . . ,

Fk|^Y1

, . . . ,

Ym;

α

_}, then Fμ

(

a

)

=^comb

(

Q

)

, where this latter comb is the corresponding combination function andQ^is ^the^multiset^containing ^a ^number ^F^μ_i

(

a

,

b

)

forevery

(

a

,

b

)

∈

α

^μ (evenif F_i doesnot depend onevery coordinate).

For example,consider a domain D= {^a

,

b}^. ^The probability formula F=^max{⁰

.

1

,

0

.

2

,

0

.

3|^Y;^Y =^Y} ^is interpreted as max{⁰

.

1

,

0

.

2

,

0

.

3

,

0

.

1

,

0

.

2

,

0

.

3}^, ^and ^the probability formula F

(

X

,

Y

)

=^max{^q1

,

q₂|∅;^X =^Y} ^is interpreted as Fμ

(

a

,

a

)

= Fμ

(

b

,

b

)

=^max{}=^0,^and^F^μ

(

a

,

b

)

=^F^μ

(

b

,

a

)

=^max{^q1

,

q2}^.

(5)

Finally: GivenaﬁnitedomainD,anrbnwithgraph G=

(

V

,

A

)

isassociatedwithaprobabilitydistributionoverinter- pretations

μ

deﬁnedby

P

_D

( μ ) =

r∈^V

P

_D

(

r^μ

|{

^s^μ

:

^s

∈

^pa

(

r

)}) ,

where

P

_D

(

r^μ

|{

^s^μ

:

^s

∈

^pa

(

r

)}) =

a∈^r^μ

F_r^μ

(

a

)

a∈/^r^μ

(

1

−

^Fr^μ

(

a

)) .

Wecan nowrephrase,inpreciseterms,themoreinformalexplanationgivenbeforeExample 1.Inessence,anrbnisa template modelthat foreachdomain D^generates â^Bayesian ^network^whereêach ^node ^r îsâmultidimensional random variabletakingvaluesrμinthesetof2^N^|r| possibleinterpretationsrμ,whereN= |D|^.Alternatively,wecanassociatewith every relationsymbolr∈Randtuplea∈D^|^r^| arandomvariabler

(

a

)

thattakesvalue 1whena∈^r^μ^and⁰^when^a∈

/

^r^μ^. An rbnandadomainD ^produceân âuxiliary^Bayesian^networkôver^these^random^variables,^where^the^parents ôfâ^node r

(

a

)

areallnodess

(

a

,

b

)

suchthatsisaparentofrintherbnand

(

a

,

b

)

∈D^|^s^|^,^and^theconditionaldistributionassociated withr

(

a

)

isgivenbyinterpretationsoftheprobabilityformulaFr ata.ThisBayesiannetworkinducesajointdistribution

P

{

r

(

a

) = ρ

r(a)

}

=

r∈^V

a∈D^|^r^|

P

r

(

a

) = ρ

r(a)

|

pa

(

r

(

a

)) = π = P

_D

( μ ) ,

wherea∈^r^μ^if

ρ

r(a)=^{1 and}^a∈

/

^r^μ^if

ρ

r(a)=^0,^and

π

isaconfigurationconsistentwiththosevalues.Followingterminology from first-order logic,we often say that thisBayesian network is the “grounding” ofthe rbn on domain D^. Ôften, ît îs possibletogenerateasmallergrounding,wheretheparentsofr

(

a

)

areonlythoserelevanttoevaluate Fr

(

a

)

(whichmight not depend on all s

(

a

,

b

)

,s∈pa

(

r

)

, a

,

b∈D^|^s^|). Giventhisequivalence betweenan rbn endowedwith adomain andan auxiliaryBayesiannetwork,werefertoprobabilitiessuchasP(^r

(

a

)

=¹

)

todenoteP(^a∈^r^μ

)

=

μ:^a∈^r^μP_D

( μ )

,andsimilarly formorecomplexevents(notethatthedependencyofthedistributiononthedomainisleftimplicitintheﬁrstcase).

Thenextexampleillustratesthemodelingpowerofdifferentcombinationfunctions.

Example2. ConsideragaintheuniversitydomaininExample 1,andsupposethat inadditiontostudentsandcourseswe alsohaveteachersinthedomain.Attheendoftheacademicyearstudentsvoteforthebestteacheroftheyear.Anawardis thengiventothemostvotedteacher.Weuseabinaryrelationtaughttoindicatewhetheracoursewastaughtbyateacher inthatyear.Thisrelationwillusuallybegivenasevidence,soitsprobabilityisnotimportant;forsimplicitywespecify

F_taught

(

X

,

Z

) =

¹

/

2

.

Weuseanunaryrelationawardedtoindicatewhethersomeoneistherecipientoftheaward.Theprobabilitythatateacher is awarded is linearlyrelatedto the proportion p of students that failedin atleast one coursehe/she taught,withtwo regimes:

P (

awarded

(

Z

) =

¹

|

^p

) =

0

.

425

−

⁰

.

4p

,

ifp

≥

⁰

.

7

,

0

.

25

−

⁰

.

15p

,

ifp

<

0

.

7

.

Weencodethisas,

F_awarded

(

Z

) =

^T

(

Z

) · [

⁰

.

025

+

⁰

.

4

· (

1

−

^F1

(

Z

))] + (

1

−

^T

(

Z

)) · [

⁰

.

1

+

⁰

.

15

· (

1

−

^F1

(

Z

))],

where T

(

Z

)

= ^threshold{⁵

/

7 · ^F1

(

Z

)

} ^denotes ^whether ^the ^proportion ôf ^failed ^students îs âbove ^0.7, ^F1

(

Z

)

= mean{^F2

(

Y

,

Z

)

|^Y;^Y=^Y} îs ^the ^proportion ôf^students ^that ^failed ^some ^course ^taught ^by ^Z^, ând ^F2

(

Y

,

Z

)

is a formula indicatingwhetherY failedacoursetaughtby Z:

F2

(

Y

,

Z

) =

max

{(

1

−

approved

(

X

,

Y

)) ·

taught

(

X

,

Z

)|

X

:

X

=

X

} .

Sincetherecanbeexactlyonewinneroftheaward,weintroducearelationsingleWinnerofarityzero(i.e.,aproposition), andspecify:

FsingleWinner

=

^max

{

^awarded

(

Z

)|

^Z

;

^Z

=

^Z

} ·

^threshold

{ γ ·

^mean

{

¹

−

^awarded

(

Z

)|

^Z

;

^Z

=

^Z

}|∅} ,

where

γ

=

(

N−¹

)/

N

)

andN= |D|isthesizeofthedomain(thenumberofcourses,studentsandteachers).Thegraphof thecorrespondingrbnisshowninFig. 3. 2

Thisexampleusesthemax combinationfunctiontoencodeanexistentialquantiﬁer;wewilllaterresorttothistrickin ourproofs.

(6)

Fig. 3.Relational Bayesian network modeling the university domain inExample 2.

3. Thecomplexityofcounting

Weassume familiaritywithstandard complexityclassessuch asp,np,pspaceandexp,andwiththeconceptoforacle Turingmachines[1,29].Asusual,wedenotetheclassoflanguagesdecidedbyaTuringmachineincomplexityclasscwith oracle in complexity class o asc^o. We alsoassume that problemsare reasonably encodedusing a binary alphabet (say, 0 and1).Adecisionproblemconsistsincomputinga0/1-valuedfunctiononbinary strings.Wecallaninputofadecision problemayesinstanceifitsoutputis1,otherwisewecallanoinstance.

Thepolynomialhierarchy isthecollectionofcomplexity classes

_k^p,

^p_k and ^p_k,forevery nonnegativeintegerk.These decisionproblems aredeﬁned inductivelyby

^p₀=^p^,

_k^p=^np^p^k−1^,

^p_k=^p^k−1^p ^,^and _k^p=^co

^p_k. Thecomplexity class ph contains alldecisionproblemsin thepolynomialhierarchy: ph= k

_k^p (over all integersk).We thushavethefollowing inclusion:

p

⊆

_co^np_np

⊆

^p^np

⊆

₂^p

p

2

⊆ · · · ⊆

_k^p

p k

⊆

_k^p

⊆ · · · ⊆

^ph

.

Awell-knownresultbyMeyerandStockmeyer[28]statesthatif _k^p=

_k^pforsomekthenph=

_k^p.Inparticular,ifp=^np thenph=^p^.^This^suggests^that^the^levels

_k^p ofthehierarchyaredistinct,andthatusinganoracleinahigherlevelbrings inmorecomputationalpower.

A probabilisticTuringmachineis astandard non-deterministic Turing machine that acceptsifat leasthalf ofthe com- putationpathsaccept, andrejects otherwise.The complexityclasspp containsdecisionproblemswhichareacceptedbya probabilisticTuringmachineinpolynomialtime.Theclassppisknowntobeclosedundercomplementandunion[4].The classpexpisdeﬁnedanalogouslywithpolynomial-timereplacedwithexponential-time.

The countinghierarchyaugments thepolynomial hierarchywithoraclemachinesbased onpp.Itcan bedeﬁnedasthe smallestcollectionofclassescontaining p andsuchthat ifc isinthe collectionthennp^c,conp^c,andpp^c are alsointhe collection[34,36].Wedeﬁnetheclasschastheunionoverallnonnegativeintegerskoftheclassesc_kp,wherec_kp=^pp^c^k−1 and c₀p=

₀^p=^p ^[36]. ^Toda’s ^celebrated ^result ^shows^that ^any ^problem ⁱⁿ ^ph ^can ^be ^solved ⁱⁿ ^polynomial ^time ^by ^a deterministicTuring machinewithanoracle pp[32].Hence,theclass chcontains theentirepolynomialhierarchy.Iffact, wehavethefollowinghierarchy:

ph

⊆

^p^pp

⊆

_co^np_np^pppp

⊆

^pp^pp

⊆

^c2p

⊆ · · · ⊆

^ch

⊆

^pspace

⊆

^exp

⊆

^pexp

.

Amany-onereduction froma problem A to a problem B is a polynomial-timefunction f that maps inputsof A into inputsof B,andsuchthat xisayesinstanceof A ifandonlyif f

(

x

)

isayesinstanceof B.Foranycomplexityclassc in thecountinghierarchy,wesaythataproblemA isc-hardifforanyproblemB incthereisamany-onereductionfromA to B.IfA isalsoinc,then Aissaidtobec-complete.

4. InferentialcomplexityofrelationalBayesiannetworks

Ourgoal in thispaper isto examine the complexity of the decisionvariant ofthe following (single-query) inference problem:

Input:Anrbnwithgraph

(

V

,

A

)

,adomainD^,^a^list^of^random^variables^r

(

a

),

r1

(

a1

), . . . ,

r

(

an

)

,andarational

γ

∈ [⁰

,

1]^. Decision:IsP

(

r

(

a

)

=¹|^r1

(

a1

)

=¹∧ · · · ∧^rn

(

an

)

=¹

)

≥

γ

?

The conditioningeventr₁

(

a₁

)

=¹∧ · · · ∧^rn

(

a_n

)

=^{1 is} ^calledêvidence,ândâlso^denoted âs{^r1

(

a₁

)

=¹

, . . . ,

r_n

(

a_n

)

=¹} and {^r1

(

a1

)

=¹},

. . . ,

{^rn

(

an

)

=¹}^. ^We ^have^assumed ^implicitlyⁱⁿ ^the formulation ofthe problem above that P(^r1

(

a1

)

= 1

, . . . ,

rn

(

an

)

=¹

) >

0 (i.e.,that evidencehaspositiveprobability).Adifferentstrategywouldbe to deﬁneinstanceswhere thiscondition isnotobservedasnoinstances(orasyesinstances).Inthiscase, toprovemembershipinsome complexity classwe wouldhavetoﬁrstshow howtodecideP(r1

(

a1

)

=1

, . . . ,

rn

(

an

)

=1

) >

0 inthatclass.Forsimplicity,weassume thatthisisalwaystrue;itispossiblehowevertoshow thatthecomplexity ofinferenceisnotchangedbythissimplifying assumption.

(7)

Ifthe domainisa singleton,then therbn isactuallya Bayesiannetwork,since probability formulascanbe eﬃciently converted into tables. Moreover, anyBayesian network can be polynomially encodedas a rbn witha singleton domain.

Hence,inferenceinrbnsispp-hard,asthisisthecaseforBayesiannetworks[8,Theorems11.3and11.5].

For domains withmany elements, complexity might depend on howthe domain D is encoded. One possibility is to assume that the domainis some set {¹

,

2

, . . . ,

N} ^specified ^by ^the înteger ^N ^(but^note ^that ^thereîs ^no ôrdering ôn ^the elementsimplied).TheintegerN canbeencodedindifferentways.Oneisthis:

Assumption3.Thedomainis{¹

,

2

, . . . ,

N}^,^where^N ^is^encodedⁱⁿ^binary^notation.

Underthepreviousassumption,theencodingofthedomaincontributeswithO

(

logN

)

bitstotheinputsize.Depending ontheconstructsallowed,thisalternativecanproduceprobabilitieswhichrequireexponentialefforttobewrittendown,as they mightneedO(^N

)

bits.Toseethis, consideran rbnwithapropositionrsuchthat F_r=^noisy-or{¹

/

2|^X;^X=^X}^.^Then P

(

r=⁰

)

=¹− [¹−N

i=¹1

/

2]=²⁻^N^.

Anotherpossibilityistostillassume thatthedomainis{¹

, . . . ,

N}^,^but^toêncode ^N ûsingûnary ^notation.În^this^case, theencodingofthedomaincontributeswithO(^N

)

symbolstotheinput.Thus,thisspeciﬁcationisequivalent(uptosome constantfactor)toencodingthedomainbyanexplicitlistwitheveryelementitcontains.Sowemightadopt:

Assumption4.Thedomainisspeciﬁedasanexplicitlistofelements.

Another parameter that affects inferential complexity is the arity of relations. If we place no bound on the arity of relations,thenanexponentialeffortmightberequiredtowritedownprobabilities.Forexample,consideranrbnwithgraph r s wherer isak-aryrelationwithprobabilityformula F_r

(

X₁

, . . . ,

X_k

)

=¹

/

2,andsispropositionwithprobability formula Fs=^noisy-or{^r

(

X1

, . . . ,

X_k

)

|^X1

, . . . ,

X_k;^X1=^X1}^;^then ^inference P

(

s=⁰

)

requires an exponential numberof bits when4.Wewillthusoftenadopt:

Assumption5.Thearityofrelationsisbounded.

Without further restrictions, the complexity of inference is dominated by the complexity of evaluating combination expressions.Forinstance,ifthecombinationfunctioninacombinationexpressionisanarbitraryTuringmachine(withno boundsontimeandspace)theninferenceisundecidable.Thus,weneedtoconstraintheuseofcombinationfunctionsin probabilityformulastomakethestudyofinferentialcomplexitymoreinteresting.

4.1. rbnswithoutcombinationfunctions

Westartouranalysiswiththesimplestcase,wherenocombinationfunctionisavailable.Thatis,weadopt:

Assumption6.Everyprobability formulais eitherarational number,oran atom,ora convexcombinationofprobability formulas.

Undertheassumptionabove,everylogicalvariableinaprobabilityformulaisfree;also,intheacyclicdirectedgraphof relationsofanrbnnonodecanhaveanaritylargerthananyofitschildren.

Ifthedomainisgivenasanexplicitlistofelements(orbyanintegerinunarynotation),thenwecan“ground”anrbn into anauxiliary Bayesiannetwork, andruninference there(consider Example 1).Thisprocess generatesa polynomially- large Bayesiannetworkiftherelationshaveboundedarity(ifnecessary,we caninsertauxiliarynodessoastobound the numberofparentsofanygivennode).BecauseinferenceinBayesiannetworksispp-complete,wehavethat

Theorem7. Inferenceispp-completewhenthedomainisspeciﬁedasalist,therelationshaveboundedarityandthereisnocombina- tionfunction.

Interestingly,thecomplexityofinferenceremainsthesameevenifthedomainisspeciﬁedbyitssizeinbinarynotation (notethatinthiscasetheauxiliaryBayesiannetworkhasasizethatisexponentialintheinput):

Theorem8. Inferenceispp-completewhenthedomainisspeciﬁedbyitssizeinbinary,therelationshaveboundedarityandthereis nocombinationfunction.

Proof. Hardness followsfromthe fact that an rbn can eﬃcientlyencode any propositional Bayesiannetwork using only convexcombinationsofnumbersandatoms(asinExample 1).Toprovemembership,considertheauxiliaryBayesiannet- workthatisgeneratedby“grounding”agivenrbn.Thisnetworkmaybeexponentiallylargeontheinput,whenthedomain isgiveninbinarynotation.Weshowthatweonlyneedtogenerateapolynomially-largefragmentoftheauxiliaryBayesian networktodecidewhetherP

(

r

(

a

)

=¹|^r1

(

a1

)

=¹∧ · · · ∧^rn

(

an

)

=¹

)

≥

γ

.Notethattocomputethatconditionalprobability,

(8)

that Fr isa convexcombination;then itmayrecursively referto severalatomsthatare parents ofr. Theseparents may inturnhaveseveralparents,andsoon.Sincetherearenotcombinationfunctions,wecanalwaysrenamelogicalvariables suchthe probabilityformulasofeveryancestorofr containonlylogicalvariablesintheprobability formulaF_r.Thus,ifr hasarityk,thentherecanbe atmost2^k ancestorsofr

(

a

)

,asthisisthenumberofsubsetsofthelogicalvariables in Fr. Sinceweassumedkisboundedbyaconstant,thisnumberofancestorsisalsoboundedbyaconstant.Thus,wecanbuild aBayesiannetworkconsistingofthesenodesandtheedgesbetweenthem,togetherwiththeassociatedprobabilityvalues.

Inference mustthenberun inthissub-network,andthiscan bedone byapolynomial-timeprobabilistic Turingmachine [8,Theorems11.3and11.5]. 2

Accordingto Theorem 8,thecomplexity ofinferenceincombination-function-free rbnsmatches thecomplexity ofin- ferenceinBayesiannetworks(ifrelationarityisbounded).Thatis,eventhoughanrbnmayrepresentexponentiallylarge Bayesiannetworks, onlya polynomialnumberofnodesofthisnetworkisrelevantforinference(and thesenodescan be collectedinpolynomialtime).

4.2. rbnswithmaxcombinationfunctions

Wenowconsiderrbnswhereinformationisonlyaggregatedbymaximization.Thatis,weassumethat Assumption9.Theonlycombinationfunctionusedismax.

Aswe haveseen,withoutcombinationfunctions, inferenceinrbns hasthesamecomputationalpowerasinferencein Bayesiannetworks, evenwhen thedomainisgivenby aninteger inbinary notation.Thisisnot thecasewhenwe allow combinationfunctions,evenoneassimpleasmaximization:

Theorem10. Inferenceispexp-completewhenthedomainisspeciﬁedbyitssizeinbinary,therelationshaveboundedarityandthe onlycombinationfunctionismax.

Proof. Membershipfollowssincean rbnsatisfyingthe aboveassumptions canbe “grounded” intoanexponentially large Bayesiannetwork.Toseethis,notethatifwehaveadomainofsize2^m,andaboundofarityk,theneachrelationproduces atmost

(

2^m

)

^k=²^km ^(grounded)^random ^variables. Înference ^can^be ^decided ⁱⁿ ^that ^networkûsing âprobabilistic Turing machine withan exponential bound on time, using the same scheme normally employed to decide an inference for a Bayesiannetwork[8].

Toshowhardness,wesimulateanexponential-timeprobabilisticTuringmachineusinganrbnsuchthatwecan“count”

thenumberofacceptingpathsbyasingleinference.WedosobyresortingtoastandardTuringmachineencodingdescribed byGrädel[18,Theorem3.2.4].Thereareotherpossibleencodingswemightuse,butthisoneisquitecompactandrelatively easytograsp.

So,take a Turingmachinethat can decidea pexp-completelanguage. Thatis,take a pexp-completelanguage L^and^a nondeterministicTuringmachinesuchthat

∈Lîfândônlyîf^the^machine^halts^within^time ²^p ^with^halfôr^moreôf^the computation pathsaccepting, wheren isthelengthof

and pdenotes apolynomialinn.¹ Denote by

σ

asymbolinthe alphabetofthemachine(whichincludesblankandboundarysymbolsinadditionto0and1s),andbyq astate.Asusual, we describeaconﬁgurationofthe machineby astring

σ

¹

σ

²· · ·

σ

ⁱ⁻¹

(

q

σ

ⁱ

) σ

ⁱ⁺¹· · ·

σ

²^p,whereeach

σ

^j is asymbolinthe tape,and

(

q

σ

ⁱ

)

indicatesthatthemachineisatstateq anditsheadisatcelli.Theinitialconﬁgurationis

(

q0

σ

₀¹

) σ

₀²· · ·

σ

₀ⁿ

followedby 2^p−ⁿ^blanks, ^where^q0 is theinitial state. Letqa andqr be,respectively, statesthat indicateacceptance or rejection ofthe input string

σ

₀¹_{· · ·}

σ

₀ⁿ. The transitionfunction

δ

ofthe machine takes a pair

(

q

, σ )

consistingof a state anda symbol inthe tape,andreturns triples

(

q

, σ ,

m

)

:the next state q,the symbol

σ

to bewritten inthe tape (we assumethat ablankisneverwrittenbythemachine), andan integerm in{−¹

,

0

,

1}^.^Here −^{1 means}^that ^the^headîs^to moveleft,0 meansthat theheadistostayinthecurrentcell,and1 meansthat theheadistomoveright.Wemakethe importantassumptionthat,ifqa orqr appearinsomeconfiguration,thentheconfigurationisnot modifiedanymore (the transitionfunctiongoesfromthisconfigurationtoitselffromthatpointon).Thisisnecessarytoguaranteethatthenumber ofacceptingcomputationsisequaltothenumberofwaysinwhichthemachinecanrunfromtheinput.

Grädel’s encoding uses quantified Boolean formulas to represent the Turing machine. As shown by Jaeger [20, Lemma 2.4],probability formulascanencode conjunctionandnegation,respectively,by multiplicationandinversion(and then disjunction and implication are easily obtained). Logical equalities such as X =^Y âre ôbtained ^by ^the ^formula

1 Intheusualdeﬁnition,aprobabilisticTuringmachineacceptswhenthemajorityofcomputationpathsaccept;however,givensuchamachinewecan alwaysbuildamachinethatdecidesthesamelanguage,andsuchthatthemachineacceptsifandonlyifatleasthalfofthecomputationpathsaccept.

(9)

F

(

X

,

Y

)

=^max{¹|∅;^X=^Y}^,ând^quantified ^formulas^such âs∃^Y

ψ(

X

,

Y

)

areobtainedas F

(

X

)

=^max{^Fψ

(

X

,

Y

)

|^Y;^Y=^Y}^, where F_ψ

(

X

,

Y

)

encodes

ψ(

X

,

Y

)

.ThuswecaneasilyreproduceGrädel’slogicalencodingofthemachinebywritinglogical formulasasprobabilityformulas.Forexample,considerarelationless(meaning“lessthan”),usedtoordertheelementsof thedomain,andconsequentlyrepresenttheorderingonthecomputationstepsandonthepositionsofthetape.Sincethe elements ofthedomainareordered,wecaninterpretthemasintegers.DeﬁneFless

(

X

,

Y

)

=¹

/

2.Wemustimposeonless theaxiomsoflinearorderings.Toconstrainlesstobeirreﬂexive,weintroduceapropositionirrefwithprobabilityformula

Firref

=

1

−

max

{

less

(

X

,

X

)|

X

;

X

=

X

} .

We have that Fμ

irref=^{1 if} ând ônly îf ^less^μ îs irreflexive. Note that the above probability formula simply encodes the quantified Boolean formula ¬∀^Xless

(

X

,

X

)

in the language of rbns. Similarly, we enforce transitivity by a relation trans withprobabilityformula

F_trans

=

¹

−

^max

{

^less

(

X

,

Y

) ·

^less

(

Y

,

Z

) · (

1

−

^less

(

X

,

Z

))|

^X

,

Y

,

Z

;

^X

=

^X

} .

To understand thisexpression, note that less

(

a

,

b

)

∧^less

(

b

,

c

)

→^less

(

a

,

c

)

holds if andonly if 1− [^less

(

a

,

b

)

·^less

(

b

,

c

)

·

(

1−^less

(

a

,

c

))

]=^1. ^Finally,^weîmpose ^trichotomy^(thatîs,îf ^X=^Y ^then^less

(

X

,

Y

)

∨^less

(

Y

,

X

)

) bya relationtrichwith probabilityformula

Ftrich

=

¹

−

^max

{ (

1

−

^less

(

X

,

Y

)) · (

1

−

^less

(

Y

,

X

)) |

^X

,

Y

;

^X

=

^Y

} .

Usingtherelationlesswecanencodearelationfirstindicatingthesmallestelementofthedomain:

F_first

(

X

) =

¹

−

^max

{

^less

(

Y

,

X

)|

^Y

;

^Y

=

^Y

} .

Wecanalsoencodetherelationsuccessor

(

X

,

Y

)

,whichistrueifandonlyifY isthesmallestelementlargerthan X:

Fsuccessor

(

X

,

Y

) =

^less

(

X

,

Y

) · (

1

−

^max

{

^less

(

X

,

Z

) ·

^less

(

Z

,

Y

) |

^Z

;

^Z

=

^Z

} ) .

WhileprobabilityformulascanencodeanyquantiﬁedBooleanformulausingonlymaxascombinationfunction,theyareless readable thantheirlogicalcounterparts.Sofortherestofthisproof,wewilluselogicalexpressionsinsteadofprobability formulas.

Introduceaunaryrelationstateqforeachstateqanddeﬁne Fstateq

(

X

)

=¹

/

2.Atomstateq

(

X

)

denotesthatthemachineis instateq atcomputationstep X.Also,introduceabinaryrelationsymbolσ foreachsymbol

σ

inthealphabet,andspecify Fsymbol_σ

(

X

,

Y

)

=¹

/

2.Atomsymbolσ

(

X

,

Y

)

denotesthat

σ

iswrittenintheYthpositionofthetapeinthe Xthcomputation step.Finally,introduceabinaryrelationheadwithFhead

(

X

,

Y

)

=¹

/

2,meaningthatthemachineheadisintheYthposition ofthetapeinthe Xthcomputationstep.Wemustguaranteethatatanycomputationstepthemachineisinasinglestate, each tapepositionhasasinglesymbol,andtheheadisatasingletapeposition.Wedosoby introducingapropositionr andaprobabilityformulaFr thatencodesthelogicalexpression:

∀

^X

q

stateq

(

X

) ∧

q=q

¬

stateq

(

X

)

∧

σ

∀

^Y

symbol_σ

(

X

,

Y

) ∧

σ=σ

¬

symbol_σ

(

X

,

Y

)

∧

∃

^Y

head

(

X

,

Y

) ∧ ∀

^Z

(

Z

=

^Y

) → ¬

head

(

X

,

Z

)

,

WeintroduceapropositionstartwithaprobabilityformulaFstart

(

X

)

imposingtheinitialconfigurationofthemachine,and a proposition compute witha probability formula Fcompute encoding the transitions ofthe machine. The formula Fstart is simple:itfixesthefirstelementusingtheconstruction∀^X[^first

(

X

)

→^stateq0

(

X

)

∧^head

(

X

,

X

)]

^.^The^formula^Fcomputeismore elaborate; we onlysketch theconstructionby Grädel.First, Fcompute encodestheconjunctionoftwologicalformulas,one affectingthetapepositionsthatdonotchange,andtheotheractuallyencodingthetransitions.Theﬁrstformulais

∀

^X

,

Y

,

Z

,

W

σ

symbol_σ

(

X

,

Y

) ∧ (

Z

=

^Y

) ∧

^head

(

X

,

Z

) ∧

^successor

(

X

,

W

) →

^symbol_σ

(

W

,

Y

)

.

Thesecondformulais

∀

^X

,

Y

,

W

q,σ

state_q

(

X

) ∧

^head

(

X

,

Y

) ∧

^symbolσ

(

X

,

Y

) ∧

^successor

(

X

,

W

)

→

(q,σ,m)∈δ(q,σ)state_q

(

W

) ∧

symbol_σ

(

W

,

Y

) ∧

HEADMOVE

,

whereHEADMOVE istheformula∃^Z[

(

Z=^Y+^m

)

∧^head

(

W

,

Z

)

]^,ând ^Z=^Y+^m îs^syntactic^sugar^forâconstructionthat canbeaccomplishedwiththesuccessorrelation.

We are stillto imposethat theconstraintswe includedare infact enforcedby anyinterpretation.Wedo soby using evidence {îrref=¹}^, {^trans=¹}^, {^trich=¹}^. ^We âlso ^need ^to împose ^that ^the înput îs represented on the tape. So use