Contents lists available atScienceDirect
International Journal of Approximate Reasoning
www.elsevier.com/locate/ijar
The effect of combination functions on the complexity of relational Bayesian networks ✩
Denis Deratani Mauá
a,∗ , Fabio Gagliardi Cozman
baInstitutodeMatemáticaeEstatística,UniversidadedeSãoPaulo,Brazil bEscolaPolitécnica,UniversidadedeSãoPaulo,Brazil
a rt i c l e i n f o a b s t ra c t
Articlehistory:
Received23December2016
Receivedinrevisedform29March2017 Accepted30March2017
Availableonline4April2017
Keywords:
RelationalBayesiannetworks Complexitytheory Probabilisticinference
WestudythecomplexityofinferencewithRelationalBayesianNetworksasparameterized bytheirprobabilityformulas.Weshow thatwithoutcombinationfunctions,inference is pp-complete,displaying the same complexity as standard Bayesian networks (thisis so evenwhenthedomainissuccinctlyspecifiedinbinarynotation).Usingonlymaximization ascombinationfunction,weobtaininferentialcomplexitythatrangesfrompp-completeto pspace-complete topexp-complete. Andby combiningmean and threshold combination functions, we obtain complexity classes in all levels of the counting hierarchy. We alsoinvestigatethe use ofarbitrarycombination functions and obtainthat inference is exp-complete even under aseemingly strong restriction.Finally, weexamine the query complexityofRelationalBayesianNetworks(i.e.,whentherelationalmodelisfixed),and weobtainthatinferenceiscompleteforpp.
©2017ElsevierInc.Allrightsreserved.
1. Introduction
Bayesian networksprovidean intuitivelanguage fortheprobabilistic descriptionofconcretedomains[22].Jaeger’sRe- lational Bayesian Networks, here referred to asrbns, extend Bayesian networks to abstract domains, andallow for the description ofrelational, context-specific, deterministic and temporal knowledge [20,21]. There are manylanguages that alsoextendBayesiannetworksintorelationalrepresentations[11,12,16,23,24,30];rbnsofferaparticularlygeneralandsolid formalism.
rbns constitutea specificationlanguage containing asmallnumberofconstructs:relations, probabilityformulas,com- bination functions,andequalityconstraints.Combination functionsare a particularlyimportantmodelingfeature,asthey provideawayofaggregatinginformationfromdifferentelementsofthedomain.
ItshouldnotbesurprisingthattheinferentialcomplexityofBayesiannetworksspecifiedbyrbnsdependsonthechoice ofconstructsallowed.However,fewresultshavebeenproducedontherelationbetweentheexpressivityofsuchconstructs andthecomplexityofinference.
Inthispaper,weexaminetheeffectofcombinationfunctionsonthecomplexityofinferenceswith(Bayesiannetworks specified by) rbns. We first argue that, without combination functions, rbns simply offer a language that is similar to platemodels,a well-knownformalism to describemodels withsimplerepetitivestructure [17,27].We show that without
✩ This paperispartofthe VirtualspecialissueontheEighthInternationalConferenceonProbabilistic GraphicalModels,edited byGiorgioCorani, AlessandroAntonucci,CassioDeCampos.
*
Correspondingauthor.E-mailaddress:denis.maua@usp.br(D.D. Mauá).
http://dx.doi.org/10.1016/j.ijar.2017.03.014 0888-613X/©2017ElsevierInc.Allrightsreserved.
Combination functions Domain spec. Bounded arity? Bounded nesting? Complexity of inference
none unary yes – pp-complete
none binary yes – pp-complete
max binary yes yes pexp-complete
max unary no yes pexp-complete
max unary yes no pspace-complete
max unary yes yes pp
p k-complete
threshold,mean unary yes yes ckp-harda
polynomial unary yes yes exp-complete
a MembershipwhenthresholdandmeanareuseddependonfurtherconstrainsexplainedinSection4.3.
combinationfunctionsinferenceispp-complete,irrespectiveoftheencodingofthedomain;thismatchesthecomplexityof inferenceinstandard(propositional)Bayesiannetworks.Whenweallowcombinationfunctionsandtheassociatedequality constraintsintothelanguage,matterscomplicateconsiderably.Wheneitherthedomainisspecifiedinbinaryorthearityof relationsisunbounded,inferenceispexp-completeevenwhentheonlycombinationfunctionismaximization.Ifweplace aboundon arity,andspecifythedomaininunary notation(orequivalent, asan explicitlistofelements),andallowonly maximization ascombinationfunction,then inference ispspace-complete.Thisis mostly generatedby theabilityto nest an unbounded numberofmaximization expressions. Infact, byfurther restrictingthe numberofnesting ofcombination functions,weobtainthesamepowerasaprobabilisticTuringmachinewithaccesstoanoracleinthepolynomialhierarchy.
We then look at a combinationof mean anda threshold: the former allows probabilities to be definedas proportions;
the latter allows the specificationof piecewise functions. We argue that threshold andmean combined are aspowerful asmaximization, andthus all previous results hold. And bya suitable constraint onthe useof thresholdandmean, we showthat wecanobtaincomplexity ineveryclassofthecountinghierarchy.Wealsolookatthecomplexity ofinference whenthecombinationfunctionisgivenaspartoftheinput.Thechallengehereistoconstrainthelanguagesoastoobtain non-trivialcomplexity results.Weshow that requiringpolynomial-timecombinationfunctionsistooweak acondition in thatitleadstoexp-completeinference.Ontheotherhand,requiringpolynomiallylongprobabilityformulasbringsinference downtopp-completeness.TheseresultsaresummarizedinTable 1.
Wealsoinvestigatethecomplexityofinference whentherbnisassumedfixed.Thisisequivalenttothe ideaofcom- pilingaprobabilisticmodel[6,9].Weshowthatcomplexityiseitherpolynomialifprobabilityformulascanbecomputedin polynomialtime(whichincludesthecaseofnocombinationexpressions)orpp-complete,whenthecombinationfunctions canbecomputedinpolynomialtime(whichincludesthecasesofmaximization,thresholdandmean).
Thepaperbeginswithabriefreviewofrbns(Section2),andkeyconceptsfromcomplexitytheory(Section3).Ourcon- tributionsregardinginferentialcomplexity appearinSection4.Thecomplexityofinferencewithoutcombinationfunctions appearinSection4.1.RelationalBayesiannetworksallowingonlycombinationbymaximizationareanalyzedinSection4.2, whilenetworksallowingmeanandthresholdareanalyzedinSection4.3.Generalpolynomial-timecomputablecombination formulasareexaminedinSection 4.4.Querycomplexity isdiscussedinSection5.Wejustifyouruseofdecisionproblems (instead offunctional problem) and discuss how our results can be adapted to provide completeness forclasses in the functionalcountinghierarchyinSection6.AsummaryofourcontributionsandopenquestionsarepresentedinSection7.
2. RelationalBayesiannetworks 2.1. Bayesiannetworks
ABayesiannetworkisacompactdescriptionofaprobabilisticmodeloverapropositionallanguage[7,22].Itconsistsof two parts:an acyclicdirected graphG=
(
V,
A)
overa finitesetofrandom variables X1, . . . ,
Xn,anda setofconditional probability distributions, one for each variable and each configuration of its parents. The parents of a variable X in V are denoted by pa(
X)
.In this paper, we consider only 0/1-valuedrandom variables, hence each conditional distribution P(X|pa(
X))
canberepresentedasatable.The semantics of a Bayesian network is obtained by the directed Markov property, which states that every variable is conditionally independent of its non-descendants given its parents. Forcategorical random variables, this assumption inducesasinglejointprobabilitydistributionby
P (
X1=
x1, . . . ,
Xn=
xn) =
ni=1
P (
Xi=
xi|
pa(
Xi) = π
i) ,
where
π
iisthevectorofvaluesforpa(
Xi)
inducedbyassignments{X1=x1, . . . ,
Xn=xn}.Bayesiannetworkscanrepresent complexpropositionaldomains,butlacktheabilitytorepresentrelationalknowledge.2.2. Definition
rbnscanspecifyBayesiannetworksoverrepetitivescenarioswhereentitiesandtheirrelationshipsaretoberepresented.
To do so,rbns borrow severalnotions from function-free first-orderlogic withequality [14], aswe now quicklyreview.
Considerafixedvocabularyconsistingofdisjointsetsofnamesofrelationsandlogicalvariables(notethatwedonot allow functionsnorconstants).Wedenotelogicalvariablesbycapitalletters(e.g., X
,
Y,
Z),namesofrelationsbylowercaseLatin letters(e.g.,r,
s,
t),andformulasbyGreekletters(e.g.,α
,β
).Eachrelationnamerisassociatedwithanonnegativeinteger|r|describingitsarity.Anatom hastheformr
(
X1, . . . ,
X|r|)
,wherer isarelationname,andeach Xi isalogicalvariable.Anatomicequalityisanyexpression X=Y,whereX andY arelogicalvariables.AnequalityconstraintisaBooleanformula containingonlydisjunctions,conjunctionsandnegationsofatomicequalities.Forexample,
(
X=
Y) ∧ (
X=
Z)
∨ (
Y=
Z) ,
where
(
X=Y)
isasyntacticsugarfor¬(X=Y)
(wewilladoptthisnotationthroughoutthetext).The localconditional probability modelsinan rbn arespecified usingprobabilityformulas;each probability formulais oneofthefollowing:
(PF1) Arationalnumberq∈ [0
,
1];or (PF2) Anatomr(
X1, . . . ,
X|r|)
;or(PF3) Aconvexcombination F1·F2+
(
1−F1)
·F3 whereF1,
F2,
F3 areprobabilityformulas;or (PF4) Acombinationexpression,thatis,anexpressionoftheformcomb
{
F1, . . . ,
Fk|
Y1, . . . ,
Ym; α } ,
wherecombisawordfromafixedvocabularyofnamesofcombinationfunctions(disjointfromthesetsofrelations symbols andlogical variables), F1
, . . . ,
Fk are probability formulas, Y1, . . . ,
Ym is a (possibly empty) list of logical variables,andα
isanequalityconstraintcontainingonlythoselogicalvariablesandthelogicalvariablesappearingin thesubformulas.Welaterdefinecombinationfunctions.Thelogicalvariables Y1
, . . . ,
Ym inacombinationexpressionaresaidtobeboundbythatexpression;theremightbeno logicalvariablesboundbyaparticularcombinationexpression,inwhichcasewewritecomb{F1, . . . ,
Fk|∅;α
}.We definefreelogicalvariables ina probabilityformula F asfollows.Arationalnumberhasno(free)logicalvariables.
Alllogicalvariablesinatomr
(
X1, . . . ,
X|r|)
arefree.Avariableisfreein F1·F2+(
1−F1)
·F3 ifitisfreeinoneofF1, F2 and F3.Finally,alogicalvariableisfree incomb{F1, . . . ,
Fk|Y1, . . . ,
Ym;α
}ifitsisfreeinanyof F1, . . . ,
Fk orα
,anditis notboundbycomb(i.e.,itisnotoneofY1, . . . ,
Ym).Anexampleofaprobabilityformulais
mean
{
0.
6·
r(
X) +
0.
7·
max{
1−
s(
X,
Y)|
X;
X=
X}|
Y,
Z;
Y=
X∧
Z=
X} .
InthisformulaX isfree(eventhough X isboundbythecombinationexpressionheadedbymax),andY andZ arebound.
Weoftenwrite F
(
X1, . . . ,
Xn)
toindicatethat X1, . . . ,
Xn arethefreelogicalvariablesinF.Similarly toBayesian networks, an rbnconsistsoftwo parts: an acyclicdirected graphwhereeach node isa relation symbol, anda collection of probability formulas, one foreach node. The probability formula Fr associated with node r containsexactly|r|freelogicalvariablesandmentionsonlytherelationsymbolss∈pa
(
r)
.2.3. Semantics
Roughly speaking,one caninterpret an rbnwithgraph G=
(
V,
A)
asa rulethat takesa set Dof elements,called a domain,andproduces an auxiliaryBayesiannetwork, asfollows.First, generateall groundings r(
a)
fora∈D|r| andr∈V. These groundingsare thenodesofthe Bayesiannetwork. Foreachgroundingr(
a)
,consider theprobability formulaata;thatis,Fr
(
a)
.Thisformuladependsonmanygroundingss(
a,
b)
ofparentsofr;thosearetheparentsofr(
a)
intheBayesian network. And foreachconfigurationπ
ofthese(grounded)parents, take P(
r(
a)
=1|pa(
r(
a))
=π )
=Fr(
a)
where Fr(
a)
is evaluatedatπ
.Weformalizethisprocedurelater,rightafterwepresentanexample:Example1. Consider a simplemodelof studentperformance, inspired by the “University World”often employed inde- scriptions ofProbabilisticRelational Models[16,22].Supposeastudent istakingcourses;foreachpairstudent/course,we have approval or not. We have relations difficult and committed, respectively indicating whether a course is difficult and whethera studentiscommitted, andrelationapproved,indicating whetherastudentgets an approvalgradeina course.
The probabilisticrelationships betweentheserelationsarecapturedby thegraphinFig. 1,wherewe usedtwoadditional relations student andcourse to “type”elements ofthe domain.The ideais that theserelationsare fullyspecified by the evidence during inference(thatis,every elementis typed),soits associatedprobabilitiesare notimportant;we cantake forexample:
Fstudent
(
Y) =
1/
2 andFcourse(
X) =
1/
2.
Fig. 1.Relational Bayesian network modeling student performance.
Fig. 2.Bayesian network obtained by grounding therbninExample 1and removing nodes with evidence.
Weneedprobabilityformulasfordifficultandcommitted:
Fdifficult
(
X) =
0.
3·
course(
X) ,
Fcommitted(
Y) =
0.
6·
student(
Y) .
Also,wehaveaprobabilityformulaforapproved:
Fapproved
(
X,
Y) =
0.
8· [
0.
6· (
1−
difficult(
X)) +
0.
4·
committed(
Y)] ·
course(
X) ·
student(
Y) .
Now supposewehavedomainD= {s1
,
s2,
c2,
c2}.Notethatwe have32groundings (fourunaryrelationsandonebinary relation),soweobtainaBayesiannetworkwith32nodes/randomvariables.Inthisnetwork,wehaveP (
difficult(
a) =
1|
course(
a) =
s) =
0.
0 ifs=
0(
ais not a course),
0.
3 otherwise;
andlikewiseforallothergroundingsofrelations.Nowsupposethattherelationstudentisgiven;saywehaves1ands2as students,andc1andc2ascourses(hencenot students).We denotethisevidenceE.GivenE,wecan focusonasmaller Bayesiannetworkobtainedbyconditioningonthis“typing”evidence;suchanetworkispresentedinFig. 2. 2
The probability formulas inthe previous example did not use combinationexpressions. Combination expressions are slightlymoredifficulttointerpret;theirsemanticsare givenbycombinationfunctions.Acombinationfunctionisanyfunc- tion that maps a multiset of finitely many numbers in [0
,
1] into a single rationalnumber in [0,
1]. Some examples of combinationfunctionsare:• Maximization:max{q1
, . . . ,
qk}=qi,wherei isthesmallestintegersuchthatqi≥qjfor j=1, . . . ,
k,andmax{}=0;• Minimization:min{q1
, . . . ,
qk}=1−max{1−q1, . . . ,
1−qk};• Arithmeticmean:mean{q1
, . . . ,
qk}=ki=1qi
/
k,andmean{}=0;• Noisy-or:noisy-or{q1
, . . . ,
qk}=1−ki=1
(
1−qi)
,andnoisy-or{}=0;• Threshold:threshold{q1
, . . . ,
qk}=1 ifmax{q1, . . . ,
qk}≥1/
2 andthreshold{q1, . . . ,
qk}=0 otherwise.Thecombinationfunction threshold, whichdoesnotappearin previousrbns, isusefultoencode case-based functions suchaspiecewiselinearfunctions.Itmayseemthatthecomparisonwithafixedvalue1
/
2 istoorestrictive;however,witha smalleffortwecancompareanyprobabilityformulawithanyconstantusingthiscombinationfunction.Toseethis,suppose we wanttocomparea probability formula F witha numberγ
∈ [0,
1]; that is,wewantto determinewhether F≥γ
.Ifγ >
1/
2,thenusethreshold{(
1/(
2γ ))
·F}.Andifγ <
1/
2,thenusethreshold{γ
F+(
1−γ )
}whereγ
=1/(
2(
1−γ ))
.Thus wecangenerateallsortsofpiecewiseandcase-basedfunctionswiththreshold.Weexploitthisversatilitytoencodedecision problemsrelatedtocountinginsomeofourproofs.We can now explain the semantics of rbns, which is givenby interpretations of a domain D.An interpretation is a function
μ
mappingeach relationsymbol r∈Rintoarelationrμ⊆D|r|,andeach equalityconstraintα
intoitsstandard meaningα
μ. An interpretationμ
induces a mapping from a probability formula F with n free logical variables into a functionFμ(
a)
fromDnto[0,
1]asfollows.IfF=q,thenFμistheconstantfunctionq.IfF=r(
X1, . . . ,
Xn)
,then Fμ(
a)
=1 if a∈rμ and Fμ(
a)
=0 otherwise. If F=F1·F2+(
1−F1)
·F3 then Fμ(
a)
=Fμ1(
a)
Fμ2
(
a)
+(
1−F1μ(
a))
Fμ3
(
a)
. Finally, if F=comb{F1, . . . ,
Fk|Y1, . . . ,
Ym;α
}, then Fμ(
a)
=comb(
Q)
, where this latter comb is the corresponding combination function andQis themultisetcontaining a number Fμi(
a,
b)
forevery(
a,
b)
∈α
μ (evenif Fi doesnot depend onevery coordinate).For example,consider a domain D= {a
,
b}. The probability formula F=max{0.
1,
0.
2,
0.
3|Y;Y =Y} is interpreted as max{0.
1,
0.
2,
0.
3,
0.
1,
0.
2,
0.
3}, and the probability formula F(
X,
Y)
=max{q1,
q2|∅;X =Y} is interpreted as Fμ(
a,
a)
= Fμ(
b,
b)
=max{}=0,andFμ(
a,
b)
=Fμ(
b,
a)
=max{q1,
q2}.Finally: GivenafinitedomainD,anrbnwithgraph G=
(
V,
A)
isassociatedwithaprobabilitydistributionoverinter- pretationsμ
definedbyP
D( μ ) =
r∈V
P
D(
rμ|{
sμ:
s∈
pa(
r)}) ,
where
P
D(
rμ|{
sμ:
s∈
pa(
r)}) =
a∈rμ
Frμ
(
a)
a∈/rμ
(
1−
Frμ(
a)) .
Wecan nowrephrase,inpreciseterms,themoreinformalexplanationgivenbeforeExample 1.Inessence,anrbnisa template modelthat foreachdomain Dgenerates aBayesian networkwhereeach node r isamultidimensional random variabletakingvaluesrμinthesetof2N|r| possibleinterpretationsrμ,whereN= |D|.Alternatively,wecanassociatewith every relationsymbolr∈Randtuplea∈D|r| arandomvariabler
(
a)
thattakesvalue 1whena∈rμand0whena∈/
rμ. An rbnandadomainD producean auxiliaryBayesiannetworkovertheserandomvariables,wheretheparents ofanode r(
a)
areallnodess(
a,
b)
suchthatsisaparentofrintherbnand(
a,
b)
∈D|s|,andtheconditionaldistributionassociated withr(
a)
isgivenbyinterpretationsoftheprobabilityformulaFr ata.ThisBayesiannetworkinducesajointdistributionP
{
r(
a) = ρ
r(a)}
=
r∈V
a∈D|r|
P
r
(
a) = ρ
r(a)|
pa(
r(
a)) = π = P
D( μ ) ,
wherea∈rμif
ρ
r(a)=1 anda∈/
rμifρ
r(a)=0,andπ
isaconfigurationconsistentwiththosevalues.Followingterminology from first-order logic,we often say that thisBayesian network is the “grounding” ofthe rbn on domain D. Often, it is possibletogenerateasmallergrounding,wheretheparentsofr(
a)
areonlythoserelevanttoevaluate Fr(
a)
(whichmight not depend on all s(
a,
b)
,s∈pa(
r)
, a,
b∈D|s|). Giventhisequivalence betweenan rbn endowedwith adomain andan auxiliaryBayesiannetwork,werefertoprobabilitiessuchasP(r(
a)
=1)
todenoteP(a∈rμ)
=μ:a∈rμPD
( μ )
,andsimilarly formorecomplexevents(notethatthedependencyofthedistributiononthedomainisleftimplicitinthefirstcase).Thenextexampleillustratesthemodelingpowerofdifferentcombinationfunctions.
Example2. ConsideragaintheuniversitydomaininExample 1,andsupposethat inadditiontostudentsandcourseswe alsohaveteachersinthedomain.Attheendoftheacademicyearstudentsvoteforthebestteacheroftheyear.Anawardis thengiventothemostvotedteacher.Weuseabinaryrelationtaughttoindicatewhetheracoursewastaughtbyateacher inthatyear.Thisrelationwillusuallybegivenasevidence,soitsprobabilityisnotimportant;forsimplicitywespecify
Ftaught
(
X,
Z) =
1/
2.
Weuseanunaryrelationawardedtoindicatewhethersomeoneistherecipientoftheaward.Theprobabilitythatateacher is awarded is linearlyrelatedto the proportion p of students that failedin atleast one coursehe/she taught,withtwo regimes:
P (
awarded(
Z) =
1|
p) =
0.
425−
0.
4p,
ifp≥
0.
7,
0.
25−
0.
15p,
ifp<
0.
7.
Weencodethisas,
Fawarded
(
Z) =
T(
Z) · [
0.
025+
0.
4· (
1−
F1(
Z))] + (
1−
T(
Z)) · [
0.
1+
0.
15· (
1−
F1(
Z))],
where T
(
Z)
= threshold{5/
7 · F1(
Z)
} denotes whether the proportion of failed students is above 0.7, F1(
Z)
= mean{F2(
Y,
Z)
|Y;Y=Y} is the proportion ofstudents that failed some course taught by Z, and F2(
Y,
Z)
is a formula indicatingwhetherY failedacoursetaughtby Z:F2
(
Y,
Z) =
max{(
1−
approved(
X,
Y)) ·
taught(
X,
Z)|
X:
X=
X} .
Sincetherecanbeexactlyonewinneroftheaward,weintroducearelationsingleWinnerofarityzero(i.e.,aproposition), andspecify:
FsingleWinner
=
max{
awarded(
Z)|
Z;
Z=
Z} ·
threshold{ γ ·
mean{
1−
awarded(
Z)|
Z;
Z=
Z}|∅} ,
where
γ
=(
N−1)/
N)
andN= |D|isthesizeofthedomain(thenumberofcourses,studentsandteachers).Thegraphof thecorrespondingrbnisshowninFig. 3. 2Thisexampleusesthemax combinationfunctiontoencodeanexistentialquantifier;wewilllaterresorttothistrickin ourproofs.
Fig. 3.Relational Bayesian network modeling the university domain inExample 2.
3. Thecomplexityofcounting
Weassume familiaritywithstandard complexityclassessuch asp,np,pspaceandexp,andwiththeconceptoforacle Turingmachines[1,29].Asusual,wedenotetheclassoflanguagesdecidedbyaTuringmachineincomplexityclasscwith oracle in complexity class o asco. We alsoassume that problemsare reasonably encodedusing a binary alphabet (say, 0 and1).Adecisionproblemconsistsincomputinga0/1-valuedfunctiononbinary strings.Wecallaninputofadecision problemayesinstanceifitsoutputis1,otherwisewecallanoinstance.
Thepolynomialhierarchy isthecollectionofcomplexity classes
kp,
pk and pk,forevery nonnegativeintegerk.These decisionproblems aredefined inductivelyby
p0=p,
kp=nppk−1,
pk=pk−1p ,and kp=co
pk. Thecomplexity class ph contains alldecisionproblemsin thepolynomialhierarchy: ph= k
kp (over all integersk).We thushavethefollowing inclusion:
p
⊆
conpnp⊆
pnp⊆
2pp
2
⊆ · · · ⊆
kpp k
⊆
kp⊆ · · · ⊆
ph.
Awell-knownresultbyMeyerandStockmeyer[28]statesthatif kp=
kpforsomekthenph=
kp.Inparticular,ifp=np thenph=p.Thissuggeststhatthelevels
kp ofthehierarchyaredistinct,andthatusinganoracleinahigherlevelbrings inmorecomputationalpower.
A probabilisticTuringmachineis astandard non-deterministic Turing machine that acceptsifat leasthalf ofthe com- putationpathsaccept, andrejects otherwise.The complexityclasspp containsdecisionproblemswhichareacceptedbya probabilisticTuringmachineinpolynomialtime.Theclassppisknowntobeclosedundercomplementandunion[4].The classpexpisdefinedanalogouslywithpolynomial-timereplacedwithexponential-time.
The countinghierarchyaugments thepolynomial hierarchywithoraclemachinesbased onpp.Itcan bedefinedasthe smallestcollectionofclassescontaining p andsuchthat ifc isinthe collectionthennpc,conpc,andppc are alsointhe collection[34,36].Wedefinetheclasschastheunionoverallnonnegativeintegerskoftheclassesckp,whereckp=ppck−1 and c0p=
0p=p [36]. Toda’s celebrated result showsthat any problem in ph can be solved in polynomial time by a deterministicTuring machinewithanoracle pp[32].Hence,theclass chcontains theentirepolynomialhierarchy.Iffact, wehavethefollowinghierarchy:
ph
⊆
ppp⊆
conpnppppp⊆
pppp⊆
c2p⊆ · · · ⊆
ch⊆
pspace⊆
exp⊆
pexp.
Amany-onereduction froma problem A to a problem B is a polynomial-timefunction f that maps inputsof A into inputsof B,andsuchthat xisayesinstanceof A ifandonlyif f
(
x)
isayesinstanceof B.Foranycomplexityclassc in thecountinghierarchy,wesaythataproblemA isc-hardifforanyproblemB incthereisamany-onereductionfromA to B.IfA isalsoinc,then Aissaidtobec-complete.4. InferentialcomplexityofrelationalBayesiannetworks
Ourgoal in thispaper isto examine the complexity of the decisionvariant ofthe following (single-query) inference problem:
Input:Anrbnwithgraph
(
V,
A)
,adomainD,alistofrandomvariablesr(
a),
r1(
a1), . . . ,
r(
an)
,andarationalγ
∈ [0,
1]. Decision:IsP(
r(
a)
=1|r1(
a1)
=1∧ · · · ∧rn(
an)
=1)
≥γ
?The conditioningeventr1
(
a1)
=1∧ · · · ∧rn(
an)
=1 is calledevidence,andalsodenoted as{r1(
a1)
=1, . . . ,
rn(
an)
=1} and {r1(
a1)
=1},. . . ,
{rn(
an)
=1}. We haveassumed implicitlyin the formulation ofthe problem above that P(r1(
a1)
= 1, . . . ,
rn(
an)
=1) >
0 (i.e.,that evidencehaspositiveprobability).Adifferentstrategywouldbe to defineinstanceswhere thiscondition isnotobservedasnoinstances(orasyesinstances).Inthiscase, toprovemembershipinsome complexity classwe wouldhavetofirstshow howtodecideP(r1(
a1)
=1, . . . ,
rn(
an)
=1) >
0 inthatclass.Forsimplicity,weassume thatthisisalwaystrue;itispossiblehowevertoshow thatthecomplexity ofinferenceisnotchangedbythissimplifying assumption.Ifthe domainisa singleton,then therbn isactuallya Bayesiannetwork,since probability formulascanbe efficiently converted into tables. Moreover, anyBayesian network can be polynomially encodedas a rbn witha singleton domain.
Hence,inferenceinrbnsispp-hard,asthisisthecaseforBayesiannetworks[8,Theorems11.3and11.5].
For domains withmany elements, complexity might depend on howthe domain D is encoded. One possibility is to assume that the domainis some set {1
,
2, . . . ,
N} specified by the integer N (butnote that thereis no ordering on the elementsimplied).TheintegerN canbeencodedindifferentways.Oneisthis:Assumption3.Thedomainis{1
,
2, . . . ,
N},whereN isencodedinbinarynotation.Underthepreviousassumption,theencodingofthedomaincontributeswithO
(
logN)
bitstotheinputsize.Depending ontheconstructsallowed,thisalternativecanproduceprobabilitieswhichrequireexponentialefforttobewrittendown,as they mightneedO(N)
bits.Toseethis, consideran rbnwithapropositionrsuchthat Fr=noisy-or{1/
2|X;X=X}.Then P(
r=0)
=1− [1−Ni=11
/
2]=2−N.Anotherpossibilityistostillassume thatthedomainis{1
, . . . ,
N},buttoencode N usingunary notation.Inthiscase, theencodingofthedomaincontributeswithO(N)
symbolstotheinput.Thus,thisspecificationisequivalent(uptosome constantfactor)toencodingthedomainbyanexplicitlistwitheveryelementitcontains.Sowemightadopt:Assumption4.Thedomainisspecifiedasanexplicitlistofelements.
Another parameter that affects inferential complexity is the arity of relations. If we place no bound on the arity of relations,thenanexponentialeffortmightberequiredtowritedownprobabilities.Forexample,consideranrbnwithgraph r s wherer isak-aryrelationwithprobabilityformula Fr
(
X1, . . . ,
Xk)
=1/
2,andsispropositionwithprobability formula Fs=noisy-or{r(
X1, . . . ,
Xk)
|X1, . . . ,
Xk;X1=X1};then inference P(
s=0)
requires an exponential numberof bits when4.Wewillthusoftenadopt:Assumption5.Thearityofrelationsisbounded.
Without further restrictions, the complexity of inference is dominated by the complexity of evaluating combination expressions.Forinstance,ifthecombinationfunctioninacombinationexpressionisanarbitraryTuringmachine(withno boundsontimeandspace)theninferenceisundecidable.Thus,weneedtoconstraintheuseofcombinationfunctionsin probabilityformulastomakethestudyofinferentialcomplexitymoreinteresting.
4.1. rbnswithoutcombinationfunctions
Westartouranalysiswiththesimplestcase,wherenocombinationfunctionisavailable.Thatis,weadopt:
Assumption6.Everyprobability formulais eitherarational number,oran atom,ora convexcombinationofprobability formulas.
Undertheassumptionabove,everylogicalvariableinaprobabilityformulaisfree;also,intheacyclicdirectedgraphof relationsofanrbnnonodecanhaveanaritylargerthananyofitschildren.
Ifthedomainisgivenasanexplicitlistofelements(orbyanintegerinunarynotation),thenwecan“ground”anrbn into anauxiliary Bayesiannetwork, andruninference there(consider Example 1).Thisprocess generatesa polynomially- large Bayesiannetworkiftherelationshaveboundedarity(ifnecessary,we caninsertauxiliarynodessoastobound the numberofparentsofanygivennode).BecauseinferenceinBayesiannetworksispp-complete,wehavethat
Theorem7. Inferenceispp-completewhenthedomainisspecifiedasalist,therelationshaveboundedarityandthereisnocombina- tionfunction.
Interestingly,thecomplexityofinferenceremainsthesameevenifthedomainisspecifiedbyitssizeinbinarynotation (notethatinthiscasetheauxiliaryBayesiannetworkhasasizethatisexponentialintheinput):
Theorem8. Inferenceispp-completewhenthedomainisspecifiedbyitssizeinbinary,therelationshaveboundedarityandthereis nocombinationfunction.
Proof. Hardness followsfromthe fact that an rbn can efficientlyencode any propositional Bayesiannetwork using only convexcombinationsofnumbersandatoms(asinExample 1).Toprovemembership,considertheauxiliaryBayesiannet- workthatisgeneratedby“grounding”agivenrbn.Thisnetworkmaybeexponentiallylargeontheinput,whenthedomain isgiveninbinarynotation.Weshowthatweonlyneedtogenerateapolynomially-largefragmentoftheauxiliaryBayesian networktodecidewhetherP
(
r(
a)
=1|r1(
a1)
=1∧ · · · ∧rn(
an)
=1)
≥γ
.Notethattocomputethatconditionalprobability,that Fr isa convexcombination;then itmayrecursively referto severalatomsthatare parents ofr. Theseparents may inturnhaveseveralparents,andsoon.Sincetherearenotcombinationfunctions,wecanalwaysrenamelogicalvariables suchthe probabilityformulasofeveryancestorofr containonlylogicalvariablesintheprobability formulaFr.Thus,ifr hasarityk,thentherecanbe atmost2k ancestorsofr
(
a)
,asthisisthenumberofsubsetsofthelogicalvariables in Fr. Sinceweassumedkisboundedbyaconstant,thisnumberofancestorsisalsoboundedbyaconstant.Thus,wecanbuild aBayesiannetworkconsistingofthesenodesandtheedgesbetweenthem,togetherwiththeassociatedprobabilityvalues.Inference mustthenberun inthissub-network,andthiscan bedone byapolynomial-timeprobabilistic Turingmachine [8,Theorems11.3and11.5]. 2
Accordingto Theorem 8,thecomplexity ofinferenceincombination-function-free rbnsmatches thecomplexity ofin- ferenceinBayesiannetworks(ifrelationarityisbounded).Thatis,eventhoughanrbnmayrepresentexponentiallylarge Bayesiannetworks, onlya polynomialnumberofnodesofthisnetworkisrelevantforinference(and thesenodescan be collectedinpolynomialtime).
4.2. rbnswithmaxcombinationfunctions
Wenowconsiderrbnswhereinformationisonlyaggregatedbymaximization.Thatis,weassumethat Assumption9.Theonlycombinationfunctionusedismax.
Aswe haveseen,withoutcombinationfunctions, inferenceinrbns hasthesamecomputationalpowerasinferencein Bayesiannetworks, evenwhen thedomainisgivenby aninteger inbinary notation.Thisisnot thecasewhenwe allow combinationfunctions,evenoneassimpleasmaximization:
Theorem10. Inferenceispexp-completewhenthedomainisspecifiedbyitssizeinbinary,therelationshaveboundedarityandthe onlycombinationfunctionismax.
Proof. Membershipfollowssincean rbnsatisfyingthe aboveassumptions canbe “grounded” intoanexponentially large Bayesiannetwork.Toseethis,notethatifwehaveadomainofsize2m,andaboundofarityk,theneachrelationproduces atmost
(
2m)
k=2km (grounded)random variables. Inference canbe decided in that networkusing aprobabilistic Turing machine withan exponential bound on time, using the same scheme normally employed to decide an inference for a Bayesiannetwork[8].Toshowhardness,wesimulateanexponential-timeprobabilisticTuringmachineusinganrbnsuchthatwecan“count”
thenumberofacceptingpathsbyasingleinference.WedosobyresortingtoastandardTuringmachineencodingdescribed byGrädel[18,Theorem3.2.4].Thereareotherpossibleencodingswemightuse,butthisoneisquitecompactandrelatively easytograsp.
So,take a Turingmachinethat can decidea pexp-completelanguage. Thatis,take a pexp-completelanguage Landa nondeterministicTuringmachinesuchthat
∈Lifandonlyifthemachinehaltswithintime 2p withhalformoreofthe computation pathsaccepting, wheren isthelengthof
and pdenotes apolynomialinn.1 Denote by
σ
asymbolinthe alphabetofthemachine(whichincludesblankandboundarysymbolsinadditionto0and1s),andbyq astate.Asusual, we describeaconfigurationofthe machineby astringσ
1σ
2· · ·σ
i−1(
qσ
i) σ
i+1· · ·σ
2p,whereeachσ
j is asymbolinthe tape,and(
qσ
i)
indicatesthatthemachineisatstateq anditsheadisatcelli.Theinitialconfigurationis(
q0σ
01) σ
02· · ·σ
0nfollowedby 2p−nblanks, whereq0 is theinitial state. Letqa andqr be,respectively, statesthat indicateacceptance or rejection ofthe input string
σ
01· · ·σ
0n. The transitionfunctionδ
ofthe machine takes a pair(
q, σ )
consistingof a state anda symbol inthe tape,andreturns triples(
q, σ ,
m)
:the next state q,the symbolσ
to bewritten inthe tape (we assumethat ablankisneverwrittenbythemachine), andan integerm in{−1,
0,
1}.Here −1 meansthat theheadisto moveleft,0 meansthat theheadistostayinthecurrentcell,and1 meansthat theheadistomoveright.Wemakethe importantassumptionthat,ifqa orqr appearinsomeconfiguration,thentheconfigurationisnot modifiedanymore (the transitionfunctiongoesfromthisconfigurationtoitselffromthatpointon).Thisisnecessarytoguaranteethatthenumber ofacceptingcomputationsisequaltothenumberofwaysinwhichthemachinecanrunfromtheinput.Grädel’s encoding uses quantified Boolean formulas to represent the Turing machine. As shown by Jaeger [20, Lemma 2.4],probability formulascanencode conjunctionandnegation,respectively,by multiplicationandinversion(and then disjunction and implication are easily obtained). Logical equalities such as X =Y are obtained by the formula
1 Intheusualdefinition,aprobabilisticTuringmachineacceptswhenthemajorityofcomputationpathsaccept;however,givensuchamachinewecan alwaysbuildamachinethatdecidesthesamelanguage,andsuchthatthemachineacceptsifandonlyifatleasthalfofthecomputationpathsaccept.
F
(
X,
Y)
=max{1|∅;X=Y},andquantified formulassuch as∃Yψ(
X,
Y)
areobtainedas F(
X)
=max{Fψ(
X,
Y)
|Y;Y=Y}, where Fψ(
X,
Y)
encodesψ(
X,
Y)
.ThuswecaneasilyreproduceGrädel’slogicalencodingofthemachinebywritinglogical formulasasprobabilityformulas.Forexample,considerarelationless(meaning“lessthan”),usedtoordertheelementsof thedomain,andconsequentlyrepresenttheorderingonthecomputationstepsandonthepositionsofthetape.Sincethe elements ofthedomainareordered,wecaninterpretthemasintegers.DefineFless(
X,
Y)
=1/
2.Wemustimposeonless theaxiomsoflinearorderings.Toconstrainlesstobeirreflexive,weintroduceapropositionirrefwithprobabilityformulaFirref
=
1−
max{
less(
X,
X)|
X;
X=
X} .
We have that Fμ
irref=1 if and only if lessμ is irreflexive. Note that the above probability formula simply encodes the quantified Boolean formula ¬∀Xless
(
X,
X)
in the language of rbns. Similarly, we enforce transitivity by a relation trans withprobabilityformulaFtrans
=
1−
max{
less(
X,
Y) ·
less(
Y,
Z) · (
1−
less(
X,
Z))|
X,
Y,
Z;
X=
X} .
To understand thisexpression, note that less
(
a,
b)
∧less(
b,
c)
→less(
a,
c)
holds if andonly if 1− [less(
a,
b)
·less(
b,
c)
·(
1−less(
a,
c))
]=1. Finally,weimpose trichotomy(thatis,if X=Y thenless(
X,
Y)
∨less(
Y,
X)
) bya relationtrichwith probabilityformulaFtrich
=
1−
max{ (
1−
less(
X,
Y)) · (
1−
less(
Y,
X)) |
X,
Y;
X=
Y} .
Usingtherelationlesswecanencodearelationfirstindicatingthesmallestelementofthedomain:
Ffirst
(
X) =
1−
max{
less(
Y,
X)|
Y;
Y=
Y} .
Wecanalsoencodetherelationsuccessor
(
X,
Y)
,whichistrueifandonlyifY isthesmallestelementlargerthan X:Fsuccessor
(
X,
Y) =
less(
X,
Y) · (
1−
max{
less(
X,
Z) ·
less(
Z,
Y) |
Z;
Z=
Z} ) .
WhileprobabilityformulascanencodeanyquantifiedBooleanformulausingonlymaxascombinationfunction,theyareless readable thantheirlogicalcounterparts.Sofortherestofthisproof,wewilluselogicalexpressionsinsteadofprobability formulas.
Introduceaunaryrelationstateqforeachstateqanddefine Fstateq
(
X)
=1/
2.Atomstateq(
X)
denotesthatthemachineis instateq atcomputationstep X.Also,introduceabinaryrelationsymbolσ foreachsymbolσ
inthealphabet,andspecify Fsymbolσ(
X,
Y)
=1/
2.Atomsymbolσ(
X,
Y)
denotesthatσ
iswrittenintheYthpositionofthetapeinthe Xthcomputation step.Finally,introduceabinaryrelationheadwithFhead(
X,
Y)
=1/
2,meaningthatthemachineheadisintheYthposition ofthetapeinthe Xthcomputationstep.Wemustguaranteethatatanycomputationstepthemachineisinasinglestate, each tapepositionhasasinglesymbol,andtheheadisatasingletapeposition.Wedosoby introducingapropositionr andaprobabilityformulaFr thatencodesthelogicalexpression:∀
Xq
stateq
(
X) ∧
q=q
¬
stateq(
X)
∧
σ
∀
Ysymbolσ
(
X,
Y) ∧
σ=σ
¬
symbolσ(
X,
Y)
∧
∃
Yhead
(
X,
Y) ∧ ∀
Z(
Z=
Y) → ¬
head(
X,
Z)
,
WeintroduceapropositionstartwithaprobabilityformulaFstart
(
X)
imposingtheinitialconfigurationofthemachine,and a proposition compute witha probability formula Fcompute encoding the transitions ofthe machine. The formula Fstart is simple:itfixesthefirstelementusingtheconstruction∀X[first(
X)
→stateq0(
X)
∧head(
X,
X)]
.TheformulaFcomputeismore elaborate; we onlysketch theconstructionby Grädel.First, Fcompute encodestheconjunctionoftwologicalformulas,one affectingthetapepositionsthatdonotchange,andtheotheractuallyencodingthetransitions.Thefirstformulais∀
X,
Y,
Z,
Wσ
symbolσ
(
X,
Y) ∧ (
Z=
Y) ∧
head(
X,
Z) ∧
successor(
X,
W) →
symbolσ(
W,
Y)
.
Thesecondformulais
∀
X,
Y,
Wq,σ
stateq
(
X) ∧
head(
X,
Y) ∧
symbolσ(
X,
Y) ∧
successor(
X,
W)
→
(q,σ,m)∈δ(q,σ)stateq
(
W) ∧
symbolσ(
W,
Y) ∧
HEADMOVE,
whereHEADMOVE istheformula∃Z[
(
Z=Y+m)
∧head(
W,
Z)
],and Z=Y+m issyntacticsugarforaconstructionthat canbeaccomplishedwiththesuccessorrelation.We are stillto imposethat theconstraintswe includedare infact enforcedby anyinterpretation.Wedo soby using evidence {irref=1}, {trans=1}, {trich=1}. We also need to impose that the input is represented on the tape. So use