• Nenhum resultado encontrado

Learning Probabilistic Sentential Decision Diagrams by Sampling

N/A
N/A
Protected

Academic year: 2022

Share "Learning Probabilistic Sentential Decision Diagrams by Sampling"

Copied!
42
0
0

Texto

(1)

Learning Probabilistic Sentential Decision Diagrams by Sampling

KDMiLe 2020

Renato Geh1 Denis Mauá1 Alessandro Antonucci2

1Institute of Mathematics and Statistics, University of São Paulo, Brazil {renatolg,ddm}@ime.usp.br

2Istituto Dalle Molle di Studi sull’Intelligenza Artificiale, Switzerland [email protected]

(2)

Background

(3)

Probabilistic Sentential Decision Diagrams (PSDDs)

vtree

B A D C

D : θ

D

C ¬B

B A ¬B ⊥ B ¬A D C ¬D ⊥

0.35 0.4 0.25

1.0 0.0 0.0 1.0 0.3 0.7

Kisa et al. 2014

(4)

vtree

B A D C

D : θ

D

C ¬B

B A ¬B ⊥ B ¬A D C ¬D ⊥

OR gate

AND gate

0.35 0.4 0.25

1.0 0.0 0.0 1.0 0.3 0.7

Leaves are eitherliterals,constants orBernoulli distributions.

(5)

vtree

B A D C

Elements...

D : θ

D

C ¬B

B A ¬B ⊥ B ¬A D C ¬D ⊥

0.35 0.4 0.25

1.0 0.0 0.0 1.0 0.3 0.7

(A∧B)

| {z }

prime

∧(D∨ ¬D)

| {z }

sub

(6)

vtree

B A D C

Partition...

D : θ

D

C ¬B

B A ¬B ⊥ B ¬A D C ¬D ⊥

0.35 0.4 0.25

1.0 0.0 0.0 1.0 0.3 0.7

(A∧B)

| {z }

e1

∨((¬A∧B)∧C)

| {z }

e2

∨((C∧D)∧ ¬B)

| {z }

e3

(7)

Related Works

(8)

LearnPSDD

Given. A vtree and a circuit.

Idea. Learn a PSDD as an expansion of initial circuit.

How?

1. Recursively apply small changes;

2. Evaluate score;

3. Greedily accept changes.

Score(S,S0|D) = logp(S0|D)−logp(S|D)

|S0| − |S|

What are these “small” changes?

Liang, Bekker, and G. V. d. Broeck 2017

(9)

Splitan element...

α

β γ

A ¬A

Split onA

α

βA

βA γ

A ¬A

(10)

Clonea partition...

α

Clone

α α

(11)

Strudel

Given. A Chow-Liu Tree (CLT).

Idea. Learn a PSDD and its vtree.

How?

1. Learn a CLT.

2. Extract a vtree from CLT.

3. Compile CLT into circuit.

4. “Grow” circuit with Split.

Dang, Vergari, and G. v. d. Broeck 2020

(12)

Dang, Vergari, and G. v. d. Broeck 2020

(13)

Learning by Sampling

(14)

Monte-Carlo Structure Learning

Previous attemptsgrewor compiledexisting models.

Instead, we want tobuild a circuit solely from knowledge.

Our approach. Start off with a logic formula:

φ(A,B,C,D,E)=(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

Recursively decomposeφ top-down through subsequent partitions.

(15)

vtree

A B C D E

How to decompose formula into elements...

(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

ABC D∧ ¬E

AB∧ ¬C

A∧ ¬B

¬ABC E

¬AB∧ ¬C D∧ ¬E

A∧ ¬B

>

...while ensuring primes are exhaustive and mutually exclusive?

(16)

Given a prime ordering, such asO= (B, C, A), return a set of primes.

B

¬B

BC

B¬C

¬BC

¬B¬C

BCA BC¬A

¬BCA

¬BC¬A B¬CA B¬C¬A

¬B¬CA

¬B¬C¬A

(17)

Given a prime ordering, such asO= (B, C, A), return a set of primes.

B

¬B

BC

B¬C

¬BC

¬B¬C

BCA BC¬A

¬BCA

¬BC¬A B¬CA B¬C¬A

¬B¬CA

¬B¬C¬A

(18)

Given a prime ordering, such asO= (B, C, A), return a set of primes.

B

¬B

BC

B¬C

¬BC

¬B¬C

BCA BC¬A

¬BCA

¬BC¬A B¬CA B¬C¬A

¬B¬CA

¬B¬C¬A

(19)

Given a prime ordering, such asO= (B, C, A), return a set of primes.

B

¬B

BC

B¬C

¬BC

¬B¬C

BCA BC¬A

¬BCA

¬BC¬A B¬CA B¬C¬A

¬B¬CA

¬B¬C¬A

(20)

Given a prime ordering, such asO= (B, C, A), return a set of primes.

B

¬B

BC

B¬C

¬BC

¬B¬C

BCA BC¬A

¬BCA

¬BC¬A B¬CA B¬C¬A

¬B¬CA

¬B¬C¬A

This gives an exponential number of elements!

(21)

Instead, let’s “forget” a variable when it does not “matter”, i.e.

when

φ|x=φ|¬x.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

(22)

Instead, let’s “forget” a variable when it does not “matter”, i.e.

when

φ|x=φ|¬x.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

(23)

Instead, let’s “forget” a variable when it does not “matter”, i.e.

when

φ|x=φ|¬x.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

(24)

Instead, let’s “forget” a variable when it does not “matter”, i.e.

when

φ|x=φ|¬x.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

(25)

Instead, let’s “forget” a variable when it does not “matter”, i.e.

when

φ|x=φ|¬x.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

We avoid computing an exponential number of primes!

(26)

In other words...

Given.

I A random prime ordering O={X1, . . . , Xn};

I A formula φ.

Generate a set of primes{pi}ni=1, and then subs si=φ|pi. But what ifφ≡ >or ∀Xi∈O,Xi 6∈φ?

(27)

Then we have even more freedom! Stochastically marginalize with some probabilityp.

B

¬B

B

¬BC

¬B¬C

BA

B¬A

¬BCA

¬BC¬A

¬B¬CA

¬B¬C¬A

We can marginalize any variable we want, since any operation onφ withXi is idempotent.

(28)

Then we have even more freedom! Stochastically marginalize with some probabilityp.

C

¬C

CA

C¬A

¬CA

¬C¬A

We can marginalize any variable we want, since any operation onφ withXi is idempotent.

(29)

(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

ABC D∧ ¬E

AB∧ ¬C

A∧ ¬B

¬ABC E

¬AB∧ ¬C D∧ ¬E

A∧ ¬B

>

(30)

(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

ABC D∧ ¬E

¬ABC E

¬AB∧ ¬C D∧ ¬E

A∧ ¬B

>

(31)

Compression

e1 e2 e3

α β γ β ... ...

Compress

e1 e2

α∨γ β ... ... Compress elements(p1, s) and(p2, s)by replacing them with (p1∨p2, s).

Compressiongenerates different distributions consistent with formula.

(32)

(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

ABC D∧ ¬E

¬AB∧ ¬C D∧ ¬E

¬ABC E

A∧ ¬B

>

(33)

(B∧C∧((A∧D∧¬E)∨(¬A∧E)))∨(¬A∧((¬C∧D∧¬E)∨¬B))

(¬AB∧ ¬C)(¬AB∧ ¬C) D∧ ¬E

¬ABC E

A∧ ¬B

>

(34)

If there arencompressable elements...

1 2 · · · n

α1 β α2 β αn β ... ...

We can generatePn k=1

n k

different circuits.

(35)

If there arencompressable elements...

1 2 · · · n

α1 β α2 β αn β ... ...

Uniformly sample a circuit from Pascal Triangle’sk-th row.

(36)

If there arencompressable elements...

1 2 · · · n

α1 β α2 β αn β ... ...

1 2

α1∨α2∨. . .∨αn

β ... ...

But how to efficiently compute potentially complex disjunctions?

(37)

Binary Decision Diagrams (BDDs)

A

B B

C

>

Operation Description Complexity Reduce canonical form ofφ O(n·logn) Apply φ1φ2 O(n1·n2) Restrict φ|x O(n·logn) Forget φ|xφ|¬x O(n2)

φ(A, B, C) = (A∨ ¬B)∧(¬B∨C)

(38)

Results

(39)

Likelihood

Data #vars Worst Best Average LSPN-Opt LSPN-CV CLT CNet 1 10 -449.89 -413.51 -415.60 -422.37 -444.62 -456.49 -585.23 2 15 -745.20 -693.51 -695.82 -695.09 -739.49 -803.41 -1070.29 3 20 -1024.01 -969.51 -971.80 -959.03 -1003.93 -1075.31 -1855.05 4 25 -1268.82 -1208.22 -1210.27 -1185.28 -1254.47 -1290.94 -2033.58 5 30 -1548.92 -1440.27 -1442.58 -1441.90 -1543.35 -1535.54 -2048.16 6 100 -5169.14 -4995.53 -4997.83 -4958.06 -5232.04 -5712.73 -10326.73 7 100 -5329.17 -5153.51 -5155.82 -4900.65 -5206.54 -5710.64 -9948.88 8 14 -472.15 -423.32 -425.62 -490.21 -506.62 -486.06 -601.31

Data Logic formula

1 φ1= (X1X3)(X4∧ ¬X2)(X5∧ ¬X10)

2 φ2= (X1X3)(X4∧ ¬X2)(X5∧ ¬X10)(X12∧ ¬X13X15∧ ¬X14) 3 φ3= (X1X3)(X4∧ ¬X2)(X5∧ ¬X10)(X12∧ ¬X13X15∧ ¬X14) 4 φ4= (X1X3)(X4∧ ¬X2)(X5∧ ¬X10)(X12∧ ¬X13X15∧ ¬X14) 5 φ5= (X2X30)(¬X15∨ ¬X10)(¬X25X5X15∨ ¬X1)(X1X15∨ ¬X30)

6 φ6= (X10∨X30)(¬X1X5)(¬X10X14X23)∧(X2∨ ¬X27X35)(X98∨ ¬X78∨ ¬X27X8) 7 φ7=>

8 φ8=d0d1d2d3d4d5d6d7d8d9

(40)

Thank You!

Questions?

(41)

References

(42)

References I

Dang, Meihua, Antonio Vergari, and Guy van den Broeck (2020).

“Strudel: Learning Structured-Decomposable Probabilistic Circuits”.In: Proceedings of the Tenth International Conference on Probabilistic Graphical Models.

Kisa, Doga et al. (2014).“Probabilistic Sentential Decision Diagrams”.In: Knowledge Representation and Reasoning Conference. URL: https://www.aaai.org/ocs/index.php/

KR/KR14/paper/view/8005.

Liang, Yitao, Jessa Bekker, and Guy Van den Broeck (2017).

“Learning the Structure of Probabilistic Sentential Decision Diagrams”.In: Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, UAI 2017, Sydney, Australia, August 11-15, 2017. Ed. by Gal Elidan, Kristian Kersting, and Alexander T. Ihler. AUAI Press. URL:

http://auai.org/uai2017/proceedings/papers/291.pdf.

Referências

Documentos relacionados

Caso o rompimento do relacionamento não seja consensual, deverá aquele cônjuge que deixou o imóvel buscar medida efetiva a fim de assegurar seus direitos sobre o bem, devendo

These proceedings include papers presented at the International Conference on “Research and Advanced Technology in Fire Safety” FireSafety 2017 which took place

Neste sentido, para contextualizarmos e compreendermos o funcionamento normativo-legal dos estabelecimentos do ensino secundário de STP e também conhecer as atividades e as

The UWPE measurements from both samples, that is, standard Escherichia coli culture samples and river’s water samples, using the implemented measurement system showed

Idosos que praticam atividades físicas, que não possuem doenças, que estão inseridos no convívio social, irão apresentar autonomia boa para realizarem suas tarefas

Mas, no caso particular da família de escravos (o que, no Brasil, inclui questões da ordem étnico-racial), filhos e mulheres não faziam do homem seu chefe, pois sua condição

Therefore, we tried to find a set of models capable of including all interactive films: (a) the arborescent model, based on a simple one-off choice made at certain moments in

No cruzamento das classes com a categoria de resposta se tem filhos verificamos que dentro das classes de economias e poupanças, de hábitos de consumo e de lazer se regista um