Consistency Checking - Revision of Boolean Logical Models of Biological Regulatory Networks usi

4.2.2 Encoding Stable State Consistency

In stable state consistency, a compound is consistent if it is able to reproduce its observed value, given the observed values of its regulators. If all compounds can reproduce their observed value in this fashion, then the model is consistent. In this subsection, we will present how we have encoded the verification of stable state consistency.

In order to be able to determine if a compound is active in an experiment, we need to know if the implicants (terms) that define its function are active or not. If even a single implicant is active, the entire function will evaluate to true (since it is written in theBCF form, which is a special case of the disjunctive normal form), and so the compound will be active. On the other hand, if no implicants are active, then the function evaluates to false, which means the compound is inactive. We will now take a look at how we are defining theimplicant_inactivepredicate.

implicant_inactive(E,C,I_NO) :− function(C,I), term(C,I_NO,R), regulates(R,C,0), curated_observation(E,R,0).

implicant_inactive(E,C,I_NO) :− function(C,I), term(C,I_NO,R), regulates(R,C,1), curated_observation(E,R,1).

To know whether an implicantI_N O of a compound C’s function is inactive in some experimentE, we need to look at the observations of the regulating compoundsRthat belong to that implicant. In the first rule, if we have an observation stating that regulator Ris inactive, and ifRis an activator, then that means that the implicantRbelongs to will evaluate to 0(i.e., be inactive). A similar process occurs in the second rule, but for the cases where Ris an inhibitor and the observation states that it is active. Note that we keep track of which implicants are inactive per experiment, since each experiment is its own self-contained environment. One question that may arise upon seeing this predicate is whether we could have instead defined animplicant_activepredicate. It is important to mention that that would also be a viable strategy, albeit less simple to implement. In order to be able to say that an implicant is inactive, it is as simple as verifying that one of that implicant’s regulators’ value is 0. However, in order to say that an implicant is active, we would need to verify that every regulator in that implicant is active. This is entirely possible, but it would require additional rules in order to implement. As such, the simpler of the two alternatives was used.

Depending on whether every implicant is inactive, we are able to determine if a func-tion is active or inactive. If at least some implicant is active, then its funcfunc-tion evaluates to true, which means that the compound whose behavior that function describes is active.

To clarify, active compounds in the computational model may not necessarily be those that have a corresponding positive observation in the real model (i.e., an observation in which they are active). While the observations tell us what was observed in the real system, what we want to verify is if our model is capable of replicating them or not. That is essentially what this next rule will help us do.

4 . 2 . C O N S I S T E N C Y C H E C K I N G

active(E,C) :− function(C,I), not implicant_inactive(E,C,I_NO), experiment(E), term(C,I_NO,_).

The above rule is saying that, if for experimentE, compoundChas some implicantI_N O which is not inactive, thenCis active for experimentE.

After determining that a compound is active in an experiment, we can then contrast that information with the observations that we have of that experiment for that compound and see if they match up. If, given the observed states of its regulators, we determine that a compound should be active, but in reality the compound was observed to have been inactive, then that means we have an inconsistency. Likewise, if the compound is inactive in our model, but was observed as being active in the real-world system, we also have an inconsistency. This is encoded in the next two rules.

inconsistent(E,C,1) :− active(E,C), curated_observation(E,C,0).

inconsistent(E,C,0) :− not active(E,C), curated_observation(E,C,1).

Theinconsistentpredicate has three arguments:E, the experiment where the inconsistency took place,C, the inconsistent compound, and a third argument which is the inconsistent state (0if the compound should have been active, and1 if the compound should have been inactive). The third argument’s main purpose is simply to aid in the identification of the type of inconsistency that was verified.

Yet, some compounds require special treatment. When we are dealing with input compounds (input nodes, as we have seen in Subsection2.2.2.1), such compounds can take any value, but this value must not change over time. The way we logically represent the functions of these compounds is by defining that they are functions with a single term, and that term contains only one regulator, which is the compound itself, as an activator. In this manner, the compound will always maintain its value, once that value is determined. The rule used to identify these input compound is as follows:

input_compound(C) :− compound(C), function(C,1), regulates(C,C,0), #count{R : term(C,1,R)} = 1.

What this rule is saying is that if we have a compoundC which has a function with a single term and that compound regulates itself as an activator, and the one term of the function contains only one regulator (which can only be the compound itself, since we already know that it regulates itself and the function has only a single regulator), then the compound has to be an input compound.

Then, in order to determine if these input compounds are active or not, we use this rule (recall what we have seen in Subsection2.1.4.2):

0 {active(E,C)} 1 :− experiment(E), input_compound(C).

Lastly, we should address the cases where the given sets of observations are incom-plete. When this happens, our solution is capable of filling in the missing observations.

How is this accomplished? That question brings us to the following rule:

1 {curated_observation(E,C,0); curated_observation(E,C,1)} 1 :−

not observation(E,C,_), experiment(E), compound(C).

What this rule is saying is that, if we have no observation for compoundCin experiment E, then we create acurated_observationin which the compound can either be active, or inactive. Note thatexperiment(E)andcompound(C)are required in the body of this rule because EandC would be unsafe variables otherwise, as they would not occur in any positive body literals (clingoalways requires that variables be safe).

However, why can we not simply write this rule using only the observation predi-cate, and instead need to use curated_observation? Because if we used observation, we would essentially be saying thatnot observationimpliesobservation, which would lead to a contradiction. As such, all our observations are converted into curated observations, and we simply work with curated observations instead. This means that all the rules we saw above will not actually be using the observation predicate in their bodies, but thecurated_observationpredicate instead, as will be displayed in the full encoding below.

The following rule is responsible the conversion of the existing observations into curated ones:

curated_observation(E,C,S) :− observation(E,C,S).

It should be noted that the choices made above may give rise to greater or fewer incon-sistencies, in the cases where the filled in observations cause compounds to report states that go against what the model would be able to produce, or the chosen states of the input compounds do not match up with the observations. As such, and with no way of knowing the values that were produced by the real-world system, a choice had to be made regarding what values to use. In this work, we assumed that the model is defined in such a way that it is able to emulate the behavior of the real world system as closely as possible (which is a reasonable assumption to make, as that is the goal of any such model). Under this assumption, it follows that the missing observations should be those that lead to as few inconsistencies as possible. So, in order to minimize the occurrence of inconsistencies, we make use of the following optimization rule:

#minimize{1,E,C : inconsistent(E,C,_)}.

What we are saying here is that we want to minimize the number ofinconsistentpredicates produced. Note that in this case, because we are not interested in the value that the third argument has (since we are simply trying to minimize our inconsistencies, regardless of the state of the inconsistent compound), we make use of an anonymous variable, by simply using an underscore instead of a regular variable name. This feature is useful when we are dealing with variables that do not recur anywhere in a rule. It would also

4 . 2 . C O N S I S T E N C Y C H E C K I N G

be important to mention that, were it not for the need to minimize inconsistencies, the process of consistency checking would be entirely non-combinatorial. This fact means that an implementation using an imperative language would be entirely plausible. How-ever, even without the combinatorial aspect, remember that using a declarative language such asASPwould still be advantageous, in the sense that it would allow for an easier understanding of how this process is implemented, thus making it easier to verify its correctness or modify it, if need be.

Before we show the full encoding, let us briefly go over the display section of the encoding:

#show experiment/1.

#show inconsistent/3.

#show curated_observation/3.

After solving, only the predicates experiment, inconsistent and curated_observation are outputted. The inconsistent predicate is required, since we will be using it to know which compounds are inconsistent for what experiments. Then, theexperimentand cu-rated_observation predicates are also outputted, since we will need to use the curated observations that were produced here when repairing the model, as we must ensure that it is able to replicate those observations. Python will then take these outputs fromclingo, process them and feed them as inputs for the repairing part of the revision process.

This covers the entirety of the encoding for stable state consistency. Listing4.4 dis-plays the full encoding for stable state consistency, alongside some comments that also cover what we have just seen.

Listing 4.4: Listing for the stable state consistency encoding.

1 %Define

2 %If a compound C has a single term, with a single regulator in that term, 3 %and it has itself as an activator, then C is an input compound

4 input_compound(C) :− compound(C), function(C,1), regulates(C,C,0), 5 #count{R : term(C,1,R)} = 1.

7 %Input compounds can take any value

8 0 {active(E,C)} 1 :− experiment(E), input_compound(C).

10 %Input observations are converted to curated observations 11 curated_observation(E,C,S) :− observation(E,C,S).

13 %If compound C has no observation for it, fill in the missing observation 14 1 {curated_observation(E,C,0); curated_observation(E,C,1)} 1 :−

15 not observation(E,C,_), experiment(E), compound(C).

17 %For a given compound C with I implicants, if there exists a regulator 18 %in implicant number I_NO which is an inactive activator

19 %(outputs 1 when active) in an obervation E, then implicant 20 %I_NO evaluates to 0 for that observation E, and so is inactive 21 implicant_inactive(E,C,I_NO) :− function(C,I), term(C,I_NO,R), 22 regulates(R,C,0), curated_observation(E,R,0).

24 %Variant of the above rule for active inhibitors

25 implicant_inactive(E,C,I_NO) :− function(C,I), term(C,I_NO,R), 26 regulates(R,C,1), curated_observation(E,R,1).

28 %For a given compound C with I implicants, if for experiment E 29 %there exists some implicant which is not inactive, then C has 30 %at least one active implicant and so is active

31 active(E,C) :− function(C,I), not implicant_inactive(E,C,I_NO), 32 experiment(E), term(C,I_NO,_).

34 %If compound C is active but there is an observation stating 35 %it should be inactive, then the model is inconsistent

36 inconsistent(E,C,1) :− active(E,C), curated_observation(E,C,0).

38 %Variant of the above rule for when compound C is active

39 inconsistent(E,C,0) :− not active(E,C), curated_observation(E,C,1).

41 %Display

42 #show experiment/1.

43 #show inconsistent/3.

44 #show curated_observation/3.

46 %Optimize for the smallest number of inconsistent compounds 47 #minimize{1,E,C : inconsistent(E,C,_)}.

4 . 2 . C O N S I S T E N C Y C H E C K I N G

4.2.3 Encoding Time Series Consistency

Time series consistency takes into consideration a time component, unlike stable state consistency. We cover two time-series modes: synchronous mode, and asynchronous mode. For the most part, the encodings for both of these modes follow the same logic that we have already seen for stable state consistency, but with the additional time (T) variable.

As such, in this section, we will focus only on covering the key aspects where these times-series encodings differ from the stable-state one. Then, for differences between synchronous and asynchronous modes, we will cover them in their respective sections below.

The main difference for time-series consistency encodings is the introduction of a time component. Because time will be present in so many rules, we give it its own predicate, to make the encodings more readable. The following rule does just that.

time(T) :− observation(_,T,_,_).

We are simply using the time component present in our time-series observations, and giving it its own predicate. But how exactly are we using this time component in our time-series consistency checking? In short, we look at what happens at a time step T, in order to determine what is going to happen at a time stepT+1. For example, let us examine the rule that defines which compounds are active, but now modified for time-series consistency:

active(E,T+1,C) :− function(C,I), not implicant_inactive(E,T,C,I_NO), experiment(E), term(C,I_NO,_), time(T).

What we do differently is that now, if we want to know if a compound is active in a given moment, we need to look at what happened in the previous moment to know. So, if at time stepTof experimentEwe had some active implicant in the function of compound C, then we can say that inT+1 of that same experiment, C will be active. For clarity, it is important to highlight that this is the only rule in which this transition from T to T+1takes place, since this is the rule responsible for using the information gathered at timestepT, to determine the state at timestepT+1. For all other rules, the time component remains the same. E.g., for the rule responsible for determining which implicants are inactive:

implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), regulates(R,C,0), curated_observation(E,T,R,0).

implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), regulates(R,C,1), curated_observation(E,T,R,1).

We look at the observations at time stepT, to know which implicants are inactive at that time stepT. Based on the atoms produced from these rules, the compound’s state is then calculated forT+1using the rule that defines active compounds.

One last thing that should be noted is that, when we check for time series consistency, we can only check it forT > 0, since forT = 0we have no information regarding the events that happened in the previousT-1timestep that allows us to say whether the compounds inT = 0are consistent or not.

4.2.3.1 Encoding Synchronous Consistency

In synchronous consistency, all regulatory functions are applied at each time step. If each compound can replicate the observed value at each time step, given the value its regulators had in the previous time step, then the model is consistent.

Listing4.5displays the full encoding for synchronous consistency. It is almost identi-cal to the encoding we have seen in Listing4.4, aside from the differences that we have already discussed.

Listing 4.5: Listing for synchronous consistency.

1 %Encoding to determine a model’s consistency using synchronous observations 2

3 %Define

4 %Time must occur in some rule head 5 time(T) :− observation(_,T,_,_).

7 %If a compound C has a single term, with a single regulator in that term, 8 %and it has itself as an activator, then C is an input compound

9 input_compound(C) :− compound(C), function(C,1), regulates(C,C,0), 10 #count{R : term(C,1,R)} = 1.

12 %Input compounds can take any value,

13 %but that value must be the same throughout time

14 0 {input_active(E,C)} 1 :− experiment(E), input_compound(C).

15 active(E,T,C) :− time(T), input_active(E,C).

17 %In T = 0, if a compound has a missing observation, clingo decides 18 %whether that compound should be active or inactive.

19 %This will be done in such a way that the

20 %curated_observation that is defined for this missing observation 21 %leads to the least amount of inconsistencies possible

22 %(this choice is required because we lack the information of T−1 that 23 %would enable us to know for sure if the compound is active or not) 24 0 {active(E,0,C)} 1 :− experiment(E), compound(C), not input_compound(C), 25 not observation(E,0,C,_).

27 %Input observations are converted to curated observations 28 curated_observation(E,T,C,S) :− observation(E,T,C,S).

30 %If compound C is active at time T and there is no observation for it, 31 %fill in the missing observation

32 curated_observation(E,T,C,1) :− active(E,T,C),

33 not observation(E,T,C,_), time(T), experiment(E), compound(C).

4 . 2 . C O N S I S T E N C Y C H E C K I N G

35 %If compound C is inactive at time T and there is no observation for it, 36 %fill in the missing observation

37 curated_observation(E,T,C,0) :− not active(E,T,C),

38 not observation(E,T,C,_), time(T), experiment(E), compound(C).

40 %For a given compound C with I implicants, if there exists a regulator in 41 %implicant number I_NO which is an inactive activator (outputs 1 when active) 42 %in an observation E at time T, then implicant I_NO evaluates to 0 for that 43 %observation E at time T, and so is inactive at time T

44 implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), 45 regulates(R,C,0), curated_observation(E,T,R,0).

47 %For a given compound C with I implicants, if there exists a regulator in 48 %implicant number I_NO which is an active inhibitor (outputs 1 when inactive) 49 %in an observation E at time T, then implicant I_NO evaluates to 0 for that 50 %observation E at time T, and so is inactive at time T

51 implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), 52 regulates(R,C,1), curated_observation(E,T,R,1).

54 %For a given compound C with I implicants, if for experiment E at time T 55 %there exists some implicant I_NO which is not inactive, then C has

56 %at least one active implicant at time T and so will be active at time T+1 57 active(E,T+1,C) :− function(C,I), not implicant_inactive(E,T,C,I_NO), 58 experiment(E), term(C,I_NO,_), time(T).

60 %If compound C is active but there is an observation stating 61 %it should be inactive, then the model is inconsistent 62 inconsistent(E,T,C,1) :− active(E,T,C),

63 curated_observation(E,T,C,0), T > 0.

65 %If compound C is inactive but there is an observation stating 66 %it should be active, then the model is inconsistent

67 inconsistent(E,T,C,0) :− not active(E,T,C), 68 curated_observation(E,T,C,1), T > 0.

69 70

71 %Display

72 #show experiment/1.

73 #show curated_observation/4.

74 #show inconsistent/4.

75 76

77 %Optimize

78 %Optimize for the smallest number of inconsistent compounds (applicable for 79 % when we’re dealing with incomplete observations)

80 #minimize{1,E,T,C : inconsistent(E,T,C,_)}.

4.2.3.2 Encoding Asynchronous Consistency

In asynchronous consistency, only one regulatory function may be applied at each time step. If the compound whose regulatory function is applied can replicate the observed value at the time step at which it is applied, given the value its regulators had in the previous time step, then the model is consistent.

This encoding closely resembles the synchronous encoding, with the additional con-cern that we now need to keep track of how many compound states change from timeTto T+1. If more than one state changes, that means more than one function was applied, and so there is something wrong with the observations. The rule responsible for identifying changed compounds is as follows:

changed_compound(E,T,C) :− curated_observation(E,T,C,S1), curated_observation(E,T+1,C,S2), S1 != S2.

If a compoundC had stateS1at timeT in experimentE, and if at timeT+1that state changed to anS2different fromS1, then we know that there was a change in the state of compoundC. The integrity constraint that ensures we can have no more than one state change at a time is the next one:

:− changed_compound(E,T,C1), changed_compound(E,T,C2), C1 != C2.

If in the same experiment and timestep, two distinct compounds C1 andC2 changed their state, then we have a logical inconsistency because the restrictions of asynchronous interactions were not respected.

Additionally, extra care needs to be taken when we are verifying the inconsistency of compounds. Since it is no longer the case that every function is applied at the same time, we cannot simply look at the value of each compound at timeTand see if it matches the expected value. We can only look at the compound whose state was changed at the timestep before in order to check for inconsistencies, or else we risk identifying many false positives (many inconsistent compounds that, in reality, are not inconsistent). As such, the rules responsible for identifying inconsistent compound states had to be modified:

inconsistent(E,T,C,1) :− changed_compound(E,T−1,C), active(E,T,C), curated_observation(E,T,C,0), T > 0.

inconsistent(E,T,C,0) :− changed_compound(E,T−1,C),

not active(E,T,C), curated_observation(E,T,C,1), T > 0.

Listing4.6displays the full encoding for asynchronous consistency.

Listing 4.6: Listing for asynchronous consistency.

1 %Encoding to determine a model’s consistency using asynchronous observations 2

3 %Define

4 %Time must occur in some rule head

4 . 2 . C O N S I S T E N C Y C H E C K I N G

5 time(T) :− observation(_,T,_,_).

7 %If a compound C has a single term, with a single regulator in that term, 8 %and it has itself as an activator, then C is an input compound

9 input_compound(C) :− compound(C), function(C,1), regulates(C,C,0), 10 #count{R : term(C,1,R)} = 1.

12 %Input compounds can take any value,

13 %but that value must be the same throughout time

14 0 {input_active(E,C)} 1 :− experiment(E), input_compound(C).

15 active(E,T,C) :− time(T), input_active(E,C).

17 %In T = 0, if a compound has a missing observation, clingo decides 18 %whether that compound should be active or inactive.

19 %This will be done in such a way that the

20 %curated_observation that is defined for this missing observation 21 %leads to the least amount of inconsistencies possible

27 %Input observations are converted to curated observations 28 curated_observation(E,T,C,S) :− observation(E,T,C,S).

30 %If compound C is active at time T and there is no observation for it, 31 %fill in the missing observation

32 curated_observation(E,T,C,1) :− active(E,T,C),

33 not observation(E,T,C,_), time(T), experiment(E), compound(C).

35 %If compound C is inactive at time T and there is no observation for it, 36 %fill in the missing observation

37 curated_observation(E,T,C,0) :− not active(E,T,C),

38 not observation(E,T,C,_), time(T), experiment(E), compound(C).

44 implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), 45 regulates(R,C,0), curated_observation(E,T,R,0).

51 implicant_inactive(E,T,C,I_NO) :− function(C,I), term(C,I_NO,R), 52 regulates(R,C,1), curated_observation(E,T,R,1).

54 %For a given compound C with I implicants, if for experiment E at time T

55 %there exists some implicant I_NO which is not inactive, then C has

56 %at least one active implicant at time T and so will be active at time T+1 57 active(E,T+1,C) :− function(C,I), not implicant_inactive(E,T,C,I_NO), 58 experiment(E), term(C,I_NO,_), time(T).

60 %If a compound’s state has been changed from time T to T+1, 61 %identify that compound

62 changed_compound(E,T,C) :− curated_observation(E,T,C,S1), 63 curated_observation(E,T+1,C,S2), S1 != S2.

65 %If a changed compound C is active but there is an 66 %observation stating it should be inactive,

67 %then the model is inconsistent

68 inconsistent(E,T,C,1) :− changed_compound(E,T−1,C), 69 active(E,T,C), curated_observation(E,T,C,0), T > 0.

71 %If a changed compound C is inactive but there is an 72 %observation stating it should be active,

73 %then the model is inconsistent

74 inconsistent(E,T,C,0) :− changed_compound(E,T−1,C),

75 not active(E,T,C), curated_observation(E,T,C,1), T > 0.

76 77

78 %Test

79 %In asynchronous mode, only one compound state 80 %may change at each time step

81 :− changed_compound(E,T,C1),

82 changed_compound(E,T,C2), C1 != C2.

83 84

85 %Display

86 #show experiment/1.

87 #show curated_observation/4.

88 #show inconsistent/4.

89 90

91 %Optimize

92 %Optimize for the smallest number of inconsistent compounds (applicable for 93 %when we’re dealing with incomplete observations)

94 #minimize{1,E,T,C : inconsistent(E,T,C,_)}.

No documento Revision of Boolean Logical Models of Biological Regulatory Networks using Answer-Set Programming (páginas 61-72)