Discussion - Revision of Boolean Logical Models of Biological Regulatory Networks using Answer-

5.3.1 Results Analysis

As was shown in the subsections above, there seems to be a clear influence of the number of compounds, interactions and number of terms in revision times. This impact can be explained by looking at how each of the two parts of the revision process operate:

• When performing consistency checking, we must ensure that the function of each compound is able to replicate the given observations. Naturally, the more com-pounds we have in a model, the more functions we have to verify. Moreover, de-termining the value outputted by each function is dependent on the number of interactions that the compound has. The more terms we have in the compound’s function, and the more regulators each term has, the longer it will take to deter-mine the state of the function. This is because we use the regulators in a term to determine if the term is active or inactive, and then based on whether all terms are inactive or not, we determine the value of the function. Therefore, the more compounds or interactions we have, the longer the consistency checking process takes.

5 . 3 . D I S C U S S I O N

• When performing repairs, we generate as many terms as the inconsistent function has, and start our search there. Naturally, the more terms that function has, the more terms will be generated, with each term having the possibility of housing as many variables as there are compounds in the model. So not only does the repair time increase with a greater number of compounds (since that would mean more possibilities for placing variables in a term), but it also increases with the number of terms, since naturally that would further expand the space of possible functions to explore.

In terms of the impact that our repairs have on the model, as was shown before, typ-ically changes occur in the format of the function nodes and in the signs of compounds.

This was to be expected, as our optimization criteria states that these are the least impor-tant types of changes to minimize. It should be noted, however, that by simply changing the weights of the optimizations inside the repair encodings, it would be very simple to re-assign the priority given to most optimizations. Changing the optimization regarding the minimization of function nodes would also be possible, albeit not so trivially, since this would also involve altering the algorithm in Python.

Lastly, the overall low repair times of our method can be explained by three intercon-nected pieces:

• The first, would be our implementation of the repair process itself. Tackling the repairs by generating the function nodes and ensuring that they are in accordance with the restrictions imposed by the used interaction scheme has proven to be a sensible strategy, due to the fact that most real-world models’ functions tend to have no more than three nodes. This makes it easy to prevent our solution from exploding in terms of repair times, since the number of nodes to search for is generally low;

• Second, by using iterative deepening to control the number of nodes we are admit-ting in our search, we are able to control the search space even further. Consider this:

we are repairing a function, and we haveCcompounds at our disposal, which can be placed inT terms. IfT = 1, then we have 2^C possibilities, since each compound can either be present or absent from that term. IfT = 2, then our possibilities grow to about 2^C×2^C. As such, a rough estimate of the number of possibilities can be given by 2^C^T. What this means is that the number of compounds in the model is certainly important for function repair, but even more important than that is the number of terms that we are considering. In all of the real-world models used in our tests, the maximum number of terms a function has is 5, therefore this number is never too large. On top of that, because we primarily start searching for functions with that number of terms before searching for functions with more nodes, we are able to arrive at solutions efficiently. Searching for functions without placing some sort of upper bound on this number beforehand would lead to (practically) unending wait times when dealing with complex models, as there would be an overwhelmingly

large number of possibilities to search for. As such, with the goal of minimizing the number of changes done to the original function in an efficient manner and that can reliably find repairs whenever these exist, starting by searching for functions that had the same number of terms as the original was the logical choice, as in this way we are directly controlling the most critical variable. Should this optimization criterion prove undesirable, it would be possible to change it (as was previously mentioned), albeit it would most likely be the case that the performance would suffer considerably if no sort of limit was to be imposed on the number of function terms to consider;

• Lastly, by leveragingclingo’s powerful and highly optimized solving capabilities, which excel at solving difficult search problems such as this, we are able to make the most out of our encodings and solve them as efficiently as possible.

5.3.2 Comparisons with ModRev

In this subsection, we will be comparing our solution with ModRev[32]. ModRev is a tool developed primarily in C++, which also performs the revision of Boolean logical models. ASPis used in ModRev’s implementation as well, but solely when checking the consistency of the models, with the repairing part being implemented in C++. When ModRev repairs inconsistent function, the optimization criteria it employs is as follows:

1. Minimise the changes to the function’s regulators;

2. Minimise the changes to the signs of regulators;

3. Minimise the number of function change operations.

Recall that our optimization criteria is:

1. Minimize the changes to the number of function terms;

2. Minimize the changes to the function’s regulators;

3. Minimize the changes to the signs of regulators;

4. Minimize the changes to the format of each term.

It is worthwhile to mention that ModRev’s minimization of the number of function change operations essentially combines the minimization of changes to the number of function terms and to the format of each term, which are used in our criteria.

In ModRev’s experimental evaluation, the tests were run on an Intel(R) Xeon 2.1GHz Linux machine, with a memory limit of 2GB. The parameters used for the test instances were the same: the five real-world models presented in Section5.1were corrupted using the 24 different configurations presented in Table5.2, and 100 instances of each model were generated for each of those configurations. To perform revision on the corrupted

5 . 3 . D I S C U S S I O N

models, the stable states of each of them were used when revising them under stable state observations, and time-series observations were generated, for both synchronous and asynchronous interaction schemes, in the same fashion as was described in Section 5.1. Additionally, a time limit of 3600 seconds was also considered for all tests shown here. The ModRev results presented here were taken directly from the PhD thesis they were presented in [32].

In Table5.17, we present the statistics regarding the number of successfully revised models and average revision time (for time-series observations), both for our approach and ModRev. Note that, for time-series observations, the statistics shown are in respect to the complete set of observations, with 5 experiments and 20 timesteps. The values presented for our solution were taken from the tables displayed in Section 5.2, while ModRev’s values were taken from the PhD thesis they were displayed in.

We will begin our comparisons by contrasting the performance of the two approaches.

In terms of efficiency, our solution has proven to be generally faster than ModRev:

• Under stable state observations, ModRev’s ability to solve the instances varied from 87.25% to 99.75%, depending on which of the five family of models is considered.

Our method was able to solve 100% of instances under the same time limit, with functions taking far less time to solve than the given threshold (usually, less than a quarter of a second, as can be evidenced by Table5.4).

• Under synchronous observations, ModRev’s ability to solve the instances varied from 53% to 82.5%, depending on which of the five family of models is considered.

Additionally, the average time taken varied from 28s to 382s, also depending on which corrupted instances were considered. Our solution was able to solve 100% of all instances, for all models, and as can be seen by Table5.14, it never took much longer than 5s to revise a model, with most instances taking less than 1s to be solved.

• Under asynchronous observations, ModRev was almost always able to solve 100%

of the instances, with the only exception being the instances of SP, of which only 87.6% were solved. Additionally, the average time taken varied from 0.35s to 58.7s, also depending on which corrupted instances were considered. Our solution was able to solve 100% of all instances, for all models, and revisions usually never took longer than a second, as can be evidenced by Table5.15.

This increase in performance can be explained by our implementation of the repair pro-cess, by the selected optimization criteria, and by leveragingclingoto solve the encodings, as was discussed in Subsection5.3.1.

In terms of legibility, we provide an easier way of understanding the implementation of the critical parts of the revision process. By implementing both the consistency check-ing and the repair procedures inASP, we are able to provide a more readable, declarative

Obs. type Model Approach Solved Our solution 100%

MCC ModRev 99.75%

Our solution 100%

FY ModRev 95.86%

Our solution 100%

TCR ModRev 88%

Our solution 100%

SP ModRev 75.67%

Our solution 100%

Stable state

Th ModRev 87.25% Avg. (s)

Our solution 100% 0.141

MCC ModRev 53.04% 277.921

Our solution 100% 0.137

FY ModRev 82.54% 382.021

Our solution 100% 0.763

TCR ModRev 78.96% 104.566

Our solution 100% 0.456

SP ModRev 53.38% 72.647

Our solution 100% 0.407 Synchronous

Th ModRev 73.33% 28.779

Our solution 100% 0.107

MCC ModRev 100% 58.658

Our solution 100% 0.092

FY ModRev 100% 0.346

Our solution 100% 0.537

TCR ModRev 100% 1.390

Our solution 100% 0.303

SP ModRev 87.63% 7.666

Our solution 100% 0.279 Asynchronous

Th ModRev 100.00% 24.250

Table 5.17: Comparison of model revision success and revision times between our ap-proach and ModRev, for each family of models (for time series observations, comparisons were made using the complete set of observations, with 5 experiments and 20 timesteps).

solution. This not only makes the revision process easier to comprehend, but it also makes its correctness easier to ascertain.

In terms of introducing modifications to the revision process, our implementation gives the user greater flexibility. One can easily change the weights of the minimizations in the repair encoding, for example, in order to fine-tune the search for functions. It is relevant to underline that changing the optimization regarding the number of nodes would prove to be more intricate, but only because it would also require one Python function to be changed, as opposed to only tweaking the repair encodings. Changes to the inner workings of both the consistency checking and repair mechanisms would also be easy to make, as would be the introduction of additional interaction schemes, since so many rules are shared among the encodings (for instance, the addition of a mixed

5 . 3 . D I S C U S S I O N

interaction scheme that accepts both synchronous and asynchronous interactions, as is used in some works, could be done in great part by re-utilizing already existing rules present in the encodings of those two interaction modes).

In terms of the chosen optimization criteria, we opted for a set of criteria that allowed us to create a more readable, modifiable, efficient and reliable solution. Note that Mod-Rev’s most efficient search algorithm may fail to find consistent functions, even though they may exist, and even if they are found there is no guarantee that the function found is optimal. On the other hand, our implementation will always find a consistent func-tion if one exists, and the funcfunc-tion that is found is always an optimum. Moreover, if the current criteria is undesirable, due to the nature of our implementation, tweaking it to better fit the circumstances of the user would be a straightforward process. For exam-ple, it would be possible to add an optimization statement in the repair encodings that involves the minimization of some function that involves several parameters (e.g. num-ber of added regulators, removed regulators, changed regulator signs, additional nodes, removed nodes, regulators added to nodes, regulators removed from nodes, etc), that would result in more custom results, better tailored to the requirements and knowledge one may have of the models at hand.

We believe that our solution can be used as a stepping stone to create even more ele-gant solutions that leverageASP. This applies not only to model revision, but potentially also to function synthesis, since with a few tweaks our repair encodings are also capable of creating new functions. For example, it would be possible to implement a rudimentary approach to regulatory function synthesis, based on the observations and prior knowl-edge we may have of the original system (thefixedpredicate can still be used), that uses iterative deepening to find suitable candidates. This could be achieved by changing the repair algorithm to start looking for functions with 1 node, then 2 nodes, and so on until a function is found (instead of starting the search at the same number of nodes as the original function). Then, if the repair encoding is altered so that the optimization state-ments that minimize the changes to the original function are removed, and the number of available node IDs is changed to the number of nodes we are currently performing the search with,clingowill attempt to find a function for the inputted compound that does not take into consideration the compound’s regulators, their signs, or function format, provided in the model’s encoding. This will make it so that the function found byclingois capable of replicating the observations, but without attempting to approach the original function in any way.

6

C o n c l u s i o n

In this work, we presented anASP-based tool for the revision of Boolean logical models.

The tool is capable of working with three distinct observation types:

• Stable state observations;

• Synchronous observations;

• Asynchronous observations.

The tool is able to verify the consistency of a model given one or more sets of observa-tions of one of the three specified types, and in case of inconsistency, it finds the repairs required to render the model consistent according to the following optimization criteria:

1. Minimize the changes to the number of function terms;

2. Minimize the changes to the function’s regulators;

3. Minimize the changes to the signs of regulators;

4. Minimize the changes to the format of each term.

By utilizing this criteria and leveraging the capabilities of the state of the artASPsystem clingo, we were able to develop a declarative, easily understandable, and highly customiz-able implementation, which is customiz-able to revise models very efficiently and, whenever a model is inconsistent, it always finds the optimal set of repairs that are required to turn it consistent. To our knowledge, this is the only approach to the revision of Boolean logical models that utilizesASPboth for consistency checking, and repairs.

Compared to imperative solutions, this approach offers greater readability and flexi-bility, allowing for the inner working of the revision process to be more easily understood, checked for correctness, and modified to suit the needs of specific models. The tool can either be used as-is with the defined optimization criteria, or tweaked in order to allow for the usage of a distinct criteria, whether by using the already existing rules provided in the repair encodings, or by defining new ones that will further tailor the function repair search to the requirements of the user. Should a different criteria be employed, special

No documento Revision of Boolean Logical Models of Biological Regulatory Networks using Answer-Set Programming (páginas 118-125)