respected. As for dynamic components, the extracted models need to be manually modified to describe the dynamic instantiation of components.
7.2 Evaluation of Tool Support
This section presents our evaluation of the tool support provided for our approach. Essentially, we analyse how easy it is to use the LTSE tool and what can influence its performance and scalability. We also discuss some known limitations.
7.2.1 Usability
The principal aim of a tool is to make easier the execution of a task that would otherwise be complex, tedious or time-consuming. Hence, a tool should serve to reduce the complexity of the task and speed up its completion.
In model extraction, the use of a tool is even more important. Converting a program into a model that is tractable by a model-checking tool is generally complex enough to be time-consuming and error-prone. By automating this process, one can gain time and decrease the possibility of errors. Furthermore, it allows even those who are not experts to apply the approach to obtain models from their codes.
Based on this, we consider that the LTSE tool is adequate for its purpose of implementing our approach. Although it does not automate the information gathering and trace generation phases, it provides an automatic way of generating an FSP description based on a set of traces from the system, guided by parameters provided by the user (alphabet, system state and interpretation of actions). This is the most difficult step of the approach and likely to introduce errors. Thus, the process implemented by the LTSE tool is essential and the combination of this tool with the LTSA tool provides complete support for the model checking process.
7.2. Evaluation of Tool Support 178 An important characteristic of the LTSE tool is its portability. As it is implemented in Java, it can run in any machine where a Java Virtual Machine has been installed. This is especially important if one intends to analyse a distributed system running in a heterogeneous network.
7.2.2 Performance and Scalability
Besides being useful and easy to execute, a tool should ideally not require much processing power and produce results quickly. The process executed by the LTSE tool demands reasonably little processing effort, being most of its work related to operations on files and access to data structures (i.e., the context table and the model structure). At most, the tool operates on two files simultaneously - one to read from and another to write in - when applying one of the necessary mappings.
Knowing that most of the processing is connected to operations on files and data structures, it is clear that the general performance of the tool depends on the size of the files and data structures it has to handle. Long log files will generate long context files. However, long log files tend to include redundant sequences of actions. Thus, the resulting FSP description file is much smaller than it could be according to the size of the original logs.
The size of the data structures is, to a great extent, also dependent on the size of the log files. The more context the traces in the files include, the more entries the context table will have and the bigger the model structure will be. Redundant information in the logs can also mean that the structures will not be as large as they could, in particular the context table.
Nevertheless, the size of the model structure is more easily affected by the length of the logs.
Whereas the context table only grows with the discovery of new contexts, the model structure grows also when new transitions between contexts are detected. In general, this growth is not very significant and the size of the structures is perfectly manageable.
The types and ranges of the attributes of a system also influence the performance of the tool and its memory usage. In order to accelerate the refinement process, we collect the values of all attributes. This means that, irrespective of which attributes will be actually used, each
7.2. Evaluation of Tool Support 179 entry in the log file registers the values of every attribute available. However, this information overhead is not carried over to the context files, since they only record context IDs and action names. The ranges of values of the attributes included in the CT affect the CT size, the size of the context files and that of the resulting model structure.
Table 7.2 shows some performance data collected from the programs used as examples in this work, including those presented in Appendix B and the case studies discussed in the previous chapter. These values were obtained executing the tool in a 2.4 GHz Pentium 4 machine with 512 MB of RAM running Windows XP. Rows containing the same log size represent results from the same program with and without attributes being considered, respectively (e.g., the first and the second rows).
The sizes of models and context tables and the processing time are approximate. The model and the context table sizes correspond to the quantity of memory occupied by these structures during the model creation. The log size is the sum of the size of all logs used in the model extraction process.
Log size (KB) CT size (KB) Model size (KB) Time (ms)
8 0.8 0.4 31
8 1.5 0.5 31
60 7.5 1.2 63
60 8.9 1.3 62
257 3.7 0.4 63
1,913 21.2 15 312
96,390 18.9 3.1 14,235
462,445 7.7 1.5 58,593
Table 7.2: LTSE performance data.
The table shows how the size of the logs influences the sizes of the CT and that of the model.
Note that, as commented before, the increase in the log size does not necessarily mean an increase in the size of the other structures.
Let us take as an example the fifth row of the table. It shows that, though the size of the log is large if compared to other log sizes in the table, the sizes of the CT and of the model happen to be smaller than, for example, those shown in the row immediately above, where the
7.2. Evaluation of Tool Support 180 log size is about four times smaller. This indicates that the redundancy of information in the log described in the fifth row is much smaller than that in the log described in the fourth row.
Logs that generate larger models than logs (e.g., the sixth row compared to the seventh) are a result of the quantity and type of attributes used to refine the model and the fact that these refinements can create more contexts and, consequently, more states.
7.2.3 Known Limitations
The LTSE tool has some known limitations. One of such limitations is the absence of a graphical interface. Although it means that the execution of the tool requires less processing power, since no graphics processing is required, it would be desirable to have a more friendly interface.
The representation of a method execution as an action whose name matches that of the method it describes seems a natural choice. However, it hampers the use of some features provided by programming languages, in particular related to object-oriented programming. The overload of methods cannot be represented in the model, since a call to any version of a methodm will result in the introduction of an actionm in the model, regardless of the parameters and return type. Hence, polymorphism is not represented either.
Support for inheritance is not provided, as it requires access to the code of the superclasses, which may not be available, especially if they belong to some third-party library. Moreover, as mentioned before, the tool does not distinguish between methods that have the same name, even if they are in different classes. Therefore, possible overridings would not be captured.
In order to handle the end of a log file and guarantee that no information is lost, the tool builds models under the assumption that the execution of a component always terminates (normally or abnormally). This means that it introduces either a reference to the predefined END state (if normal termination) or to a F INAL state (for an abnormal termination). In cases where the execution does not finish (infinite loop) but is interrupted, the inclusion of aF INAL state means that the tool could not find a next context to connect the last one to and, therefore, it has connected the last context to the F INAL state. During the analysis, this transition may