Ontologically Correct Taxonomies by Construction: A Graph Grammar-Based

To validate our proposal, we formalize the model building operations as a graph grammar incorporating both microtheories. A graph grammar is a formal way to specify an initial graph and a set of rules for transforming the graph.

Context and Motivation

In this work we also use an additional fundamental theory called MLT (Multi-Level Theory) (ALMEIDA et al., 2018; CARVALHO; ALMEIDA, 2018), which allows us to tackle taxonomies that also include types of high order, i.e. multi-level taxonomies. As shown in (FONSECA et al.,2021;BRASILEIRO et al.,2016a;DADALTO et al.,2021), multi-level taxonomies are widely observed in practice and suffer from a large number of modeling errors that can be attributed to the inadequate use of subclassification and instantiation in combination.

Contribution

In other words, a representation system based on this ontological theory is a model language, i.e., a system whose basic building blocks are not low-granularity primitives such as types and relations, but finer-grained models. higher formed by types and relationships.

Approach

To illustrate the concept of the forbidden absence of an edge, consider the transformation rule shown in Figure 3, which is applicable in the graphs in Figure 4(a,b) but not in Figure 4(c). We then use the tool's condition exploration feature to examine the taxonomy forms generated by the grammar. Taxonomies of size 2 also reveal violations of the rules involving species (a species specializing in another species or any other sort number, Figure 20).

The Forbidden Fragment provides the choice of the right base type for the specialization (the base type that represents the Individua class). As shown in Figure 29, the categorizer specializes a base type that is the powertype of the base type in the lower order (to which the categorized base type belongs). The formal verification of the graph grammar presented in Section 4.1 follows the same procedure discussed in Section 3.2 for the grammar of durable types.

Consequently, the goal of sound analysis is to verify that no form of taxonomy in the language is incorrect. To analyze the completeness of the MLT-based grammar, we repeat the same procedures described in Section 3.2.2, where we consider regular and irregular taxonomic forms. Combining the rules given in Figures 14(a-d) and 27 leads to the new rules shown in Figures 45(a-d). a) new first-order type (b) new first-order category.

Structure

Graph Grammars

Graphs

Graphs and diagrams provide a simple and powerful approach to a variety of problems typical of computer science in general and software engineering in particular. These notations produce models that can be easily viewed as graphs, and thus graph transformations are involved.

The GROOVE Graph Transformation Tool

The green (continuous thick) 'creator' elements, in Figure 1 the node Can and the edge labeled 'child' from the node B to it, are created. When the rule is applied, both the edge labeled 'child' from nodeBto nodeD and nodeDa are removed.

Ontological Foundations

As an example, the kind 'Person' can be specialized in the (biological) subtypes 'Man' and 'Woman'. An example of a category is 'Physical Object', which represents properties of all kinds of entities that have masses and spatial extensions (eg people, cars, clocks, buildings); an example of an anti-rigid non-sort is 'Customer', which represents conditional properties for all its instances (ie, no customer is necessarily a customer), which can be of type 'Person' and 'Organization';.

The Multi-Level Theory

These fundamental concepts for multilevel modeling have been formally captured in the multilevel theory of MLT, described in various publications in recent years (ALMEIDA; . FRANK; KÜHNE, 2018; CARVALHO et al., 2017; CARVALHO; ALMEIDA, 2018 ). The following key definitions are proposed for MLT (ALMEIDA et al., 2018): Individuals are those entities that cannot possibly play the role of a type in an instantiation relation. Examples include person, dog, planet, car. Second-order types are those types whose instances are first-order types.

The second-order type can be defined as the power type of the first-order type, and so on. DADALTO et al.,2021), this taxonomy is problematic as 'mayor' is at the same time an example of 'Position' and a specialization of 'Position' (through 'Public Office').

Related Works

Further empirical evidence discussed in (DADALTO et al.,2021;BRASILEIRO et al.,2016a;FONSECA et al.,2021) shows that this is a large-scale problem in practice. Viatra2 (VARRÓ; BALOGH,2007) and Henshin (ARENDT et al.,2010) are model transformation tools that are part of the Eclipse Modeling Framework (EMF), and therefore can handle Ecore models. In the work described in (RUY et al.,2017), the authors argued that instead of proposing methodological rules or semantically motivated syntactic constraints based on the axiomatization of ontological theories, a representational system to model to build according to ontological constraints can use a more productive strategy, which makes use of the fact that formal constraints from the theory impose a correspondence between each specific type of type (characterized by ontological meta-properties) and certain modeling structures (or modeling patterns) .

Our proposal instead formalizes a grammar with concrete ontological patterns as model constructions; (ii) second, OntoClean is focused (even if not explicitly) on object types (also called substantive types) as opposed to more generally addressing taxonomies formed by other durable types (e.g. types whose instances are dependent entities , such as symptoms, marriages, registrations, etc. GUIZZARDI et al.,2018)); (iii) Finally, OntoClean does not address the construction of higher-order types. The first two points mentioned also apply to the graph grammar proposed in (BATISTA et al.,2021).

Graph Transformation Rules for Ontologically Correct Taxonomies 32

Introducing Dependent Types

Introducing Specializations for Existing Non-Sortal Types

Introducing Generalizations for Existing Sortal Types

Summary of the Grammar Rules

Formal Verification

Verifying Soundness

The first step to verify the reliability of the proposed graph grammar is to summarize its language, that is, to construct all possible taxonomy forms that can be achieved by any sequence of rule applications. When the exploration was performed, the tool managed to generate a total of 2,123,196 taxonomy forms up to a limit 𝑁 = 6, with a breakdown of this total per bound value shown in Table 1 . The table also shows that our reliability objective is validated (at least up to 𝑁 =6), with no taxonomy shapes flagged as incorrect by the graph conditions.

Given the inherent exponential growth of the number of possible taxonomic forms with respect to bound𝑁, it was not possible to continue the exploration for 𝑁 =7 and beyond due to memory limitations (execution stopped after several million partially constructed models.) This state space explosion is a common problem for all explicit state model viewers such as GROOVE (GHAMARIAN et al., 2012). Experimental results show that exhaustive testing within a small finite domain actually captures all typical system faults in practice (ROBERSON et al., 2008), and numerous case studies using the Alloy tool have confirmed the hypothesis by performing analysis at different scales. and retrospectively show that a small scale would be sufficient to find all the bugs discovered (JACKSON, 2019).

Verifying Completeness

To perform the completeness analysis, we need to take into account not only the correct taxonomy forms, but also the incorrect ones. To this end, we have developed another fully allowable graph grammar that can be used to create both correct and incorrect models. Given that the allowed grammar produces all possible models (correct and incorrect), we can conclude that the taxonomy grammar of Section 3.1 produces all correct taxonomy forms up to that format, and only the correct ones.

These tables 3,4 and 5 show that none of the taxonomy forms (the same 2,123,196 correct ones shown in table 1, and the incorrect ones) is an incorrect taxonomy form that is not produced by the graph grammar shown in 3.1, providing additional evidence that the taxonomy grammar of Section 3. 1 produces correct forms, and all. In other words, these tables provide additional evidence that the graph grammar of our stable types is not only sound but also complete.

Verification Scope Matters

This chapter presents our second proposed graph grammar that takes into account the constraints imposed on types by Multi-Level Theory (Section 4.1). This graph grammar includes the identified transformation rules for constructing correct multilevel taxonomies, i.e. taxonomies involving types of different levels of classification.

Graph Transformation Rules for Correct Multi-Level Taxonomies

Introducing a New First-Order Type
Introducing a New Order
Introducing a New High-Order Type
Introducing Specialization Relations
Introducing Instantiation Relations
Summary of the Grammar Rules

The model can be extended to accommodate higher type levels by creating a power type of the highest order base type previously present in the model ie. power type without an existing power type in the model (see Figure 28). Note that since all specializations of a base type are instances of its power type (which is a higher-order base type), we omit instances of that higher-order base type as they are redundant. This prevents the situation where an entity is an instance of a subtype without being an instance of a supertype.

The new instance of the high-order type must be an instance of all its supertypes that are not basic types, to ensure that the semantics of specialization are enforced. The introduction of new high-order types, which specialize some high-order basic type and categorize some type at one level below;.

Formal Verification of MLT Constraints

Constraints Related to Basic Types

Further proof of the completeness of our MLT-based grammar can be provided by following a strategy similar to that given in Section 3.2.2. In other words, in this grammar, which is presented in Appendix F, categorization relations do not necessarily occur between types of adjacent levels, i.e., the constraints shown in Figures 38(a, b, c) do not apply, but the constraint shown in Figure 40is. The second grammar, shown in Appendix G, produces taxonomic forms in that the specialization relations do not necessarily respect stratification, so the constraint shown in Figure 40 does not apply.

Finally, a third grammar, presented in Appendix H, produces taxonomy forms in the sense that instantiation relations do not necessarily respect the definitions of categorization and specialization, that is, the constraints presented in Figures 41 and 42 are not enforced. As in 3.2.2, this provides further evidence in favor of the completeness of our MLT-based grammar.

Graph Transformation Rules to Build Ontologically Correct Multi-

Introducing First-Order New Types
Introducing First-Order Dependent Types
Introducing a New Order
Introducing High-Order New Types
Introducing High-Order Dependent Types
Introducing Specializations for Existing Non-Sortal Types
Introducing Generalizations for Existing Sortal Types
Classifying Types
Summary of the Grammar Rules

Again, the instantiations of basic types remain implicit as can be deduced from the power type declaration: all subtypes of the base type are by definition instances of the power type. a) new-higher-order kind (b) new-higher-order category. This is performed by the rules shown in Figures 49(a,b), which are the product of combining the original rules presented in Figures 15, 16 and 29. a) new-higher-order-subspecies (b) new-high-order-antirigid-sortal. The introduction of new first-order types that directly specialize in the basic type that the Individual class represents (introduction of first-order kinds and non-kinds);

The introduction of new high-order types that directly specialize in some basic high-order type and the categorization of a type at a level below (introduction of high-order types and non-types). The introduction of new high order sort numbers that require an existing high order sort number at the same level to inherit an identity principle, categorizing some type at one level below;.

Formal Verification

The typology of persistent types is part of the Unified Foundational Ontology (UFO) (GUIZZARDI et al., 2015), which underlies the Ontology-Driven Conceptual Modeling language OntoUML (GUIZZARDI et al., 2018). The theory of higher-order types was conceived as a basis for multi-level models (ALMEIDA et al.,2018; CARVALHO; ALMEIDA,2018) and was also successfully applied in a number of initiatives, including the definition of a well-founded multi-level modeling language (FONSECA et al., 2021). As a result of the logical characterization of these meta-properties, we have that certain structures (patterns) are imposed on the language primitives that represent these distinctions (RUY et al.,2017).

The work proposed here is inspired by the work in (ZAMBON; GUIZZARDI, 2017) and advances the work started in (BATISTA et al., 2021). This plugin is intended to be used together with the gUFO ontology (a lightweight implementation of UFOs) (ALMEIDA et al., 2019).

Limitations

Using a well-grounded multi-level theory to support the analysis and representation of the Powertype pattern in conceptual modeling. In: AUSTRALIAN COMPUTER SOCIETY, INC.Proceedings of the Sixth Asia-Pacific Conference on Conceptual Modeling-Volume 96. In: IOS PRESS.Formal Ontology in Information Systems: Proceedings of the Seventh International Conference (FOIS 2012).

Future Work