ModOnto: A Suite of Tools for Modularizing Ontologies

Texto

(1)“ModOnto: A Suite of Tools for Modularizing Ontologies” By. Camila Bezerra da Silva M.Sc. Dissertation. Universidade Federal de Pernambuco posgraduacao@cin.ufpe.br www.cin.ufpe.br/~posgraduacao. RECIFE, August/2009.

(2) Universidade Federal de Pernambuco Centro de Informática Pós-graduação em Ciência da Computação. Camila Bezerra da Silva. “ModOnto: A Suite of Tools for Modularizing Ontologies”. Trabalho apresentado ao Programa de Pós-graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco como requisito parcial para obtenção do grau de Mestre em Ciência da Computação.. A M.Sc. Dissertation presented to the Federal University of Pernambuco in partial fulfillment of the requirements for the degree of M.Sc. in Computer Science.. Advisor: Prof. Dr. Frederico Freitas Co-Advisor: Prof. Dr. Jérôme Euzenat. RECIFE, August/2009.

(3)

(4)

(5) To my dear mother..

(6) Acknowledgements First of all, I thank God for your presence in my life, the support that gave me to finish this work despite the difficulties that I lived in the last months. I want to thank to my family, in special, my dear mother, Jacira, who always supports me, advises me and lead me for the ways of life, that if were not for her constant struggle to provide me a better life, I would not be where I am today. I want to express my thanks to my advisors, Fred Freitas and Jérôme Euzenat for the orientation, patience, dedication and incentive. The work presented in this dissertation would not be possible without their support and advice. I really want to thank for the opportunity that Jérôme gave me to spend 3 months at INRIA under his advising, which contributed to increase my knowledge and to finalize the implementation of the ModOnto. I would like to thank boyfriend, José Dihego, who has been present at my side giving support in the difficult moments. And, I also want to thank all the people who helped me directly or indirectly in the realizing of this work.. vi.

(7) "The greater the obstacle, the more glory in overcoming it." —JEAN-BAPTISTE POQUELIN (Moliere).

(8) Resumo. Devido aos problemas relacinados com o gerenciamento e o raciocínio em grandes ontologias, um forte interesse e uma conseqüente investigação ativa em modularização de ontologias vem emergindo na comunidade científica relacionada à Web Semântica. Uma vez que muitas ontologias são preferivelmente grandes artefatos, para uma adoção em grande escala das mesmas, por exemplo, na Web Semântica, é necessário permitir que desenvolvedores de ontologias possam incluir apenas as entidades e axiomas que são relevantes para a aplicação que estão a desenvolver. Além da reutilização, o uso de modularização de ontologias é útil para muitas outras tarefas, incluindo suporte a queries, raciocínio distribuído, desenvolvimento em grande escala e manutenção de ontologias. Alguns abordagens para modularização têm sido propostas, no entanto nemhuma delas dispõe de uma ferramenta flexível que permita não só definição de módulos, mas também outros tipos de tarefas, como análise sintática e semântica, biblioteca de módulos e uma poderosa ferramenta para seleção de entidades. Esta dissertação propõe um conjunto de ferramentas, chamado ModOnto, para cumprir esses requisitos. Incorpora uma abordagem para modularização de ontologias que herda alguns dos princípios de Engenharia de Software Orientada para Objeto, que são o encapsulamento e o ocultamento de informação. Palavras-chave: Modularização de ontologias, Reuso, Extração de módulos. viii.

(9) Abstract. Ought to problems with management and reasoning large ontologies, strong interest and a consequent active research on ontology modularization has been emerging in the Semantic Web research community. Once many ontologies are rather large artifacts, for a large scale adoption of them, e.g. in the Semantic Web, it is necessary to enable ontology developers to include only those entities and axioms that are relevant for the application which they are developing. In addition to reuse, ontology modularization is useful for many other tasks, including query answering, distributed reasoning, scalable evolution and maintenance. Some ontology modularization approaches have been proposed; nevertheless neither of them disposes of a flexible tool that permits not only ontology module definitions but also other support tasks, like syntactical and semantic checking of the modules, module library and powerful entity selection facilities. This dissertation proposes a suite of tools, called ModOnto, to meet these requirements. It incorporates an approach to ontology modularization that inherits some of the main principles from object-oriented Software Engineering, which are encapsulation and information hiding. Keywords: Ontology modularization, Reuse, Module extraction. ix.

(10) Contents. List of Codes. 1. Glossary. 2. 1. . . . . .. 4 4 6 6 7 7. . . . . . . . . . . . . . . . . . . . . .. 9 9 10 11 13 14 16 16 17 18 20 21 22 22 23 24 24 25 26 26 27 28. 2. Introduction 1.1 Context and Motivation 1.2 Goal . . . . . . . . . . 1.3 Contributions . . . . . 1.4 Applications . . . . . . 1.5 Text Organization . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. Ontology 2.1 Introduction . . . . . . . . . . . . . . . . . . . . 2.2 Ontology Components . . . . . . . . . . . . . . 2.3 Ontology Engineering . . . . . . . . . . . . . . . 2.4 Description Logics . . . . . . . . . . . . . . . . 2.4.1 Description Logic Reasoning . . . . . . . 2.5 Ontology Languages . . . . . . . . . . . . . . . 2.5.1 The first languages . . . . . . . . . . . . 2.5.2 The Semantic Web and its Languages . . 2.5.3 OWL . . . . . . . . . . . . . . . . . . . 2.6 Ontology Tools . . . . . . . . . . . . . . . . . . 2.6.1 Ontology development tools . . . . . . . OILEd . . . . . . . . . . . . . . . . . . OntoEdit . . . . . . . . . . . . . . . . . OntoLingua Server . . . . . . . . . . . . Protégé . . . . . . . . . . . . . . . . . . 2.6.2 Ontology matching and integration tools . Chimaera . . . . . . . . . . . . . . . . . ODEMerge . . . . . . . . . . . . . . . . Alignment API . . . . . . . . . . . . . . 2.6.3 Neon ToolKit . . . . . . . . . . . . . . . 2.6.4 Conclusion . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . .. x.

(11) 3. 4. Ontology Modularization 3.1 Reusability . . . . . . . . . . . . . . . . . . . . . 3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Syntactical and Semantic Modularity . . . short title . . . . . . . . . . . . . . . . . . Semantic Modularization . . . . . . . . . . 3.2.2 Coupling and Cohesion . . . . . . . . . . . 3.3 Definition . . . . . . . . . . . . . . . . . . . . . . 3.4 Partitioning Approaches . . . . . . . . . . . . . . 3.4.1 Structured-Based Partitioning . . . . . . . 3.4.2 Ontology Partitioning using ε-connections 3.5 Formal Interpretation . . . . . . . . . . . . . . . . 3.5.1 DDL . . . . . . . . . . . . . . . . . . . . 3.5.2 IDDL . . . . . . . . . . . . . . . . . . . . 3.6 Approaches . . . . . . . . . . . . . . . . . . . . . 3.6.1 COWL . . . . . . . . . . . . . . . . . . . 3.6.2 ε-connections . . . . . . . . . . . . . . . . 3.6.3 PDL . . . . . . . . . . . . . . . . . . . . . 3.7 Selection Approach . . . . . . . . . . . . . . . . . 3.7.1 ModTool . . . . . . . . . . . . . . . . . . 3.7.2 NeOn . . . . . . . . . . . . . . . . . . . . 3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Alignments . . . . . . . . . . . . . . . . . 3.8.2 OO principled ontology modules . . . . . . 3.8.3 Tool Support for Ontology Modularization ModOnto 4.1 The OntoCompo Approach . 4.1.1 Language . . . . . . Language Syntaxes . Language Semantics 4.2 Features and Functionalities 4.3 Architecture . . . . . . . . . 4.3.1 Module API . . . . . 4.3.2 Module Extractor . . 4.3.3 Module Linker . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. 31 31 32 34 34 34 35 35 37 37 38 39 39 39 41 41 43 44 45 46 48 52 52 52 52. . . . . . . . . .. 54 54 55 56 60 60 61 62 64 65. xi.

(12) . . . . . . . . . . . . . . . .. 66 66 67 68 68 71 71 71 71 72 72 73 75 77 78 80. Conclusions and Future Work 5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 82 82 84 85. 4.4. 4.5. 5. 4.3.4 Module Checker . . . . . . . . . . . 4.3.5 Module Reasoner . . . . . . . . . . . 4.3.6 Module Library . . . . . . . . . . . . Used Technologies . . . . . . . . . . . . . . 4.4.1 OWL API . . . . . . . . . . . . . . . 4.4.2 Alignment API . . . . . . . . . . . . 4.4.3 IDDL Reasoner . . . . . . . . . . . . Classical Reasoning . . . . . . . . . Distributed Reasoning . . . . . . . . Usage Example . . . . . . . . . . . . . . . . 4.5.1 Context . . . . . . . . . . . . . . . . Human Readable Syntax . . . . . . . 4.5.2 Creating the module in the ModOnto 4.5.3 RDF Format . . . . . . . . . . . . . 4.5.4 Alignment process . . . . . . . . . . 4.5.5 Syntactic and Semantic Checker . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. Bibliography. 86. Appendices. 91. Diagrams of Classes. 92. Code of the Usage Example. 95. xii.

(13) List of Figures. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10. Classes and instances . . . . . . . . . . . . . Instances . . . . . . . . . . . . . . . . . . . Relations . . . . . . . . . . . . . . . . . . . ontology markup languages . . . . . . . . . . RDF Triple . . . . . . . . . . . . . . . . . . OntoEdit . . . . . . . . . . . . . . . . . . . . Protégé . . . . . . . . . . . . . . . . . . . . The steps of the ODEMerge Methodology . . An alignment generated by the alignment API Neon ToolKit . . . . . . . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. 10 10 11 17 18 23 25 26 28 29. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12. Distributed system . . . . . . . . . . . . . . . . . . . . . . . . . C-OWL mapping from the family ontology to the familia ontology Link Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . Swoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P-DL example . . . . . . . . . . . . . . . . . . . . . . . . . . . . ModTool’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . ModTool’s methodology . . . . . . . . . . . . . . . . . . . . . . Modularization plugins for the NeOn Toolkit . . . . . . . . . . . Metamodel for ontology modules . . . . . . . . . . . . . . . . . . Metamodel for ontology modules . . . . . . . . . . . . . . . . . . Metamodel for ontology modules . . . . . . . . . . . . . . . . . . Comparative Evaluation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. 40 42 44 44 45 47 47 49 50 50 51 53. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10. Module Language tags and descriptions. . . . . . . . . . . ModOnto Architecture . . . . . . . . . . . . . . . . . . . Module API Architecture . . . . . . . . . . . . . . . . . . Module Extractor . . . . . . . . . . . . . . . . . . . . . . Module Linker . . . . . . . . . . . . . . . . . . . . . . . Module B uses module A . . . . . . . . . . . . . . . . . . Module Tab . . . . . . . . . . . . . . . . . . . . . . . . . Module Library . . . . . . . . . . . . . . . . . . . . . . . Implementation aspects of thw OWL API . . . . . . . . . Dependecies between the ontologies and the Menu module. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. 57 62 63 64 65 66 67 68 70 73. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. xiii.

(14) 4.11 Importing entities from Pizza Ontology . . . . . . . . . . . . . . . . . 4.12 Creating the NorthEasternBrazilianPizza class . . . . . . . . . . . . . . 4.13 Module Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 76 77 78. 1 2 3 4 5. 92 93 93 94 94. Renderer Diagram . . . . . . Parser Diagram . . . . . . . Syntatical Checker Diagram Library Diagram . . . . . . Reasoner Diagram . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. xiv.

(15) List of Tables. 2.1. OWL Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 20. xv.

(16) List of Codes. 3.6.1 4.1.1 4.1.2 4.5.1 4.5.2 4.5.3. Module in XMLSchema format . . . . . . . . . . . . . . . . . . . . . . Module in XMLSchema format . . . . . . . . . . . . . . . . . . . . . . Module in XMLSchema format . . . . . . . . . . . . . . . . . . . . . . Menu Module in RDF Syntax . . . . . . . . . . . . . . . . . . . . . . NorthEasternBrazilianPizza class . . . . . . . . . . . . . . . . . . . . Alignment in RDF format between the Pizza ontology and the Menu Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .0.1 Menu Module in RDF Syntax . . . . . . . . . . . . . . . . . . . . . . .0.2 The contains and the exports of the Menu module in RDF Syntax . . . .0.3 NorthEasternBrazilianPizza class . . . . . . . . . . . . . . . . . . . .. 43 58 59 79 80 81 95 96 97. 1.

(17) Glossary. A AI. Artificial Intelligence, p. 16.. API. Application Programming Interface, p. 68.. D DDL. Distributed Description Logics, p. 39.. DL. Description logics, p. 13.. I IDDL. Integrated Distributed Description Logics, p. 41.. K KB. Knowledge Base, p. 13.. KIF. Knowledge Interchange Format, p. 16.. KM. Knowledge Management.. KR. Knowledge Representation, p. 13.. O OBO. Open Biomedical Ontologies, p. 69.. OKBC. Open Knowledge Based Connectivity, p. 23.. OO. Object Oriented.. OWL. Web Ontology Language, p. 18.. P P-DL. Package-based Description Logics, p. 44.. 2.

(18) GLOSSARY. R RDF. Resource Description Framework, p. 17.. T Turtle. Terse RDF Triple Language, p. 69.. X XML. Extensible Markup Language, p. 17.. 3.

(19) 1 Introduction. 1.1. Context and Motivation. With the advent of the Internet, it appeared the Semantic Web that provides automated information access based on machine-processable semantics of data and heuristics that use these metadata(Berners-Lee et al. (2001)). The idea is that the machines might understand the data disposed in the Web. The explicit representation of the semantics of data using the ontologies, will enable a Web to provide a quality level of service. However, the success of the Semantic Web depends of a number of efforts; one of them is ontology reuse. Moreover, Ontology construction is deemed to be a time-consuming and labourintensive task. Therefore, it heavily relies on the possibility of reusing existing ontologies. The ontology subfield that provides methodologies to guide to developers is known as Ontology Engineering. Most of its methodologies share similar iterative phases, like specification, implementation and evaluation with Software Engineering techniques. However, a culture of building blocks for ontologies, such as the component culture that took over the realm of object oriented Software Engineering, has not shown up in the Ontology Engineering yet. Ontology modularization comes up as a way to solve the problem of defining a part of an existing ontology to be reused, in order to enable ontology developers to include only those concepts and relations that are relevant for the application they are modeling an ontology for. Ontology editors such as Protégé1 , allow the reuse of another ontology by including it in the model that is being designed (in Protégé this happens through the inclusion of 1 http://protege.stanford.edu/. 4.

(20) 1.1. CONTEXT AND MOTIVATION. other projects). The web ontology language OWL2 offers the possibility to import an OWL ontology by means of the <owl:imports> statement. The semantics of this import statement is that all definitions contained in the imported ontology file are included in the importing ontology, as if they were defined in the importing ontology file. Moreover, owl:import is directed; i.e, only the importing ontology is affected, and is transitive, e.g, if A imports B and B imports C, then A imports all definitions contained in C. The problem with these reuse techniques is that when they import a given ontology, the whole content of it is included in the new one; thus, it is not possible to import only relevant parts of the ontology, every definitions are included. This consists in a serious limitation that leads to several problems: • Reasoning scalability: the reasoners are known to perform well on small-scale ontologies, with performances degrading rapidly as the size of the ontology increases. • Complexity management: the larger the ontology, the more difficult it is to control the accuracy of the design. • Understandability: it is easier to understand an ontology contents if the ontology is small. To deal with the lack of ontology modularity, some approaches have been proposed some of them are presented in chapter 3. At least one large ontology editor has already implemented modularization, the NeOn toolkit3 . Indeed, the NeOn project received a budget of almost 15 million euros and gathers 17 European research institutions in order to produce an ontology toolkit that is much more flexible and powerful than the current ones and one of the relevant branches of it refers to ontology modularization. Nevertheless, although these ontology modularization approaches that have been already proposed, neither of them disposes yet of a flexible tool that permits not only ontology module definitions but also other support tasks, like syntactical and semantic checking of the modules, module library and powerful entity selection facilities. Most modularization approaches to the problem focus on linking ontologies (or modules) rather than building modules that can encapsulate foreign parts of ontologies (or other modules) that can be managed more easily. 2 http://www.w3.org/TR/owl-features/ 3 http://www.neon-toolkit.org/. 5.

(21) 1.2. GOAL. 1.2. Goal. With this dissertation our objective was to propose a suite of tools for modularization of ontologies which tackles the problems cited above. Thus, it aims at providing an environment for building modules by reusing ontologies or other modules. Moreover, it provides others support tasks like consistency checking of modules both syntactically and semantically. This work is part of a project of cooperation between CIN-UFPE and INRIA4 (Institut National de Recherche en Informatique et en Automatique), entitled “Composition and Modules for Ontology Engineering” (OntoCompo), funded by CNPq5 (National Counsel of Technological and Scientific Development) and INRIA. Its goal is to build an approach for modularization that inherits two of the main principles from object-oriented Software Engineering, which are encapsulation and information hiding. The approach encompasses a language definition with syntaxes and semantic notions, to represent the module.. 1.3. Contributions. The main contribution of this dissertation is the availability of the suite of tools that provides support to build ontology modules, based on the OntoCompo approach. The suite offers interesting support tools like a library of modules that will help developers to choose suitable modules to be reused in a new one. Nevertheless, at least three novel, distinctive features are also present, which at the best of our knowledge is not implemented in any other ontology module tool: 1. Semantic checking of the module being created. This checking is accomplished in a very flexible manner; users can choose from two semantics, the usual DL semantics which is proper to usual DL ontologies, or the Integrated DL semantics based on IDDL(Zimmermann (2007)), which is adequate for distributed DL ontologies. To achieve this step, we had to integrate an IDDL reasoner with the suite. 2. Syntactical checking, that assures that the module does not contain any syntactical errors. 3. Support to the use of alignments among the imported modules ( or ontologies ) and the new one. 4 http://www.inria.fr/ 5 http://www.cnpq.br/. 6.

(22) 1.4. APPLICATIONS. 4. Compliance with an on-the-making standard module format, the NeOn module format. According to the literature review carried out for this work, we have drawn to the conclusion that the different proposals ( PDL, ε-connections, C-OWL, etc ) are not well supported by tools.We were not able to find a module development environment that supports not only the creation of modules, but also other support tasks, like syntactical and semantic checking of the modules, and other necessary tasks for the good and safe development of a module. Therefore, this work can be viewed as an attempt to make for the lack of this type of tool. An additional research interest was to deploy such a tool departing from the ontology module language defined in the project. The suite has been developed in Java language and use three APIs: OWL API6 from Manchester, Alignment API( Euzenat (2004) ) and the IDDL (Zimmermann and Duc (2008a)) Reasoner from INRIA.. 1.4. Applications. The present work may have potential impacts in several application scenarios, including the following: Collaboration Development: the building process of an ontology usually involve cooperation among several domain experts. The ModOnto provides support for efficient collaborative building of large, modular ontologies. Ontology Reuse: the lack of modularity in current ontology languages forces an ontology to be totally reused. An ontology has to be completely reused even if only a small fragment of it is actually needed. The ModOnto provides mechanisms to build ontology modules by reusing small fragments of several imported ontologies (or modules). Support Tasks for Module Development: besides the module development, ModOnto offers others support tasks like modules library.. 1.5. Text Organization. The presented work was described in five chapters. The present chapter aims to give to the reader a view of the work and the context and motivation to realize it. 6 http://owlapi.sourceforge.net. 7.

(23) 1.5. TEXT ORGANIZATION. In Chapter 2 describes the concept of ontology; methodologies, tools, languages to development of ontologies. The goal is to show the available mechanisms for the development of ontologies and what of them give us support to reuse them. In Chapter 3 describes Ontology Modularization which comes up as a way to develop ontologies by reusing parts from others ontologies and the existent approaches. In Chapter 4 presents the main contribution of this dissertation which is the suite of tools for ontology modularization and the OntoCompo approach. And it is given an example of using the ModOnto, the module development as well as the other support tasks like syntactical checker. In Chapter 5 the conclusions and contributions are given of this work. Moreover, it is proposed the future works.. 8.

(24) 2 Ontology. 2.1. Introduction. The word ontology stems from the field of Philosophy, where it means explanation of being. The goal was to try to define all things of the word. In this context, the philosophers try to answer questions like “What is a being?”. As for the Computer Science mainstream, ontologies were introduced in the 80’s, in the area of Artificial Intelligence for representing knowledge to be employed in automatic inferences of knowledge-based systems. Ontologies have been used as a form of organizing knowledge information, mainly with regard to formal representation of knowledge. The main purpose is to provide sharing and reusing knowledge. There are several definitions in the literature, which provide different and complementary points of view. One of the first definitions was given by Neches et al. (1991), who defined an ontology as follows: “An ontology defines the basic terms and relations comprising the vocabulary of a topic area as well as the rules for combining terms and relations to define extensions to the vocabulary.” This definition provides some vague guidelines: define terms, relations and rules to combine terms. Few years later, Gruber (1993) defined an ontology as “an explicit specification of a conceptualization”. After this, many definitions of what an ontology is were proposed. Afterwards, Borst (1997) modified Gruber’s definition as: “Ontologies are defined as a formal specification of a shared conceptualization”. By conceptualization, it refers to an abstract model of some phenomenon in the world by having identified the relevant concepts of the represented domain. By explicit, it means that the type of concepts used, and the constraints on their use are declarative. By formal, it refers to the fact that the ontology should be machine readable and processable. And shared reflects the notion that. 9.

(25) 2.2. ONTOLOGY COMPONENTS. an ontology captures consensual knowledge, that is, it is not private of some individual, but accepted by a group. This definition became the most quoted in literature and by the ontology community. Ontology provides a shared vocabulary, which can be used to model a domain, that is, concepts and its types that exist, and their properties and relations. And, they are widely used in different areas: Knowledge Engineering, Databases and Software Engineering.. 2.2. Ontology Components. According to Perés and Corcho (2002) the ontology is composed of five components: 1. Classes - represent a set of individuals with common characteristics in a domain. They are organized in taxonomies and can be abstract or concrete concepts, simple or complex, real or unreal. The figure 2.1 illustrates two classes: the Country class, which represents the set of Countries of the world and a Person class which represents a set of people.. Figure 2.1 Classes and instances. Figure 2.2 Instances. 10.

(26) 2.3. ONTOLOGY ENGINEERING. 2. Instances - represent objects in the domain. Individuals of the classes Country and Pearson can be seen in figure 2.2. The Japan is an individual from the Country Class, for instance. 3. Relations - represent connections between two classes. Naturally, relations can also be instantiated. For instance, in figure 2.3 the individuals from the Person class is related with the individuals from the Country class through the “wasBorn” property. Thus the individual Chan from the class Person related by the relation “wasBorn” to the individual Vietnam from the class Country.. Figure 2.3 Relations. 4. Axioms - are predicating about the elements that are always true. Example: If an animal is carnivore, it eats meat, where Animal and Meat are classes and “eats” is the relation between them. 5. Functions - complex structures formed from certain relations that can be used in place of an individual term in a statement. Example: Mother is a function that relates an animal to exactly one female animal. After having defined the ontology elements, in the following we will discuss on ways of how to build ontologies, a subarea known as Ontology Engineering.. 2.3. Ontology Engineering. According to Gómez-Pérez and colleagues, Ontology Engineering (OE) refers to the set of activities regarding to process, life cycle, methods, methodologies, tools and languages that support the ontology development (Gomez-Perez et al. (2004)).. 11.

(27) 2.3. ONTOLOGY ENGINEERING. When we intend to represent something in an ontology, it is necessary to make design decisions. Gruber(Gruber (1995)) proposed a set of design criteria for ontology development: Clarity: an ontology should effectively communicate the intended meaning of defined terms. The definitions should be objective and, as much as possible, independent of social or computational context. Coherence: the ontology should be used to make inferences that are consistent with the definitions. At least, the defining axioms included it should be logically consistent. Extendibility: an ontology should be designed to be possible to use it at shared domain. It should be able to define new terms for special uses based on the existing vocabulary, in a way that does not require the revision of the existing definitions. Minimal encoding bias: the conceptualization should be specified at the knowledge level without depending on a particular symbol-level encoding. Encoding bias should be minimized, because knowledge-sharing agents may be implemented in different representation systems and representation styles. Minimal ontological commitment: an ontology should require the minimal ontological commitment sufficient to support the intended knowledge sharing activities. Thus, an ontology should make as few claims as possible about the domain being modeled, allowing the part committed to the ontology to specialize and instantiate the ontology as needed. In the last years, Ontology Engineering attracted plenty of attention of the researches. Hence, many methodologies, tools and languages have appeared to assist the task of building ontologies. When a new ontology is about to be built, several questions arise in the its development related to the methodology, tools and language that must be used, such as the ones listed below: • Which methods and methodologies can I use for building ontologies? • How can I reuse other ontologies? • How can the applications use the existing ontologies? • Which language(s) should I use to implement my ontology?. 12.

(28) 2.4. DESCRIPTION LOGICS. • Is the language chosen appropriate for exchanging information between different applications? • Is the language chosen supported by any development tool? • What are the inference mechanisms attached to the ontology language? In order to solve these questions it is needed to know about the methodologies, tools, languages, domain of the ontology, and so on. The next sections present the main characteristics of methodologies, tools and languages, which can help to solve the previous questions. First, we explain the Description Logic which is a formalism that supports several languages like OWL.. 2.4. Description Logics. Description logics (DL) means a family of knowledge representation (KR) formalisms with the purpose of representing knowledge of an application domain by defining the relevant concepts of the domain, and using these concepts to specify properties and individuals occurring in the domain. Differently from their predecessors, the name Description Logics indicates the languages which are equipped with a formal and logic-based semantics(Baader et al. (2003)). In DL classes(concepts) can be defined in terms of descriptions that specify the properties that objects must satisfy to belong to the concept. These descriptions are expressed using a language that allows the building of complex descriptions, including restrictions on the binary relationships(called roles) connecting objects. Moreover, provides to the user infer implicitly knowledge from the knowledge that is explicitly contained in a knowledge base. Decidability and complexity of the inference depend on the expressive power of the chosen DL. However, it comes up inference problems of high complexity when is used a very expressive DLs or still the results may even be undecidable. On the other hand, very weak DLs may not be sufficiently expressive to represent the important concepts of a given application. A DL knowledge base (KB) comprises two components: • TBox(Terminological Box): is a set of inclusion axioms and equivalent axioms. The TBox introduces the terminology, i.e., the vocabulary of an application. The. 13.

(29) 2.4. DESCRIPTION LOGICS. vocabulary consists of concepts, which denote sets of individuals, and roles, which denote binary relationships between individuals. Example: {Father v Person, GrandFather ≡ Person u ∃ hasChild.Parent} • ABox: contains assertions about named individuals in terms of this vocabulary. Example: {Peter:Father, Peter hasChild John} Assume that we want to define the concept of “A happy woman is married to a good man, has at least one child and has a good job”. This concept can be described with the following concept description: HappyWoman ≡ Human u ¬Man u (∀married.GoodMan) u ≥ 1hasChild u ∀has.GoodJob. 2.4.1. Description Logic Reasoning. Several systems based on ontologies make reasoning using them. The reasoning task becomes even easier when the formalism in which the ontologies are defined, already provides reasoning services, like DL ( Description Logic ). For instance, DL languages enable the automatic organization of the classes defined in an ontology, into a specialisation/generalisation hierarchy ( ontology classification), a reasoning task that provides the detection of potential modeling errors such as inconsistent class descriptions and missing sub-class relationships. There are reasoners that provide these services like Pellet and fact++ for the DL language ( OWL, Ontology Web Language, the standard ontology language for the Semantic Web ). They are able to infer logical consequences from a set of asserted facts or axioms, among other reasoning tasks. One of the main services offered by a DL reasoner is to test whether or not one class is a subclass of another class ( subsumption service ). Another service is to check the ontology consistency. Based on the description of a class, the reasoner can check whether or not it is possible for the class to have any instances. A class is deemed to be inconsistent if it cannot possibly have any instances. The standard inference services that are offered by reasoners are: • Consistency checking: ensures that an ontology does not contain any contradictory sentences. • Concept satisfiability: determines if is possible for a class to have any instances. If a class is unsatisfiable, then defining an instance of that class will cause the whole ontology to be inconsistent.. 14.

(30) 2.4. DESCRIPTION LOGICS. • Classification: computes the subclass relations between every named class to create the complete class hierarchy. The class hierarchy can be used to answer queries such as getting all or only the direct subclasses of a class. • Realization: finds the most specific classes that an individual belongs to; i.e., realization computes the direct types for each of the individuals. For example, consider the following sentences below: 1. Drivers drive vehicles 2. ‘Bus drivers’ drive buses 3. A bus is a vehicle These sentences in DL is equivalent to: Drivers ≡ ∃drive.vehicle BusDrivers ≡ ∃drive.bus Bus v Vehicle A subsumption reasoning allows the inference that “bus drivers are drivers”(BusDrivers = Drivers) since “vehicle” is more general than “bus”. An example of inconsistency is given below: Human ≡ ∀eat.(Meat t Vegetable) u ∃eat.Meat u ∃eat.Vegetable Meat u Vegetable ≡ ⊥ Vegetarian ≡ ∀eat.Vegetable Vegetarian(Mary) Human(Mary) These axioms mean that Humans have to eat both Meat or Vegetable which are disjoint classes. Vegetarians eat only Vegetables and Mary is human and is Vegetarian. So, according to the axioms above, Mary is a human and is vegetarian. This sentence is inconsistent since humans have to eat meat. Next, we discuss some languages on which ontologies can be built.. 15.

(31) 2.5. ONTOLOGY LANGUAGES. 2.5 2.5.1. Ontology Languages The first languages. At the beginning of the 1990s, it appeared a set of ontology languages based on AI. Basically, such ontology languages were based on first order logic (i.e. KIF), on frames combined with first order logic (i.e. Ontolingua, OCML1 and FLogic) or on Description Logic (i.e. Loom). KIF (Hayes and Menzel (2001)) was created in 1992 in order to be an interchange format for diverse KR systems. KIF is based on First Order Logic (FOL). After, it was developed by the Stanford University, the Ontolingua (Farquhar et al. (1996)), which was build on KIF and combines the KR paradigms of frames. It allows the representation of concepts, taxonomies of concepts, n-ary relations, functions, axioms, instances and procedures. However, the KIF based Ontolingua has a problem with its expressive power, so that no general reasoner support for KIF exists. Loom (MacGregor (1991)) was developed simultaneously with Ontolingua at the Information Science Institute (ISI) from the University of South California. Loom is a language based on Description Logics and production rules. The following ontology components can be represented with this language: concepts, concept taxonomies, n-ary relations, functions, axioms and production rules. FLogic (Kifer et al. (1990)) was developed in 1995 and combines frames and first order logic. Basically it provides classes, attributes with domain and range definitions, is-a hierarchies with set inclusion of subclasses and multiple attribute inheritance, and logical axioms that can be used to further the relationships between elements of an ontology and its instances. SHOE (Heflin et al. (1999)) was built in 1996 as an extension of HTML, in the University of Maryland. SHOE combines frames and rules and uses tags different from those of the HTML specification. SHOE just allows representing concepts, relations, instances and deduction rules, which are used by its inference engine to obtain new knowledge. However, it was based on a database perspective and not in a KR formalism. OIL(Fensel et al. (2000)) had as main goal the easy adoption by the Semantic Web developers. This language adds frame-based KR primitives and its formal semantics is based on DLs, it was organized in layers to facilitate the use, and had a reasoner, the Fact2 classifier to perform automatic classifications of concepts. 1 http://kmi.open.ac.uk/projects/ocml/ 2 http://www.cs.man.ac.uk/. horrocks/FaCT/. 16.

(32) 2.5. ONTOLOGY LANGUAGES. DAML(DARPA Agent Markup Language) is an extension of OIL without inference mechanism.. 2.5.2. The Semantic Web and its Languages. There are billions of documents on the World Wide Web (WWW), which are used by billions of users globally; for that reason, it is increasingly difficult to find, organize, and maintain the information required by the users. The notion of a Semantic Web (BernersLee et al. (2001)), that provides enhanced information access based on the exploitation of machine-processable meta-data, has been proposed to address these problems. The Semantic Web is accepted as an extension of the current Web where, documents are annotated with meta-information. This meta-information defines how the information contained in documents will be available in a processable way by machines. The explicit representation of meta-information, accompanied by ontologies, will enable a Web that provides a qualitatively new level of service. Ontologies are a key to enable technology for the Semantic Web. Because they provide a semantic model to the data, enabling common vocabulary to exchange information.. Figure 2.4 ontology markup languages. The development of the Semantic Web proceeds in steps, each step building a layer on top of another. The Figure 2.4 shows the main layers of the Semantic Web. At the bottom there is XML language that permit to write structured Web documents with a vocabulary defined by the developer. Unfortunately XML imposes no semantic constraint on the meaning of these documents. XML is responsible for an important role in the exchange of a wide variety of data on the Web.. 17.

(33) 2.5. ONTOLOGY LANGUAGES. RDF is a basic data model for writing simple statements about Web objects(resources). The data models can be represented in a XML syntax. Its structure is composed by three objects: resource, properties and triples. Resource is what is described by a RDF expression and is identified by URI (Uniform Resource Identifier). A property is any characteristic used to describe a resource. A triple has the follows form: < resource, property, value >, that means “the resource that has the property with certain value”. For example, the figure 2.5 represents the statement “Camila Bezerra is the owner of the Web page http://www.cin.ufpe.br/ cbs”.. Figure 2.5 RDF Triple. RDF Schema provides a model of primitives for organizing Web objects into hierarchies. The main primitives are classes and properties, subclasses and subproperties relationships, and domain and range restrictions. OWL is a language for describing properties and classes of RDF resources, with semantics for generalization hierarchies of such properties and classes. The logic layer allows to write specific applications with declarative knowledge. The proof layer involves deductive process and representation of proofs in Web languages(the languages described previously) and validation. Finally, the trust layer will comes up from the use of digital signatures and other types of knowledge. Due to the great importance of the OWL to this work, OWL will be discussed in more details in the next section.. 2.5.3. OWL. In 2001, the W3C formed a working group called Web-Ontology (WebOnt) Working Group. The aim of this group was to make a new ontology markup language for the Semantic Web, called OWL (Web Ontology Language). It has taken DAML+OIL features as the main input for developing OWL and have proposed the first specification of this language (McGuinness and van Harmelen (2004)). OWL has three increasinglyexpressive sublanguages: OWL Full, OWL DL, and OWL Lite. • OWL Full: represented the OWL language more expressive. It permits the combination of primitives of the OWL with RDF/RDF Schema. For example, in OWL. 18.

(34) 2.5. ONTOLOGY LANGUAGES. Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right; this is not permitted in OWL DL. However, the language has become so powerful as to be undecidable, resulting in a incomplete reasoning support. • OWL DL: in order to obtain computational efficiency, OWL DL restricts how the constructs from OWL and RDF may be used. Essentially application of OWL’s constructor’s to each other is disallowed, thus ensuring that the language corresponds to Description Logic. The advantage of this is that it permits efficient reasoning support. • OWL Lite: it is the more restrictive and include the expressivity of frames and description logic with restrictions. For example, classes can only be defined in terms of named superclasses, and only certain kinds of class restrictions can be used. OWL is built on RDF and RDF Schema and uses RDF’s XML-based syntax. OWL adds more vocabulary for describing properties and classes. For instance, relations between classes(e.g. inheritance), cardinality(e.g. “at least four”), characteristics of properties (e.g. functional), and others. The table 2.5.3 shows some constructs from the OWL language. The “intersetionOf”, “unionOf” and “complementOf” are constructs deriving from the set theory. “allValuesFrom” and “someValuesFrom” are property restrictions, for example, a particular class may have a “someValuesFrom” restriction on a property that at least one value for that property is of a certain type. As described in the last line of the table: “Exists at least one woman that has a child”. “masCardinality” and “minCardinality” are cardinality restricions. That is, the restrictions constrain the cardinality of that property on instances of that class. For example, “A woman that has 3 childrens at most”. Moreover, all the conditions that we have used in class descriptions have been “necessary” conditions. Necessary conditions can be read as, “If something is a member of this class then it is necessary to fulfil these conditions”. With necessary conditions alone, we cannot say that, “If something fulfils these conditions then it must be a member of this class”. A class that only has necessary conditions is known as a Primitive (partial) Class. A class that has at least one set of necessary and sufficient conditions is known as a Defined (complete). 19.

(35) 2.6. ONTOLOGY TOOLS. Constructor DL Syntax intersectionOf C1 u · · · uCn unionOf C1 t · · · t Cn complementOf ¬C allValuesFrom ∀P.C someValuesFrom ∃P.C maxCardinality ≤nP minCardinality ≥ nP. Example Person u Employed Man t Woman ¬Man ∀hasChild.Mother ∃hasChild.Woman ≤3hasChild.Woman ≥ 1hasChild.Woman. Table 2.1 OWL Constructs. Class. In our Pizza ontology, Cheesypizza is a defined class ( i.e. with a necessary and sufficient condition)’.. 2.6. Ontology Tools. In the last years, a high number of environments dealing with ontology have appeared. The ontology tools can be classified as follows(Gomez-Perez et al. (2002)): Ontology development tools: this class includes tools that can be used for building a new ontology from model or reusing existing ontologies. Beyond the usual edition and browsing functionality, these tools usually include ontology documentation, ontology export and import functions from different formats, graphical views of the ontologies, ontology libraries, inference engines, etc.] Ontology evaluation tools: they appear as support tools that ensure that ontologies and their related technologies have a given level of quality. For example, the OntoAnalyser and OntoGenerator, both supported by Ontoprise3 and Institute AIFB4 , were realized as plugins for OntoEdit(Sure et al. (2002)) which is a graphically oriented ontology engineering environment. OntoAnalyser focuses on evaluation of ontology properties, in particular language conformity and consistency. OntoGenerator focuses on evaluation of ontology based tools, in particular performance and scalability. Ontology-based annotation tools: these tools have been designed to allow users inserting and maintaining (semi)automatically ontology-based markups in Web pages. 3 http://www.ontoprise.de 4 http://www.aifb.uni-karlsruhe.de. 20.

(36) 2.6. ONTOLOGY TOOLS. For example, the COHSE5 developed by the Manchester Information Management Group and the Southampton’s Intelligence, Agents and Multimedia Group(IAM). The aim of this set of tools is the use of metadata to support the construction and navigation of links in the Semantic Web. Ontology storage and querying tools: these tools have been created to allow using and querying ontologies easily. Example: the Sesame6 , is being developed by Aidministrator Nederland. It is a system consisting of a repository, a query engine and an administration module for adding and deleting RDF data and Schema information. Ontology learning tools: they are used to (semi)automatically derive ontologies from natural language texts. The ontology learning is a complex multi-disciplinary field that uses the natural language processing, text and Web data extraction, machine learning and Ontology Engineering. For example, the Text2Onto7 is a framework, developed at the University of Karlsruhe, for ontology learning from textual resources. The idea is to get better extraction results and to reduce the need for an experienced ontology engineer by using several different extraction approaches and then combining the results. This architecture is based on Data import, Natural language process, algorithm library(Statistical, Data Mining, etc.), result presentation and ontology engineering environment. Ontology merge and integration tools: the tools aim to solve the problem of merging or integrating different ontologies on the same domain. In addition, there are the ontology development tools and ontology merge and integration tools, which receive more attention in this work because they are closely relating to Ontology Modularization. In the next sections, we will try to provide a broad overview of some of the available tools and environments that can be used for the building ontologies and merge ontologies.. 2.6.1. Ontology development tools. Many ontology development tools exist for building ontologies, but there is the necessity of interoperate between different tools. The lack of interoperability between all these 5 http://cohse.semanticweb.org/ 6 www.openrdf.org 7 http://ontoware.org/projects/text2onto/. 21.

(37) 2.6. ONTOLOGY TOOLS. tools generate problems, for example, when integrating an ontology into the ontology library of a different tool. Other problem is the fact that they do not cover all the activities of the ontology life cycle. In the next sections, we will show some ontology development tools. OILEd OILEd(Horrocks et al. (2000)) is a graphical ontology editor, implemented in Java, developed by the University of Manchester. The knowledge model of OILEd is based on DAML+OIL. Thus, OILEd offers a familiar frame-like paradigm for modeling while still supporting the rich expressiveness of DAML+OIL where required. Classes are defined in terms of their superclasses and property restrictions, with additional axioms capturing further relationships such as disjointness. The expressive knowledge model allows the use of complex composite descriptions as role fillers. This is in contrast to other existing frame-based editors, where such anonymous frames must be named before they can be used as models. Moreover, OilEd provides the FaCT8 reasoner to check the consistency of ontologies, add implicit subClassOf relations, and export ontologies to different formats, including the Ontology Interchange Language Resource Description Framework (OIL-RDF) and DARPA Agent Mark Up Language Resource Description Framework (DAML-RDF). OntoEdit OntoEdit(Sure et al. (2002)) is an Ontology Engineering Environment which has a graphical interface and supports the development and maintenance of ontologies. OntoEdit9 is built on top of a powerful internal ontology model, which can be serialized using XML, and supports the internal file handling. This tool supports representation-language to model concepts, relations and axioms. Furthermore, provides graphical views to support modeling of different phases of the Ontology Engineering cycle. The tool allows the user to edit a hierarchy of concepts or classes (figure 2.6). The concepts may be abstract or concrete, which indicates whether or not it is allowed to create instances of the concept. The tool allows similar to the well-known “copy-and-paste” functionality the reorganizing of concepts within the hierarchy. 8 http://www.cs.man.ac.uk/. horrocks/FaCT/. 9 http://www.ontoknowledge.org/tools/ontoedit.shtml. 22.

(38) 2.6. ONTOLOGY TOOLS. Figure 2.6 OntoEdit. OntoLingua Server The Ontolingua Server(Farquhar et al. (1996)) has been developed by the Knowledge Systems Laboratory (KSL) at Stanford University. Consist in a set of tools and services that support the building of shared ontologies between distributed groups. The architecture provides access to a library of ontologies, translators to languages like Prolog, and an editor to create and browse ontologies. Furthermore, it permits that remote editors can browse and edit ontologies. Remote or local applications can access any of the ontologies in the ontology library using the OKBC (Open Knowledge Based Connectivity) protocol.. 23.

(39) 2.6. ONTOLOGY TOOLS. Protégé Protégé10 is a free, open-source platform that provides a suite of tools to build domain models and knowledge-based applications with ontologies. Protégé implements a set of knowledge-modeling structures and actions that support the creation, visualization, and manipulation of ontologies in various representation formats. The user can customize the Protégé for creating knowledge models. Moreover, Protégé can be extended by way of a plug-in architecture for building knowledge-based tools and applications. The protégé provides the tab plugins which provide capabilities as visualization, ontology merging and version management, inferencing, and so on. The OntoViz and Jambalaya tabs, for example, present different graphical views of a knowledge base, with the Jambalaya tab allowing interactive navigation, zooming in on particular elements in the structure, and different layouts of nodes in a graph to highlight connections between clusters of data. The Protégé-OWL editor(see figure 2.7) is an extension of Protégé that supports the OWL, which is the most recent development in standard ontology languages, supported by the World Wide Web Consortium (W3C). The Protégé-OWL editor enables users to: load and save OWL and RDF ontologies, edit and visualize entities(classes, properties,instances and SWRL11 rules), define logical class characteristics as OWL expressions, execute reasoners such as Description Logic classifiers and edit OWL individuals for Semantic Web markup.. 2.6.2. Ontology matching and integration tools. The area of ontology matching has been received high interest from the community because it helps to solve problems related to ontology heterogeneity. The diversity of information dispose on the Web must coexist with the interaction between systems. An option to make different ontologies compatible is to establish mappings between ontologies, and to merge them at run-time. Ontology merging is the process of generating a single, coherent ontology from two or more existing and different ontologies related to the same subject. A merged single coherent ontology includes information from all source ontologies but is more or less unchanged. The original ontologies have similar or overlapping domains but they are 10 http://protege.stanford.edu/ 11 http://www.w3.org/Submission/WRL/. 24.

(40) 2.6. ONTOLOGY TOOLS. Figure 2.7 Protégé. unique and not revisions of the same ontology. Ontology merge is very important in design time, since the merger companies or organizations in general can lead to a merge of their ontologies. Because there may be the necessity of merging several ontologies to have another with better quality. Ontology alignment aiming to find relationships between concepts belonging to different ontologies. The ontology alignment problem can be described in one sentence: given two ontologies each describing a set of discrete entities (which can be classes, properties, rules, predicates, etc.), find the relationships (e.g., equivalence or subsumption) holding between these entities. Below we present some of the ontology matching and integration tools. Chimaera Chimaera(McGuinness et al. (2000)) is a merging and diagnostic web-based browser ontology environment. It supports merging multiple ontologies together and diagnosing individual or multiple ontologies. The development was based on development of the others user interfaces for knowledge applications such as the Ontolingua, the Stanford CML editor and the Stanford JAVA Ontology Tool(JOT). Chimaera contains an editing environment and allows the user to use others editor/browser environment like OntoLingua. It facilitates merging by allowing users to. 25.

(41) 2.6. ONTOLOGY TOOLS. upload existing ontologies into a new workspace or into an existing ontology. Chimaera allows users to run a diagnostic suite of tests selectively to analyze the ontologies. The tests include incompleteness tests, syntactic checks, taxonomic analysis, and semantic checks. ODEMerge ODEMerge (Ramos (2004)) is a tool to merge ontologies that is integrated in WebODE, the software platform to build ontologies that has been developed by the Ontology Group at Technical University of Madrid. Therefore, it is a client-server tool that works in the Web. ODEMerge provides a partial software support for the methodology for merging ontologies elaborated by Diego (R (2001)). This methodology proposes the following steps (see Figure 2.8): transformation of formats of the ontologies to be merged, evaluation of the ontologies, merging of the ontologies, evaluation of the result and transformation of the format of the resulting ontology to be adapted to the application where it will be used.. Figure 2.8 The steps of the ODEMerge Methodology. An important characteristic of ODEMerge is that it can be used to merge ontologies in so many ontology implementation languages as the ones that WebODE processes, since WebODE is the host platform of ODEMerge. Alignment API This API for manipulating alignments, was developed at the INRIA-Grenoble12 . The authors have designed a format for expressing alignments in a uniform way(Euzenat 12 http://www.inrialpes.fr/. 26.

(42) 2.6. ONTOLOGY TOOLS. (2004)). The goal of this format is to make available alignments to be shared on the Web. This format is expressed in RDF. The API itself is a Java description of tools for accessing the common format. It defines four main interfaces (Alignment, Cell, Relation and Evaluator) and proposes the following services ( see http://alignapi.gforge.inria.fr/): • Storing, finding and sharing alignments; • Piping alignment algorithms (improving an existing alignment); • Manipulating (thresholding and hardening); • Generating processing output (transformations, axioms, rules); • Comparing alignments; • And publishing ontology alignments on the Web. The alignment description is composed of the following components: • a pair of ontologies between which the correspondences are established; • a level used for characterizing the type of correspondence; • a set of correspondences which express the relation holding between entities of the first ontology and entities of the second ontology; • an arity (default 1:1) which the usual notations are 1:1, 1:m, n:1 or n:m. The alignment described in the figure 2.9 displays the alignments between entities from two bibliographic ontologies. It contains two correspondences; the first one relates the class reviewed article from onto1 to the class article from onto2, with a confidence factor of 0,63.. 2.6.3. Neon ToolKit. The NeOn toolkit13 is a free, open source extensible Ontology Engineering Environment. It has been developed by the NeOn project which is a 14.7 million Euros project involving 14 European partners. 13 http://www.neon-toolkit.org/. 27.

(43) 2.6. ONTOLOGY TOOLS. Figure 2.9 An alignment generated by the alignment API. The NeOn toolkit contains plugins for ontology management and visualization. The core features include: basic editing of XML, visualization/browsing, import/export to F-Logic, (subsets of) RDF(S) and OWL. It is part of the reference implementation of the NeOn architecture. The toolkit combines several features of the approaches discussed above, like development, alignment and evaluation of ontologies. The NeOn Toolkit (see figure 2.10) is designed as an open and modular architecture, which includes infrastructure services, such as a registry and a repository, and supports distributed components for ontology management, reasoning and collaboration in networked environments. It has a strong emphasis on networked ontology management, i.e. support for engineering ontologies that are embedded in a network of ontologies via rich semantic relationships, including models for modular ontologies and mappings across ontologies(Haase et al. (2008)). Based on Eclipse, the Toolkit defines an open framework for plugin developers.. 2.6.4. Conclusion. There are several tools, methodologies and techniques to develop and maintain ontologies, but neither of them provides a good support to reusing of ontologies.. 28.

(44) 2.6. ONTOLOGY TOOLS. Figure 2.10 Neon ToolKit. Therefore it would be very interesting if the ontology developer could build an ontology by reusing parts of other ontologies. Many ontology development methodologies and methods, such as the Ontology 101 method and Methontology(described in the previous sections), include a reuse step in the ontology development that allow ontology developers to integrate into the ontology that they are currently designing and implementing an ontology that has already been developed. Concerning languages, only OWL give some support a reuse of ontologies. OWL provides the <owl:imports> construct for linking multiple OWL ontologies to form a larger OWL ontology, i.e, the whole ontology is imported. However, such syntactic importing solution of OWL lacks support for partial reuse of ontologies. With respect to tools, Protégé allows the reuse of another ontology by including it in the model that is being designed, this happens through the inclusion of other projects. Besides, there is a plugin for Protégé, called Prompt which permits to extract a part of an ontology. The user enter with a class, the system recursively follows the properties around the selected class of the ontology, until a given distance is reached. The NeOn ToolKit has a modularization plugin that provide to create modules reusing parts of an ontology and other modules. However, it does not permit important tasks related to reuse, for example, assigning restrictions to new entities(classes, properties. 29.

(45) 2.6. ONTOLOGY TOOLS. and instances) using the imported entities like disjointness of classes and existential restrictions. The present work proposes a concrete tool which tries to supply the lack of mechanisms for modularization of ontologies. In the next chapter, will discuss about the modularization of ontologies.. 30.

(46) 3 Ontology Modularization. The notion of modularization came from Software Engineering where it refers to software development in a structured way that supports the combination of self-contained components that are easier to build, reuse and maintain. “From an Ontology Engineering perspective, modularization should be considered as a way to structure ontologies, meaning that the construction of a large ontology should be based on the combination of self-contained, independent and reusable knowledge components”(d’Aquin et al. (2007)). By separating knowledge entities, modularization allows that the developers focus only on the elements that are relevant for a given application at a given time. Therefore, the modularization improves performance, by reducing the amount of knowledge that have to be manipulated by ontology-based tools, including reasoners and editors.. 3.1. Reusability. Ontologies have been used to solve problems of distributed knowledge, integration of information across applications and to permit an efficient communication between agents in the e-commerce, for instance. But to obtain a successful result in these applications is needed the ability to share and reuse existing ontologies. Reusability is one of the most significant aspects in Software Engineering in general. For example, software engineers have developed libraries of software that are reusing in other applications. In Ontology Engineering, enabling knowledge reusability is an important goal for building ontologies. Moreover, increasing the reusability of knowledge implies in the maximization of its usage among several activities. The main benefits of ontologies reuse are( Jarrar (2005)):. 31.

(47) 3.2. MOTIVATION. • Savings in cost, time and efforts: this means that instead of creating an entire ontology with elements already built elsewhere, one may reuse an existent ontology or some parts of it that satisfy your needs. • Increasing reliability: a reusable ontology gives indication that is approved and generally accepted. Nevertheless, there are challenges that hamper ontology reusability. The main problem resides on the dependency up the purpose that an ontology is made for. Ontologies are intended to capture knowledge at the domain level. Thus, when reuse knowledge for a different purpose, the usability perspectives for both purposes may differ. Therefore, ontology reusability will be restricted depending on how different the usability perspectives are. Another important issue is the difficulty on identifying and isolating the reusable parts, i.e, allowing the general-purpose parts of an ontology to be reused instead of reusing the whole ontology. One possible way to achieve reusability is through the process of modularization that consists on identifying and extracting significant parts of an existent ontology. In this context, the set of extracted entities is called module.. 3.2. Motivation. The benefits of modularization are well known from many areas of Computer Science. For example, in Software Engineering there is the notion of components where you build a software reusing other components what provides quick development, organization, reusability and easy understanding. Stuckenschmidt2003 (2003) mentions three benefits for Ontology Modularization: Distributed Systems: ontologies in different places are built independent of each other and can be assumed to be heterogeneous. Unrestricted referencing to concepts in a remote ontology can therefore lead to serious semantic problems as the domain of interpretation may differ even if concepts appear to be the same on a conceptual level. The introduction of modules with local semantics and clearly specified interfaces can help to overcome this problem. Large Ontologies: those ontologies that sometimes contain more than a hundred thousand concepts are hard to maintain as changes are not contained locally but can. 32.

(48) 3.2. MOTIVATION. affect large parts of the model. Another argument for modularization in the presence of large ontologies is reuse as in most cases, we are not interested in the complete ontology when building a new system, but only in a specific part. Experiences from Software Engineering shows that modules provide a good level of abstraction to support maintenance and reuse. Efficient Reasoning: a specific problem that occurs in distributed ontologies and very large models is the problem of efficient reasoning. The introduction of modules with local semantics and clear interfaces will help to resolve this problem and provides a basis for the development of methods for localizing inference. Moreover, modularization could help the ontologies developers to identify and select only those concepts and relations relevant for the application they are modeling an ontology for(Bezerra et al. (2008)). Considering aspects of Software Engineering, the modularization also brings benefits “as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time” (Parnas (2002)). Some authors from Ontology Engineering believes that modularity in ontologies may bring similar benefits for the development and management of large ontologies. The expected Engineering benefits of ontology modularity include the following ( Bao (2007) ): Collaboration: a module can be more easily constructed collaboratively with separate groups working on different modules of the ontology. Flexibility: a module can be easy built by composition or decomposition of parts of other modules and ontologies; an ill-designed or obsolete module can be replaced with a new module with controlled impact on other modules. Understandability: due to the structure of a module, it is easier to understand the contents of the module. Debugging: well-defined semantic modularity can help developers to quickly find and solve problems. Scalability: many ontology tools, e.g., reasoners, editors and query engines, are known to perform well on small-scale ontologies, but drastically degrade in performance when the size of the ontology increases.. 33.

(49) 3.2. MOTIVATION. 3.2.1. Syntactical and Semantic Modularity. Bao (2007) suggests that the need for modularity can be viewed along some points divided in basically two dimensions, the syntactical modularity and semantic modularity. Syntactical modularity addresses the need to organize large ontologies in multiple, manageable, compact modules, so that interactions between ontology modules are well-controlled for more efficient ontology construction, revision and reuse. Semantic modularity tackles the need to allow localized and contextual points of view of autonomous contributors of different ontology modules and distributed reasoning. Syntactical Modularization We can consider the following aspects of syntactical modularity: Loose Coupling: consists on minimizing the interactions between the modules. It can be measured by the connectedness of modules, i.e, the number of shared symbols between axioms in different modules. In order to get an efficient reasoner, modules must be loosely connected. Organizational Structure: consists on making modules by organization of symbols and relations of moderate-sized units that are easy to design, use and reuse. Syntactical Partial Reuse: it is the ability of reusing only relevant parts of an ontology. Modular structure would enable flexible and efficient partial reuse of the ontology. Semantic Modularization Typically the ontologies in the Semantic Web captures contextual knowledge. Such knowledge may depend on implicit assumptions that the ontology users are not always aware of. Some contextual aspects include the following: Implicit Domain of Discourse: reuse of knowledge without regard to applicable context can lead to unintended consequences. Differences in the universe of discourse: suppose we will query two ontologies about professors in two different departments. In the first ontology, the individuals are explicitly enumerated by their names e.g., Professor = {John,Michel}. In the same way, the second ontology, have Professor = {Peter,Mary}, but the names are different and are disjoint from the ones in the first ontology. Hence, the. 34.

(50) 3.3. DEFINITION. two ontologies disagree on the members of Professor such that it has different interpretations in the two contexts. Subjectivity: different contexts can occur when there are different political, cultural and social points of view of ontologies developers. For example, in Western countries, the notion of Weekend typically refers to Saturday and Sunday, while in Muslim countries it is Friday and Saturday. This can cause inconsistences if both ontologies are reused in a module. Spatio-Temporal Contexts: the same sentences can regard different meanings in different places and time. For example, the sentence “Today is Monday” can be true or false depending on the place and time you are in the world. Moreover, it is important a low coupling and strong cohesion that are explained below.. 3.2.2. Coupling and Cohesion. Coupling focuses on aspects of relationships between modules, whereas cohesion emphasizes the internal consistency of a module. Coupling is the level of inter-dependency among the modules. The coupling is directly related to cohesion, since the higher the coupling the lower the level of cohesion. This is because that when a module has a very strong dependency on another module or class, it is not “strong” enough to do tasks individually. Modules with high coupling are more difficult to understand and maintain. Cohesion shows consistency or conceptual unity in the relationship with the other modules. Cohesion depends on information hiding( isolating the internal implementation details of the module). A cohesive module should ideally have a unique responsibility that can be accomplished through a public service interface with the environment. Therefore, is desirable low coupling and strong cohesion among the modules.. 3.3. Definition. Generally speaking, a module is some subset of a whole that makes sense and can somehow exist separated from the whole, although not necessarily supporting the same functionality as the whole. From an application perspective, the module must be able to provide a reasonable answer to at least some of the queries which it is intended to support.. 35.