Developing an University Ontology in
Education Domain using Protégé
for Semantic Web
SANJAY KUMAR MALIK
University School of Information Technology, GGS Indraprastha University, New Delhi
NUPUR PRAKASH
University School of Information Technology, GGS Indraprastha University, New Delhi
S.A.M RIZVI
School of Computer Sciences, Jamia Millia Islamia University
Abstract- The current web is based on HTML which is not able to be exploited by information retrieval techniques and hence processing of information on web is mostly restricted to manual keyword searches which results in irrelevant information retrieval . This limitation may be overcome by a new web architecture known as semantic web which is an intelligent and meaningful web proposed by Sir Tim Berner’s Lee. In his roadmap for semantic web, Ontology plays a pivotal role in information exchange, use and re-use knowledge, shared and common understanding of a domain that can be communicated between people and across application systems which is the goal of semantic web. Ontology is used to capture knowledge about any domain of interest with the objective of incorporating the machine-understandable data on the current human-readable web. Web Ontology Language (OWL) is a semantic markup language for sharing ontologies on the web and is designed for use by software agents to empower them to comprehend the meaning of web documents.
Ontology is a broad term including a wide range of activities,complexities and issues in which Ontology Development is one of the most fundamental and significant concern. There may be various methodologies or tools for ontology development . In this paper, we consider the education domain and demonstrate the development of an University Ontology using Protégé 3.4 Editor. Indraprastha University, Delhi, India has been taken as an example for the Ontology Development and various aspects like super class and sub class hierarchy, creating a sub class, instances for classes illustration, query retrieval process, Graph corresponding to a sub-class using TGViz have been demonstrated.
Keywords:Semantic Web, Ontology, Protégé, Query, TGviz, Intelligent web.
1. Introduction 1.1 Semantic Web
1.2 Ontology
To compare conceptual information across two knowledge bases on web, a program must have a way to discover common meanings and the solution to this is to collect information at a common place called Ontologies. Ontology formally describes a list of terms which represent important concepts, such as classes of objects and the relationships between them in order to represent an area of knowledge[2]. Ontologies provide a formal semantics that can be employed to process and integrate information on the web. Gruber [3] describes ontology as an explicit specification of conceptualization[4]. Ontologies play a pivotal role in providing a vocabulary comprising unambiguous definitions for terms that can essentially serve as a formal support for communication between software agents. They provide a communication mechanism for users and software agents, clearly define the semantics for different domains for the purpose of interactions on the web, and help in creating a knowledge base that will enable people to work on a particular domain [5].Web Ontology Language(OWL), recommendation from W3C, is widely used to construct a domain ontology. Inorder to be able to perform useful automatic reasoning tasks on web data, there is a need to go beyond the basic semantics of XML Schema and RDF Schema and there is a need of a more expressive and reasoning language which enhances the RDF with more vocabulary. Web Ontology Language(OWL), recommendation from W3C, is widely used to develop ontologies.
Fig 1.1 Ontology towards Semantic Web
Besides the significant role of Ontology in semantic web, various other factors playing a key role towards the achievement of intelligent or efficient retrieval of information on web are as shown in figure 1.1 above. XMLS(XML Schema)extends the capabilities of XML where XML(Extensible Markup Language) is for for data exchange and to add meaning to data. RDFS(RDF schema) is to represent the web resource where RDF(Resource Description Framework) is for representing the knowledge resources on the web and uses the web identifier URI (Uniform Resource Identifier) to identify the resources. SPARQL(Standard Protocol for RDF Query language) is to extract information from RDF graphs for machine understandable representation. Agents, are programs for a specific purpose carrying meaningful information from one machine to another and Semantic Search engines and browsers are for semantic traversal. The semantic web may provide a promising platform for knowledge management systems and vice versa, since they have the potential to give each other the real substance for publishing and interlinking machine-understandable web resources which support intelligent search. Semantic Annotation enables highlighting, indexing and retrieval. In the context of the Semantic Web, social networks like twitter assistpeople in finding common relationships and discussion forums for exchange of information and are playing a crucial role in information credibility and trustworthiness.
Semantic Web Mapping
Knowledge Mgt. and Semantic
Annotation, Programming [Jena/Java 1.6, Code-Editing tools like Eclipse] Intelligent Agents, Search Engines, Browsers, Web Scrapping, Social Networks XML, XMLS, RDF, RDFS, RDF/XML, URI,SPARQL
Evaluation Merging
Ontology
Ontology
[Creation using editor like protégé, adding to existing one, merging, Mapping,
evaluation maintenance, ontology
1.3 Protégé for Ontology Development
A number of ontology editors are available for developing an ontology, e.g, Protégé, SWOOP, OntoEdit, Altova SemanticWorks, OntoStudio, and hence forth. But Protégé is most widely used by researchers, professionals, programmers, and others alike [6]. Protégé is a an open source freely available ontology editor and knowledge base framework [7]. According to a survey conducted by Cardoso, Protégé editor and Education are most widely used for the development of ontologies (see Fig. 1(a) and Fig.1(b)). In this paper, Section 2 highlights Ontology development in education domain. Section 3 illustrates the development of Indraprastha University(Delhi, India) Ontology using protégé editor. It presents the query retrieval using query tab of protégé and also illustrates the TGviz tab which is used to provide the route of the ontology with a graph to reach to any classes or subclasses. C
onstruction of a class hierarchy
(also called taxonomy) for the University domain is illustrated with the help of code
snippets of XML, RDF, and OWL. Section 4 concludes the paper with future scope.
2. Ontology Development in Education Domain
Sir Jorge Cardoso carried a survey on most widely used ontology editors and most widely used domain for ontology development and found that Protégé tool had a market share of 68.2% followed by Swoop, OntoEdit, Texteditor, Altova SemanticWorks, and hence forth and Ontologies were mostly developed in the field of education (31%). Therefore,in this paper, the University domain ontology is constructed using Protégé and OWL. Following figures, 1(a) and 1(b) are the graphical representations of the findings of Cardoso.
68.20%
13.60% 12.20% 10.30% 10.30%
7.30%
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00%
Protégé Sw oop OntoEdit TextEditor Altova SW OilED
Ontology Editors
Figure 1(a)Ontology editors used by respondents (researcher, professional, programmer, etc.) domains [Source: Jorge Cardoso, “The Semantic Web Vision: Where are We?” IEEE Intelligent Systems, September/October 2007, pp.22-26, 2007.].
Figure 1(b) Development of ontologies in different domains [Source: Jorge Cardoso, “The Semantic Web Vision: Where are We?” IEEE Intelligent Systems, September/October 2007, pp.22-26, 2007].
used domain for Ontology development. Creating an Ontology and then a good quality Ontology that is well-structured and free from contradictions, isn’t easy. The designer of a good quality ontology requires the ability to conceptualize and articulate ideas and a skill for modeling abstractions. He should have a good knowledge of the syntax of the ontology languages so that the model is expressed correctly[1]. So Ontologies need to be well developed and accepted. Although people prefer to create smaller ontologies for special purposes, but practically, ontologies may be required to be used on a larger scale or using parts of more than one ontology or need of modification with time . Building Ontologies is further divided into three steps: ontology capture, ontology coding and possible integration with existing Ontologies[8]. Ontology life cycle involves steps like specification, conceptualization, formalization, integration, implementation and maintenance[9]. Fundamental steps which are key in the development of an ontology using a tool are [10]:
Obtain domain knowledge: A deep insight into and a thorough knowledge of the respective domain is prerequisite to construction of any domain ontology.
Identify the key concepts: Concepts that represent the domain are identified and hence implemented by means of classes.
Build the taxonomy: the class hierarchy is created by creating the classes and their respective subclasses, and instances of classes.
Identify relationships between classes: Properties are used to represent relationship between classes.
Consistency checking: the constructed domain ontology must be checked for consistency using reasoners.
Implementation of ontology: involves deployment of ontology to enable machine-to-machine communication.
An Ontology building methodology may have the following layers:[11]
Top layer: building process compliance with software development process. Middle layer: Generic constraints and guidelines to specify major steps.
Bottom layer: Most fine grain guidelines such as those for class identification process etc.
The challenge is that changes in ontologies are caused either by changes in domain or shared conceptualization or changes in specification[12].
3. Illustrating Ontology Development with an Example: University Ontology(Indraprastha University, Delhi, India) in Education Domain.
Indraprastha University, Delhi, India has been taken as an example for the Ontology Development using the protégé editor.
Figure 3.1: Open the recent project or create the new project in protege.
Figure 3.2: Selecting type of OWL to be used.
Figure 3.2 illustrates about selecting the options like OWL DL or OWL Lite or OWL Full etc for a new project.
Protégé editor is used to create the Indraprastha University Ontology. Superclass and Subclass hierarchy has been illustrated where “GGS IP University” is the super class and “affiliated institutes” and “ipu campus administration” are some of it’s sub classes which have further subclasses like, “govt maintained institutes”, “academics”, “establishment”, “controller of examinations” etc. Therefore, we create the classes and subclasses to describe the major concepts and then adding the properties (slots) and features (facets) to the classes to describe the internal structure of these concepts. Then , by using form editor and form browser, we create slots like, “Name”, “Designation”, “Date of Joining(DJ), “ID” etc of some subclass say, “controller of examination” as shown in figure 3.4.
Figure 3.4: Creating slots like, “Name”,”ID” etc of a sub class like“controller of examination”.
Figure 3.5 snapshot illustrates instances of Indraprastha University Ontology which gives some details of the corresponding classes that may be useful in finding some information about the sub class like, “controller of examination” with some slot value as “Name” or “ID” etc.
Query Retrieval Process
In Figure 3.6, “Queries” tab is used to show how we can run the query and find the particular information about any particular instances or classes. When the query is run giving the value of ID say “1”, an instance of “controller of examination” subclass is created with its slot values like, Name, Designation, DOJ, Ph.No.,ID etc .
Figure 3.6: Query retrieval process
Using TGViz tab for route graph
In Figure 3.7, TGViz tab is used which shows a graph corresponding to the sub class “controller of examination” which provides the possible route and options of how one can reach to any class or subclass from any other class or subclass. the path distances between the nodes/classes. The path distances between the nodes/classes can be varied using the radius option.
4. OWL/RDF/XML SNIPPETS OUTPUT
OWL , RDF and XML Codes snippets Output is as follows: OWL <owl:Class rdf:ID="Academics"> <rdfs:subClassOf> <owl:Class rdf:ID="IPU_Campus_Administration"/> </rdfs:subClassOf> <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >Academics</rdfs:label> </owl:Class> <owl:Class rdf:ID="Asst_Registrars"> <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >Asst Registrars</rdfs:label> <rdfs:subClassOf> <owl:Class rdf:ID="Dyp_Registrar"/> </rdfs:subClassOf> </owl:Class> RDF /RDFS <rdf:RDF>
<rdfs:Class rdf:about="&rdf_;Academics" rdfs:label="Academics"> <rdfs:subClassOf rdf:resource="&rdf_;IPU_Campus_Administration"/> </rdfs:Class>
<rdf:Property rdf:about="&rdf_;Add" rdfs:label="Add"> <rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
<rdfs:Class rdf:about="&rdf_;Affiliated_Institutes_AI_" rdfs:label ="Affiliated Institutes(AI)">
<rdfs:subClassOf rdf:resource="&rdf_;GGS_IP_University"/> </rdfs:Class>
</rdf:RDF> XML
<?xml version="1.0" ?> <class> <name>Academics</name> <type>:STANDARD-CLASS</type> <own_slot_value> <slot_reference>:ROLE</slot_reference> <value value_type="string">Concrete</value> </own_slot_value>
<superclass>IPU Campus Administration</superclass> </class> <class> <name>Examination</name> <type>:STANDARD-CLASS</type> <own_slot_value> <slot_reference>:ROLE</slot_reference> <value value_type="string">Concrete</value> </own_slot_value>
5. Inferences
Some key ontology inferences are as below:[6]
Ontology Concern Maximum Used
Ontology Editor Protégé
Protégé tab features for development Classes, Slots, Forms, Instances
Ontology Language OWL
Domain for Developing Ontologies Education
Methodology Not specific- Varies
Why to use Ontology To share common understanding of
structure of information among people or software agents
Ontologytechnique(Mapping, integrating,
merging,aligning)
Mapping
6. Conclusions and Future Work
The current web is mainly based on HTML which describes how information is to be displayed on a web page for humans to read and it is not having the capacity of being exploited by machines. Semantic web aims to bring the present web to a state where besides humans, machines also understand and perform various tasks which involves various complexities and challenges. Various issues which play a key role in realizing the vision of semantic web are : XML(Extensible Markup Language) and XML Schema, RDF(Resource Description Framework and RDF Schema, URI(Uniform Resource Identifier), Unicode and SPARQL(Standard Protocol for RDF Query language), Search Engines and Agents, Logic, Proof and Trust , Web Browsers, Semantic Annotation , Web usage mining and Ontology etc. Ontology is significant where Ontology development is the most fundamental step which is the objective of this paper and been illustrated with an education domain example. Various methodologies and challenges exist in the process. This paper may be useful for researchers and professionals who are willing to work in Ontology development and the work can be extended to the development and deployment of large and complex ontologies and providing a solution for various other critical ontology issues towards semantic web.
7. References
[1]Thomas B. Passin, “Explorer’s Guide to the Semantic Web”, Manning Publications, pp-152 [2] Berners Lee, Godel, and Turing, “Thinking on the Web”, Wiley, pp xv, pp xxv, xxvi, pp-108 [3] Thomas B Passin, “Explorer’s guide to the semantic web”,Manning, pp-18
[4] T. R. Gruber. “A translation approach to portable ontology specifications”, Knowledge Acquisition, 5:199–220, 1993.
[5] T.S. Dillon, E. Chang, P. Wongthongthom, “Ontology-Based Software Engineering—Software Engineering 2.0”, 19th Australian Conference on Software Engineering.
[6] Jorge Cardoso, “The Semantic Web Vision: Where are We?” IEEE Intelligent Systems, September/October 2007, pp.22-26, 2007. [7] Protégé, http://protege.stanford.edu
[8] John Davies, Rudi Studer, Paul Warren, “Semantic Web Technologies”, trends and research in ontology-based systems, Wiley, pp 22-23.
[9] Xiaomeng Su, “Semantic enrichment for ontology mapping”, pp 29.
[10] Michael Denny, “Ontology Building: A Survey of Editing Tools” November 06, 2002 available at : http://www.xml.com/pub/a/2002/11/06/ontologies.html?pp-1