In September 2004 the ontology management working group has been founded in order to develop an ontology management system to be used for several DERI projects and working groups. The group is coordinated by DERI and includes also Ontotext and other partners. In this document we will outline the way to go in order to reach this ambitious target.
In section 2 we will describe the necessary functionalities which will be broken down to the components level in Section 3. Decisions about the implementation of those will be described in section 4 before section 5 tries to schedule the single tasks and section 6 concludes this document.
In this section we describe the functionalities that will be provided by our Ontology Management System (OMS). We will consider versioning, merging/aligning, editing/browsing, storing and querying.
The versioning function aims to handle different versions of the ontology. It includes versioning of the ontologies with instances update and backward compatibility support. This functionality is linked with the mediation to maintain consistent versions.
Multiple users on the tool require some versioning functionalities. Each time an ontology is modified, the old version is stored with a unique version number and the user can undo or redo her changes. These changes are formally defined; they can be either basic, for example a class modification, or complex, for example moving a sub tree of the ontology. Complex changes are only compositions of basic changes.
As the changes can modify the consistency of the ontology, the OMS versioning part includes consistency checking (see also Editing and Browsing). The multiple users must be informed when a change has occurred and their local ontologies are automatically synchronized. The modifications on the ontologies can create inconsistencies with the instance bases, this is the main reason for providing accessibility to the older versions of a modified ontology, and the user knows clearly on which version she is working on.
The OMS system provides security checking to restrict access to certain profiles and certain parts of the ontology.
Merging and Aligning
An important set of functionalities of the OMS is the mediation part. The term mediation includes all the functions providing help to merge and map ontologies and their instances. The mapping functionality is semi-automatic with the use of mapping-patterns or mapping-rules built on a mapping language. The patterns are stored in a pattern-library; the user can browse to find relevant patterns with the help of a search tool. That must also be done using a user-friendly interface with suggested mappings and drag and drops. The mappings are validated by consistency checking.
The OMS is able to mediate between different ontologies to answer the user queries; the system is able to execute queries on mapped ontologies. It mediates between them using query rewriting and unification of the resulting instances. It also uses the mappings to merge ontologies.
Editing and Browsing
The OMS allows the user to browse or edit ontologies via a user friendly graphical interface. Therefore this interface includes a visualization tool for representing even very large sets of ontologies and their instances. The user has the possibility to create ontologies and instance sets, to import/export them from/to different ontologies coded in different languages and to edit them. Creating and editing activities can insert inconsistencies in the ontologies; this is why the OMS editor includes an inference engine to ensure consistency checking.
Once the classes and instances are in the system, the user needs to manage them. The OMS provides tools to manage this library containing large scale different ontologies and sets of instances. These tools allow the user to search through the different ontologies as well as through their sets of instances. This search is carried out via a classical browsing, a query language or a graphical easy to use query interface and uses an internal indexation system. The search tool is improved by a suitable documentation on the ontologies concerning the authors, some keywords, and natural language searchable descriptions.
As the ontologies should be geographically dispersed and used by different users at the same time, the OMS is able to manage with concurrent access, and geographically dispersed versions. Concurrent access is detailed in the versioning paragraph of this document.
In order to align with the WSMO editor the ontology schema and the instances are stored in separated repositories. The software architecture is modular; it can interoperate with the existing ontology management tools, and is designed to interoperate with future tools because of the use of standard languages.
Storage and Representation
The ontologies and the sets of instances are stored in a repository which can efficiently deal with large ontologies. The ontologies are tightly connected to the other layers of the system and the provenance of their different parts is known due to the tracking process.
The OMS repository support different ontology language and semantics, it is based on a semi-structured/graph-based data model.
The repository access is made with a query language supporting conjunctive queries; these queries can also include data types, aggregation functions or range queries. Search is possible on literals with the use of keywords.
The query results are structured and can be presented like search engines results pages. The query language provides closure.
The OMS repository interface adheres to a standard; it integrates multiple ontologies repositories in a single logical one. The system is able to deal with multiple users (see versioning) and adheres to the ACID paradigm; it includes transactions, logging and locking.
An important feature of the storage and representation model is that it allows for integration of (non-ontological) data sets such as databases. Specific wrappers make such data available in a form compliant with the OMS data model.
Further, it allows modularization of the data and knowledge, making it possible the ontologies to be partitioned into multiple data sets, each of which can be described with non-functional properties and managed separately. There is support for both explicitly defined data sets and views (which represent, in a nutshell, a restriction over larger data sets).
The different components of the Ontology Management system need a query language between the different user interfaces and the repository to realize queries on the ontologies. This language allows the user to manipulate the structure of the ontologies: classes, functions, relations and axioms. It also includes commands to manipulate the users, the connection to the repository and views of the ontologies. It is SQL-like with CREATE, DROP and ALTER commands.
In this section we describe the components necessary to realize the functionalities introduced in the last section.
The ontology versioning layer will have the following components:
Version space component
This component will handle the version space of ontologies and elements stored in ORDI. It will also provide an explicit API for the manipulation of versions – retrieval or creation.
The versioning layer will facade ORDI ontology manipulation and storage API so that versioning is properly handled.
Every version in ORDI must have an author associated with it, therefore the authorization component will make sure that user identity is available to the other components.
Validation, Diff, Impact analysis and Change propagation components
These components will provide support for ontology/instance validation, computing the differences between versions, impact analysis and change propagation tasks. These components will be specified in further detail after the first prototype is ready in December 2004.
The alignment tool will be developed following the solution defined in the first version of this deliverable and adding new support from our researches recent advances.
Ontology merging and ontology aligning tasks both require the use of mappings: between the two source ontologies and the newly merged one for the former, and between the two aligned ontologies for the latter. Mapping specification is currently a semi-automatic task for which many algorithms exists. In the first version of this deliverable we present one based on PROMPT (see section 2.6) and suggest using it in our system. Like new algorithms are likely to emerge from the research community, the alignment tool should be able to include them and the user to use her preferred one. In this perspective we will develop a general alignment API on which different algorithms could be implemented.
The alignment tool will satisfy all the requirements raised in the first version of this working draft. We will next present its architecture.
The alignment tool contains two components:
The mapping module helps the user to create mappings and construct merged ontologies.
The runtime module uses the created mappings to perform the tasks required by the external components.
We will next detail the composition of each module.
• Mapping language
As seen in section 4.1, the mappings are based on a general mapping language.
Patterns are templates that match the more usual mistakes between two ontologies. The use of predefines patterns considerably reduce the mapping designer task. In this solution we propose the use of a pattern language to define them, a pattern library allowing storing and retrieving them efficiently.
• Mapping algorithms interface
The architecture of the module allows the use of different mapping algorithms. These algorithms are stored and can be combined to create efficient mappings. The interface specifies the ontology language in input and the mapping language in output.
• Graphical user interface
This interface plays the main role in the mapping module. It allows the user to graphically create or modify mappings by linking similar entities. Mapping proposals as results of the mapping algorithms are also integrated in this part of the component.
This module is used by the reasoning part of the ontology management system. It can also be implemented as a web service but we won’t discuss this here.
This module uses the mappings to perform the following tasks:
• Query rewriting
Used to rewrite a query written for an ontology into one for another ontology. This process uses the mapping between the two ontologies or proposes to create one using the mapping module.
• Instance transformation
Use to transform instances from one ontology to another. This process also uses the mapping between the two ontologies.
The Ontology explorer will contain the following components:
The class browser/editor is UI component that will show the list or hierarchy of available ontologies and their contents, upon selection of an element in this hierarchy the properties and sub elements of the element will be shown and editable. The list will allow the removal or creation of ontologies and elements. The focus of this component will be to show the class/sub class hierarchies, the attribute definitions on classes, the relations between classes and the axioms/rules as the basic ontology modeling blocks.
The instance browser/editor is a UI component similar to the Ontology browser/editor but dealing with independent collections of instances. The focus of this component will be to show the instances with their attribute values and class memberships, the relation instances and possibly axioms as the basic building blocks for data representation.
Versioning UI support
This UI component will help the user with managing versioning by providing access to the functionality of the versioning high-level component (see below). It will display the list of versions for an ontology or an element within an ontology, and likewise for the instances, possibly in connection with the ontology browser/editor and instance browser/editor components. This component will also allow the user to create a new version or retrieve previous versions.
These UI components will provide assistance to the user with the tasks of merging, alignment, mapping and factoring of ontologies. These components will be specified in further detail after the first prototype is ready in December 2004.
DDL interpreter UI support
The DDL interpreter will be an independent component that will interpret DDL into ORDI invocations, thus providing batched processing of change descriptions. The editor tool will provide a basic support for the user to process a DDL file by the interpreter and to view the results.
The ontology representation and data integration functionality will be realized with respect to the ORDI framework, [Kiryakov et al., 2004]. The included data and ontology models are described below.
The overall scheme is that ORDI will play as a middleware providing the OM tools and other applications with uniform access to various reasoners, repositories and other data sources. This strategy will be implemented through wrappers.
We ground our data representation on the RDF data model ([Klyne and Carroll, 2004]), since it is well-founded and detached from the semantics of the various knowledge representations, ontology, and semantic web languages used today. Another argument is that, there have been no major changes in its specification recently, which is an indication that it has reached certain degree of maturity. Finally, it ought to be taken into consideration that most of the formalisms that are used today for definition of the formal semantics within the set of different languages, can easily deal with the raw RDF data. To state it more explicitly, we see the data that will be used or manipulated through ORDI, as an RDF graph, defined as a set of RDF statements – triples.
Structured bodies of data represented in this model are called data graphs (in order to avoid, the usage of the term RDF graph and the inappropriate connotations to the RDFS semantics).
The ontology model in ORDI will be based on the one defined in WSMO (see Listing 1).
entity ontology nonFunctionalProperties ofType nonFunctionalProperties importedOntologies ofTypeSet ontology usedMediators ofTypeSet ooMediator concepts ofTypeSet concept relations ofTypeSet relation functions ofTypeSet function instances ofTypeSet instance axioms ofTypeSet axiom
This model is defined on a conceptual, epistemological, level which means that it is formal enough to allow conceptualization, but still providing only minimal commitments to the semantics of the ontologies.
Details on the formal representation of such a conceptual model and its mapping to data graphs are given in [Kiryakov et al., 2004].
Ontology Query Language
In order to realize the querying functionality we will develop a querying language and a respective interpreter.
In this section we describe the implementation decisions that have been made so far.
Currently, development of the versioning tool is in the architecture and design phase, therefore few implementation choices have been made and few implementation options (choices we will have to make) are apparent.
The agreement is to program versioning support as a layer above ORDI, with the UI as part of the editor/browser tool. It is decided that versioning will be done on the conceptual (semantic) level as opposed to other popular approaches like storing the ontologies in syntactic form and using existing syntactic versioning systems like CVS.
The version space component faces the choice of where and how the version space will actually be represented and what will be stored in ORDI. The options are currently unclear and under active discussion.
The ORDI facade component will resemble the underlying ORDI API as much as possible to ease the inclusion of the versioning component in the rest of the system during integration tasks.
Due to time limitations, it is likely the authorization component will be very simple and insecure, simply asking the user for their name and relying on the honesty. This will be fully sufficient for the purposes of the initial prototype and demo.
Merging & Alignment
So far there haven’t been made any decisions about the implementation of the merging and alignment component.
Editing & Browsing
Currently, development of the editing and browsing tool is in the architecture and design phase, therefore few implementation choices have been made and few implementation options (choices we will have to make) are apparent.
Both the ontology and instance editors/browsers will be based on the Eclipse platform which has been evaluated as the best basis for the tool. The UI widgets provided by Eclipse will be reused as much as possible.
We will attempt to provide a graphical representation of ontologies and instances in the editor/browser components, and currently the options trees, hyperbolic-plane trees and unconstrained graphs with automatic or user-driven layout. These options need to be evaluated with regard to the goal of being able to efficiently handle large-scale ontologies and instance collections, and support for the layout of automatically generated ontologies or instance collections, which has the tendency to become confusing with larger numbers of components.
Finally, it is not yet clear how much the Versioning UI support will be integrated with the two editor/browser components, the options apparently depend on the unresolved question of version space representation in the versioning layer.
Ontology representation and data integration
The core ontology representation API will be an extension of the WSMO Ontology API. In addition, OMS will also specify a set of functional interfaces (for storage and retrieval, for versioning, etc.).
Based on the agreement found in the DIP project the ontology representation and data integration framework will be developed in Java.
The ontology representation and data integration API shall be based on an existing repository tool. Two widespread alternatives are the Jena and the Sesame system. The table below summarizes the features available in the tools respectively.
As a general approach, ORDI will be coupled with multiple wrappers for repositories and ontology servers. Wrapper for KAON 2 is a high priority.
Ontology Query Language
So far there haven’t been made any decisions about the implementation of the query language component.
Efforts / Timetable
The necessary efforts and an appropriate schedule still have to be discussed. A first draft is summarized in Table1 and will be described in more detail in the following sections.
|Versioning||Merging & Alignment||Editing & Browsing||Represen-tation & Repository||DDL|
Table 1: Deadlines and responsibilities
The versioning tool as described in 2.1 shall be realized according to the following sections.
The requirements of the versioning tool will be analyzed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.1 .
The architecture of the versioning tool will be designed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.2 .
The implementation of the versioning tool will be realized by DERI Innsbruck. The responsible person is Jacek Kopecký. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.3 .
Merging and Alignment Tool
The merging and alignment tool as described in 2.1 shall be realized according to the following sections.
The requirements of the merging and alignment tool will be analyzed by DERI Innsbruck. The responsible person is Francois Scharffe. The document will be finished until December 31st 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.1 .
The architecture of the merging and alignment tool will be designed by DERI Innsbruck. The responsible person is not fixed, yet. The document will be finished until April 30th 2005 . The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.2 .
The implementation of the merging and alignment tool will be realized by DERI Innsbruck. The responsible person is not fixed, yet. The prototype will be finished until June 30th 2005 . The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.3 .
Editing and Browsing Tool
The editing and browsing tool as described in 2.1 shall be realized according to the following sections.
The requirements of the editing and browsing tool will be analyzed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.1 .
The architecture of the editing and browsing tool will be designed by DERI Innsbruck. The responsible person is Jan Henke. The document will be finished until October 4th 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.2 .
The implementation of the editing and browsing tool will be realized by DERI Innsbruck. The responsible person is Jan Henke. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.3 .
Representation and Repository
The repository as described in 2.1 shall be realized according to the following sections.
The requirements of the repository are managed by Ontotext. The responsible person is Atanas Kiryakov. Those are already available in [Kiryakov at all, 2004].
The architecture of the representation API and path for wrapping of existing repositories and data sources will be delivered by Ontotext. The responsible person is Atanas Kiryakov. The most important design questions are already covered in [Kiryakov at all, 2004]. The next version will be presented after the implementation of the first phase.
The implementation of the ontology representation API will be realized by Ontotext. It is currently being implemented as an extension of the WSMO Ontology API. The first version will be available by October 18th 2004 .
The implementation of the repository will be realized by Ontotext. The responsible person is Atanas Kiryakov. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d9/d9.3 .
DDL for Ontologies
The DDL as described in 2.1 shall be realized according to the following sections.
The requirements of the DDL will have to be analyzed. The responsible person is not fixed, yet. The document will be finished until December 31st 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.1 .
The architecture of the DDL will have to be designed. The responsible person is not fixed, yet. The document will be finished until April 30th 2005. The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.2 .
The implementation of the DDL will have to be realized. The responsible person is not fixed, yet. The prototype will be finished until June 30th 2005. The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.3 .
In this document we outlined the way to go in order to reach the ambitious target of creating a general ontology management system to be used by several DERI projects and working groups.
The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, SWWS, Esperonto and h-TechSight; by Science Foundation Ireland under the DERI-Lion project; and by the Vienna city government under the CoOperate programme.