
Copyright © 2004 DERI®, All Rights Reserved. DERI liability, trademark, document use, and software licensing rules apply.
The DIP ontology versioning tool will be integrated in the suite of DIP tools, and as such should be able to communicate with these. More importantly, it can benefit from the features offered by other tools in the ontology management suite. With this requirement this tool contributes to the overall interoperability and compatibility requirement for the tool suite. Deliverable 4, Section 2 situates the versioning tool in the DIP framework.
The versioning principles should not be based on one particular ontology language, but should provide a solution for as many as possible multiple ontology languages and paradigms. Defining a model-independent versioning framework and tool might be utopian, but working in this direction we avoid discrimination.
An example of this is WSMO studio [todo ref], which supports conversions to and from various ontology languages. The versioning tool can utilize this feature to support interoperability and compatibility between versions, even if they are potentially written in different languages.
The ontology language adopted here is WSMO Core. The storage will be supported by ORDI.
In this section we elaborate on the functional requirements from a research point of view. For a detailed implementation priority list with respect to these requirements we refer to Section 4.
The set of all versions stored in the repository can be represented by a cloud of nodes in a version space. The change specification (in fact a transformation) between two versions is represented by directed edges connecting the respective version nodes. This results in a graph of versions. Some transformations are reversible, resulting into undirected edges (Figure 1).
Figure 1: An illustration adopted from [De
Leenheer, 2004] of an arbitrary set of possible ontologies
Ωkl. The dashed and solid arrows between the ontologies
reflect transformations; each ontology is trivially accessible from itself so
reflexive arrows are left implicit.
In general, in the repository we consider thus two types of first-class citizens: (i) versions (Requirement R4) and (ii) their inter-relationships (or transformations) (Requirement R7). Furthermore we have for each version (a) a unique identification (Requirement R5), and (b) an additional meta-information block (Requirement R6).
This version space representation gives the user the opportunity to revert changes, or for the limited duration of a session (or even permanently), revert back to a previous version of the ontology. It also acts in the first place as a conceptual overview of the requirements that are elaborated more in the following subsections; and in the second place as a possible visualisation of the version space.
Trivially, a user must be able to store his version of an ontology for later retrieval. From this requirement it should be clear how we will physically store a version. Principles from classic versioning mechanisms such as CVS [Berliner, 1990], Subversion [todo - Collins-Sussman] can be adopted here.
Decisions have to be made what will be stored as being a new version. Different alternative levels of granularity are possible. We adopt and present here some principles from data schema versioning [todo - Andany et al., 1991]:
or even:
The granularity will only be augmented if performance and storage problems would manifest.
Permissions and restrictions for editing ontology versions should be foreseen, and supported by the underlying storage layer. If ORDI is chosen as storage layer, this is done automatically.
Ontology versions are given different labels, describing the state they are in. We identify essentially three:
Naturally, administrator privileges are not subject to these restrictions.
The transformation transaction control (Requirement R8) must be aware of this requirement.
For each version stored in the system, there should be a persistent and unique identifier. This identifier is generated by an identification mechanism that is conform to Requirement R4. The latter means that the identification mechanism must be able to identify different versions of multiple ontologies, and versions of knowledge elements such as concepts.
Again we can refer to existing principles from software versioning [todo - Brown], and related work [todo - Klein et al.].
Each version has important meta-information that is not necessarily for computer processing purposes, but rather for guiding the human ontology engineer in his versioning process. Apart from other potential elements, following elements are essential:
A more advanced way is modelling the information according to a certain (versioning) ontology, and some basic concepts of such an ontology have been highlighted by [todo - Klein]. However the relevance and added value of such an extra elaboration should be considered first.
The evolution process between two ontology versions should be formally specified. [Klein 2004] provides different complementary alternatives (change logs, conceptual relations, and transformation sets) that can give a rich description of the change that the original ontology has undergone. We first discuss more basic needs.
A formal change specification describes unambiguously and correctly how exactly an evolution process of one ontology version into another one occurred. In the simplest case, an evolution process is a transformation described as a sequence of elementary transformations applied to a particular ontology.
Determining a set of possible elementary change operators goes parallel with determining which knowledge elements can evolve in an ontological definition, where the latter depends on the chosen ontology paradigm. We have stated in Requirement R2 that our versioning framework should be model-independent, so we assume there exists a finite set of atomic change operators, and that this set is available.
[todo - Banerjee et al., 1987] presents a taxonomy of change operators that can be applied to the ORION object-oriented data model. Other researchers did this likewise for other paradigms such as relational data schemas [e.g., todo - Roddick, 1993], conceptual data schemas [e.g., De Troyer, 1993, Halpin, 1989], etc.
As an illustration, consider the RDFS model, where the evolvable knowledge elements are classes, slots, constraints, etc. This would result in respectively add class, drop class; add slot, drop slot; add constraint, drop constraint, etc. In general, when defining a taxonomy we can structure mutators in at least three categories: (i) for the specialisation/generalisation hierarchy, (ii) for the concept definitions, and eventually (ii) for the instance data. All ontology models should recognize these categories.
Basically, this set of operators must not restrict the possible transformations. In other words, it should be sound and complete.
Next to the hard requirement for a finite, sound and complete set of atomic transformations (Requirement R7.1), we also require support for more complex changes. Complex changes allow the user to express his/her intent in a more meaningful (and high-level) manner.
Complex changes are built from a sequence of atomic changes, although the same sequence can have a potential different meaning; this is a very important distinction. Indicating a complex change defines exactly what atomic changes need to be made, but a certain sequence of atomic changes that matches those defined by a complex change, does not necessarily imply that the complex change has been made. As an example, consider two classes A and B, having respectively slots a and b. The sequence of deleting a in A and successively adding a in B is different from moving slot a from class A to B [Lerner, 2000].
The idea of complex change operators has been rarely studied in data schema evolution [Lerner, 2000], but has inspired related work in ontology evolution [todo - Stojanovic, Klein].
More advanced, a complex evolution process can be defined as transformations between subconstructs of the ontology. If the ontology is representable by a graph, a custom change could be the morphing from one subgraph into another. In general, a library can be kept of custom transformations. This idea is adopted from software evolution [Mens, 1999].
After the engineer has specified his transformation specification, he can consider it as a logical unit of work and store it in the version repository server. This transaction can have one of two outcomes. If it completes successfully, the transaction is said to have committed. On the other hand if it was not successful, the transaction is aborted. In the latter case the version repository is rolled back to its previous consistent state, and the user is notified. We refer to the ACID properties [todo - Haerder and Reuter, 1983] from the DB community here.
First of all the system should check whether the version is revisable by reading its permissions (cfr. authorization, requirement R4.2). Then and only then the system can and must check whether the transaction preserves the logical integrity of the version space before committing it to the server. The logical integrity consists of following two strong criteria:
And two less important ones:
Eventually the system could ask:
And decide whether to allow also updates of old versions of the ontology or not [Kim and Chou, 1988].
Any failure caused by a transaction in order to preserve/achieve logical integrity results in a roll-back of that transaction.
A last requirement is that each update of the version space should be persistent.
In data schema evolution several principles were defined to keep the schema consistent after each change of its definition. These solution principles are referred to as semantics of change.
A possibility is to define invariant properties intrinsic to the model to ensure semantic and structural integrity; and then to define rules or primitives for effecting the changes, by preserving these invariants [todo - Banerjee et al., 1987]. Mostly there are multiple alternative ways to preserve the invariants; the transformation rules are then responsible for choosing the most meaningful way.
Another possibility is to carefully restrict the set of mutators. In the relational model, only change atomic operators are allowed that preserve the consistency. More complex well-ordered sequences of such atomic change operators are allowed if there are constraints on their application order. [Roddick, 1993] requires the atomic change operators to be expressible in relational algebra.
Once the ontology has been revised, its interpretation mapping with some committing applications might get broken, resulting in wrong interpretations of instance data and semantics. The responsible person for the committing application has several alternatives:
A change in an ontology is always a decision which has been agreed on by a community. Further on, each application responsible has to decide by itself whether she or he will follow the trend and change along. However, if the backwards/forwards compatibility requirement (see R8) is fulfilled, one has not to revise her application model or interpretation mapping.
Dropping a concept Suppose the decision is taken to drop the concept PERSON from an ontology. This means that the majority of the community members is not interested anymore in interpreting the concept. Members that do not agree keep their commitment to the old version, and the new version is backwards compatible with the old one.
Adding a concept A new concept is introduced in the ontology, resulting into a new version. Old application models will still be compatible with the new version.
Updating a concept In some paradigms this reflects to adding or dropping properties of a concept, in others this means dropping or adding relationships between concepts. Potential (consistency) problems must be anticipated here.
Changing constraints and rules The support needed here depends on the expressiveness of the language. But here also consistency problems might arise and must be anticipated.
The user should be able to view differences between versions in an easy, and intuitive manner. A good starting point here can be the work done in PROMPTDiff. The custom transformations mentioned in Requirement R7.2 could be used in the comparison.
This feature unlocks possibilities for new applications such as forking/joining parallel ontology versions. When an initial ontology is used and evolved several times by different independent user groups (forking), it could be interesting to investigate the differences between the two resulting versions, and merge them. Note that ontology merging is not within the scope of this WP.
The impact of the consequences of a certain ontology evolution, both on the conceptual and instance level, should be calculated. Deployment of ontologies in applications typically comes down to committing [Guarino, 1998, Meersman, 2001] or mapping the applications information assets (such as data schemas) to the ontological model. Any change in the ontology might break thus the integrity of this commitment.
For example, if the ontology has undergone an evolution process, impact analysis means to detect that the semantics of the web services are broken, and that the existing service clients may now be getting answers that are interpreted wrongly. Requirement R9 tackles this problem.
Further, there is a need to calculate the impact of the consequence in case of change cascades. If ontology A is included in another ontology B, then there may be consequences for B if A is revised.
Note that impact analysis only analyses the possible impact, and informs the engineer. It does not solve any of the problems it detected.
Ontology construction and deployment is an extensive task which is typically tackled by multiple teams of knowledge engineers and/or domain experts. They work concurrently on the same or different parts of the ontology, and regularly synchronisations are needed, resulting in frequently generated stable versions.
This subsection specifies the essential client-side requirements. Much of it is based on the ideas in [De Leenheer, 2004] and [todo - Stojanovic, 2002].
A graphical version space (according to Requirement R3) browser providing a convenient view on all first-class citizens (being ontologies and their inter-relationships) stored in the server. Further, a zoom-feature on all first-class citizens enabling:
The engineer needs an editor where can evolve ontologies in 2 ways:
The tool needs an agent to manage and broadcast messages and notifications to the users. E.g., when an engineer has committed a transformation, the notification agent is responsible for notifying possible implications such as cascading changes, and forced roll-backs as illustrated in Requirement R8.
For the implementation we have distinguished the following phases:
The Xs indicate in which phase which requirement is being initially tackled.
| Req ID | Versioning Requirement | Version 0 | Version 1 | Version 2 | Version 3 | Priority |
|---|---|---|---|---|---|---|
| R1 | Interoperability/Compatibility | X | (affects all implementation) | |||
| R2 | Genericity | X | (affects all implementation) | |||
| R4 | Versioning Strategy | X | HIGHEST | |||
| R5 | Version Identification | X | HIGHEST | |||
| R6 | Additional Meta Information | X | HIGH | |||
| R3 | Version Space Representation | X | HIGHEST | |||
| R13 | GUI Version Browser | X | HIGHEST | |||
| R7 | Formal Transformation Specification | X | HIGHEST | |||
| R14 | GUI Transformation Editor | X | HIGH | |||
| R10 | Difference Analysis | X | HIGHEST | |||
| R9 | Propagation of Changes | X | HIGH | |||
| R8 | Version Validation Control | X | MEDIUM | |||
| R11 | Impact Analysis | X | HIGH | |||
| R15 | GUI Notification Agent | X | HIGH | |||
| R12 | Distributed Environment Support | X | LOW |
The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, SWWS, Esperonto and h-TechSight; by Science Foundation Ireland under the DERI-Lion project; and by the Vienna city government under the CoOperate programme.
The authors would like to thank to all the members of the OMWG working group for their advices and inputs to this document.
[Banerjee and Kim, 1987] Banerjee, J. and Kim, W. (1987) Semantics and Implementation of Schema Evolution in Object-oriented Databases. ACM SIGMOD Conf., SIGMOD Record 16(3):311--322.
[Berliner, 1990] Berliner, B. (1990) CVS II: Parallelizing Software Development. In Proc. of the USENIX Winter 1990 Technical Conf. (Berkeley, CA), pp 341--352, USENIX Association.
[De Leenheer, 2004] De Leenheer, P. (2004) Revising and Managing Multiple Ontology Versions in a Possible Worlds Setting. In Proceedings of the OTM 2004 Workshops, LNCS, 2004. Springer Verlag.
[De Troyer, 1993] De Troyer, O. (1993) On Data Schema Transformation, PhD. thesis, University of Tilburg (K.U.B.), Tilburg, The Netherlands.
[Collins-Sussman et al., 2004] Collins-Sussman, B., Fitzpatrick, B.W., and Pilato C.M. (2004) Version Control with Subversion, O'Reilly
[Guarino, 1998] Guarino, N. (1998) Formal Ontology and Information Systems. In Proc. of the 1st Int'l Conf. on Formal Ontologies in Information Systems (FOIS98) (Trento, Italy), IOS Press, pp. 3-15.
[Haerder et al., 1983] Haerder, T., and Reuter, A. (1983) Principles of transaction-oriented database recovery. ACM Computing Surveys 15 (1983), pp. 287-317.
[Halpin, 1989] Halpin, T. (1989) A logical analysis of information systems: static aspects of the data-oriented perspective. PhD thesis, University of Queensland, Brisbane, Australia.
[Kim and Chou, 1988] Kim, W. and Chou, H. (1988) Versions of Schema for Object-oriented Databases. In Proc. of the 14th Int'l Conf. on Very Large Data Bases (VLDB 1988) (Los Angeles, CA), Morgan-Kaufmann, pp. 148--159.
[Klein et al., 2002] Klein, M., Fensel, D., Kiryakov, A., and Ognyanov, D. (2002) Ontology Versioning and Change Detection on the Web. In Proc. of the 13th Int'l Conf. on Knowledge Engineering and Knowledge Management (EKAW02) (Siguenza, Spain), Springer-Verlag, pp. 197--212.
[Klein, 2004] Klein, M. (2004).Change Management for Distributed Ontologies. PhD thesis, Vrije Universiteit Amsterdam.
[Lerner, 2000] Lerner, B. (2000) A model for compound type changes encountered in schema evolution. ACM Transactions on Database Systems (TODS), 25(1):83--127, ACM Press.
[Meersman, 2001] Meersman, R. (2001) Ontologies and Databases: More than a Fleeting Resemblance. In Proc. of the Int'l Workshop on Open Enterprise Solutions: Systems, Experiences, and Organisations (OES-SEO2001) (Rome, Italy), Luiss Publications.
[Mens, 1999] Mens, K. (1999) A Formal Foundation for Object-Oriented Software Evolution, PhD. Thesis, Vrije Universiteit Brussel, Belgium.
[Roddick et al., 1993] Roddick, J., Craske, N., and Richards, T. (1993) A Taxonomy for Schema Versioning Based on the Relational and Entity Relationship Models, In Proc. the 12th Int'l Conf. on Conceptual Modeling / the Entity Relationship Approach (Dallas, TX), Springer-Verlag, pp. 143--154.
[Stojanovic et al., 2002(1)] Stojanovic, L., Maedche, A., Motik, B., and Stojanovic, N. (2002) User-driven Ontology Evolution Management. In Proc. of the 13th European Conf. on Knowledge Engineering and Knowledge Management (EKAW02) (Siguenza, Spain), Springer-Verlag, pp. 285--300.
[Stojanovic et al., 2002(2))] Stojanovic, L., Stojanovic, N., and Handschuh, S. (2002) Evolution of the Metadata in the Ontology-based Knowledge Management. In Proc. of the German Workshop on Experience Management (Berlin, Germany), GI, pp.65--77.
$Date: 2005/04/11 14:39:24 $