logo

OMWG D6.1: Requirements for Versioning Tool

DERI OMWG Working Draft 22 October 2004

This version:
http://www.omwg.org/TR/2004/d6/d6.1/v0.1/20041022/
Latest version:
http://www.omwg.org/TR/2004/d6/d6.1/v0.1/
Previous version:
http://www.omwg.org/TR/2004/d6/d6.1/v0.1/20041008/
Authors:
Pieter De Leenheer
Carlo Wouters
Editors:
Jacek Kopecký
Pieter De Leenheer

Copyright © 2004 DERI®, All Rights Reserved. DERI liability, trademark, document use, and software licensing rules apply.


Table of contents

1 Tool Requirements
1.1 Interoperability and compatibility
1.2 Genericity
2 Functional Requirements
2.1 Version Space Representation
2.2 Versioning Strategy
2.2.1 Granularity of versions
2.2.2 Authorization for revision
2.2.3 Version Status Identification
2.3 Version Identification
2.4 Meta-information about versions
2.5 Formal Transformation Specification
2.5.1 Elementary transformations
2.5.2 More complex transformations
2.6 Version Validation Control
2.7 Propagation of Changes
2.8 A Feature for Analysing Differences between Versions
2.9 Impact Analysis
2.10 Distributed Environment Support
3 Interface Requirements
3.1 Version Browser
3.2 Transformation Editor
3.3 Notification Agent
4 Implementation Priority List
5 global architecture
6 Conclusion
7 Acknowledgement
Appendix A. References

1 Tool Requirements

1.1 Interoperability and compatibility

The DIP ontology versioning tool will be integrated in the suite of DIP tools, and as such should be able to communicate with these. More importantly, it can benefit from the features offered by other tools in the ontology management suite. With this requirement this tool contributes to the overall interoperability and compatibility requirement for the tool suite. Deliverable 4, Section 2 situates the versioning tool in the DIP framework.

1.2 Genericity

The versioning principles should not be based on one particular ontology language, but should provide a solution for as many as possible multiple ontology languages and paradigms. Defining a model-independent versioning framework and tool might be utopian, but working in this direction we avoid discrimination.

An example of this is WSMO studio [todo ref], which supports conversions to and from various ontology languages. The versioning tool can utilize this feature to support interoperability and compatibility between versions, even if they are potentially written in different languages.

The ontology language adopted here is WSMO Core. The storage will be supported by ORDI.

2 Functional Requirements

In this section we elaborate on the functional requirements from a research point of view. For a detailed implementation priority list with respect to these requirements we refer to Section 4.

2.1 Version Space Representation

The set of all versions stored in the repository can be represented by a cloud of nodes in a version space. The change specification (in fact a transformation) between two versions is represented by directed edges connecting the respective version nodes. This results in a graph of versions. Some transformations are reversible, resulting into undirected edges (Figure 1).

Component view of the ontology management system
Figure 1: An illustration adopted from [De Leenheer, 2004] of an arbitrary set of possible ontologies Ωkl. The dashed and solid arrows between the ontologies reflect transformations; each ontology is trivially accessible from itself so reflexive arrows are left implicit.

In general, in the repository we consider thus two types of first-class citizens: (i) versions (Requirement 2.2) and (ii) their inter-relationships (or transformations) (Requirement 2.5). Furthermore we have for each version (a) a unique identification (Requirement 2.3), and (b) an additional meta-information block (Requirement 2.4).

This version space representation gives the user the opportunity to revert changes, or for the limited duration of a session (or even permanently), revert back to a previous version of the ontology. It also acts in the first place as a conceptual overview of the requirements that are elaborated more in the following subsections; and in the second place as a possible visualisation of the version space.

2.2 Versioning Strategy

Trivially, a user must be able to store his version of an ontology for later retrieval. From this requirement it should be clear how we will physically store a version. Principles from classic versioning mechanisms such as CVS [Berliner, 1990], Subversion [todo - Collins-Sussman] can be adopted here.

2.2.1 Granularity of versions

Decisions have to be made what will be stored as being a new version. Different alternative levels of granularity are possible. We adopt and present here some principles from data schema versioning [todo - Andany et al., 1991]:

or even:

The granularity will only be augmented if performance and storage problems would manifest.

2.2.2 Authorization for revision

Permissions and restrictions for editing ontology versions should be foreseen, and supported by the underlying storage layer. If ORDI is chosen as storage layer, this is done automatically.

2.2.3 Version Status Identification

Ontology versions are given different labels, describing the state they are in. We identify essentially three:

  1. working version: the version is not stable, finished or closed: it is fully revisable in all its aspects;
  2. stable version: the version is complete and useful; versioning is possible;
  3. final version: the ontology has been agreed on and cannot be versioned or deleted ever again;

Naturally, administrator privileges are not subject to these restrictions.

The transformation transaction control (Requirement 2.6) must be aware of this requirement.

2.3 Version Identification

For each version stored in the system, there should be a persistent and unique identifier. This identifier is generated by an identification mechanism that is conform to Requirement 2.2. The latter means that the identification mechanism must be able to identify different versions of multiple ontologies, and versions of knowledge elements such as concepts.

Again we can refer to existing principles from software versioning [todo - Brown], and related work [todo - Klein et al.].

2.4 Meta-information about versions

Each version has important meta-information that is not necessarily for computer processing purposes, but rather for guiding the human ontology engineer in his versioning process. Apart from other potential elements, following elements are essential:

A more advanced way is modelling the information according to a certain (versioning) ontology, and some basic concepts of such an ontology have been highlighted by [todo - Klein]. However the relevance and added value of such an extra elaboration should be considered first.

2.5 Formal Transformation Specification

The evolution process between two ontology versions should be formally specified. [Klein 2004] provides different complementary alternatives (change logs, conceptual relations, and transformation sets) that can give a rich description of the change that the original ontology has undergone. We first discuss more basic needs.

A formal change specification describes unambiguously and correctly how exactly an evolution process of one ontology version into another one occurred. In the simplest case, an evolution process is a transformation described as a sequence of elementary transformations applied to a particular ontology.

2.5.1 Elementary transformations

Determining a set of possible elementary change operators goes parallel with determining which knowledge elements can evolve in an ontological definition, where the latter depends on the chosen ontology paradigm. We have stated in Requirement 1.2 that our versioning framework should be model-independent, so we assume there exists a finite set of atomic change operators, and that this set is available.

[todo - Banerjee et al., 1987] presents a taxonomy of change operators that can be applied to the ORION object-oriented data model. Other researchers did this likewise for other paradigms such as relational data schemas [e.g., todo - Roddick, 1993], conceptual data schemas [e.g., De Troyer, 1993, Halpin, 1989], etc.

As an illustration, consider the RDFS model, where the evolvable knowledge elements are classes, slots, constraints, etc. This would result in respectively add class, drop class; add slot, drop slot; add constraint, drop constraint, etc. In general, when defining a taxonomy we can structure mutators in at least three categories: (i) for the specialisation/generalisation hierarchy, (ii) for the concept definitions, and eventually (ii) for the instance data. All ontology models should recognize these categories.

Basically, this set of operators must not restrict the possible transformations. In other words, it should be sound and complete.

2.5.2 More complex transformations

Next to the hard requirement for a finite, sound and complete set of atomic transformations (Requirement 2.5.1.1), we also require support for more complex changes. Complex changes allow the user to express his/her intent in a more meaningful (and high-level) manner.

Complex changes are built from a sequence of atomic changes, although the same sequence can have a potential different meaning; this is a very important distinction. Indicating a complex change defines exactly what atomic changes need to be made, but a certain sequence of atomic changes that matches those defined by a complex change, does not necessarily imply that the complex change has been made. As an example, consider two classes A and B, having respectively slots a and b. The sequence of deleting a in A and successively adding a in B is different from moving slot a from class A to B [Lerner, 2000].

The idea of complex change operators has been rarely studied in data schema evolution [Lerner, 2000], but has inspired related work in ontology evolution [todo - Stojanovic, Klein].

More advanced, a complex evolution process can be defined as transformations between subconstructs of the ontology. If the ontology is representable by a graph, a custom change could be the morphing from one subgraph into another. In general, a library can be kept of custom transformations. This idea is adopted from software evolution [Mens, 1999].

2.6 Version Validation Control

After the engineer has specified his transformation specification, he can consider it as a logical unit of work and store it in the version repository server. This transaction can have one of two outcomes. If it completes successfully, the transaction is said to have committed. On the other hand if it was not successful, the transaction is aborted. In the latter case the version repository is rolled back to its previous consistent state, and the user is notified. We refer to the ACID properties [todo - Haerder and Reuter, 1983] from the DB community here.

First of all the system should check whether the version is revisable by reading its permissions (cfr. authorization, Requirement 2.2.2). Then and only then the system can and must check whether the transaction preserves the logical integrity of the version space before committing it to the server. The logical integrity consists of following two strong criteria:

And two less important ones:

Eventually the system could ask:

And decide whether to allow also updates of old versions of the ontology or not [Kim and Chou, 1988].

Any failure caused by a transaction in order to preserve/achieve logical integrity results in a roll-back of that transaction.

A last requirement is that each update of the version space should be persistent.

In data schema evolution several principles were defined to keep the schema consistent after each change of its definition. These solution principles are referred to as semantics of change.

A possibility is to define invariant properties intrinsic to the model to ensure semantic and structural integrity; and then to define rules or primitives for effecting the changes, by preserving these invariants [todo - Banerjee et al., 1987]. Mostly there are multiple alternative ways to preserve the invariants; the transformation rules are then responsible for choosing the most meaningful way.

Another possibility is to carefully restrict the set of mutators. In the relational model, only change atomic operators are allowed that preserve the consistency. More complex well-ordered sequences of such atomic change operators are allowed if there are constraints on their application order. [Roddick, 1993] requires the atomic change operators to be expressible in relational algebra.

2.7 Propagation of Changes

Once the ontology has been revised, its interpretation mapping with some committing applications might get broken, resulting in wrong interpretations of instance data and semantics. The responsible person for the committing application has several alternatives:

A change in an ontology is always a decision which has been agreed on by a community. Further on, each application responsible has to decide by itself whether she or he will follow the trend and change along. However, if the backwards/forwards compatibility requirement (see 2.6) is fulfilled, one has not to revise her application model or interpretation mapping.

Dropping a concept Suppose the decision is taken to drop the concept PERSON from an ontology. This means that the majority of the community members is not interested anymore in interpreting the concept. Members that do not agree keep their commitment to the old version, and the new version is backwards compatible with the old one.

Adding a concept A new concept is introduced in the ontology, resulting into a new version. Old application models will still be compatible with the new version.

Updating a concept In some paradigms this reflects to adding or dropping properties of a concept, in others this means dropping or adding relationships between concepts. Potential (consistency) problems must be anticipated here.

Changing constraints and rules The support needed here depends on the expressiveness of the language. But here also consistency problems might arise and must be anticipated.

2.8 A Feature for Analysing Differences between Versions

The user should be able to view differences between versions in an easy, and intuitive manner. A good starting point here can be the work done in PROMPTDiff. The custom transformations mentioned in Requirement 2.5.1.2 could be used in the comparison.

This feature unlocks possibilities for new applications such as forking/joining parallel ontology versions. When an initial ontology is used and evolved several times by different independent user groups (forking), it could be interesting to investigate the differences between the two resulting versions, and merge them. Note that ontology merging is not within the scope of this WP.

2.9 Impact Analysis

The impact of the consequences of a certain ontology evolution, both on the conceptual and instance level, should be calculated. Deployment of ontologies in applications typically comes down to committing [Guarino, 1998, Meersman, 2001] or mapping the applications information assets (such as data schemas) to the ontological model. Any change in the ontology might break thus the integrity of this commitment.

For example, if the ontology has undergone an evolution process, impact analysis means to detect that the semantics of the web services are broken, and that the existing service clients may now be getting answers that are interpreted wrongly. Requirement 2.7 tackles this problem.

Further, there is a need to calculate the impact of the consequence in case of change cascades. If ontology A is included in another ontology B, then there may be consequences for B if A is revised.

Note that impact analysis only analyses the possible impact, and informs the engineer. It does not solve any of the problems it detected.

2.10 Distributed Environment Support

Ontology construction and deployment is an extensive task which is typically tackled by multiple teams of knowledge engineers and/or domain experts. They work concurrently on the same or different parts of the ontology, and regularly synchronisations are needed, resulting in frequently generated stable versions.

3 Interface Requirements

This subsection specifies the essential client-side requirements. Much of it is based on the ideas in [De Leenheer, 2004] and [todo - Stojanovic, 2002].

3.1 Version Browser

A graphical version space (according to Requirement 2.1) browser providing a convenient view on all first-class citizens (being ontologies and their inter-relationships) stored in the server. Further, a zoom-feature on all first-class citizens enabling:

3.2 Transformation Editor

The engineer needs an editor where can evolve ontologies in 2 ways:

  1. either he knows precisely how to change the ontology: he can define the transformation syntactically;
  2. either he does not know exactly how to change the ontology, but has a perfect idea of how the resulting ontology should look like: based on aimed result, the tool can try to generate the transformation for him.

3.3 Notification Agent

The tool needs an agent to manage and broadcast messages and notifications to the users. E.g., when an engineer has committed a transformation, the notification agent is responsible for notifying possible implications such as cascading changes, and forced roll-backs as illustrated in Requirement 2.6.

4 Implementation Priority List

For the implementation we have distinguished three phases:

The Vs indicate in which phase which requirement is being initially tackled.

Req. ID. Versioning Requirement Version 1 Version 2 Version 3 Priority
V1 Interoperability/Compatibility V (affects all implementation)
V2 Genericity V (affects all implementation)
V3 Version Space Representation V HIGHEST
V4 Versioning Strategy V HIGHEST
V5 Version Identification V HIGHEST
V6 Additional Meta Information V HIGH
V7 GUI Version Browser V HIGHEST
V8 Formal Transformation Specification V HIGHEST
V9 GUI Transformation Editor V HIGH
V10 Propagation of Changes V HIGH
V11 Version Validation Control V MEDIUM
V12 Difference Analysis V HIGHEST
V13 Impact Analysis V HIGH
V14 GUI Notification Agent V HIGH
V15 Distributed Environment Support V LOW

5 global architecture

global architecture

6 Conclusion

todo

7 Acknowledgement

The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, SWWS, Esperonto and h-TechSight; by Science Foundation Ireland under the DERI-Lion project; and by the Vienna city government under the CoOperate programme.

The authors would like to thank to all the members of the OMWG working group for their advices and inputs to this document.

Appendix A. References

[Banerjee and Kim, 1987] Banerjee, J. and Kim, W. (1987) Semantics and Implementation of Schema Evolution in Object-oriented Databases. ACM SIGMOD Conf., SIGMOD Record 16(3):311--322.

[Berliner, 1990] Berliner, B. (1990) CVS II: Parallelizing Software Development. In Proc. of the USENIX Winter 1990 Technical Conf. (Berkeley, CA), pp 341--352, USENIX Association.

[De Leenheer, 2004] De Leenheer, P. (2004) Revising and Managing Multiple Ontology Versions in a Possible Worlds Setting. In Proceedings of the OTM 2004 Workshops, LNCS, 2004. Springer Verlag.

[De Troyer, 1993] De Troyer, O. (1993) On Data Schema Transformation, PhD. thesis, University of Tilburg (K.U.B.), Tilburg, The Netherlands.

[Collins-Sussman et al., 2004] Collins-Sussman, B., Fitzpatrick, B.W., and Pilato C.M. (2004) Version Control with Subversion, O'Reilly

[Guarino, 1998] Guarino, N. (1998) Formal Ontology and Information Systems. In Proc. of the 1st Int'l Conf. on Formal Ontologies in Information Systems (FOIS98) (Trento, Italy), IOS Press, pp. 3-15.

[Haerder et al., 1983] Haerder, T., and Reuter, A. (1983) Principles of transaction-oriented database recovery. ACM Computing Surveys 15 (1983), pp. 287-317.

[Halpin, 1989] Halpin, T. (1989) A logical analysis of information systems: static aspects of the data-oriented perspective. PhD thesis, University of Queensland, Brisbane, Australia.

[Kim and Chou, 1988] Kim, W. and Chou, H. (1988) Versions of Schema for Object-oriented Databases. In Proc. of the 14th Int'l Conf. on Very Large Data Bases (VLDB 1988) (Los Angeles, CA), Morgan-Kaufmann, pp. 148--159.

[Klein et al., 2002] Klein, M., Fensel, D., Kiryakov, A., and Ognyanov, D. (2002) Ontology Versioning and Change Detection on the Web. In Proc. of the 13th Int'l Conf. on Knowledge Engineering and Knowledge Management (EKAW02) (Siguenza, Spain), Springer-Verlag, pp. 197--212.

[Klein, 2004] Klein, M. (2004).Change Management for Distributed Ontologies. PhD thesis, Vrije Universiteit Amsterdam.

[Lerner, 2000] Lerner, B. (2000) A model for compound type changes encountered in schema evolution. ACM Transactions on Database Systems (TODS), 25(1):83--127, ACM Press.

[Meersman, 2001] Meersman, R. (2001) Ontologies and Databases: More than a Fleeting Resemblance. In Proc. of the Int'l Workshop on Open Enterprise Solutions: Systems, Experiences, and Organisations (OES-SEO2001) (Rome, Italy), Luiss Publications.

[Mens, 1999] Mens, K. (1999) A Formal Foundation for Object-Oriented Software Evolution, PhD. Thesis, Vrije Universiteit Brussel, Belgium.

[Roddick et al., 1993] Roddick, J., Craske, N., and Richards, T. (1993) A Taxonomy for Schema Versioning Based on the Relational and Entity Relationship Models, In Proc. the 12th Int'l Conf. on Conceptual Modeling / the Entity Relationship Approach (Dallas, TX), Springer-Verlag, pp. 143--154.

[Stojanovic et al., 2002(1)] Stojanovic, L., Maedche, A., Motik, B., and Stojanovic, N. (2002) User-driven Ontology Evolution Management. In Proc. of the 13th European Conf. on Knowledge Engineering and Knowledge Management (EKAW02) (Siguenza, Spain), Springer-Verlag, pp. 285--300.

[Stojanovic et al., 2002(2))] Stojanovic, L., Stojanovic, N., and Handschuh, S. (2002) Evolution of the Metadata in the Ontology-based Knowledge Management. In Proc. of the German Workshop on Experience Management (Berlin, Germany), GI, pp.65--77.


Valid XHTML 1.1!

$Date: 2004/10/22 16:12:55 $

webmaster