Ch4: LarKC knowledge representation language

Frank, 16 Sep 2008

4.1 Intro

There is tension between two design goals:

  1. maximise interoperability between LarKC plugins by fixing a single representation language
  2. maximise the variety of applications for which larKC can be used by enabling a flexible representation language

LarKC can achieve the best of both worlds by doing both things at once:

  1. define a mandatory representation language. This language must be supported by each plugin that can call itself a LarKC plugin
  2. Define "language profiles" which allow different representational extensions and restrictions w.r.t. to the mandatory representational language.

4.2 Definition of "mandatory representation language"

See earlier chapters in this deliverable for requirements on the language from the use-cases.

The definition of the mandatory language must include a

Representation-model: The obvious choice for representation-model are the labelled graphs from RDF (URI-labelled graphs including blank-nodes)

Serialisation syntax: again an obvious (although painful) choice is W3C's XML/RDF encoding, but there are many otherse (N3, Turtle, the OWL2/XML serialisation, etc. The figure 5.6 of the current deliverable provides a useful overview of these and other possible serialisations and their relations.

Semantics: in order to deal with large scale applications, this language must obviously be rather inexpressive, and have a semantics that is cheap to compute. An obvious choice here is just taking RDF+RDF(S), or something close.

4.3 Definition of "language profiles"

This approach is well tried and tested in Web-based knowledge representation languages, e.g. the "layering" of OWL1, and the "language profiles" in OWL2.

The profiles are used to annotate

Upward/downward compatability:

The "language profiles" differ not only in the syntactic elements of the language (with their respective semantics), but also in the various styles of semantics, such as Open World Assumption, Unique Name Assumption, and varieties of these (e.g. limited Open World Assumption), etc.

We must think of some way in which these language profiles are stated. The obvious solution is some kind of "hierarchy of languages" ordered by both syntactic and semantic embedding. This would require a "meta-vocabulary" of (at least) names of different syntactic varieties and different semantic assumptions, plus a way of stating particular combinations of these to make up a particular profile.

Of course this meta-vocabulary (defining the set of language profiles) should be open, in the sense that other people should be able to define their own language profiles, and embed it in the hierarchy of already existing profiles.

The "LarKC KR Ontology" from section 5 is a very ambitious approach to defining language profiles, namely by identifying the fine-grained building blocks of languages (the different "modelling elements") and then (presumably) defining profiles by choosing different combinations of these. A more course-grained approach would be as taken in the OWL2 WG, namely simply defining various named profiles, and not allowing arbitrary combinations of the modelling elements outside these profiles. In the OWL2 approach, the meta-vocabulary simply consists of atomic tags for different profiles, organised in an inclusion hierarchy, but the actual syntactic and semantic definitions of the profiles are "informal" (not machine accessible/processible). The profiles are simply the atomic terms in an ontology of language profiles.

I'm currently unsure whether the "LarKC system ontology" that is in section 5 will be helpul in defining either mandatory language or the profiles. It would seem more useful as a representation-model underlying the API design reported in D1.2.1 (initial operational framework).

LarkcProject/WP1/Chapter4Outline (last edited 2008-09-17 10:13:13 by FrankVanHarmelen)