Ch4: LarKC knowledge representation language
Frank, 16 Sep 2008
4.1 Intro
There is tension between two design goals:
- maximise interoperability between LarKC plugins by fixing a single representation language
- maximise the variety of applications for which larKC can be used by enabling a flexible representation language
LarKC can achieve the best of both worlds by doing both things at once:
- define a mandatory representation language. This language must be supported by each plugin that can call itself a LarKC plugin
- Define "language profiles" which allow different representational extensions and restrictions w.r.t. to the mandatory representational language.
4.2 Definition of "mandatory representation language"
See earlier chapters in this deliverable for requirements on the language from the use-cases.
The definition of the mandatory language must include a
- representation-model (e.g: term/formula-trees in FOL, labelled graphs in RDF(S)/OWL, S-expressions in KIF, etc).
- a serialisation syntax for transmitting expressions
- a semantics
Representation-model: The obvious choice for representation-model are the labelled graphs from RDF (URI-labelled graphs including blank-nodes)
Serialisation syntax: again an obvious (although painful) choice is W3C's XML/RDF encoding, but there are many otherse (N3, Turtle, the OWL2/XML serialisation, etc. The figure 5.6 of the current deliverable provides a useful overview of these and other possible serialisations and their relations.
Semantics: in order to deal with large scale applications, this language must obviously be rather inexpressive, and have a semantics that is cheap to compute. An obvious choice here is just taking RDF+RDF(S), or something close.
4.3 Definition of "language profiles"
This approach is well tried and tested in Web-based knowledge representation languages, e.g. the "layering" of OWL1, and the "language profiles" in OWL2.
The profiles are used to annotate
- datasets (to state in which language they have been expressed)
- plugins (to state which language they can process)
- requests (to state under which semantics they expect to be processed)
Upward/downward compatability:
- plugins should be able to process weaker language profiles then their own because weaker profiles are both syntactic and semantic subsets.
- plugins should be able to partially process stronger language profiles, by simply ignoring any specific semantics of language elements in more expressive profiles.
The "language profiles" differ not only in the syntactic elements of the language (with their respective semantics), but also in the various styles of semantics, such as Open World Assumption, Unique Name Assumption, and varieties of these (e.g. limited Open World Assumption), etc.
We must think of some way in which these language profiles are stated. The obvious solution is some kind of "hierarchy of languages" ordered by both syntactic and semantic embedding. This would require a "meta-vocabulary" of (at least) names of different syntactic varieties and different semantic assumptions, plus a way of stating particular combinations of these to make up a particular profile.
Of course this meta-vocabulary (defining the set of language profiles) should be open, in the sense that other people should be able to define their own language profiles, and embed it in the hierarchy of already existing profiles.
The "LarKC KR Ontology" from section 5 is a very ambitious approach to defining language profiles, namely by identifying the fine-grained building blocks of languages (the different "modelling elements") and then (presumably) defining profiles by choosing different combinations of these. A more course-grained approach would be as taken in the OWL2 WG, namely simply defining various named profiles, and not allowing arbitrary combinations of the modelling elements outside these profiles. In the OWL2 approach, the meta-vocabulary simply consists of atomic tags for different profiles, organised in an inclusion hierarchy, but the actual syntactic and semantic definitions of the profiles are "informal" (not machine accessible/processible). The profiles are simply the atomic terms in an ontology of language profiles.
I'm currently unsure whether the "LarKC system ontology" that is in section 5 will be helpul in defining either mandatory language or the profiles. It would seem more useful as a representation-model underlying the API design reported in D1.2.1 (initial operational framework).
