WP1: Conceptual Framework and Evaluation

D1.2.2: Improved Operational Framework

Telcos and Meetings

Objective

The main thrust of the WP1 parallel session was essentially a brainstorming session with the intention of better understanding our task of defining the improved operational framework, which will lead up to the deliverable "D1.2.2 Improved operational framework", led by VUA.

In this deliverable we will attempt to map out the landscape for combining techniques from multiple disciplines for the purpose of achieving web scale reasoning. No easy task.

The platform and plug-ins we have delivered so far are too coarse-grained as can be seen from the fact that we have mostly wrapped existing components. For the future we should be looking at exposing more fine-grained elements of plug-ins for re-use, i.e. individual functions, algorithms, data-structures, etc. For example, I could imagine a deductive reasoner replacing a sorted index with a cognitively inspired selection algorithm.

As I said in the session, the main goal of this deliverable should be to identify as many potential situations where re-use of fine-grained elements is possible. These situations will be inspired by close consultation with the technical work packages 2, 3 and 4.

Hopefully, the results of this 'survey' will uncover some real examples of beneficial re-use/integration, but perhaps more helpful would be some matrix showing where closer integration of components might have some benefits and so inspire plug-in writers to explore these areas for themselves.

Methodology

The methodology is rather vague at this stage. We agreed that a good place to start would be with a survey of all LarKC personnel actively involved in plug-in creation. This will include WP 2,3,4 leaders, plus all people project-wide who write plug-ins or wrap existing components as plug-ins. We should ask them to list:

People will need to be imaginative and should not limit themselves just to source code. Components for re-use could be anything from database structures and java classes, to 3rd party libraries and services.

We will then try to match the availability of certain sub-components with gaps in functionality already identified. We can also identify functional areas which appear to solve similar problems, thus suggesting where tighter integration might be beneficial.

My personal feeling is that a generalisation of these results to a more fine-grained API/framework might not be possible. Therefore, I would suggest proceeding on a case-by-case basis and review the results at a later date to tease out any potential abstractions.

Approach

The specification of the operational framework could take the form of a series of guidelines to support the development of plug-ins and applications that use plug-ins, i.e. in order to support both plug-in and application developers. The description would include a specification of different design patterns describing different plug-in interaction scenarios (see discussions in WP5 and initial set of design patterns) and how they can be combined to maximize scalability (or other aspect of a given task)

Outline of the Deliverable

  1. Introduction

  2. Motivation of Improved Framework (STI, VUA)

    • Write up the motivation for an improved operational framework based on the weakness and missing aspects of the previous deliverable
    • Don't forget to mention the notion of workflow rather than pipeline and justify why we decided to change to workflow (not only sequential execution of plug-ins)
  3. Dealing with Scalability Through Tighter Integration of Components ( ALL )

    1. Note: The purpose of this chapter is to describe how tighter (fine-grained) integration of components in LarKC can lead to scalability
    2. Integration of Selection and Query, as well as comparison with query refinement from the perspective of scalability (Yi Zeng)

    3. Tighter integration with the Data Layer (Onto?)

      • Note: Recall the session on "pushing computation to the data" in Munich.
      • See D5.5.2

  4. Specification and description of Identify/Selection Plug-in

    • 4.1. Overview of the functionality provided by the plug-in type (see Note [2]) 4.2. Reusable aspects of the plug-in (see Note [1])
      • Query refinement methods: based on user interests (Yi zeng).

      • Query refinement methods: Random Indexing (USFD, Prior Knowledge selector, ask Jose and Danica).

      • ?GrowingDataSetSelecter (Barry, Mick) ?

      • WP7b's prior selector (Mark?)

  5. Specification and description of Transformation Plug-in

    1. Overview of the functionality provided by the plug-in type (see Note [2])
    2. Reusable aspects of the plug-in ?? (Yi Huang (SIEMENS)?) (see Note [1])

  6. Specification and description of Reasoner Plug-in

    1. Overview of the functionality provided by the plug-in type (see Note [2])
    2. Reusable aspects of the plug-in (see Note [1])
      • Rule-based reasoner (STI)

      • Approximate Reasoner (interleaving selection and reasoning and PION work. VUA)

      • Stream-based Reasoner (Emanuelle (CEFRIEL)?)

  7. Description of the guidelines/Specification of operational framework

    • 7.1. Design Patterns (HLRS, Georgina)

      • Expensive pre-computation (once) followed by cheaper computation many times
      • Expensive computation many times plus cheaper computation many times plus accessing external resources
      • Expensive computation continuous
      • Replication plus Data partitioning
      • Some expensive plus some cheaper, but no one of them dominating
  8. Conclusion

Important Notes

Note [1]: The purpose of this section (and the one in chapter 5 and 6) is to describe the algorithms, data structures, and (sub)tasks that make up a given selection plug-in and can potentially be reused by other components (plug-ins) in LarKC. More specifically, the idea is to take a component, e.g. a selection plug-in, a reasoner plug-in, a transformation plug-in, a decider, the data layer component, etc. and break it apart in order to identify its several subcomponents and describe the input/output and functionality of each such subcomponents. By doing the same thing with all LarKC components (including for example the data layer component) we should be able to define a framework of multiple components that can be used for problem solving in LarKC. Another benefit of this is that we can identify possible reusable components that can be used by other larkc components. For example, in Barry's example of a rule-based reasoner we could identify several such subcomponents: a selection mechanism for selecting which rules to apply, a selection mechanism for selecting the portion of the data the reasoner will reason over, a module for indexing the data and allowing efficient accesing, a module to process the query sent to the reasoner (e.g. to split the query in multiple sub-quesries that can be solved in parallel inside the reasoner), etc. Note that we can do the same analysis with any other component in LarKC. The key is also to analyze these components independently of LarKC. Note for example that this way of breaking the rule-based reasoner apart has nothing to do with LarKC, i.e. any rule-based reasoner could be modeled in a similar way. If we do this for every other coomponent in the LarKC architecture we should be able to spot the parts that can be reused by other components in LarKC. For instance, the selection method in the rule-based reasoner could be abstracted and implemented as a selection plug-in. Finally, we could also identify data structures that can be reused, for example, indexes over triplesets.

We suggest to ignore the LarKC plug-in architecture at the moment and just concentrate on some components that you have implemented (or reused, or just understand to some extent) and think about each the individual parts of this (algorithms, data structures, processes, etc).

This is the kind of description that we want to have from each LarKC plug-in, ideally a description that includes UML diagrams depicting the different parts of a plug-in. If we are able to identify as many such components/parts for every LarKC plug-in we should be able to formulate guidelines and design patterns to support plug-in and workkflow writers (the guidelines being part of the framework).

Note [2]: Describe any extensions/changes to the functionality of the plug-in since the last version of the platform as documented in D1.2.1

Timeline

Deadline: M24 (end of March)

Resources

The link to the SVN repository is the following: D1.2.2

LarkcProject/WP1/D1.2.2 (last edited 2010-02-25 10:37:12 by GastonTagni)