Feature Requests and Feedback from other LarKC WPs

WP2 Retrieval and Selection

(Feedback from Danica Damljanovic, USheff)

WP3 Abstraction and Learning

(Feedback from Florian Steinke, Siemens) Few comments and suggestions that I think would be helpful to improve/accelerate the start-up phase:

  1. Ant build: On windows one has to set the environmet variable ANT_OPTS as follows prior to calling ant
    • set ANT_OPTS=-Xmx256m
    • Otherwise, the compilation stops with a memory error.
  2. After downloading the code, it would be nice if a demo pipeline would be pre-installed (e.g. the Urabn pipeline). Then one would just use "run-larkc.bat" and do first tests. At the moment, the default version does not reference any decider plugin and "run-larkc.bat" fails. Such a pre-installation would be much easier than first having to copy some plaugins somewhere. Moreover, at the moment it seems one has to look into the source code to know which plugins belong togethe; and one first has to figure out that only one decider plugin is allowed in the plugin directory (that the reson not to copy all plugins to the plugin directory).
  3. The simple sparql client (+ source code) http://shodan.ijs.si/client.zip should also be distributed with the LarKC platform. It is necessarry to see some results of the platform. Again would make start-up easier.

  4. Can one make one large Eclipse project/environment/workbench which includes everything (including all plugins) and where I do not have to set paths etc. My idea is that I just have to open something, press the debug button, and start looking into the platform code to fiund out how things work.
  5. Creating a new plugin: Do I understand it correctly that I have to copy the respective template directory and manually change the class name within several text files and filenames? Any way to automate this?
  6. A documentation of the LarkC platform and how it works (including mutithreading, the muptiple-calls-of-identifier-plugin paradigm, How do I get an RDF store?, ORDI data structure, etc. ) is needed.
  7. LarKC includes the OWLIM Sesame RDF store, right? Can I use all the functions of Sesame? How do the Sesame functions/models/Rdf-stores translate to the LarKC API? Why is the LarKC API so restricted?

WP6 Urban Computing Use Case

  1. Caching at platform level configurable at plug-in level
    • Description:

    • Status:

      • we had some discussions with Ontotext regarding this topic in the Berlin meeting. Data generated by a workflow is kept in the so called "named graph". The named graph is accessible via a URI. The Data Layer offers the necessary tools to manage it.
      • First experimental prototype being currently developed by Spyros (VUA). To be presented during Munich meeting
    • Next steps (responsible):

      • provide more info on methods exposed by the Data Layer and how to make use of them (Ontotext)
      • Experimental prototype being developed by Spyros after the Amsterdam WP5 Performance workshop
    • Deadline: tbd

  2. Split, join and (conditional) loops in the workflow
    • Description:

    • Status:

      • split / join connectors are currently known as "multiplexers". We are in progress with their implementation as components of the platform. Current impleentation in progress by Barry/Florian (UIBK), after the Amsterdam meeting.
      • regarding loops, this should be done by the Decider at workflow runtime. AFAIK we still haven’t implemented any example of this. We add it to our list of requested features.
    • Next steps (responsible):

      • First implementation of multiplexers (Cyc, Luka)
      • Integration / adaptation of implementated multiplexers to Distribution Model (Alexey)
    • Deadline: tbd

  3. Possibility to subtype the type Query at plug-in level
    • Description: (Daniele's descrption and suggestions to solve it)

      • I will use the alpha Urban LarKC Event Workflow to explain the problem: the first two plug-ins of this workflow are a Query Transformer and an Identifier. The first converts the ?SparqlQuery in a ?CityQuery, a subclass of Query containing information about the city involved in the input query, while the second receives as input the ?CityQuery object and use it to retrieve the events. In the new code structure we found two problems:

        • the class in the LarKC model schema: the class Query and its subclass are described in an RDF model used by the platform. ?CityQuery was not described in the model... Luka and Alexey solved (temporary) the problem extending the RDF model in the platform, but it can't be a general solution for the future.

        • the location of ?CityQuery class: the Transformer and the Identifier are in two separate projects, so where the ?CityQuery class should be located? If I put it in both the projects (its source code), it won't be recognized to be the same class. I tried to import it as a library and it seems to work (if I remember well); then, talking with Luka and Alexey we put it in the platform project (the deadline for the ISWC tutorial was near and we took the quickest solution).

      • Summarizing this point, we'd like to have a way in order to define new kinds of Query (and more generally LarKC types passed by a plug-in to another one), defining:
        • a smart way to extend the RDF model of LarKC types, eventually not the direct extension of the RDF model of the platform, otherwise if the number of subclass (defined by each plug-in developer) grows this model will be unnecessarily big. It will be great if LarKC platform can load only some "model extensions" with the definitions of the types used by the workflows involved in the execution;
        • some indication regarding how to interact with the extended class: is the use of library containing the custom types the right way to do it?
    • Status: to be discussed in the Munich meeting

    • Next steps (responsible):

    • Deadline: tbd

  4. Possibility to send different queries to different components in the Workflow
    • Description: (Daniele's description)

      • if we use the Pipeline class we can set only one input query for all the plug-in involved in the workflow; in order to send different queries to different plug-ins we avoided the use of this class. We'd like to know if the Pipeline class will be extended in order to support this feature or if we should continue to write deciders without using that class.
    • Status: To be discussed in th Munich meeting

    • Next steps (responsible): tbd

    • Deadline: tbd

  5. Concurrent write and read access to the data layer by independent JVMs
    • Description: (Emanuele's description)

      • I may be wrong, but the method that makes Data Layer transactional only work if the thread accessing the data layer are all in the same JVM. If you have two programs running in two JVMs (e.g., when you distribute plug-ins) the Data Layer stops working. I can reproduce the error if needed.
    • Status:

      • AFAIK this was an issue with the java threads that was addressed by Luka and Daniele, right? I don’t know the current status (how/if this was solved). @Emanuele: Can you please inform us about it?
      • Check with Vassil!
    • Next steps (responsible):

      • Check with Daniele and Vassil during Munich meeting
    • Deadline: tbd

  6. A way to configure plug-ins
    • Description: (Daniele's description)

      • for example, we'd like that ?RemoteGraphLoaderIdentifier could be initialized with one URL to retrieve a remote RDF graph (storing its content in the Data Layer). I remember that Luka told me that you are developing a mechanism in order to do it... is there any news about it?

    • Status: To be discussed in the Munich meeting

    • Next steps (responsible): tbd

    • Deadline: tbd

  7. Workflows running at batch time
    • Description: (Florian Steinke proposed a possible solution to improve the time performance of the Alpha Urban LarKC)

      • some workflows running at batch time moving the data that alpha Urban LarKC will need (monument and event information) from the Web to the LarKC data layer;
      • workflows invoked by users when using the alpha Urban LarKC doesn't interact with external sources but it queries directly the data layer.
      • We know how to realize the second kind of workflows (the "reading" ones), but there are different ways to realize the first ones (the "writing" ones): they can be realized as normal plug-ins invoked every period of time by an application, or they can be realized with a "special" decider that works without input executing periodically the workflow. We think that the second solution (a decider that can schedule the execution of workflows) is a more general and it that can be reused easily in different scenarios. What do you think about it?
    • Status: this is related to the "warming-up" concept. To be discussed in the Munich meeting, during joint WP1/WP2/WP4/W5 session (anywhere else?)

    • Next steps (responsible): tbd

    • Deadline: tbd

WP7a Early Clinical Drug Development

LarkcProject/WP5/FeatureRequestsFromOtherWPs (last edited 2011-05-31 14:28:39 by ?NorbertLanzanasto)