Quality of Service Aspects
The purpose of this page is to describe/specify an initial set of basic QoS parameters for each of the LarKC plugins. The content of this page is part of the discussions made during the "QoS and Anytime Workshop" in Innsbruck (QosAndAnytimeWorkshop). Relevant documentation can be found at LarkcProject/WP5/docs/Bibliography/QoS
Since LarKC is also about managing resources to get approximate and/or anytime behaviour, the interfaces for the plugins must also capture Quality of Service aspects. Hence, each plugin interface should deal with QoS aspects, as well as with the functional behaviour, as above. We have noticed in other projects (on web-services) that there is very little agreement on a common vocabulary to describe QoS aspects. Here we give some examples of what these could be for the various types.
As a starting point we could use the following ideas outlined in ApiConsiderations:
DECIDE Plugin: This plugin receives QoS constraints from the user (time, memory, allowed user interactions, allowed network traffic), in general any QoS-constraint meaningful to a query-formulating-user. These must then be translated into QoS constraints on the other plugins, in terms that are meaningful to plugin-developers. An example of such a translation is a max. response time dictated by the user, which is translated to a max. number of triples to retrieve by the RETRIEVE plugin, in order to limit the required computation time by the INFERENCE plugin.
RETRIEVE Plugin: Examples of relevant QoS constraints could be
- upper/lower-bounds on the number of triples required,
- importance ranking on input identifiers (must-have, nice-to-have);
- required quality/trust values on retrieved resources
First attempt at terminology for QoS parameters
This is currently being brainstormed, so don't pay much attention
Method:
- Capture all parameters together - will separate into functional/meta-data/QoS afterwards.
QoS Constraints (Preferences) from user's perspective
- Required starting time (batch) - not pipeline QoS, but config params
- Max time for final answer
- Max time for first answer
- Desired number of results
- Max number of results
- Attempt completeness (yes/no)
- Attempt soundness (yes/no)
- Type of dataset to identify (RDF, text, spreadsheet) - config params
- Want justifications
QoS from plug-in/decider interface perspective
- All plug-ins:
- Cluster: Min/max number of nodes (min=must have, max=fastest operation), 1= not parallelisable
- Cluster: Estimated total execution time - meta-data . (HARD, not a single value)
- Cluster: Min memory requirements
- Cluster: Max memory requirements
- Deployment factor: Speed of connection
- Identify:
- Provide performance data (graph?, function fit?), includes resources/sec - QoS
- What kind of data-set: rdf document, text, spreadsheet - functional
- Ranking.
- Number of indexed sources.
- Transform (data):
- From syntax + vocabularies(s) - functional
- To syntax + vocabularies(s) - functional
- Performance - triples/sec?
- 1:1 mapping or something else(1:many, many:1, etc)
- equality (or something else)
- label dependent (yes/no)
- scalability (based on size of ontology)
- Text to triples: Language
- Text to triples: Benchmarked accuracy
- Transform (query):
- From query form - functional
- To query form - functional
- Select:
- Data-set type
- Completeness - no, there is only one complete selection component
- Selection procedure (provenance/cost-benefit/reasoning context/random/syntactic distance)
- Representation??? (none/ etc etc)
- Reason:
- Performance
- Type of reasoning/expressivity: RDF, RDFS, OWL, RIF, SWIRL, WSML
- Load time
- Query time
- Completeness
- Soundness
- Scalability (good/bad) (abox/tbox) (linear, sub-linear, polynomial)
- Decider:
Walkthrough simple query
Invent rules that apply above parameters in context, e.g. 1. If input data is big and reasoner not scalable then....
Mapping user QoS constraints to plug-in/decider constraints
