Addressing 1st Review Comments
excerpt from the file LarKC215535M1M14review.pdf
- D6.1 Requirements summary and initial data repository
- Approved in full
- The deliverable presents a thorough requirements analysis and state of the art survey for the Urban Computing use case. Various material including papers and wiki contents are attached as appendices.
NO ACTION NEEDED
- D6.2 Templates of periodic report on data and performances
- Approved subject to the conditions listed under remarks
- The deliverable considers the factors that will be relevant for characterizing data sources and for evaluating the performance of (the urban computing) application(s). Careful evaluation will be critical for measuring the success or otherwise of the project. This report makes a useful start in establishing a suitable framework, but it will no doubt need to be refined as the project progresses. One problem is that very many measures are considered. It will be useful to fix on a smaller number of critical measurements that can be used to compare the performance of LarKC with that of other systems and technologies.
NO RESUBMISSION NEEDED
- I suggest nonetheless to write a few lines as explanation and answer to this remark (even if the main explanation will lie in D6.6)
- D6.3 Urban Computing environment specification
- Approved in full
- This deliverable reports on the development of the Urban Computing prototype as well as requirements for the fully fledged application. The progress made on the prototype has been impressive, although no part is yet being played by large data sets. Addressing this issue will be crucial in the fully fledged application.
NO ACTION NEEDED
- D6.4 1st periodic report on data and performances
- Approved subject to the conditions listed under remarks
- At this stage of the project only information on data sources is provided. Very large scale data sources still seem to be lacking.
NO RESUBMISSION NEEDED
- I suggest nonetheless to write a few lines as explanation and answer to this remark (even if the main explanation will lie in D6.6)
How to address the review comments (to be inserted in D6.6)
Performance evaluation issue
[Kono] Kono mail (WP6 ML 7/27/2009): WP6 partners raised the same question during the composition. However, we concluded that identifying new measures would cost more rather than identifying it in advance and select it later when measuring. Also, these too many measures can be explained partly because LarKC use many technologies such as information retrieval, reasoning, natural language processing, and machine learning. We categorized performance measures such as Scale, Heterogeneity, Reusability, and Statistical Measurement. We will focus on measuring Scale and then Heterogeneity. Because Scale is the closest purpose of LarKC for “web-scale reasoning. Heterogeneity is another main criteria for measuring reasoning power. Reusability, and Statistical Measurement will have low priority while Reusability can measure how many functionalities can be provided by plug-ins and Statistical Measurement can measure what the performance of machine learning modules of LarKC is.
- [Daniele] In D6.6 we define three kind of tests to evaluate LarKC and the Alpha Urban LarKC. In next evaluation delieverables we will perform again this tests (extending them if necessary)
Large data issue
[Florian] Traffic data: we have traffic information about Milano (D6.6/data section) with 109/1010 order data. We want to use it with ML techniques, so RDF format seems not be necessary, but we could try to convert anyway if we'd like to perform other kinds of experiments. The conversion will raise a modeling issue: what is the best RDF model to represent this data? How to manage timestamps? Could WP4 cope with this problem?
[Daniele] LOD: actually we interact with LOD (dbpedia) via Sindice to retrieve general information about Milano. We could try to access the LOD with other approaches (for example WP7's LDSR) comparing them; moreover we can prepare queries to retrieve a major amount of data from the LOD.
[Daniele] ?OpenStreetMap and ?LinkedGeoData: a recent initiative, called ?LinkedGeoData, allows to obtain the data contained in ?OpenStreetMap in RDF format. This is a good start point to study the extension of our work from Milano to a larger scale (state/world). This will grant to work with a major amount data and scalability issues to cope with will be introducted.
