Skype teleconference May 26th 15:00 CET
The Goal of the Telco
Within plugin parallelisation for selection: priorities and plans. Current status: http://wiki.larkc.eu/LarkcProject/WP2/parallelisation
Participants
- Mihai Lupu, IRF
- Danica Damljanovic, USFD
- Jose Quesada, MPG
- Lael Schooler, MPG
- Yi Zeng, WICI
- Yan Wang, WICI
- Mark Greenwood, USFD
- Alex Chepsov, HLRS
- Matthias Assel, HLRS
- Ivan Peikov, Onto
Agenda and Minutes
- WP2 to make a plan on what the priority/expectations for the review are:
- we aim to show RI running on the whole LLD; this is the most important;
- HLRS to report issues/progress/plans on the Airhead work during the telco
- HLRS have implementation of parallelisation of search; next step is porting application to the cluster and then testing different configurations of parallelisation (using small datasets); finally, when we have found relevant configuration we test scalability looking at the whole LLD; 'the more the better' for HLRS :);
- Timeline: by end of June we will have application running on the cluster
- LarKC platform, workflow running RI and interest based selecter/reasoning: plan for the review: after we have seen RI running fast
- MPI to send data if we want semanticVectors considered
- data ready, to be sent to HLRS by tomorrow
- subsetting: still no performance progress although precision/recall is good
- demo market MPI: subsetting; reduce the time for every single query? have a list of queries for which it does make sense and also for which is does not make sense to apply subsetting
- ESA would be better candidate for parallelisation, so that we do not do the things twice; but this will be considered after the review
- airhead or sv? airhead should be our goal for the review; and then we can have a look at how the code differs from semanticvectors - it should be doing the same thing so might be easy to parallelise
- HLRS to give opinion about the WICIs code
Matthias: according to documentation, there are two main parts where we can do parallelisation: 1. Loading of RDF 2. querying over the 10 subsets; for 1, we tried multithreading and performance is better for 30%; final improvements by 50% with the first 4 RDF; Mattias is looking into code; will do measurements; for review: if there is no parallelisation we will show refinement; there is online demo; ?InterestBasedSelecter and ?InterestBasedReasoner are plugins on the platform but there is no workflow;
- ACTIONS: Mattias and Yi to discuss this offline but seems like we have the concrete plan for before/after the review
- The exchange program (HLRS and USFD)
- Alex to visit Sheffield; final Airhead library on cluster; how to create workflow and provide LarKC support; for how long? 2 weeks; workshop for 1 or 2 days workshop after his visit; july is better after 19th maybe workshop on 26th?; mark, jose, mihai ok; jose: do we need to participate if we are not interested in the parallelisation work? this depends on the progress by july; yi; visa for 2 months in advance so might be problematic;
- ACTIONS: Alex and Danica to talk about the timeline and propose the dates for the others
- Alex to visit Sheffield; final Airhead library on cluster; how to create workflow and provide LarKC support; for how long? 2 weeks; workshop for 1 or 2 days workshop after his visit; july is better after 19th maybe workshop on 26th?; mark, jose, mihai ok; jose: do we need to participate if we are not interested in the parallelisation work? this depends on the progress by july; yi; visa for 2 months in advance so might be problematic;
