Early Clinical Development use case plans and collaboration activities

The document overview the collaboration activities between between Early Clinical Development use case and the LarKC technology work packages. Linked Life Data (http://linkedlifedata.com) hosts a massive knowledge base composed by billions of facts generated from:

Further information is derived as:

All existing knowledge may be interpreted and rearranged in new unexpected ways unforeseen during the initial information model design. LLD offers fully compliant SPARQL endpoint, full-text search functionality and basic data visualization.

Investigation of causal relationships

In the use case "Semantic Integration for Early Clinical Development" a main research interest is the exploration of causal relationships investigation and if it can generate good hypothesis for the complex questions scientists face. The causality search between information types like gene, disease, biological process, environmental factor and etc can be used to give new directions in early clinical/translational medicine and allow to:

For example a causal relationship is the link between (1) gene and disease, (2) gene and biological process (up-regulate, down-regulate), (3) biological process and environmental factor. (TODO: Somebody with the expertise to revise the examples).

The causality relationships could be derived from existing data sources or generated using new methods (do we have the expertise to develop and asses these methods?)

The ideas so far

TC to initiate this hypothesis generation/validation collaboration

A doodle poll will be sent out to decide TC day and time.

Background information with rational and relation to our use case can be found here.

User Retained Interests Based Selection vs Query Refinement

Partners: WICI (leader), Ontotext (LLD support), ?AstraZeneca (end-user evaluation)
Status: Active development
Reference documents:
Medical Search Refinement
Involved WP: WP2, WP7a

Some practical example to present the problem and the expected solution: search-refinement-WICI

Current Status: The Program for Interests based SPARQL query rewriting has been finished (Will be reported in Munich meeting). Now is in the process of interests calculation of Medline authors. User Retained Interests Based Selection has been proofed to be efficient than Query refinement on the DBLP dataset, but similar evidence is supposed to be get on Medline dataset before Year 2 review.

Random Indexing of Medline/Pubmed

Partners: USFD (leader), Ontotext (support implementation and dataset), ?AstraZeneca?, other WP2 members?
Status: under specification/investigation
Reference documents: yet to be done
Involved WP: WP2, WP7a

LLD KB contains semantic annotations into RDF format:

http://www.linkedlifedata.com/explore?resource=pubmed-article%3A18703625&limit=100

The predicates lifeskim:mentions (high-recall) and lifeskim:mentionsStrict (high-precision) link the pubmed document to semantic annotations parts of UMLS.

Concept mapping with machine learning methods

NB: The description could be narrowed down only to causative relationships (originally it was proposed general semantic relationships that included also identity mapping, etc.)

Partners: Siemens (?), Ontotext (?)
Status: initial idea
Reference documents:
Involved WP: WP3, WP7a

LLD integrates concepts from more than 20 different sources. Each source describes different type of entities like gene, proteins, disease, etc. The ultimate objective is to create a knowledge base with controlled redundancy (e.g. the concept with similar identity are connected with a set of predicates)

For instance the concept with label COPD is mentioned as:

The task is to connect all entities (if not already connected) with a list of predicates to indicate "exact match", "close match", "just related" or unrelated. We have identified many features like: exact labels, exact URI local names, network connection, same type and etc. Using the mappings from existing sources or special meta-rules crafted by Ontotext we can provide tons of training, test data.

LarkcProject/WP7a/Cross WP collaboration ideas (last edited 2010-02-04 12:33:03 by ?BoAndersson)