D7a.2.1 Pathway and Interaction Knowledge Base

Type: other (knowledge base)
Scope: public
Delivery date: M18
09-16 September 09 (1 week), Quality Assessor = person outside the WP in which deliverables is produced,
16-23 September 09 (1 week), Quality Controller = WP leader,
23-30 September 09 (1 week), buffer/ check by Frank van Harmelen,
30 September 09, submission to EC.

Introduction (Problem description)

Questions

Current page to present the questions is: LarkcProject/WP7a/Questions

Type of entities, relations and data sources

The chapter outlines the core entity types, relations and data sources to generate them. Some of the information will be dereived as result of information extraction process of various textual documents.

Basic Concept Types

Concept

Identifier

Data source

Example (all URIs should be resolveable!)

Comment

Gene

?EntrezGene identifier

?EntrezGene

http://linkedlifedata.com/resource/EntrezGene/id/7157

Protein

Uniprot primary accession number

Uniprot

http://purl.uniprot.org/uniprot/P02340

Pathway + classification

?PathwayCommons (Intact, Reactome, BioGRID, NCI-Nature) + Pathway Ontology (SKOS)

Disease and disored

Disease ontology identifier

Disease ontology (OBO)

http://linkedlifedata.com/resource/diseaseontology/id/DOID:766

The category may be too broad

Drug Active Substance

?DrugBank identifier/ CAS number

?DrugBank (LODD)

http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00002

Symptom

Symptom ontology identifier

Symptom ontology (OBO)

?

Phenotype

Human phenotype ontology identifier

Human phenotype (OBO)

?

Anatomical loci

?

NCI (OBO)

?

?

Tissue

?

BRENDA tissue / enzyme source? (OBO)

?

Cell type

?

BRENDA tissue / enzyme source? (OBO) + ontology population + Cell type (OBO)

?

?BioTagger has a trained model to recognize the concepts

Cell line

?

BRENDA tissue / enzyme source? (OBO) + ontology population

?

?BioTagger has a trained model to recognize the concepts

Company

Wikipedia page

DBPedia

http://dbpedia.org/resource/AstraZeneca

Cellular component

GO identifier

GO (OBO/RDF?)

http://linkedlifedata.com/resource/GeneOntology/id/GO:0005739

Biological process

GO identifier

GO (OBO/RDF?)

http://linkedlifedata.com/resource/GeneOntology/id/GO:0030154

Molecular function

GO identifier

GO (OBO/RDF?)

http://linkedlifedata.com/resource/GeneOntology/id/GO:0006355

Document

?PubMed id + other?

Pubmed + other?

[[http://linkedlifedata.com/resource/pubmed/pmid/12853137||Other|http://linkedlifedata.com/resource/pubmed/pmid/12853137]]

Other document could be also integrated

Author

?PubMed authors

Generated from First Last name and the initials

Duplications are possible

Chemical

CAS

?

?

?

Relation (all could be a subject of information extraction process; the arguments may be optional)

Relation

Identifier

Datasource

Comment

Interaction (?PhysicalEntity(Gene/Protein)+, Pathway?)

?

?PathwayCommons?

Possibly redundant

Target (Gene/Protein, Drug active substance)

?DrugBank target identifier

?DrugBank

http://www4.wiwiss.fu-berlin.de/drugbank/resource/targets/228

Drug Product (Company name, Drug Active Substance+, Route of Administration, Region)

?

Wikipedia/FDA/Dailymed?

Indication (Drug Product/Drug Active Subtance?, Disease/Symptom)

?

?DrugBank

Information extraction indication field

Gene function (Gene, Molecular function/Cellular component/Biological process)

?

?EntrezGene / Uniprot / GO

Treatment (???)

?

?

very unclear concept; maybe it will be better if we can replace with clinical trial

Concepts or relations that have been excluded from the initial list:

Data source transformation & instances mappings

To describe the OBO to SKOS transformations

To describe the other required transformations in order to "link data"

WP7a transformations.png

Reasoning schema & requirements

Schema reasoning

To successfully implement WP7a M18 prototype we required atleast the support of SKOS schema which invovles heavy usage of:

Example:

<A> skos:broader <B> .
<B> skos:broader <C> .

entails

<A> skos:broaderTransitive <B> .
<B> skos:broaderTransitive <C> .
<A> skos:broaderTransitive <C> .

Another example used for the purpose of semantic data integration is the alignement of different biomedical thesaurus:

<A> skos:broadMatch <B> .

entails

<A> skos:mappingRelation <B> .
<A> skos:broader <B> .
<A> skos:broaderTransitive <B> .
<A> skos:semanticRelation <B> .
<A> rdf:type skos:Concept .
<B> rdf:type skos:Concept .

Inconsistency rules

<Love> skos:prefLabel "love"@en ; skos:prefLabel "adoration"@en .

Relation extraction validation

LarkcProject/WP7a/M18prototype (last edited 2009-07-21 07:19:18 by ?VassilMomtchev)