Information Extraction
See also information in the requirements deliverable
See also WP7a information extraction
Other corpora
A few other corpora that may prove useful, in addition to those already suggested by WP7a:
- PennBioIE CYP (various)
- PennBioIE Oncology (various)
EBI:diseases
?BioText (diseases, treatments)
- Wisconsin / Craven (diseases)
- IEPA (diseases, disease relations)
- TREC genomics
- LLL (protein / gene interactions)
- Yapex (proteins)
- CLEF (diseases, anatomy - but clinical documents, and restricted)
Entities to extract
The full list of entities required by WP7b:
- genes
- proteins
- biomolecules
- anatomical sites
- diseases, disorders, symptoms etc.
- small chemical entities
- species
- MeSH terms
- person (from structured fields)
- organisation (from structured fields)
- lexical features
- dependency relations
- biomedical relations: TBD
