Pointers to work on TripleStores here.
Useful intros on RDF:
List with signed statements on nr of triples handled by various stores (but only marginally informative, since they don't really tell you what the stores can do with these triples: just retrieval? or also inference? if so, what semantics is supported? etc).
Sesame is an easy to install triple store to play with: http://www.openrdf.org/
OWLIM is a high-performance triple store on top of Sesame, developed by LarKC partner OntoText: http://www.ontotext.com/owlim/ plus some good slides on what they can do: OWLIMPres.pdf
The YARS triple store was the first to break the billion-triple barrier (YARS-DERI-TR-2007-04-20.pdf), but it doesn't support RDFS/OWL inference.
Slides on the design-philosophy of the Mulgara Store (by Zepheira) netkernel.pdf
Possibly the TripCom Triple Store model is a/another suitable storage paradigm for LarkC?
Some slides on TripCom approaches scalability: TripCom.ppt
as well as a general TripCom paper: triple-space-computing.pdfsome recent benchmarking attempts:
Only very loosely connected to the topic of triple stores: OceanStore is an approach to world-wide distributed storage and retrieval of persistent information. Is this where distributed triple-stores should go? oceanstore asplos00.pdf oceanstore fast2003-pond.pdf
- Again somewhat loosely related, but still relevant: the DB world is also waking up to the fact that the world's data is now too interconnected, and too underorganised to fit
in a database. This has prompted work on what they call "dataspaces": "[data comes in] a large number of diverse, interrelated data sources, [with] no way to manage their dataspaces in a convenient, integrated, or principled fashion." (now what does that remind you of?
)
The paper launching this idea is dataspaces SIGMOD Dec05.pdf; a later paper isdataspaces-PODS06.pdf Although not strictly about Triple Stores, the Tuple Space model is a potential model for the LarKC storage component (and even for communication between the plugins). A useful survey appeared recently: tuple-space-survey.pdf#
Members of W3C's Semantic Web Deployment Working Group have published a Group Note for "Best Practice Recipes for Publishing RDF Vocabularies" http://www.w3.org/TR/2008/NOTE-swbp-vocab-pub-20080828/
Chris Bizer has developed a benchmark for RDF stores, and has run it to compare native RDF/SPARQL stores against SQL databases (wrapped as RDF stores). The results are rather interesting, see http://lists.w3.org/Archives/Public/semantic-web/2008Sep/0128.html
