|
See also swecr Java API *javaapi*This module has one clearly defined API to the outside. The implementation should be able to run on a desktop. This API should expose: - an RdfModel - API: RDF2Go ModelSet, impl: Sesame2
- a BinaryStore - API: BinaryStore, impl: JCR / JackRabbit
- a TextIndex - API: Lucene, impl Lucene
- Versioning - API: ?, impl JackRabbit or SubVersion
- AccessRights - API: ?, impl ?
Updates - add triple (remove is reverse)
- add to versioned persistence layer
- add triple to RdfModel for indexing
- if triple contains literal, index in lucene
- add binary (remove is reverse)
- add to versioned persistence layer
- for given mime-type
- extract text and index in lucene
- extract RDF and index in RdfModel
Queries - rdf query -> ask RdfModel
- fulltext query -> ask TextIndex
- mixed query:
- split in rdf part and fulltext part
- query RdfModel with rdf part
- query TextIndex with fulltext part
- join result sets
sample query (taken form LARQ documentation): SELECT ?item WHERE {
?lit swecr:textMatch '+text' .
?item swecr:hasContent ?lit
} Versioning - takeSnapshot - store current state as a version
- getCurrentVersion - return a global number, like in SubVersion
- restoreVersion (x) - copy old version as next version
- rollbackto(x) - remove all versions after x from versioning system
AccessRights - filter access on add/remove of data (rdf and binaries)
- filter query results
web apiswecr web serverThis server exposes javaapi to the web as a restapi. swecr web clientConsumes a restapi and exposed the result as an implementatooion of javaapi. This allows to use swecr remotely or locally. Locally, no HTTP is used. swecr implementationWe surely need to re-use an RDF triple store. - We plan to use RDF2Go as a triple store abstration layer. Underneath, we currently investigate Jena and OpenRDF, with a slight preference for OpenRDF (we use it in NEPOMUK, too).
- The store gives us: sparql
We also need a binary store, which essentially maps URIs to Input- and OutputStreams. - For non-versioning we have: IBinStore. But we also need versioning. So we look into:
- We expect to get: binary+versioning
Next we probably use - Lucene. It is internally used in JackRabbit (a JCR implementation). Also, Lucene is used by LuceneSAIL and LARQ. We will see if we can tweak the two to use the same Lucene.
- This should give us: sparql+fulltext+binary
- We also need some tools to extract full-text from binaries, to get binary+fulltext.
- Using Aperture we can enhance the possibilties to ask queries (Aperture extracts RDf from binaries). So we get a better sparql+binary .
Next in the list is versioning of the RDF. - We plan to look again in SemVersion and create a new version of it as an RDF2Go layer. This should give us sparql+versioning.
For access rights, well, we should use something on top of the other layers. - Any ideas for: access?
- I once supervised a very good diploma thesis on the topic PDF, in German.
Locking can probably easy be implemented by using hte triple store to record the current state of a resource. We are left with the problems: - uris for binaries: this is so easy to achieve, we don't give it a name
- we just need to map any kind of ID from the binary store to URIs. Almost ...
- xhtmlelement
- To annotate XML elemements, we plan to use the XML ID approach, this gives all DOM nodes that need it a URI. See also XhtmlAndRDF.
|