What's new? | Help | Directory | Sign in
Google
                
Search
for
Updated Mar 26, 2007 by max.at.xam.de
Architecture  
Textual representation of a possible architecture

See also

swecr Java API *javaapi*

This module has one clearly defined API to the outside. The implementation should be able to run on a desktop.

This API should expose:

  • an RdfModel - API: RDF2Go ModelSet, impl: Sesame2
  • a BinaryStore - API: BinaryStore, impl: JCR / JackRabbit
  • a TextIndex - API: Lucene, impl Lucene
  • Versioning - API: ?, impl JackRabbit or SubVersion
  • AccessRights - API: ?, impl ?
Updates
  • add triple (remove is reverse)
    • add to versioned persistence layer
    • add triple to RdfModel for indexing
    • if triple contains literal, index in lucene
  • add binary (remove is reverse)
    • add to versioned persistence layer
    • for given mime-type
      • extract text and index in lucene
      • extract RDF and index in RdfModel

Queries

  • rdf query -> ask RdfModel
  • fulltext query -> ask TextIndex
  • mixed query:
    • split in rdf part and fulltext part
    • query RdfModel with rdf part
    • query TextIndex with fulltext part
    • join result sets
sample query (taken form LARQ documentation):

  SELECT ?item WHERE {
    ?lit swecr:textMatch '+text' .
    ?item swecr:hasContent ?lit
  }

Versioning

  • takeSnapshot - store current state as a version
  • getCurrentVersion - return a global number, like in SubVersion
  • restoreVersion (x) - copy old version as next version
  • rollbackto(x) - remove all versions after x from versioning system
AccessRights
  • filter access on add/remove of data (rdf and binaries)
  • filter query results

web api

swecr web server

This server exposes javaapi to the web as a restapi.

swecr web client

Consumes a restapi and exposed the result as an implementatooion of javaapi. This allows to use swecr remotely or locally. Locally, no HTTP is used.

swecr implementation

We surely need to re-use an RDF triple store.

  • We plan to use RDF2Go as a triple store abstration layer. Underneath, we currently investigate Jena and OpenRDF, with a slight preference for OpenRDF (we use it in NEPOMUK, too).
    • The store gives us: sparql

We also need a binary store, which essentially maps URIs to Input- and OutputStreams.

  • For non-versioning we have: IBinStore. But we also need versioning. So we look into:
  • We expect to get: binary+versioning

Next we probably use

  • Lucene. It is internally used in JackRabbit (a JCR implementation). Also, Lucene is used by LuceneSAIL and LARQ. We will see if we can tweak the two to use the same Lucene.
    • This should give us: sparql+fulltext+binary
    • We also need some tools to extract full-text from binaries, to get binary+fulltext.
    • Using Aperture we can enhance the possibilties to ask queries (Aperture extracts RDf from binaries). So we get a better sparql+binary .

Next in the list is versioning of the RDF.

  • We plan to look again in SemVersion and create a new version of it as an RDF2Go layer. This should give us sparql+versioning.

For access rights, well, we should use something on top of the other layers.

  • Any ideas for: access?
  • I once supervised a very good diploma thesis on the topic PDF, in German.

Locking can probably easy be implemented by using hte triple store to record the current state of a resource.

We are left with the problems:

  • uris for binaries: this is so easy to achieve, we don't give it a name
    • we just need to map any kind of ID from the binary store to URIs. Almost ...
  • xhtmlelement
    • To annotate XML elemements, we plan to use the XML ID approach, this gives all DOM nodes that need it a URI. See also XhtmlAndRDF.


Sign in to add a comment