|
|
For an evalutation of existing StateOfTheArt with respect to these features, see StateOfTheArt.
Step 1: Generalised Requirements
from the original powerpoint file
- Varying content granularity
- From a single word (such as a concept or wikiname) to a document (such as a wiki page content)
- Should be able to represent semantic concept-maps (iMapping)
- Annotation of everything
- Even of parts of the content
- Semantic statements about everything
- Concepts, resources, binaries, statements,
- Versioning of everything (binaries, content, model)
- Resources can be versioned individually, RDF is best versioned on the model
- Fulltextsearch
- Autocomplete support
- For links and plain text
- Client-Server
- To enable real-time collaboration, just like in a wiki
- Synchronisation
- Mirror a content store constantly
- Commit local repository to a shared repository – and vice versa
- Import/Export
- From other repositories, from RDF, from web sites, from applications
- Compatible with semantic web and web architecture
- Implementable
Step 2: Identified orthogonal requirements
Representation on two layers
- Content Resources: map URIs to content (strings or binaries), might be local or remote
- --> need for binary store
- Semantic Metadata (statements about URIs), locally managed
- --> RDF store
- Queries for named items (such as Wikinames) needed for auto-completion support
- Verisoning on content – on a per-resource basis
- Versioning of semantic model – on a model-basis
- Needs an index of all content resources
- Queries to lexiccon are needed for fulltext-auto-completion
- RESTful API
- Clear export format
- Support for synchronisation between two repositories needed
Step 3
Data Model
Binaries
Binary files can not be stored in RDF. They need to be stores elsewhere. We call this feature binary. See BinaryStore.
Full-text search
Full-text search means the ability to find partial string matches within a collection of strings. Lucene is the most used implementation for this. We call this feature fulltext. See TextIndex.
We also have derrived requirements, like the ability to make full-text queries over binary content - which works of course only if some subsyste, extracted e.g. the content of a PDF or DOC file. We call such combined requirements binary+fulltext, which should not be confused with the requirement to have binary and have fulltext.
RDF/SPARQL
RDF can handle triples, short (optionally types) strings and allows powerfull queries (SPARQL). We call the query feature sparql. See RdfModel.
We need a way to add metadata to binaries, so we need sparql+binary. In reality, we need to assign URIs to stored binaries.
For RDF we also need sparql+fulltext (fulltext search in all literals) and sparql+binary+fulltext (fulltext search in the binaries AND the literals).
XHTML
XHTML is a very widely deployed conten format for structured content. It can represent text documents and tables and multimedia content as well. So we define XHTML as out basic content unit. We also need
- ability to annotate individual elements of an XHTML snippet (xhtmlelement)
- link XHTML files to RDF URIs (easy)
- version the XHTML snippets (can be done with the binary+versioning functionality)
Orthogonal Features
Versioning
Versioning is the ability to turn back time and get back to older states. We call this feature versioning.
Access Rights
Not verybody should be allowed to see or change everything. So access is restricted, we call this feature accesscontrol.
Concurrency
The system should also work in a shared-use scenario, therefore we need the ability to work collaboratively on the same repository. This can be done simplisitcally with methods like startEditingResourceX, cancelEdit and commit. We call these things locking.
API
We have the need to use a Java API and a web api.
- We call this javaapi.
- We call this restapi.
Also, we need the ability to deploy the stuff on a local desktop, which we call desktop.
- derrived requirement: small memory footprint!
- derrived requirement: acceptable performance for everyday tasks
Summary
In swecr, we need
- SPARQL: sparql
- uris for binaries: this is so easy to achieve, we don't give it a name
- full-text search over all literals and binaries, integrated with SPARQL: sparql+fulltext+binary
- versioning of binaries: binary+versioning
- versioning of rdf: sparql+versioning
- access rights for rdf and binaries: access
- java and rest api: javaapi, restapi
- desktop
- xhtmlelement
- locking
Sign in to add a comment
