My favorites | Sign in
Project Home Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
DesignNotes  
initial design notes on the tabular API
Updated Dec 22, 2010 by alimanfoo@googlemail.com

Non-Functional Requirements

  • provenance & traceability - for every record, it should always be possible to unambiguously describe the provenance of that record, i.e., its complete derivation graph and the sequence of operations that have been applied

  • incrementally improving data quality - bad data is ok, to begin with. data with missing values, invalid datatypes, breaking constraints, need to be accommodated. the design needs to support an incremental approach to improving data quality.

Functional Requirements

Tables

REQUIREMENT: create a new table

given a tables collection IRI, e.g., http://example.org/tabular/tables

POST a table specification document to the tables collection IRI

a table specification document is an atom entry document with inline content type "application/x.tabular+xml;type=tablespec" e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:title>[table name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=tablespec">
        <tbl:tablespec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table description]</tbl:localization>
            </tbl:description>
            <tbl:configuration>
                <tbl:property>
                    <tbl:key>foo</tbl:key>
                    <tbl:value>bar</tbl:value>
                </tbl:property>
            </tbl:configuration>
        </tbl:tablespec>
    </atom:content>
</atom:entry>

TODO standard configuration properties, e.g., ...

  • mutable-fields - can fields be added, removed or modified following table creation? (implies way of specifying initial fields if not)
  • mutable-records - can records be added, removed or modified following table creation? (implies way of specifying initial records if not)
  • strict-datatype-validation - should new records be rejected if they do not conform to datatype constraints specified in the fields? or should the data be accepted and stored regardless?

successful response is 201 Created

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[table id]</atom:id>
    <atom:published>[creation date]</atom:published>
    <atom:updated>[date last updated]</atom:updated>
    <atom:author>[identity of person who created the table]</atom:author>
    <atom:link rel="edit" href="http://example.org/tabular/tables/123"/>
    <atom:link rel="http://purl.org/net/tabular/rel/fields" href="http://example.org/tabular/tables/123/fields"/>
    <atom:link rel="http://purl.org/net/tabular/rel/records" href="http://example.org/tabular/tables/123/records"/>
    <atom:title>[table name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=tablespec">
        <tbl:tablespec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table description]</tbl:localization>
            </tbl:description>
            <tbl:configuration>
                <tbl:property>
                    <tbl:key>foo</tbl:key>
                    <tbl:value>bar</tbl:value>
                </tbl:property>
            </tbl:configuration>
        </tbl:tablespec>
    </atom:content>
</atom:entry>

notice two special links:

REQUIREMENT: add a field to a table

given a table IRI, make a GET to retrieve the table specification document

find the fields collection IRI via the "http://purl.org/net/tabular/rel/fields" link

POST a field specification document to the fields collection IRI

a field specification document is an atom entry document with inline content type "application/x.tabular+xml;type=fieldspec" e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:title>[field name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=fieldspec">
        <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field description]</tbl:localization>
            </tbl:description>
            <tbl:datatype base="[XML schema built-in datatype]">
                <!-- optional restrictions depending on facets of base datatype, e.g., ... -->
                <tbl:maxlength>12</tbl:maxlength>
            </tbl:datatype>
            <tbl:default>[default field value]</tbl:default>
        </tbl:fieldspec>
    </atom:content>
</atom:entry>

N.B. the given datatype represents an aspiration for the field, and on configuration? the service is expected to permit storage of data in that field that does not conform to the lexical space for the referenced datatype

TODO atomic datatypes only? what about list and union datatypes?

TODO what about referencing datatypes defined in a data dictionary? or would you never define datatypes independently of a field?

TODO what about not null constraint?

TODO what about unique constraint?

successful response is 201 Created

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[field id]</atom:id>
    <atom:published>[creation date]</atom:published>
    <atom:updated>[date last updated]</atom:updated>
    <atom:author>[identity of person who created the field]</atom:author>
    <atom:link rel="edit" href="http://example.org/tabular/tables/123/fields/456"/>
    <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/tables/123"/>
    <atom:title>[field name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=fieldspec">
        <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]">[localized field label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]">[localized field description]</tbl:localization>
            </tbl:description>
            <tbl:datatype base="[XML schema built-in datatype]">
                <!-- optional restrictions depending on facets of base datatype -->
                <tbl:maxlength>12</tbl:maxlength>
            </tbl:datatype>
        </tbl:fieldspec>
    </atom:content>
</atom:entry>

TODO what about namespaces? i.e., do you want people to be able to specify a namespace for the field when represented in a tabular record document? probably not, makes life complicated

REQUIREMENT: list all fields specified for a table

given a table IRI, make a GET to retrieve the table specification document

find the fields collection IRI via the "http://purl.org/net/tabular/rel/fields" link

make a GET request to the fields collection IRI

successful response is 200 OK with an atom feed document where all entries have inline content type "application/x.tabular+xml;type=fieldspec" e.g.

<atom:feed xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[fields collection id]</atom:id>
    <atom:title>All fields for table [table name]</atom:title>
    <atom:update>[date last updated]</atom:updated>
    <atom:link rel="self" href="http://example.org/tabular/tables/123/fields"/>
    <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/tables/123"/>
    <app:collection xmlns:app="http://www.w3.org/2007/app" href="http://example.org/tabular/tables/123/fields">
        <app:accept>application/atom+xml;type=entry</app:accept>
        <appx:acceptinline xmlns:appx="http://purl.org/net/appx/xmlns">application/x.tabular+xml;type=fieldspec</appx:acceptinline>
    </app:collection>
    <atom:entry>
        <atom:id>[field id]</atom:id>
        <atom:published>[creation date]</atom:published>
        <atom:updated>[date last updated]</atom:updated>
        <atom:author>[identity of person who created the field]</atom:author>
        <atom:link rel="edit" href="http://example.org/tabular/tables/123/fields/456"/>
        <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/tables/123"/>
        <atom:title>[field name]</atom:title>
        <atom:content type="application/x.tabular+xml;type=fieldspec">
            <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
                <tbl:label>
                    <tbl:localization xml:lang="[lang tag]">[localized field label]</tbl:localization>
                </tbl:label>
                <tbl:description>
                    <tbl:localization xml:lang="[lang tag]">[localized field description]</tbl:localization>
                </tbl:description>
                <tbl:datatype base="[XML schema built-in datatype]">
                    <!-- optional restrictions depending on facets of base datatype -->
                    <tbl:maxlength>12</tbl:maxlength>
                </tbl:datatype>
            </tbl:fieldspec>
        </atom:content>
    </atom:entry>
    <atom:entry>
        <atom:id>[field id]</atom:id>
        <atom:published>[creation date]</atom:published>
        <atom:updated>[date last updated]</atom:updated>
        <atom:author>[identity of person who created the field]</atom:author>
        <atom:link rel="edit" href="http://example.org/tabular/tables/123/fields/456"/>
        <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/tables/123"/>
        <atom:title>[field name]</atom:title>
        <atom:content type="application/x.tabular+xml;type=fieldspec">
            <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
                <tbl:label>
                    <tbl:localization xml:lang="[lang tag]">[localized field label]</tbl:localization>
                </tbl:label>
                <tbl:description>
                    <tbl:localization xml:lang="[lang tag]">[localized field description]</tbl:localization>
                </tbl:description>
                <tbl:datatype base="[XML schema built-in datatype]">
                    <!-- optional restrictions depending on facets of base datatype -->
                    <tbl:maxlength>12</tbl:maxlength>
                </tbl:datatype>
            </tbl:fieldspec>
        </atom:content>
    </atom:entry>
</atom:feed>

N.B. fields must be presented in order in which values are expected in records, usually in order of ascending creation date (most recent last)

REQUIREMENT: add a record to a table

given a table IRI, make a GET to retrieve the table specification document

find the records collection IRI via the "http://purl.org/net/tabular/rel/records" link

POST a record document to the records collection IRI

a record document is an atom entry document with inline content type "application/x.tabular+xml;type=record" e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:content type="application/x.tabular+xml;type=record">
        <[table name] xmlns="">
            <[field name]>[field value]</[field name]>
            <[field name]>[field value]</[field name]>
            <[field name]>[field value]</[field name]>
            <!-- ... -->
        </[table name]>
    </atom:content>
</atom:entry>

e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:content type="application/x.tabular+xml;type=record">
        <person xmlns="">
            <givenname>Alistair</givenname>
            <familyname>Miles</familyname>
            <gender>M</gender>
        </person>
    </atom:content>
</atom:entry>

successful response is 201 Created

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[record id]</atom:id>
    <atom:published>[creation date]</atom:published>
    <atom:updated>[date last updated]</atom:updated>
    <atom:author>[identity of person who created the record]</atom:author>
    <atom:link rel="edit" href="http://example.org/tabular/tables/123/records/456"/>
    <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/tables/123"/>
    <atom:title/>
    <atom:content type="application/x.tabular+xml;type=record">
        <person xmlns="">
            <givenname>Alistair</givenname>
            <familyname>Miles</familyname>
            <gender>M</gender>
        </person>
    </atom:content>
</atom:entry>

TODO rules about missing fields, accept but get populated with defaults? configuration property here on the table controlling behaviour?

TODO rules about unrecognised fields, get auto-pruned? probably, otherwise harder to implement if repository is expected to store them as well as defined fields

REQUIREMENT: list records in a table

TODO

REQUIREMENT: page through records in a table

TODO

REQUIREMENT: update a record

TODO

REQUIREMENT: delete a record

TODO

REQUIREMENT: delete a field from a table

TODO

REQUIREMENT: update a field in a table

TODO

REQUIREMENT: insert a field in a table at a given position (optional)

TODO

REQUIREMENT: change the position of a field in a table (optional)

TODO

REQUIREMENT: add a record to a table via an HTML form

TODO

REQUIREMENT: list revisions of a record

TODO

REQUIREMENT: snapshot a table

TODO want an immutable snapshot but still want to be able to page through records

REQUIREMENT: add multiple records to a table in a single request

TODO

REQUIREMENT: express multi-field constraints on a table

TODO express constraints involving multiple fields of a table

REQUIREMENT: list invalid records

TODO

REQUIREMENT: list valid records

TODO

Queries

REQUIREMENT: list records in a table matching a simple query

TODO

GET /tabular/tables/123/records/query?[field name]=[value]&[field name]=[value]&... HTTP/1.1

TODO link relation to find the query endpoint?

REQUIREMENT: list records in a table matching a complex query

TODO

Data Dictionaries

REQUIREMENT: create a stand-alone data dictionary

TODO what is a data dictionary exactly? collection of fieldsets each of which has a collection of fieldspecs? or just a collection of fieldspecs?

REQUIREMENT: add a fieldset to a dictionary

TODO

REQUIREMENT: add a field to a fieldset

TODO

REQUIREMENT: create a table with fields referencing entries in a stand-alone data dictionary

TODO

Views

REQUIREMENT: create a transformation on a table (view)

given a views collection IRI, e.g., http://example.org/tabular/views

POST a view specification document to the views collection IRI

a view specification document is an atom entry document with inline content type "application/x.tabular+xml;type=viewspec" e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:title>[view name]</atom:title>
    <atom:link rel="http://purl.org/net/tabular/rel/source" href="http://example.org/tabular/tables/123"/>
    <atom:content type="application/x.tabular+xml;type=viewspec">
        <tbl:viewspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized view label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized view description]</tbl:localization>
            </tbl:description>
        </tbl:viewspec>
    </atom:content>
</atom:entry>

successful response is 201 Created

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[view id]</atom:id>
    <atom:published>[creation date]</atom:published>
    <atom:updated>[date last updated]</atom:updated>
    <atom:author>[identity of person who created the view]</atom:author>
    <atom:link rel="edit" href="http://example.org/tabular/views/789"/>
    <atom:link rel="http://purl.org/net/tabular/rel/fields" href="http://example.org/tabular/views/789/fields"/>
    <atom:link rel="http://purl.org/net/tabular/rel/records" href="http://example.org/tabular/views/789/records"/>
    <atom:title>[view name]</title>
    <atom:link rel="http://purl.org/net/tabular/rel/source" href="http://example.org/tabular/tables/123" title="source"/>
    <atom:content type="application/x.tabular+xml;type=viewspec">
        <tbl:viewspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized view label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized view description]</tbl:localization>
            </tbl:description>
        </tbl:viewspec>
    </atom:content>
</atom:entry>

REQUIREMENT: add a field to a view

given a view IRI, make a GET to retrieve the view specification document

find the fields collection IRI via the "http://purl.org/net/tabular/rel/fields" link

POST a field specification document to the fields collection IRI, e.g.

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:title>[field name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=fieldspec">
        <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field description]</tbl:localization>
            </tbl:description>
            <tbl:datatype base="[XML schema built-in datatype]">
                <!-- optional restrictions depending on facets of base datatype, e.g., ... -->
                <tbl:maxlength>12</tbl:maxlength>
            </tbl:datatype>
            <tbl:default>[default field value]</tbl:default>
            <tbl:constructor>
                <!-- field is copy of source field -->
                <tbl:equals>[source title].[field name]</tbl:equals>
            </tbl:constructor>
        </tbl:fieldspec>
    </atom:content>
</atom:entry>

successful response is 201 Created

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:id>[field id]</atom:id>
    <atom:published>[creation date]</atom:published>
    <atom:updated>[date last updated]</atom:updated>
    <atom:author>[identity of person who created the field]</atom:author>
    <atom:link rel="edit" href="http://example.org/tabular/views/789/fields/012"/>
    <atom:link rel="http://purl.org/net/tabular/rel/owner" href="http://example.org/tabular/views/789"/>
    <atom:title>[field name]</atom:title>
    <atom:content type="application/x.tabular+xml;type=fieldspec">
        <tbl:fieldspec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized field description]</tbl:localization>
            </tbl:description>
            <tbl:datatype base="[XML schema built-in datatype]">
                <!-- optional restrictions depending on facets of base datatype, e.g., ... -->
                <tbl:maxlength>12</tbl:maxlength>
            </tbl:datatype>
            <tbl:default>[default field value]</tbl:default>
            <tbl:constructor>
                <!-- field is copy of source field -->
                <tbl:equals>[source title].[field name]</tbl:equals>
            </tbl:constructor>
        </tbl:fieldspec>
    </atom:content>
</atom:entry>

TODO other types of field constructor, e.g., ...

  • mathematical expression on single source column
  • mathematical expression on two or more source columns

probably need to find lowest common denominator between sqlite view support and xpath, e.g.,

<tbl:constructor>
    <tbl:expression>concat(source.givenname, ' ', source.familyname)</tbl:expression>
</tbl:constructor>

...bearing in mind that this would need to be translated into both xpath and sqlite sql in the underlying implementation.

maybe identify a subset of xpath that could be mapped to sqlite sql?

REQUIREMENT: list fields of a view

TODO

REQUIREMENT: list records from a view

TODO as for a table

N.B., the underlying implementation may evaluate the records each time the records are retrieved. also, N.B., if a table is the view source, then the records in the view will change if the table changes

TODO can a view reference a table snapshot? yes, should be able to, would be required for unambiguous derivation.

REQUIREMENT: create a new table from a view

the requirement here is to be able to persist the data generated from a view, either as an immutable table to use as input to further transformations, or to start editing.

this is a bit like "materialising" a view (as a table).

given a tables collection IRI, e.g., http://example.org/tabular/tables

POST a table specification document to the tables collection IRI

a table specification document is an atom entry document with inline content type "application/x.tabular+xml;type=tablespec" e.g.

include a "http://purl.org/net/tabular/rel/source" link referring to the view from which to load the data

<atom:entry xmlns:atom="http://www.w3.org/2005/Atom">
    <atom:title>[table name]</atom:title>
    <atom:link rel="http://purl.org/net/tabular/rel/source" href="http://example.org/tabular/views/789"/>
    <atom:content type="application/x.tabular+xml;type=tablespec">
        <tbl:tablespec xmlns:tbl="http://purl.org/net/tabular/xmlns">
            <tbl:label>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table label]</tbl:localization>
            </tbl:label>
            <tbl:description>
                <tbl:localization xml:lang="[lang tag]" type="[text|html|xhtml]">[localized table description]</tbl:localization>
            </tbl:description>
            <tbl:configuration>
                <tbl:property>
                    <tbl:key>foo</tbl:key>
                    <tbl:value>bar</tbl:value>
                </tbl:property>
            </tbl:configuration>
        </tbl:tablespec>
    </atom:content>
</atom:entry>

N.B., after creation, the "source" link would have to be made immutable otherwise a client could destroy the provenance information

N.B., an implementation would have to initialise the fields based on fields found in the source, then load data from the source, as part of the creation process.

successful response is 201 Created

TODO e.g. response

TODO N.B. there is a problem here, because the view could be modified after this table was created, destroying provenance information. how to fix the provenance? require views to be locked? snapshot views? snapshot view automatically as part of table creation process?

TODO maybe differentiate live views from stable views? i.e., live view is sourced directly from a mutable table, and the view may also be changed, whereas a stable view is always sourced from a table snapshot and cannot be modified once created?

REQUIREMENT: list only top 10 records from a view (preview)

TODO

REQUIREMENT: create a view joining records from two or more tables

TODO

REQUIREMENT: create a view melting records from a single table

TODO melt view, i.e., one-to-many rows

REQUIREMENT: create a view flattening records from a single table

TODO reduce rows, i.e., many-to-one rows, same as join with sources being the same?

REQUIREMENT: add records from a view to a table that already exists

TODO

Parsers

REQUIREMENT: create a parser to extract records from a resource

TODO

REQUIREMENT: list records from a parser

TODO

REQUIREMENT: list top 10 records from a parser (preview)

TODO

REQUIREMENT: create a new table from a parser

TODO

REQUIREMENT: add records from a parser to a table that already exists

TODO

Merge Views

REQUIREMENT: create a merge view of two or more tables

TODO would need to define the join fields up front?

REQUIREMENT: add a join field to a merge view

TODO

REQUIREMENT: add a copy field to a merge view

TODO

REQUIREMENT: add a merge field to a merge view

TODO

REQUIREMENT: list records in a merge view

TODO

REQUIREMENT: list first 10 records in a merge view (preview)

TODO

REQUIREMENT: delete a field from a merge view

TODO

REQUIREMENT: update a field in a merge view

TODO

Merge Tables

REQUIREMENT: create a merge table from a merge view

TODO

REQUIREMENT: list records in a merge table

TODO

REQUIREMENT: update a record in a merge table

TODO

REQUIREMENT: update many records in a merge table in a single request

TODO

REQUIREMENT: create a table from a merge table (with no conflicts)

TODO

Working Sets

A working set is a convenient grouping of tables, views, parsers, merge views and merge tables. Any of these data resources can appear in zero or more working sets. Working sets are purely for the convenience of users, to provide ways of logically grouping resources that are being worked on together.

REQUIREMENT: create a working set

TODO

REQUIREMENT: add a resource to a working set

TODO

REQUIREMENT: remove a resource from a working set

TODO

Service

REQUIREMENT: discover a tables collection

TODO

REQUIREMENT: discover a views collection

TODO

Powered by Google Project Hosting