My favorites | Sign in
Project Home Wiki Issues Source
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 45: Clarify use of void:vocabulary
2 people starred this issue and may be notified of changes. Back to list
Status:  Fixed
Owner:  richard....@gmail.com
Closed:  Oct 2010


Sign in to add a comment
 
Project Member Reported by richard....@gmail.com, Nov 10, 2009
Which URI exactly do I use for any given vocabulary?

We say, the one that's the object of isDefinedBy triples for the vocab terms, but often isDefinedBy is 
not used in real-world vocabularies.

Should we say "downloadable location"? Should we say "namespace URI"? What about trailing 
hashes, leave them or remove them?

I would prefer having some really clear guidance in the Guide.
Nov 10, 2009
Project Member #1 K.J.W.Al...@gmail.com
This is a good question.

An advantage of specifying the property should point simply to a "downloadable
location" is that data authors could specify a location where a  particular version
could be downloaded (this was suggested in another issue on versioning). The dataset
author could add triples like:

:Dataset void:vocabulary <vocab-location#> .
<vocab-location-2007-08-09#> vann:preferredNamespaceUri <http://xmlns.com/0.1/foaf/> .

The disadvantage of this is that it does complicate discovering which datatsets use
which vocabularies, as the actual vocabulary URI might be found in one of two places
in the graph pattern. You would need to do something like a UNION, with one graph
pattern using the OPTIONAL {} FILTER !bound pattern to exclude the other graph pattern.

I'm not sure such a complication is justified (yet) by real-world demand for the
versioning use case ?

Hmm, I would be inclined to simply say it should be the namespace URI of the
vocabulary, including the trailing hash/slash. We can dispense with mention of
rdfs:isDefinedBy. 

What do we mean by vocabulary? a collection of terms under the same namespace, where
the terms appear in the dataset as either { ?s ?term ?o }  or  { ?s a ?term }   ?
Or is this too narrow, precluding uses of SKOS for example ?


Nov 12, 2009
Project Member #2 richard....@gmail.com
From Keith:

http://myadmin.kwijibo.talis.com/kwijibo-dev3/services/sparql?
query=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-
ns%23%3E%0D%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-
schema%23%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0APREFIX
+owl%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0APREFIX+dcterms%3A+%3Chtt
p%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0D%0Aprefix+void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoi
d%23%3E%0D%0A%0D%0ASELECT+%3Fvocab+%28count%28%3Fdataset%29+as+%3Fno%29+%7B+%0D%0A%3Fd
ataset+void%3Avocabulary+%3Fvocab+.%0D%0A%7D+GROUP+BY+%3Fvocab%0D%0AORDER+BY+DESC%28%3F
no%29%0D%0A%0D%0A
Nov 12, 2009
Project Member #3 K.J.W.Al...@gmail.com
http://tinyurl.com/yz5bbas is a query of usage of vocabularies  in dataset descriptions
Nov 12, 2009
Project Member #4 K.J.W.Al...@gmail.com
http://tinyurl.com/yz5bbas is a query of usage of vocabularies  in dataset descriptions
Nov 12, 2009
Project Member #5 K.J.W.Al...@gmail.com
http://tinyurl.com/yfm83el for rkb is a similar query for  rkbexplorer

the only misuse i can see is the coref namespace doesn't have a trailing hash or slash
Jan 7, 2010
Project Member #6 Michael.Hausenblas
as of 2010-01-07 telco
Labels: Milestone-Release2.0
Jan 18, 2010
Project Member #7 Michael.Hausenblas
(No comment was entered for this change.)
Labels: Prodcut-vocab
Jan 18, 2010
Project Member #8 Michael.Hausenblas
(No comment was entered for this change.)
Labels: Product-vocab
Jan 18, 2010
Project Member #9 Michael.Hausenblas
(No comment was entered for this change.)
Labels: -Prodcut-vocab
Apr 15, 2010
Project Member #10 K.J.W.Al...@gmail.com
so, we agreed that we should say void:vocabulary should point to a namespace URI for
the vocabulary.

As far as I remember, we came to the conclusion that there is a basic lack of
consensus in practice of what URI to use to identify a vocabulary/ontology. Some 
ontologies don't use owl:Ontology, some don't use rdfs:isDefinedBy, many are
published at more than one location, etc etc.

For void:vocabulary to be useful for dataset selection, voiD authors need to use
canonical URIs for the vocabularies/ontologies they link to. I consider that the most
widely understood mechanism for deriving a canonical URI for a vocab is to use the
same mechanism as Qnames etc for stripping the local name off a vocabulary term URI.
eg: for http://www.w3.org/2002/07/owl#sameAs , http://www.w3.org/2002/07/owl#
    for http://xmlns.com/foaf/0.1/name , http://xmlns.com/foaf/0.1/

May 5, 2010
Project Member #11 richard....@gmail.com
I agree for slash URIs but not for hash URIs. The problem is that <http://www.w3.org/2002/07/owl#> is not 
a resource that has any description. If you try to dereference it, you actually get 
<http://www.w3.org/2002/07/owl> because of hash stripping, so this URI actually represents an RDF 
document. If you look into that file, it furthermore states that <http://www.w3.org/2002/07/owl> is an 
owl:Ontology and various other metadata. It says nothing about <http://www.w3.org/2002/07/owl#>. 
Without having made a survey, I'd expect to see the same thing for most other hash namespaces. It's certainly 
what we implement in Neologism.

So if we say that the void:vocabulary should point to <http://www.w3.org/2002/07/owl>, then we actually 
end up with nicely linked data. If we say it should point to <http://www.w3.org/2002/07/owl#>, we just 
point at nothing.

It is true that just using the namespace URI (including hash or slash) would be slightly simpler in terms of 
specification and implementation, but removing the hash leads to an actual dereferenceable document that 
typically includes a helpful description of the document itself, and this tighter interlinking is worth the little 
bit of additional complexity IMO.

Therefore my proposed text:

Every value of void:vocabulary SHOULD be the namespace URI of a vocabulary or ontology that is used in the 
dataset. A vocabulary's namespace URI is the URI of any class or property in the vocabulary, with the local 
name stripped, that is, everything after the last "/" or "#" is removed. If the namespace URI ends in a "#", then 
this trailing hash is also removed; if it ends in a slash, the slash is kept.
May 6, 2010
Project Member #12 richard....@gmail.com
I said: “Without having made a survey, I'd expect to see [the hash-less URI for owl:Ontology and rdfs:isDefinedBy] 
for most other hash namespaces.”

This may actually be wrong, so I retract that statement and will try to actually do a little survey.

Regardless of the survey's outcome, I believe that my proposal is the right choice, because it treats the 
vocabulary and the document it is defined in as the same thing and avoids the introduction of an unnecessary 
extra resource.
May 6, 2010
Project Member #13 richard....@gmail.com
Survey results: http://groups.google.com/group/pedantic-web/msg/505c158813c9bff2

So my guess was actually right, only 20% of owl:Ontology statements and 20% of rdfs:isDefinedBy targets go to 
URIs ending in a hash.
May 7, 2010
Project Member #14 K.J.W.Al...@gmail.com
good work richard!
I am enticed by your linked data argument, but neither of our proposals consistently
result in linked data. For instance, with SIOC, if you dereference the namespace URI
without the hash, you don't get any triples about that hashless URI, only
http://rdfs.org/sioc/ns#

So I guess we have to decide what is more important, linking to a URI you can find in
the dereferenced graph (which may not always be possible, but this needn't invalidate
the approach), or linking to "most canonical" URI.
May 7, 2010
Project Member #15 richard....@gmail.com
As per my post to pedantic-web: SIOC predates httpRange-14, they didn't know what they were doing. Still, 
http://rdfs.org/sioc/ns is at least the identifier of a document that has a representation. 
http://rdfs.org/ns/void# is not the identifier of anything at all. Hence, stripping the hash is more linky and 
works for more vocabularies (47.5% vs 22.5%).
Sep 15, 2010
Project Member #16 richard....@gmail.com
Per today's call, we resolved to adopt the text from comment 11, and update 1.7 accordingly
Oct 15, 2010
Project Member #17 Michael.Hausenblas
Reviewed section 1.7 as per my action from last time and this is fine by me.
Oct 29, 2010
Project Member #18 K.J.W.Al...@gmail.com
looks good, except I am concerned that referring to the URI as "namespace URI" might be confusing/inaccurate, since in common parlance, "namespace URI" is steps 1 and 2, but not 3 - and Richard's argument for step 3 was that "namespace URI" is just a string you concatenate a local name to, not a URI you dereference, so what step 3 really  defines is a "canonical vocabulary document URI". 
Oct 29, 2010
Project Member #19 richard....@gmail.com
@Keith, in Revision 140 I changed the wording to this:

“Every value of void:vocabulary SHOULD be a URI that identifies a vocabulary or ontology that is used in the dataset. These URIs can be found as follows: …”

Does this addresses your comment?
Oct 29, 2010
Project Member #20 K.J.W.Al...@gmail.com
yep
Oct 29, 2010
Project Member #21 richard....@gmail.com
Great :-) Closing.
Status: Fixed
Sign in to add a comment

Powered by Google Project Hosting