What's new? | Help | Directory | Sign in
Google
rdflib
A Python library for working with RDF, a simple yet powerful language for representing information.
  
  
  
    
Search
for
Updated Mar 13, 2008 by risukun
Labels: Intro
IntroSparql  
Introduction to using SPARQL to query an rdflib graph

Create an Rdflib Graph

You might parse some files into a new graph (IntroParsing) or open an on-disk rdflib store.

from rdflib.Graph import Graph
g = Graph()
g.parse("http://bigasterisk.com/foaf.rdf")
g.parse("http://www.w3.org/People/Berners-Lee/card.rdf")

LiveJournal produces FOAF data for their users, but they seem to use foaf:member_name for a person's full name. For this demo, I made foaf:name act as a synonym for foaf:member_name (a poor man's one-way owl:equivalentProperty):

from rdflib import Namespace
FOAF = Namespace("http://xmlns.com/foaf/0.1/")
g.parse("http://danbri.livejournal.com/data/foaf") 
[g.add((s, FOAF['name'], n)) for s,_,n in g.triples((None, FOAF['member_name'], None))]

Run a Query

for row in g.query('SELECT ?aname ?bname WHERE { ?a foaf:knows ?b . ?a foaf:name ?aname . ?b foaf:name ?bname . }', 
                   initNs=dict(foaf=Namespace("http://xmlns.com/foaf/0.1/"))):
    print "%s knows %s" % row

The results are tuples of values in the same order as your SELECT arguments.

Timothy Berners-Lee knows Edd Dumbill
Timothy Berners-Lee knows Jennifer Golbeck
Timothy Berners-Lee knows Nicholas Gibbins
Timothy Berners-Lee knows Nigel Shadbolt
Dan Brickley knows binzac
Timothy Berners-Lee knows Eric Miller
Drew Perttula knows David McClosky
Timothy Berners-Lee knows Dan Connolly
...

Namespaces

The Graph.parse 'initNs' argument is a dictionary of namespaces to be expanded in the query string. In a large program, it's common to use the same dict for every single query. You might even hack your graph instance so that the initNs arg is already filled in.

If someone knows how to use the empty prefix (e.g. "?a :knows ?b"), please write about it here and in the Graph.query docs.

Bindings

Just like with SQL queries, it's common to run the same query many times with only a few terms changing. rdflib calls this initBindings:

FOAF = Namespace("http://xmlns.com/foaf/0.1/")
ns = dict(foaf=FOAF)
drew = URIRef('http://bigasterisk.com/foaf.rdf#drewp')
for row in g.query('SELECT ?name WHERE { ?p foaf:name ?name }', initNs=ns, initBindings={'?p' : drew}):
    print row

Output:

(rdflib.Literal('Drew Perttula', language=None, datatype=None),)

See Also


Comment by ewan.klein, Feb 03, 2008

I was curious about the issue with empty prefixes -- a statement like this in the query prolog causes an error:

PREFIX : <http://xmlns.com/foaf/0.1/>

The reason seems to be the definition of convertTerm() in rdflib/sparql/bison/SPARQLEvaluate.py, in particular:

    elif isinstance(term,QName):
        #QNames and QName prefixes are the same in the grammar
        if not term.prefix:
            return URIRef(queryProlog.baseDeclaration + term.localname)

That is, the embedded condition succeeds since term.prefix is the empty string, and queryProlog.baseDeclaration defaults to None. The solution seems to be this: replace the above PREFIX statement in the prolog with:

BASE <http://xmlns.com/foaf/0.1/>

Then queryProlog.baseDeclaration has the required value, and the return value succeeds.


Sign in to add a comment